Sunday, 10 August 2014

Why Mersenne Twister, BBS... ?


I wanted to have a blog called "random blog". However that implies a lack of purpose for the blog and lacks a certain style to the name. Being in the field of computational biology, we have simulation and the stochastic ones have random number generators and eventually you get curious enough to find out how random numbers are generated. Then you stumble upon these inventively named (or seemingly so because of author names being unusual) Pseudo Random Number Generators (PRNG's) and there you have it. Rather than writing "random blog" I just get a couple of PRNG's in my title to indicate the randomness!

It seemed quite a mouthful to begin with but now I like it much better. Sounds like a fun story book.

Addendum 08-12-2014

I own a Philips GoGear 2GB MP3 player which I bought second-hand in the early part of 2006, for the first 4 years I used it to listen to music while traveling, then after that I used it for listening to music while riding the bike because sound pollution can be extremely exhausting.

No scroll wheel, press button to scroll, slow interface and everything else reduced this player to a random shuffle player. I just put it on shuffle and listened to whatever came out of the earphones.

About a year later of this I noticed that the song orders were becoming predictable. I could remember that I had heard these songs in that particular order. Soon I noticed that the randomization had reduced considerably to the point that the songs were almost playing in alphabetical order or track order. It seemed that the random number generator in the MP3 player had run out of random numbers to pick.

I went online to read about this and discovered the concept of entropy in random number generators. Pseudo Random number generators (PRNG's) are typically initialized with a particular number called the "seed". Thereafter this seed produces a stream of random numbers. Using the same seed will produce the same series of random numbers which is why you can see the command set.seed() in stochastic simulations in statistics that allows reproducibility of results when you try it out on your computer.

So seeds can make a stream of predictable sequences. So normally if you want your PRNG to not be predictable you set a random seed every time. The amount of randomness available is called the "entropy" of the RNG. So entropy decreased if the pool of seeds is not refilled with new seeds. New seeds are generated by sampling a Hardware Random Number Generator and using it to generate a few "true" random numbers which are then used to seed a PRNG.

In the case of my MP3 player, because the playlist had remained more or less unchanged in the last 2 years of using it, the entropy had presumably decreased. The randomness of the "shuffle" function grew worse until one day I decided to put a few new songs into it. Suddenly the randomness in the shuffle was back again, which leads me to conclude that the seed for the RNG was probably the number of songs in the player or something like that, there was no HRNG because that would presumably add to the cost. Interestingly, HRNG for development boards like the Arduino are available for 4~5 $.

No comments:

Post a Comment