random?

bossk171 · 2007-10-16 09:07:19

This has always bothered me:

How do random number generators work? I can't see how anything in computers can be random.

My two guesses are:

1. Take an irrational number (let's say pi=3.14159...) and when you use a random number function, it calls 3, then 1, then 4, then 1 and so on... giving the appearance of randomness.

2. Using really really small time intervals (beyond our perception) and designating them as values 0.001s = 1, 0.002s = 2, 0.003s = 3 etc. So when the random number function is called, at a certain time, a number is spat out based on the time it is called. This idea seems more flawed to me, but easier do.

So how is it really done? Can some one give me a good idea, or better yet, write something up in C++? Please keep it as simple as possible, I'm still pretty new at programming (relatively).

Ricky · 2007-10-16 18:36:44

1. Take an irrational number (let's say pi=3.14159...) and when you use a random number function, it calls 3, then 1, then 4, then 1 and so on... giving the appearance of randomness.

This only pushes the problem back one step. How do you choose which number to start off with and/or what digit to start at?

2. Using really really small time intervals (beyond our perception) and designating them as values 0.001s = 1, 0.002s = 2, 0.003s = 3 etc. So when the random number function is called, at a certain time, a number is spat out based on the time it is called. This idea seems more flawed to me, but easier do.

Yes, they are called clock cycles. Every time your cpu executes an instruction, there is at least 1 clock cycle. Typically, random number generators use this as a "seed" (think like a plant, you need something to start with). There is one rather large benefit to this: no two seeds can ever be of the exact value in the same uptime of a computer. Why? Because getting the tick count (number of cycles) costs clock cycles.

However, typically such things are necessary. Instead, programs use the number of milliseconds from the year 1970 or the number of milliseconds of uptime. In the Windows 32 library, the function GetTickCount() doesn't actually return the number of clock ticks, but rather number of milliseconds since the computer was booted.

Now what you do with this value is input it into a function. I don't remember much more, but there is an entire field of study in analysis where the purpose is to find functions which "appear" random and have an even distribution.

MathsIsFun · 2007-10-16 20:35:56

Called "pseudorandom number generators". Having a seed is useful, as you can replicate the sequence (say you get an error one time, it is nice to rerun it). I don't think software can generate truly random numbers.

But a computer *could* have a true random number generator implemented in hardware, for truly "organic/biological" programs.

bossk171 · 2007-10-17 03:22:49

MathsIsFun wrote:

But a computer *could* have a true random number generator implemented in hardware, for truly "organic/biological" programs.

It could? That kind of brings me back to my original question of how...

Ricky · 2007-10-17 04:47:53

I don't think software can generate truly random numbers.

It depends what you mean by random. This involves a bit of quantum physics and philosophy. There is one school of thought that everything is determined from the start of the universe. If such were the case, then certainly nothing can truly be random. But then quantum physics came along and told us that not only are there random things, but that just about everything is in fact random. The randomness just gets "canceled" out when you get really big. In either case, what is to follow is the closest to random you can get. I've stripped out much of the details as they aren't important, but I can provide them is you wish.

volatile int count;

void Add(int n) {
   int i, x;
   for (i = 0; i < n; i++) {
      x = count;
      x++;
      count = x;
   }
}

int main() {
   //Create 4 threads, each calling Add(500000)
   //Wait till all threads are done
   cout << count << endl;
   return 0;
}

Your computer can execute multiple threads (programs, in a way) at once. It does so by executing thread A for so many milliseconds, then executing B, C, and D, and then going back to A till all of them are done. So in the above example, Add is sort of like the "main" for each thread. They will exit after getting done with the for loop. However, as they are all executing at the same time, one thread will interrupt another thread and take over. So here is what happens.

The program starts, and thread A is the first to get execution time. Let's say that count gets increased up to 50 and then just as we finish executing the line "x = count", thread A is told to stop and B starts. Now remember count is being shared across threads (it is global). So count is 50, and B starts increasing it. Then B stops and C starts, then C stops and D starts, then D stops and A starts. count has been increased by a whole bunch, say it's now 1000. But remember where thread A stopped. The next two lines we execute are x++ and count = x; In thread A, our x is 50. So we do 50++; and count = 51;. So everything we added in the other threads got lost.

Now the key thing here is that thread switching depends on a whole lot of things. How your OS chooses to switch, all the operations that occur upon a switch, whether the cache gets hits or not, and so on. If you have to swap an extra value out of a register, tons of different things can happen. So where a thread gets interrupted is entirely and purely random. Thus, the final result must be random as well.

bossk171 · 2007-10-17 05:07:17

That is far better an answer than I would have expected, Ricky, thanks.

However, if you re-run that same problem, wouldn't you get the same "random" numbers? If not, then how? You said it's how my OS chooses to switch, won't it choose to switch the same way each time? Won't the cache gets hit (or not) the same each time?

My computer knowledge is pathetic at best, so please bare with me if some of my questions don't make sense.

Ricky · 2007-10-17 05:45:37

No, the amount of time can not be determined. What you must realize is that to run a single instruction, a lot of things have to happen. For example, we wish to add 1 to the variable x. First we need to read in the code that tells us what we want to do. Once this is done, we update what's called the "program counter" (tells us where our next line of code is). Now we know we want to add 1 to the variable x. So we go to the cache and we say, "Is the variable x in here?" If it is, we can take it's value and move it into a register. If it isn't we go to the bus, take a short trip to the memory banks, we ask them for the variable x at such and such a location. We get back on the bus, we go back to the cpu and we say, "here it is, the variable x." Now we take this register, send it through the ALU with the instruction to add one to it. We get it back, and we go to the cache and we say, "put x in here." If the cache is full, then the cache has to eject some other value out so that it can make room for x.

This is what happens when 1 is added to x. And it's the most cut down version of what I could give to you. If I were to consider all the steps it would take to include pipelining, imagine multiplying that paragraph by at least 3. So let's take a look at some of the steps in more detail.

When we ask the cache if x is in there, the answer could be yes or no. Typically, if we just used it, it would be yes. But along with the threads that are running in our program, other threads are running too. By definition, the kernel (the OS "program") must also be running. Typically on any given computer system, there are 20-40 processes also running in the background. Now some of these can also take up CPU time which would alter the results of the code I posted. So the output is based upon what these other processes do as well. If say the kernel fills up the cache with a whole bunch of data, then our x will have been ejected and we would find it. As a result, the "lookup" of x will take a lot longer than it would have of if we found x in the cache. The same sort of thing can happen when we put x back into the cache. If it is full, then the operations needed to eject something first will take longer than if it wasn't full.

Now we have the CPU, the bus, and the memory. Each can (and hopefully do) have a different amount of hertz. This means that the cpu ticks X times every second, the bus transfers data Y times every second, and the memory looks up values Z times every second. What is going to happen is that if you start a request for memory at a certain time, the differences in X, Y, and Z could sync up so this happens very quickly. Or conversely, if you time it right there will be a lot of waiting in between. So it not only depends on all the above factors, but the exact time you start the program as well.

On top of all this, I don't believe everything inside a cpu happens in precise clock ticks. I'm not a computer engineer, so I'm not certain, but I believe the same memory look up can take different amounts of time based solely on the fact that our machines don't work perfectly. Combine this with all the above, and you get a purely random execution time. However, much like quantum physics, most of the differences will cancel each other out so that they don't appear. However, certain programs (like the one above) can emphasize that they do in fact occur.

bossk171 · 2007-10-17 05:56:06

I (think) I get it. It seems pretty strait forward, but it takes a bit to get my head around.

Thanks for the speedy response, but more importantly, that for the lengthy and informative response.

Math Is Fun Forum

#1 2007-10-16 09:07:19

random?

#2 2007-10-16 18:36:44

Re: random?

#3 2007-10-16 20:35:56

Re: random?

#4 2007-10-17 03:22:49

Re: random?

#5 2007-10-17 04:47:53

Re: random?

#6 2007-10-17 05:07:17

Re: random?

#7 2007-10-17 05:45:37

Re: random?

#8 2007-10-17 05:56:06

Re: random?

Board footer