Hello, bonjour, zdravstvuyte, mambo, welcome to the fourth tutorial of Statistical Mechanics: Algorithms and Computations from the Physics Department of Ecole Normale Superieure. This week we will study random numbers and the sampling of discrete and continuous one-dimensional distributions. You need to understand these relatively simple problems in order to attack the more complex sampling problem that we are interested in, for their physical applications. For three weeks already, we heavily relied on random numbers Upsilon generated through Upsilon=random.uniform(0,1) This means that we called on the module 'random' of Python and use the function 'uniform' between 0 and 1 to generate random numbers between 0 and 1, uniformly distributed with a flat distribution. We left out many things, for example how these random numbers are generated on a computer, the most deterministic of all devices and whether we can trust them. To get an idea of what random numbers really are let us consider a baby version of a random number generator: it is called the congruential linear random number generator. Starting from a seed, an initial integer value, it generates a deterministic but seemingly erratic sequence of integers, that are folded back into a given interval through a modulo operation. It is this modulo operation that gives the name congruential to this and this whole class of generators Then, these integer numbers are divided by a large number, to give the familiar objects the random floats between 0 and 1. Of course, this program is deterministic and if we run it several times we will obtain exactly the same output. Now, we can set the initial seed equal for example to the system time and then the output of our random number generator will be a different sequence at each time that we run it. This is also done in the random.uniform that you have in Python, unless you take special precautions. Let us briefly discuss the properties of our naive generator, naive_ran.py but as it always more fun to find answers by yourself than to have them told to you I ask you to download the program from the Coursera website and then to answer the questions. You may have to run the program in order to find them. So, question 1: the program naive_ran.py generates a sequence of integers idum and of floats idum / float(m) what is the range of the integers idum? Well, the answer is clearly that the integers idum are contained in an interval including 0 and (m-1). This is done by the modulo operation: modulo m Now question 2: we have understood that the integers idum are between 0 and (m-1) does the generator generate all the numbers between 0 and (m-1)? or does it leave out a few of them? Are there holes? Well, as you may find out for yourself, for example by sorting the output of the random number generator along the first column, you see that all the numbers between 0 and (m-1) are represented. Question 3: we have seen that the sequence of random integers idum is bounded so sure enough after certain time we'll hit a number that we have seen before what happens then? Run the program and see for yourself. Well, this answer is very simple: the output sequence is periodic if you hit a number a certain time, you'll generate the same sequence as you did the first time it is this, plus the fact that there are no holes in the distribution that makes that this distribution is totally flat. You can learn about the Python generator by reading the documentation or by running this little program. In our applications, we need to sample integers and real, but also physical objects, from a general distribution pi(x) this is of course what the Markov chain method is all about but the direct sampling of discrete and one-dimensional continuous systems is an important subject that Michael will next treat in this tutorial. Before continuing, please take a moment to download, run and modify the program naive_ran.py, if you haven't done so already and also to familiarize yourself with basic usage of random numbers by running this program. Throughout this course we use the Metropolis algorithm many times, as Werner explained in the first lecture the Metropolis acceptance rate is given by pi(a->b) = minimum of 1 and the ratio of the statistical weights pi(b) and pi(a) of the states b and a But what did we actually do in order to decide whether or not a particular move is accepted or rejected? This can be determined by comparing the acceptance rate to a random number between 0 and 1. If the random number is smaller than the acceptance rate the move is accepted, and otherwise it is rejected Of course we do not even need to make this comparison if the ratio of the statistical weights is larger than one cause then - according to Metropolis - the acceptance rate will be 1 and the move is certainly accepted. We have already used the Metropolis algorithm to sample non-constant probabilities. For example in last week's homework (homework number 3) we sampled non-constant distributions of hard-disks but now let us use the Metropolis algorithm to sample a simple one-dimensional function, in fact the Gaussian normal distribution How does this look in Python? We start at x=0 then we propose a random move with a step size between (-delta) and delta the acceptance rate of this proposed move, according to Metropolis, is shown here and we have really understood that the minimum does not need to appear in the code. As we see, the algorithm samples the Gaussian distribution nicely but it much better to choose another route. We implement a direct sampling algorithm in fact a direct sampling rejection algorithm. Once again, we are now talking about rejections in a direct sampling algorithm, and not about rejections of the algorithmic transition in a Markov chain algorithm. So how do we proceed? Well, we can put our Gaussian into a box between (-x_cut) and x_cut and if you want, we can do this even on the Monte Carlo beach in a set-up which is only slightly modified from what we used in lecture 1. Just like the children threw pebbles randomly into a circle and a square to estimate the number pi we can now randomly throw pebbles into our rectangular box whenever a pebble falls below the Gaussian curve we accept it, otherwise we reject it. So what is the probability that a pebble that is fallen at a position x landed below the Gauss function? Well, clearly this probability is proportional to the function itself as there's the same number of pebbles landing at each position x due to the constant box side for all x then the fraction of the accepted pebbles is proportional to pi(x) In Python, our new pebble game can be implemented as follows we first select a random position within the rectangle then we check whether it is above or below the Gaussian curve and only if it is below we accept it This direct sampling rejection algorithm again samples the Gaussian distribution very nicely but note that we had to introduce this inelegant cutoff (-x_cut), x_cut. Rejection sampling can be problematic It is not due to the presence of rejections per se, but because the rejection rate can become really enormous, even prohibitive. To see this, let's look at a function that is not quite as friendly as a Gaussian distribution: the distribution 2 / sqrt(x) Maybe it troubles you that this distribution diverges for x->0 as sqrt(x)->0 and 1/sqrt(x) becomes infinite. However, such distributions exist and Vivien will later show you how to sample them without problems using a simple sample transformation But for our rejection method, that works so nicely in the simple Gaussian case, this presents a real headache, and this is because we simply don't know what box size to choose Should we set y_max equal to 1, 10, 1000, 1000000 Our rejection rate would sky-rocket, and whatever value we choose some part of the curve will still be above our box. Let's give it a try, we implement the rejection method (just like we did in the Gaussian case) and set y_max=100 although the result does not look so bad, in order to get a feeling for the potential danger of cutting off the divergent probability distribution at finite y_max, we compare with a Markov chain algorithm. Note that we do not need to introduce any cut-off as the Markov chain algorithm does not have any problem coping with the infinite probability density. The programs records the highest values of y and the lowest values of x that it obtains while it randomly moves along the x-axis between 0 and 1. This is the output of our program, in a few second of runtime it goes down to values in x as small as 10^(-7) and up to values in y of a few thousand, way beyond our chosen cutoff of 100. Should we actually use this Markov chain algorithm? Alberto in a few moments and Vivien later will show you that this is not at all called for. So before continuing, please familiarize yourself thoroughly with the way the Metropolis acceptance probability is implemented. Furthermore take a moment to download, run and modify the programs we just discussed. On the Coursera website you will find markov_gauss.py that implements the Metropolis choice directly. There's also the graphic version, markov_gauss_movie.py. Furthermore you will find the program reject_direct_gauss_cut.py note that there's the word 'cut' in its name as we have to cut-off the tails of the distribution at (-x_cut), x_cut. In this example this was not a problem, as the distribution is suppressed exponentially for large absolute values of x. There's also the program reject_direct_inv_sqrt.py Now we had to introduce a cut-off in y, and it was not exponentially small. This is really dangerous and we must avoid it at any price. Before Alberto will next introduce the method of tower sampling I invite you to take a look at markov_inv_sqrt_movie.py and take it apart, just in case you still doubt the existence of divergent distributions. Get it to run for other values of the exponent: positive ones pi proportional to x, x^2, x^3, for x between 0 and 1 but also for negative ones: pi proportional to x^(-1/2) x^(-1) and x^(-3/2), again for x between 0 and 1 There are many strange phenomena to observe, and we'll get back to them in week 10, the final week of this course. Sampling non-uniform finite distributions has an archetypical example in what we call the Saturday night problem. As discussed by Werner in lecture 4, we imagine to have k possible evening activities and we do not feel equally enthusiastic about them: go out with friend, visit family, do your homework, sports, laundry, and so on.. We know all the pi_k probabilities but we can still have troubles in deciding what to do. This means that we have troubles in sampling the probability pi_0, pi_1, pi_2 and so on we can adapt Michael's rejection method, and instead of sampling x between -x_cut and x_cut, we can draw integer random numbers 0, 1, .., K-1, select an activity which we'll be accepted if the pebble falls inside the red box. This occurs with a probability which is proportional to the weight of the activity. As Michael explained, this algorithm can have problems because of the high rejection rate. But for the Saturday night problem this is ok, we can program the algorithm and plan our activities week by week. Bad luck.. Let's look again to the last figure. You can see that the number of pebbles you should draw before having a single hit is typically equal to the ratio between pi_max and pi_average. Where pi_max is the maximal weight and pi_average is the mean value between all the activities. This number can be very large. The tower sampling method is much faster and much more elegant. Instead of placing the boxed one next to the other, we pile up them, one on top of each other, and the pebble is drawn from the bottom to the top of the tower. Once again, you choose your activity with a probability which is proportional to the weight of the activity But the algorithm is much faster, and the output is statistically the same. in Python, this gives the following rejection-free program How we can actually find the box j where the pebble falls? for a small number of activities K we can just go through the ordered table pi_0, pi_1, .. until we find a j for which pi_(j-1)