share
interactive transcript
request transcript/captions
live captions
download
|
MyPlaylist
SOL GRUNER: I'd like to welcome you to the first Bethe lecture of 2015. These lectures commemorate Hans Bethe, who's surely one of the giants of physics in the 20th century. The span of Bethe's work over his long career was 3/4 of a century long, was simply phenomenal. His work completely changed the direction of solid state, atomic, nuclear, particle, and astrophysics. It's really quite a statement.
He was amongst other things a Nobel Prize winner, a leader during the Manhattan Project, and afterwards, a leader to try to control nuclear weapons. Freeman Dyson, who was a Bethe student, appropriately said that he was the supreme problem solver of the 20th century. Most of Bethe's career was, in fact, spent at Cornell, and perhaps more than any other single individual he shaped the physics department into the form that we presently see, for which Cornell is forever grateful.
Each year, Cornell invites a distinguished lecturer to give three lectures as part of the Bethe lecture series. And this year, the Bethe lecturer is Professor William Bialek. I think Bethe would have been very pleased by the selection of Bill, since Bill's career like Bethe is very, very broad.
Bill's a theorist, received his PhD from Berkeley in 1983. After postdocs stents at Groningen and then UC Santa Barbara, he joined the Berkeley faculty until the NEC Research Institute managed to attract him to New Jersey from Berkeley-- that's quite a feat-- in order to try to become the Bell Labs of soft matter.
Since 2001, Bill has been at Princeton University, where he is presently the John Archibald Wheeler and Battelle professor of physics. At Princeton, he's also a member and former associate director of the Lewis-Sigler Institute for Integrated Genomics. He is an associate faculty member and director in the programs of Applied and Computational Mathematics, Neuroscience, and Biophysics. And he's a fellow at the Princeton Center for Theoretical Physics.
Bill has many others, which I'll only mention that he's a fellow of the American Physical Society and a member of the National Academy of Sciences. He's also, for those of us who know him, a superb teacher. Princeton recognized this by the president's award for distinguished teaching. And most recently, he's received the Swartz Prize for Theoretical and Computational Neuroscience from the Society for Neuroscience.
Bill will be giving three lectures, the titles are shown here. The one today is "Are Biological Networks Poised at Criticality?" He will give a second lecture tomorrow at 4:00 in 700 Clark Hall that will be "Predictive Information and The Problem of Long Time Scales in the Brain." And finally, but not least, in this room at 7:30 on Wednesday, he'll be giving a lecture oriented toward the public, on "More Perfect Than We Imagined, A Physicist's View of Life."
So please join me in welcoming Professor Bialek to Cornell University.
[APPLAUSE]
WILLIAM BIALEK: Thanks, Sol. Finding your name in the same sentence with Hans Bethe's is an enormous honor. I got to encounter Bethe a little bit during his wintertime visits to Santa Barbara, which overlapped the period I was a post-doc there.
But the memory is strongest is the first time I set foot on the Cornell campus. I was a freshly minted PhD. My wife and I were on our way to the Netherlands for a year. And I came up to give a seminar, which was hosted by Watt Webb. Watt handed me my schedule for the afternoon at lunch. The schedule consisted of speaking to Ken Wilson, Michael Fisher, and then giving my talk. I was terrified.
And so I get out of the elevator at the correct floor at Newman, and well, you walk out of the elevator and turn and there was an office with an open door. And the reason I chose this was because I couldn't find a picture of the correct vintage. But basically, the behavior was the same. He was working away. By then, there was this glorious halo of white hair to which I aspire.
And he looked up and smiled. I walked past his office and he sat back down to work again. And--
SPEAKER 1: Was he using a slide rule?
WILLIAM BIALEK: He was not using slide rule at that point. I mean, there are other things that give away the vintage of the photograph, but I particularly liked the pear. So it really reminded me of that scene. Anyhow.
SPEAKER 2: He was using a six-inch [INAUDIBLE].
WILLIAM BIALEK: Yeah I don't think he had it in his hand as I walked by. I think that I would have remembered.
But the other thing I aspire to is to make sure that I'm been working and not distracted by something else. The many things over which he had mastery sitting and working and not being distracted by the rest of life in the university was one of them. It was extraordinary. There's a lot to think about there.
So I was asked to give three talks obviously to somewhat different audiences. And since I work at the interface of physics and biology trying to reach a view of the living world that corresponds to our theoretical physicist's view of the inanimate world, I thought I would share with you to begin this talk the things that really worry me.
There's no question that there are an enormous number of beautiful phenomena in the biological world. They occur on all scales from the behavior of single molecules. This is here to remind me to talk about the fact that the emergence of the structure of even single biological molecules is the result of interactions among a very large number of elements. A typical protein might consist of 100 or 200 or more amino acids.
And on the one hand, we know actually from the theory of disordered systems that if you make a random heteropolymer, so if you choose amino acids at random and stick them together, then that doesn't fold into a well-defined structure at all. In fact, the most likely thing is that they'll agglomerate with each other and sink to the bottom of the test tube.
On the other hand, it is not the case that the identity of every single amino acid matters. The structure is somehow emergent from the collection of them. And you artificially and evolution by just through its history has sampled many different sequences that can fold into the same structure.
And in fact, this is a serine protease, chosen because it was one of the landmarks of understanding of that. There are both mammalian serine proteases and bacterial series proteases. You know that they're both serine proteases because they have a serine in them. And they're both proteases, which is to say they're enzymes that cut other proteins.
But it is not the case that they resemble each other in amino acid sequence very much. And so if it wasn't for these few key properties, you really wouldn't recognize them as belonging to the same family at all. Now the mammalian serine proteases were the second enzyme whose structure was solved by x-ray diffraction.
So we knew what it looked like. It took many years until one of the bacterial structures was solved. And these two molecules that share almost no amino acids lie perfectly on top of each other in structure within a few tenths of an Angstrom. And the residues which actually participate in the chemistry of breaking the bonds lie on top of each other within tenths of an Angstrom.
So this tells you that even at the level of single molecules, the things that matter for biology are emergent. They're not the properties of individual amino acids. They're colligative properties of the entire sequence, such that you can have two sequences that letter by letter look very different. And yet somehow the same structure, and hence the same biological function, emerges.
So one can look at all sorts of other scales. You can look at the molecules that are relevant in forming patterns in embryonic development. So this is a fruit fly. What you're seeing is one particular protein has been lit up by fusing it with a fluorescent protein. You see these marvelous stripes forming. This is the mechanics of the embryo during this period. And then this is an example of the little creature that might walk out.
For those of you who know a little more biology, you'll realize that this is not the end result of the development of that. This is rather cuter. The thing that walks out of here is a maggot. This is a caterpillar. By convention, caterpillars are nicer than maggots. But you'll notice that they also have stripes.
And if you went away thinking that these patterns of gene expression of concentration of proteins are what provide the road map or the blueprint for making these stripes, you wouldn't be far off. And again, although we are looking at the concentration of one particular molecule, what's actually happening at this stage is that there are perhaps 20 different molecules that are regulating each other's synthesis and degradation. And together, that network produces these patterns and gives each individual cell its identity, which will eventually translate into these beautiful structures.
So here one is looking in at a particular region of the nervous system. Throughout the brain, individual neurons generate electrical signals, which if you record them consist of a sequence of identical pulses, which are very brief. They're about a millisecond long. They're called action potentials or spikes.
And here, what's been done is to engineer into this little mouse an indicator protein that whose fluorescence depends on the concentration of calcium in the cell, which is a sort of rough indicator of this electrical activity. And you can see that as the mouse pauses in different places, different cells are active. And these are the famous play cells in the hippocampus for the discovery of which John O'Keefe shared the Nobel Prize last year.
And then you sort of have collective behavior on much larger scales. There are other beautiful phenomena that occur throughout.
Now the problem isn't that we don't know how to write down equations that describe any of these. So indeed, over the last decade or more, all of biology has become a much more quantitative enterprise. And it's actually fun to watch. Each subfield of biology discovers that it can become more quantitative on its own without any reference to the other subfields.
So in neuroscience, there is actually a quantitative tradition that reaches back very far. Collective animal behavior is something that people have finally figured out how to make quantitative observation. So there were lots of theories, but it was hard to make measurements, and so on also in developmental biology and molecular and cellular biology because structure has been quantitative for a very long time.
I should say that in many cases, it's the influx of physicists who have helped to drive things forward, both experimentally and theoretically. Although, the best cases are ones where one builds on a sort of foundation of biological knowledge, hard won by very classical techniques.
So if you now go to a modern conference on quantitative biology or even you go to listen to the biological physics talks at the American Physical Society, you'll hear about all of these things. The problem is that you'll sort of hear about them separately. And each one is its own special story. And of course, the biological literature is very divided.
And if you dig in, each one of these special systems is complicated in its own way. And people have tried to deal with that. And so this is what keeps me up at night.
There is a fantastic example of our quantitative understanding of biology. In fact, an understanding that we now can phrase in very physical terms. The electrical signals that are generated by neurons throughout your brain are the result of special protein molecules that sit in the cell membrane. So membranes are insulating.
So there will be no interesting electrical dynamics of membranes if they were just membranes. There are special proteins that sit in the membrane which are called channels. And they can open and close and allow ionic currents to flow through. The interesting thing is that their probability of opening or closing as they change structure to open or close charges move around inside the molecule.
And so as a result, the probability of their being open or closed is influenced by the voltage or the electric field across the membrane. So you have in the membrane, a device, which if you increase the voltage, you can cause more current to flow. That's normal. All conductors do that.
But it can happen non-linearly. It can happen regeneratively. So a small change in voltage can cause an enormous amount of current to flow.
And as a result, the cell membrane becomes this amplifier. It has all sorts of interesting nonlinear dynamical properties. And in particular, those nonlinear dynamical properties can lead to the production of these stereotype pulses, which are the action potentials that are the currency of communication among all neurons in your brain between your sensory neurons and your central brain and between your brain and your motor neurons as you try to act. OK?
And really, our understanding of this goes back to Hodgkin and Huxley in 1952, who actually wrote down equations to describe all of us. They understood that there were these channels and so on, although they had never seen them. And they did an incredibly ingenious set of experiments designed to dissect out the contributions from channels that are selective for the flow of sodium across the membrane channels, that are selective for the flow of potassium across the membrane, and so on.
And they studied all this under conditions by essentially passing a wire down the center of the long axon that the cell sends from one end of the body to the other. They ensured that the current flowing across the membrane was spatially uniform. So they could study the dynamics in time.
But then you know how currents flow along the length. That's just conduction through an ionic solution. So when you take the wire out, if you know the conductivity of the solution, you now know what's going to happen. And what happens is that there's a little pulse and it propagates. And it propagates with exactly the correct velocity.
And so it was really a triumph. And I think we can say, we understand the electrical dynamics of neurons.
The only problem is that this description already in the paper by Hodgkin and Huxley has 20 parameters. Now they were at great pains to measure these parameters by fitting to very different kinds of experiments. And then when they predicted the propagation of the action potential, there weren't any parameters left. So that really was a prediction.
Unfortunately, or fortunately, depending on your point of view, Hodgkin and Huxley studied an incredibly simple example. Our genomes, in fact, encode more than 100 different kinds of these channels. And a typical neuron in your brain chooses some cocktail of perhaps eight different kinds in different proportions. So in fact, it's even enormously more complicated than what Hodgkin and Huxley let on.
And so if you try this style of description, you're led down this path of ever increasing complexity with more parameters and so on. So I don't really think that for this audience I need to explain why this is troubling. But just for the sake of comparison, there is a paper that appeared five years later, which perhaps some of you know.
And this is the first paragraph of the discussion section at the end. One can imagine if you ever encountered some of the authors of the conflict between modesty and triumphalism among the different characters that lead to the writing of this paragraph, which really is marvelous. But the part that's relevant is the sentence I've underlined. Only the critical temperature involves a superconducting phase. The other two parameters are determined from the normal phase.
So the theory of superconductivity is a three parameter theory. And as pointed out, two of them you can measure from the normal metal. And one of them is the critical temperature. You just measure temperature and notice the critical temperature is gone.
So in fact, the theory of superconductivity is almost a no parameter theory. And in fact, I think physics at its best tends to be, at least physics of materials, right, particle physics is a little more complicated in a funny way. If you want to say the theory of superconductivity or the theory of liquid crystals, you have fluid mechanics. These are in some sense no parameter theories. Right?
And if I choose my units correctly, then I have the Navier-Stokes equations. And the only thing I really care about Reynolds number. And that's dimensionless. Right? So things have to change when you change a dimensionless parameter. And then we can classify flows and their transitions and so on.
So a lot of very beautiful physics is almost parameter free. And of course, to mention local triumphs, which will be relevant in a moment, the theory of critical phenomena, for example, is essentially parameter free. Right? Once I tell you the symmetries of the order parameters and the dimensionality in which you live, the critical exponents are calculable. They're pure numbers, no parameters.
So it's almost too easy to point out our understanding of biology and say that we have a problem with the politicization of parameters because it's obvious it. I would encourage you to think a little harder about this side, why this works. Right? How is it that we have theories of real phenomena in rather complex materials that are almost parameter free?
So at this point you know there are branches. There are different points of view as to how one should proceed. So one of the options is unspeakable.
And then there are these other two, which I think are both sort of interesting. One is that somehow biology works because although we spend all our time fitting these parameters, in fact, they don't really matter. That's somehow a distraction. The essence of what's going on is a behavior that's not dependent on the parameters.
And that maybe that's important because the parameters that we have difficulty determining may also be difficult for the organism to determine. So you can imagine, for example, that some of these parameters are effectively how many copies of one particular molecule do I have in the cell. Well if cells can't actually control that number, then it is not only true that we will have difficulty determining that number as we try to fit the data, but the cell also would have difficulty determining that number.
And so if things are going to work, whatever work means, they better work without depending on the precise value of that parameter. And that's a really interesting idea, but it's not the one I'm going to talk about.
The one I'm going to talk about is the idea that maybe the parameters chosen by nature are special. That is to say there's been a lot of selection. Right? In evolution, there's often an emphasis on generation of random variation. But of course, it's also true that there is selection. And if selection is strong enough, that can drive you to points in parameter space that you might actually be able to recognize and pick out by some theoretical principles that are not just let's fit the parameters and see what happens. OK? So that's the spirit which actually you'll hear echoed, not just in this talk, but in subsequent talks.
But as I said, it is useful by way of introduction to remind yourself of why it is or why it might be that the situation seems so different in physics. So there is a statement which you all know is true, but is perhaps not the thing that immediately leapt to mind when you heard this question.
So if you're talking about equilibrium phenomena, you know that in some sense our description of a system in thermal equilibrium is incredibly simple. The probability that you will find the system in any microscopic configuration is given by the Boltzmann distribution. But the Boltzmann distribution can be thought of-- it is mathematically, and it's useful to think about it-- as being the probability distribution that is as random as it can be given that the average value of the energy. Or the colloquial as randomization it can be, of course, is replaced mathematically by make the entropy as large as possible.
So what that means is that the states that you'll actually find are as nearly random as you can make them, except that somebody told you that the average energy has a certain value. And so that shapes the probability distribution, but it shapes it in a very simple way. And forces it to be able to Boltzmann distribution.
And that's true, but it's a little bit of a cheat because, of course, when I write down the energy as a function of the microscopic variables, there could be all sorts of complicated stuff in there that I'm hiding from you. However, it's usually the case that that energy function is something that I can build locally. Right? Things don't arbitrarily interact with other elements on the other side of the sample. Atoms interact with their neighbors.
Now, if I tell you that you're going to write the energy function only out of local operators, of course, there's still lots of local operators. But what Wilson taught us was that if we want to get the macroscopic physics right you don't need to keep all of the local operators. You only need to keep the ones that are relevant.
And I love talking about this actually to non-physics audiences because you realize that we have this incredible luxury to work in a field in which the words relevant and irrelevant have been made mathematically precise. Right? Everybody knows you should only keep the relevant things. Right? But you know here it really means something. It's amazing.
So this, of course, was a huge triumph. But there's another part that really makes it work. And that is there aren't too many relevant operators. And this is why it's possible to have simple theories. Right? If you want to have a description of the macroscopic behavior of matter, and you insist that the interactions are local, then you only need to write down a limited number of terms and you will get everything right.
And in fact, if you're talking about writing down a theory in thermal equilibrium, what is in fact true is that the successful models are the maximum entropy probability distributions that are consistent with the average values of these relevant operators. And so it really is true that our successful theories of macroscopic equilibrium matter are as theories that describe matter as being as random as it can possibly be except for a few things. And we understand why it is that fixing those few things is sufficient to get the macroscopic behavior right.
So this is a kind of animating spirit behind what many of us are trying to do in the biological case. And I hope you'll recognize the spirit, even if not so much of the technicality comes through. So I've learned, living as I do in between physics and biology, that one of the many differences between talks in physics and in biology is that biologists only thank people at the end, whereas in physics there's a bit more of a tradition to say you know I'm going to talk about work done in part by other people so I should tell you who they are before I start.
So I'm going to talk about things that began many years ago in collaboration with Michael Berry, no relation to the other Michael Berry, and his post-doc Ronen Segev. On this side are the people who are experimentalists, and on the side are the theorists. So this was work, which I hope to say a few words at the end about the descendants of this, thinking about neural systems. There was then a realization that some of the things that we were trying to do in neural systems were connected to things that people were doing in the context of amino acid sequences and families of proteins.
And then I spent a sabbatical in Rome and discovered many wonderful things including this remarkable work on flocks of birds. Then came back to Princeton to think about embryos. And returned again to the problems of the retina and neural systems. And all these wonderful theoretical colleagues biased rather toward students and post-docs who have now gone off on their own, one of them even coming here.
And of course, if you want to find out about this, you can find all the papers on the archives. That's another difference between physics and biology talks. Good.
So with that rather long prologue, how do we do this? How do we implement all these ideas for a flock of birds?
So the two things to have in mind are this sort of renormalization group spirit of trying to find a limited number of local operators or a limited number of relevant operators. And this problem of maybe the parameters that nature has chosen are special. So those are the things in the background.
So this is here to let you know that my colleagues in Rome, Andrea Cavagna and Irene Giardina have done this wonderful thing of actually filming flocks of birds with multiple cameras so that they can reconstruct positions and velocities of individual birds in three dimensions, even in flocks that contain a couple of thousand birds, which involves a lot of fantastic image processing using methods that are actually borrowed from statistical physics. Wonderful sort of loop of ideas.
And so you should think that one can get from these data essentially collections of the positions and velocities of all the birds in the flock at each moment of time over some window during which the flock is in front of their filming apparatus, which sits on top of one of the historic buildings in the center of Rome looking out over the piazza in front of the train station for those of you who've been there.
So historic buildings, you need special permission from a different ministry to put your equipment up on the top. And you have to bring it down every day. It's a marvelous Italian story. It's why some people want to do these experiments in the lab instead of in the field.
So look, it could be that the way flocks organize themselves is that there is a bird and he sees a bird all way on the other side of the flock, and he says, I'm going to go follow him. But that seems a little bit implausible, although there are still papers appearing with that idea. Seems much more likely that what's happening is that birds look at their neighbors and try to orient to do something similar to what their neighbors are doing. And that that ordering tendency somehow propagates throughout the entire flock. And this idea has been in the physics literature for a very long time.
The only question is it's not clear whether it's a metaphor, all the birds deciding to fly in the same direction is like all the spins lining up in a magnet. Also life is like a river. Or is this a real theory in which I could actually calculate things?
So we're going to try and calculate things. What we're going to try and do is to implement this intuition and this argy idea that what I should do is to build the maximum entropy distribution that's consistent with the expectation value of the operators. Well, I wish I could do the relevant ones. But what I'm going to do is the ones I think are important. OK? And so there's a little bit of finger crossing involved here. And we'll see how it goes.
So if you think that local things are important, then you focus on this guy and you look at his neighborhood. And you say, he's trying to align to his neighbors. So let's measure the similarity of the velocity of this bird to that of his neighbors. We'll average over a neighborhood, which we'll take to have some size n sub c. We can either discuss how n sub c is determined. It's not very large. It's 15 or 16. Remember, you're in three dimensions. So it's not as bad as it looks.
And alternatively, you can do something a little bit more sophisticated than just picking a fixed neighborhood if people are curious. So you want to do this in a dimensionless way. So we're going to measure the similarity to your neighbors by computing this quantity. It's a mean square difference from your neighbors.
You realize that if this is the only thing you fix about the flock, then if you add a constant velocity to everybody, then you get the same number back. So you'd also like to fix the average speed at which everybody's going. You'd like to preserve the symmetry that the flock can fly in any direction.
And actually if you do that, you also discover that in some sense this is a gradient term. Right? It's telling you about differences from your neighbors. But you'd also like to say something about your overall variance. So let's also fix the variance between an individual and the average over the flock.
So if I try to write down the maximum entropy distribution that is consistent with these expectation values, well remember, the maximum entropy distribution consisted with the average value of the energy is the Boltzmann distribution. So it's either to minus a constant times the thing whose average you're fixing. And that turns out to generalize. If you have many things whose average you'd like to fix, then you have basically an energy that has many terms corresponding to each of the things that you fix.
So that's the maximum entropy distribution consistent with the three expectation values that I just told you. Roughly speaking, j determines the degree of similarity to your neighbors. mu determines your mean speed. And g determines the variance of the speed.
So you measure those three expectation values from the data. You do the inverse problem of tuning. This is inverse in the sense of statistical mechanics. Right? So usually in statistical mechanics I write down the Boltzmann distribution. And it has coefficients in the energy. And then I compute the expectation values and correlation functions and so on.
Here I'm doing the opposite. I measure the expectation values, and then I have to tune the coefficients in the energy so that the expectation values come out right. Same idea, but backwards.
So you do it. And now you can compute anything you want. So for example, you can decompose the velocity into speed and direction. And then you can compute the correlation functions of these quantities separately. And so since we wanted a definition of correlation function in which you always want to measure connected correlations, which means you subtract the averages.
But average over what? Do you want to do average over time? In this case, what we want to do is average over the particular snapshot of the flock, which means that the correlation function can go negative. But that's a detail. I just wanted to be sure you know what we're computing.
The typical neighborhood size that we average over is a couple of meters. So the birds within a small region, but the flock is 30 meters across. So the things that we're fixing are related to the behavior in this small region. So of course, we get that right. But what's surprising is that you get the correlation function both for directions and for speed exactly right for the whole flock.
In fact, if you dig in, you can even compute four point correlations. So I have two birds here, and I check whether their directions are a little more correlated than average. And then do that for two birds over here. And then ask whether if these guys are a little more correlated than average, are these guys also a little more correlated than average? And they are. And that depends on their distance, actually the non-monotonic way because of the effects of the boundary.
You can get all sorts of details right. It's astonishing because this is in some sense a very simple model, but it all works.
Since this is a physics colloquium let me draw your attention to a few things. You'll notice that these are rather featureless sort of boring looking curves. What you do not see that you might have expected is an exponential decay of correlations on some scale, which is perhaps a few times the interaction rate. Right? You don't see this. OK.
Furthermore, the only thing that you could point to as possibly being a correlation length which is the place where it crosses zero. In fact, if you do the analysis for flocks of different sizes, you discover that both in the data and in the theory this characteristic scale is proportional to the linear size of the flock itself. It's not a scale at all.
So now you think oh, aha, now I know why he put criticality in the title. There's no correlation like that. You should be careful. This is a directional correlation. The flock breaks a continuous symmetry. They decide to fly this way and not that way.
So if the probability distribution that you write down for the velocities is e to the minus something local in those directions and that probability distribution has the property that breaks a continuous symmetry, there will be Goldstone modes, which are the direction waves going through the thing. And that will give you long range correlations.
And that is what these are. So as my friend, Andrea Cavagna likes to say, the birds know Goldstone's there. They knew it before Goldstone, in fact.
This however, shows the same behavior in the correlation function of the speed for which there is no continuous symmetry being broken. So this one you should be a little more worried about. So hold the thought.
And let's talk about a very different problem. Let's talk about that problem of a protein folding up as a result of interactions among all its amino acids. And I told you that if you look across evolutionary history you can find many proteins that fold into the same structure, but have different amino acid sequences.
So then you can ask what is the probability distribution of amino acid sequences that are consistent with folding into this particular structure? And you notice, for example, that at site number 17 there is the probability of P1, 17 of using that alanine and the probability 2, 17 of using a glycine and so on.
But importantly, you see that there are correlations among the sites. If I swap amino acids over here, I also seem to swap amino acids over there statistically. Right? It's correlated.
So you say I'd one to build a model to capture those paralyzed correlations. So if you want to build the model that captures those paralyzed correlations, but otherwise has as little structure as possible, that is to say the distribution you write down has the largest possible entropy, then it is of this form. But again, you don't really believe that if you change amino acids over here that that should directly require you to change an amino acid over here.
You suspect that if you could write the probability distribution in this form, then the only non-zero terms would be from the amino acids that actually bump into each other. Right? If I make this amino acid a little bigger, I should make the one next to it a little smaller. If I make this one a little more positively charged, I should make this one a little more negatively charged, and so on.
So people have tried to do this. And you realize that this is still very much an open area. But you realize that if you could do this, if you could look at let's say 2,000 examples of the sequences of proteins that go in the same family, and there are many families of proteins for which you can do that now because sequencing is cheap, you could build a model like this.
And then you would discover that the only non-zero terms pointed to pairs ij that were near each other in space. But if that's true, then you could imagine, remember, it's polymer. Right? Protein is a polymer. So they're connected.
And then you know that by reading this off that residue number 17 is close to the residue number 149. So the polymer must fold up and bring 17 and 149 next to each other. If I give you a list of all of the residues that are in contact with each other, that's called the contact map. And if I tell you that, I've basically told you the structure of the protein. Because you have to have continuity along the chain and then satisfy all these contacts. And that's enough to determine the structure of the protein.
So if this is true, then it would mean that by looking at the statistics of amino acid sequences in protein families I could determine protein structure. And so this is an example of a paper that tries to do this. And this, of course, would be astonishing. And has lots of people excited.
Here I'd like to point out something slightly different, which is that the reason this problem is not trivial. So you might say, what's the problem? If things only interact with each other locally, then they should only be correlated locally. But that's not true. OK?
If you measure the correlation between amino acids, remember, there's 20 possible amino acids so I don't want to just measure correlation by multiplying things together. I should measure something like the mutual information between the choice of amino acid at site a and site b. If I plot in this family the mutual information between amino acid choices at sites which are separated by distance r, you see, indeed, it's true that things that are close together tend to be more correlated.
But actually you can find examples of things that are right next to each other that are only a little bit correlated. And more disturbingly, you can find examples of things that are on completely opposite sides of the protein that are equally correlated. And in fact, although there's a little bit of a decay, there's not much. And the most obvious feature of this data is they're flat. Right?
So again, the correlations don't seem to decay. So if you imagine trying to build a model of this form, such that the correlations between si and sj or s and s prime at sites i and j have correlations that extends throughout the protein, the only way you're going to do it is if these interactions start out being local. The only way that the correlations can become long ranged is if the strength of these interactions, or alternatively if you want the temperature that goes out in front of this. But that would be a little bit too much like the Boltzmann distribution. Right? Remember this is just a probability. There's no real energy.
The strength of these terms had better be tuned to its critical point or else you're never going to reproduce this behavior. Now I should say that we're not done. We don't really know yet that there is a local model that perfectly reproduces this behavior. All of these papers that go at this question have made use of various approximations. And what Lucy and I are actually working on now is trying to relieve these approximations and test more directly whether one can make these local models work.
But if they're going to work, and there's a lot of reasons to think so, then the only way it's going to work is if somehow the strength of these interactions is tuned so that you've got long range correlations, which should remind you of what I told you about in the birds. In fact, if we go back to the flock, I told you that there was a couple in constant g that basically controlled the variance of the speed. So I can now reach into that model and tune that coupling constant.
And what you discover is that if you make that coupling constant bigger, you do see the correlations of speed decaying on some characteristic scale. And in fact, the characteristic scale very quickly becomes the distance over which you have interactions all together.
The actual value is such that you don't get any decay at all, or you get something that approximates this linear decay. In fact, there is a critical point in that model essentially at the value of g that's picked out by the data.
So what's interesting here is, I was joking about whether the statistical mechanics approaches to collective behavior are meant to be metaphorical or they're meant to be a real theory. What we've done is to say, OK, let's take this maximum entropy idea, which is you take certain expectation values from the data, certain average properties from the data, completely seriously so you want to match them exactly. But otherwise, you make your model as random as possible.
Those models are statistical mechanics models. Right? They're of the form of the Boltzmann distribution. If the things whose average you're keeping track of are the things for which you have an intuition, they even have the detailed form of models that we understand in statistical physics. And furthermore, in this case, there are local interactions. So if you're reproducing long range correlations with local interactions, and it's not the Goldstone mechanism, then the only way that could possibly be working is if the parameters have been tuned to be close to a critical point.
And remember, we didn't tune the parameters. You'll notice that effectively it really is true that g determines the variance of the speed. So you could imagine a flock, which was the same, but it had the variance of the speed was half as big. Then if we tried to go through this exercise, what we would predict is this sort of correlation function, which is not what we see.
So there's something about if I think of the phase diagram of the system being not in my parameters of my model but the values of the things I actually observe the similarity function q and the variance of the velocity sigma, then in that parameter space there is a critical point. And the real system seems to be poised right on that critical surface.
So once you're armed with these things, you might say let's look around at other examples. So let's look at that embryo that I was showing you. The signals that produce those beautiful spatial patterns, they start with a signal that comes from the mother who essentially establishes boundary conditions in the egg. That leads to spatially varying concentrations of what are called the primary morphogens, the ones that sort of set the basic rules. This is going to be the head. This is going to be the backside of the animal.
And then those molecules feed a network of things that are called gap genes, which have spatial profiles of expression which are broad. And then these molecules then feed into the things that make the molecules that make the stripes that you saw in the second slide. What you see here are optical sections through an embryo. So you're looking sort of cutting the embryo down the middle, although you're doing it optically so you don't actually have to cut.
In the fly embryo, there's this marvelous thing that at the early stages of development things are going so fast that the cells don't even stop to make membranes in between the nuclei. So an egg starts with one big cell, one nucleus. And you get two nuclei, four nuclei, eight nuclei, and so on. And it's not until you've gone through 14 divisions that you actually pause and make walls or membranes between the cells. So it really is a big box. It's as close to the physicist idealization as we're going to get.
And furthermore, for the convenience of the experimentalists, somewhere around the ninth round of nuclear duplication, all of the nuclei, except for a handful that do something special, move to the surface of the egg where you can see them. And so thus, if you take an optical section through the middle, all of the nuclei around the edge. And the proteins that are being labeled here in the different colors are proteins that serve to regulate the reading out of other genes. And so they localize to the nucleus.
So it's a fantastic experimental system. So these molecules all have names. And I can see if I can give the talk without telling you the names. Because that's another difference between physics talks and biology talks.
OK so now let's project onto the long axis. And you see that one of the molecules, so this one for example, it's on in the front part. It goes off. And then in the back there's another bump. This one is this one. Unfortunately, the colors in here and the colors in there don't line up. That's unfortunate.
Anyhow, you get these complicated spatial patterns. And what you should think is that essentially the cell's identity, where am I along this axis, is determined by the values of these protein concentrations. And in fact, one of the things that we did was to try and ask how accurately can you read out position by knowing these values. And indeed, these protein concentrations taken together, just these four molecules, are sufficient to determine the position of a nucleus along this axis to an accuracy of 1% of the length of the entire egg. And that's about the distance to the next nucleus. So it really is true that these molecules contain enough information to set up a blueprint that determines the position of every cell.
OK. But that's not what I want to focus on here. What I want to focus on is are there long range correlations in the system? You'll notice there are little error bars here. And those error bars are actually not the measurement errors. Those are the standard deviations across many, many different embryos. The measurement errors are about 3% on this scale, which is one of the triumphs of my experimental colleague Thomas Gregor that he can actually do that.
And it matters because it turns out that the biological variations are only 10%. So you really do have to make a precision measurement in order to reveal the underlying precision of the biological system.
So how do I decide whether these fluctuations are correlated along long distances? So our path to this was a little complicated. You might think well, I don't know, measure the fluctuation here and the fluctuation there and ask if they're correlated. That turns out not to be quite the right thing to do. Remember, you have four different variables. And so the correlation is really a matrix.
And so what I want to do is to focus on some elements of that matrix and tell you how those correlations work. So what we did was we noticed that if you look in these crossing regions here, where you see the concentration of one molecule goes up, the other one goes down, there roughly speaking, there are only two molecules that have significantly non-zero concentrations. So in this region, you can sort of blow things up and think just about the two dimensional problem.
And so you know that if you're in two dimensions, then all of the different combinations that you might think about are just sort of rotations and not two dimensions. And it turns out that if you measure the correlations, the combination that diagonalizes the covariance matrix, the sort of normal modes if you want, are pretty much the symmetric and anti-symmetric combinations.
Furthermore, if you track how these things vary in time, so here what you do is you stop the action in the embryo, and you stain it. So you say, how do I do things in time? Well, you take lots of embryos. They've each been progressing for different lengths of time. You stop them all. And now all you need is a way of determining at what time they stopped.
Remember, I told you that the system pauses to build the membranes in between the nuclei. It turns out that if you look in a live image and watch those membranes growing, or rather invaginating, so there's a membrane that is around the entire egg and it vaginates to put barriers in between the nuclei. If you watch the progress of that motion, and you measure how deep it has gone, that gives you a clock, which is accurate to plus or minus one minute from embryo to embryo. There's another piece of extraordinary precision.
So you can do this. And you can basically get snapshots at different times. And what you discover is that the two modes, the symmetric and anti-symmetric combination, one of them has a lot of variance. And the other one has a little variance. Right?
You're diagonalizing the covariance matrix. So the two eigenvalues aren't equal. In fact, they're very different from each other. So most of the variances in the anti-symmetric mode and a little bit of the variances in the symmetric mode.
And then you watch in time. And if you watch the individual genes it's all very complicated. But if you rotate and look at the symmetric and anti-symmetric combinations, you discover that the mode that has small variance relaxes very quickly to its study state value. And the mode that has a large variance drifts very slowly.
Furthermore, if you look at the spatial correlations of the fast mode, the symmetric combination, so let's just again stay inside this region and compute the correlation of the fluctuations. So the correlation of the fast mode at this position with the value of the fast mode at this other position, you get something like this. You'll notice this scale is 0.01. So what that means is that basically by the time you've gone to the next nucleus over, you've lost all your correlations.
On the other hand, if you look at the high variance mode, which is also the slow mode, you notice that its correlations extend essentially over this entire crossing region. But now what you can do is to compute the slow mode. So in this region, the slow mode is blue minus red. In this region, the slow mode is red minus green. In this region, the slow mode is green minus yellow. In this region, the slow mode is yellow minus blue.
So you can calculate the fluctuations along each of those modes at every position. And what you discover is, well, this one is the mode associated with, a, correlated with the mode associated with a. That's a reproduction of this curve. But then you can find these other modes correlated with this mode as you move throughout the embryo. And what you see is that the correlations are decaying, or the envelope of the correlations decay.
But if you know which mode to pick, you can find correlations that are decaying very slowly. In fact, these lines have a slope, which is just 1 over l or 1 in these units.
So what this says is that somehow this network of interacting genes also is generating many of the signatures that you'd expect near a critical point. You have slowing down of certain modes. And you have long range correlations.
In fact, if you look closely, you discover that at each point the distribution of the slow modes is also non-Gaussian, or as the distribution of the fast nodes is Gaussian, which you'd, again, expect near criticality.
SPEAKER 3: Is there any resource limitation to the developing embryo?
WILLIAM BIALEK: Oh sure. Yeah, that's a great question. There's a whole other part of our understanding of the physics of embryonic development which has to do with why the noise levels are what they are, which I think I'm going to talk about on Wednesday night. And that's definitely a resource limitation. Presumably, you just can't make lots of molecules.
OK. Time is short, but let me try to say a few words about neurons. Because this is where we started and where we're still digging in. So my colleagues can now record from more than 100 neurons simultaneously in a small patch of the retina. And this is a patch of the retina whose linear dimensions are such that essentially every cell can talk to every other cell, because the cells can reach out. And in this neighborhood, which you might think of as being a kind of elementary repeating unit of the retina, there are perhaps 250 or 300 cells. So if you're recording from 120 of them, not only do you have a lot of neurons, you also have most of the ones that are relevant in that neighborhood.
If you basically mark at every moment when each cell generates an action potential-- so here are the neurons in order. Here is time. There's some movie being shown in the background. I can now cut time into small slices. You should worry about how you choose your time slice, but that's another story. And now the responses are binary. Either you see an action potential, and you can write down a 1. Or when I was at NEC, we had physicists and computer scientists. And it was amazing how long it took to translate between 1 and 0 and spin up and spin down, but whichever one you want.
And so the responses are intrinsically discrete, or binary. And then you could say, well, let me build a model that matches various features of the data. So let me match the probability that every neuron generates a spike. Let me match the correlations between pairs of neurons. And let me match the overall probability that K out of the N cells spike together.
So these models are-- again, the model that does that is this, which except for this term, you should recognize. I should say, some of the correlations are positive, and some of them are negative. And correspondingly, some of these interactions, jij, are positive, and some of them are negative. So this is indeed a spin glass. One of the things you would expect is that then, if you think in terms of the energy, it should have many minima. And indeed, alternatively, the probability solution should have many maxima. And, in fact, it does. And that's something you predict from the model, and then you can go find those in the data. And there's a rich phenomenology associated with those multiple peaks in the distribution, which I'm not going to talk about.
In the example I gave you with the birds, it was important that we could show that the model we wrote down actually worked, right? It's not enough to say, well, I want to write down the simplest model, or the least structured model that's consistent with certain expectation values. I have to convince you that I picked the right expectation values, that I picked the relevant variables. So I have to compute something to show you that I got it right.
When you start talking about 120 neurons, and there are no symmetries that relate the neurons to each other, it's not clear what you should check. But there's one thing that's natural to check. And that is that the model predicts, if you want, the energy of every possible state, that is to say, it's log probability. So you can just count how many states have energy less than E, or greater than E, whichever way you want to do it. And look at the cumulative-- well, you could look at the distribution of E in the data. Cumulative distributions are easier because then you don't need to make bins. And you can look at it in the data, and you can look at it in a Monte Carlo simulation of the model. And you see that they agree with each other very, very well.
What's impressive is that this energy here corresponds to events-- states-- that should occur, on average, once in the experiment, which lasts a couple of hours. The agreement, however, extends all the way out to here, which is, in energy units, 15 more. Well, kT is equal to 1. So that means that the things that are happening out here should occur with probability E to the minus 15 in the experiment. Which means that if we did the experiment all year, instead of for an afternoon, these things still wouldn't happen very often. But, of course, there are many, many states out here.
And so what this shows you is that the distribution of the likelihood of those states is actually coming out right in the model, in the experiment. So there are many other things you can check. For our purposes, the question you might want to ask is, where are you in the phase diagram? And there are lots of ways of asking this question. But one very simple one is, let me scale up or down the correlations. Remember, this model is chosen to match the pairwise correlations, so let me scale those up and down.
And, I mean, if I wanted to, I could put a temperature out in front of the whole thing. But if I do that, then I'm really showing too much of my physicist prejudice. Because when I do that, every time I change the temperature, I change everything about the network. So let me keep the average probability of every cell generating an action potential fixed, change the strength of correlations, and then let me do what it says in the textbooks I should do, which is to measure the specific heat.
You'll recall that the specific heat is really just the variance of the energy. And you'll see that it has a peak as a function of this parameter in the phase diagram. And that peak is at 1, or very close to 1, which is the real system. And that peak, even when you normalize in the proper way-- so this really is a specific heat and not a heat capacity-- that peak is growing and sharpening and moving toward 1 as you look at more and more neurons, suggesting that indeed the real model is sitting near a critical surface.
So we don't have the whole phase diagram for any of these systems, but we do have hints that they're sitting near a critical surface. Now, in each of these cases, we've gotten at this question in different ways. And you might ask, can we do better? I would say-- not to belabor the points here-- the best thing that my colleagues are trying to do is to do an adaptation experiment. So if you think about the distribution of states taken on by the neurons in the retina, if you suddenly changed the statistics of the movies that the retina is watching, that distribution will go completely crazy. The distribution of neural states will go crazy in response to a sudden change in the statistics. And you know this because if you're sitting in a dark movie theater and you walk outside, you can't see for a moment, right?
Then the distribution relaxes back. It does not relax back to exactly the same thing. But it might relax back to something else that's sitting on the critical surface. And if we could show that that was true, that would show you that during this process of adaptation, the system was being actively pulled back to a surface whose structure we can actually calculate. And there are hints of that in the data, but we don't know for sure yet.
So let me end by talking about what this all means. So I want to be sure you understand that this is all driven-- all of the things I've talked to you about are driven by these tremendous advances in experiment. So all of the ideas that I've told you about in thinking about networks of genes in the embryo, interactions in flocks of birds, statistical mechanics approaches to networks of neurons, these all have a history of a theoretical literature that dates back-- you can find precursors for decades.
But suddenly, we have experiments against which these theories can be compared, if we're tough on ourselves and really try to make the comparison rigorous. And that's a fantastic thing and something to celebrate. Now, the question is, what are these data telling us? We think that what the data are telling us is that all of these different systems are somehow sitting near a critical surface. And that's also an old idea, but it's not clear why it should be true.
I think one of the things that's interesting is that if you are near a critical surface, then many of these problems about determining parameters go away. Because there's something about the behavior near criticality that we can calculate without knowing all of the microscopic parameters. And so, in a way, this is something of an answer to the problem that I started with.
It's also true that critical points are extremal in a variety of ways. For example, they have a peak in the specific heat, which is the variance of the energy. And the energy is the log of the probability. So what that means is that they are places where the dynamic range of how surprised you should be by what you see is very, very large.
Critical systems, of course, have divergent susceptibilities. So maybe part of the point of being critical is to be able to respond maximally to very small signals. It's also true that they have diverging correlation lengths. So maybe what's important is the ability to propagate information over very long distances without the accompanying rigidity of being solid.
And finally, although I haven't mentioned it here, critical points are almost always associated with long time scales and the ability to translate between the microscopic time scales, for example, of activity in the nervous system and the macroscopic activity-- the macroscopic time scales associated with behavior is a real problem in understanding the neural basis of behavior. And it's something that I'll say a few words about tomorrow.
So thank you. It's a pleasure to be here. And actually, to talk about criticality at Cornell was also sort of a special thing. So thanks very much.
[APPLAUSE]
SPEAKER 1: Want to take a question?
WILLIAM BIALEK: Please.
SPEAKER 2: In reference to the protein structure molecule--
WILLIAM BIALEK: Yeah. It went by kind of fast. Yeah.
SPEAKER 2: Could this method be used with a known protein structure [INAUDIBLE] that have been induced by a [INAUDIBLE] or drug. Could it be used as a [INAUDIBLE] model?
WILLIAM BIALEK: So, in fact, what we're doing now-- so the prediction that most people have been trying to make is to see if they can go from the sequence variation to the structure itself. The first successful example of that was to look at bacterial two-component signaling systems. So you have two proteins that you know touch each other. And you have many examples of each of them. Yeah, in fact, you have many examples in a single bacterium. And, of course, there are many bacteria.
So you can look at the co-evolution of the residues on the two sides. And by this sort of argument, it was possible to identify which residues are in contact. What Lucy and I are trying to do is to lean on known structures to really test this idea that local models are sufficient. The question you're asking is whether, in some sense, if I take this model, can I now imagine doing something at one end of the protein and using it to discuss, as it were, the propagation of information through the molecule.
And so there are ideas about doing that. This specific class of models, I'm not sure anybody is trying that. There are closely related models where people are trying to do that. It is a very good idea. And this somehow might be connected to what I was saying at the end, that it is a very special feature of proteins that you can poke them at many different places and have influences that are very far away. And so one way that could happen is they're just completely rigid. But that isn't the way it happens, right? It's still sort of soft, yet the information propagates over very long distances.
SPEAKER 3: So back to what you were saying before about adaptation experiments. So if you change the input to the system-- so I guess the retina-- would that launch the system to some sort of random walk until it reaches another point on the critical surface?
WILLIAM BIALEK: So if you imagine drawing a phase diagram in two dimensions, so you're cutting through the critical surface-- remember, the system-- I mean, this surely isn't like systems in which there's a critical point and it's simple. The correct description is higher-dimensional than that. So if you imagine there's a critical surface that you've cut through, and here's one line of that surface, and you're sitting here. Then at the moment that you suddenly switch the inputs, the distribution is going to move away from this point. And then the hope is that during the process of adaptation, it's going to come back. If it comes back to exactly the same point, you haven't learned anything. All you've learned is that adaptation managed to undo whatever change you made.
But if it comes back to a different point on the critical surface, then I think people would have difficulty arguing that criticality isn't important, right? Because I can calculate what this surface looks like. And if I come back to another point on it and lie in this rather high-dimensional space of model, should the distribution come back to be right on this surface. So that's the experiment that my colleague tried to do.
SPEAKER 4: Yeah. Well, I think about some of the flocking experiments. And from what you were saying in the last slide, it makes me think that you seem to be implying, why is it good to be in this critical state? What possibility it is to have these long-range correlations, just have a large flock. But I guess my question is, how do you tease them out further? For example, one thing that comes to mind is [INAUDIBLE] anisotropically [INAUDIBLE] horizontal, versus fish, where they're more likely to be isotropic. So in those two examples--
WILLIAM BIALEK: Except for the fish that live in shallow water. But yes, go on.
SPEAKER 4: So I'm thinking, if you put [INAUDIBLE] a correlation that you have with the flock of birds to schools of fish-- [INAUDIBLE], which means this sort of thing.
WILLIAM BIALEK: So I don't-- good idea. It's also worth pointing out that different birds flock differently, and even physicists can tell the difference, right? We just don't know the names of all the birds. So the birds that my Roman colleagues were studying are European starlings, which make these kind of fluid flocks that you saw at the beginning. And this is in contrast, for example, to geese flying in formation, just to pick something on the far extreme, which really do look quite crystalline.
I don't know about schools of fish. The example that [INAUDIBLE] have been working on is swarming in midges, gnats, these annoying little insects that come out at dusk. And one of the things that's beautiful about that example is that if you look at a swarm of midges, it looks like they're all just moving around at random, except that they tend to stay in this neighborhood. But then if you actually measure carefully, you can discover that their trajectories are actually correlated. And, in fact, they're correlated across the entire swarm, beyond what you'd expect just from the fact that they stay in the neighborhood.
In fact, if you look at different forms, which are of different densities, and you make a model of things interacting with their neighbors, and so on, then if you have interactions that extend over a fixed distance, then the density effectively becomes the control parameter that determines the degree of ordering. And they've been able to give a finite size scaling argument for how the density tracks as a function of the number of things in the swarm. And so there you seem to be able to do much more.
It's not clear what the best system-- each of these systems-- look, let me just be completely honest, right? The hope is that there's something more general we can say. But it's clear that every one of these systems, we're boxed in by particulars. And we're trying to see if we can pull out the thing that's general. And one should still be skeptical. Yes.
SPEAKER 5: There was something about this example I didn't follow. Did you say that the experimentalists had a way of changing that variance parameter?
WILLIAM BIALEK: Oh, no. What I said--
SPEAKER 5: That you had the pictures that showed what happens if you go away from the criticality.
WILLIAM BIALEK: This. So you'll notice there's only one example of the data, and there are many examples of the model. So we can do that.
SPEAKER 5: I see.
WILLIAM BIALEK: But if you ask yourself, what does that correspond to doing? That is, as if you had found a flock in which the value of [? Q ?] was the same, but the variance of the speed was larger or smaller. So you could think about making a phase diagram, not in the parameters of our model, but in the observable-- do you measure versus [INAUDIBLE]?
SPEAKER 5: No, just wondering what that would look like if you were showing those pictures.
WILLIAM BIALEK: Ah, we should do that. OK. Yeah.
SPEAKER 5: And then the other thing regarding that-- in the slide you just flashed past us, there was a deviation for R, which you didn't mention. Presumably, that's not important, because it's just a finite size [INAUDIBLE].
WILLIAM BIALEK: So the treatment of the boundaries is important. I would say that the reason that we don't think-- well, OK. This is what it is. We're not trying to hide anything. If you do it on a different flock, some of them are a little better, and some of them are a little worse. I think what you should remember is that there are not very many birds which are this far apart. So although the error bar is small, it's-- I mean, what do-- cosmologist friends talk about cosmic variance, right? I mean, you don't have very many examples of that component. You've measured it very accurately. But whether it's typical for the distribution or not, you don't know.
SPEAKER 1: Let's take one last question.
SPEAKER 6: Yeah, so the technical framework that you describe is usually associated with equilibrium?
WILLIAM BIALEK: Yes.
SPEAKER 6: But [INAUDIBLE] systems, they're usually not equilibrium. Can you comment on why it works so well?
WILLIAM BIALEK: Equilibrium is death. Not my quote.
SPEAKER 1: Bill, you may want to repeat that.
WILLIAM BIALEK: Yeah. Yeah. So the question was, how is it that we're using a maximum entropy framework that we usually think of as being an equilibrium framework, and yet, we're describing a system that patently is not at equilibrium? Remember that maximum entropy, where the thing whose average you constrain is the mechanical energy, is only correct at equilibrium. But the things that we're constraining aren't the mechanical energy. They're just some correlation functions.
So the way to think about this-- and maybe this is actually a very good last question, because it's about the whole strategy. I have a very complex system. In principle, in order to describe it, I have to tell you lots and lots of things. So the maximum entropy framework is a framework for taking account of various things I might tell you about the system, in order. I tell you one thing, and I build the maximum entropy distribution that's consistent with that. Then I tell you another thing, and that produces another maximum distribution which, of course, has lower entropy. I'm describing more and more order.
And the hope is that I can stop before too long. In some sense, the number of states in the system is exponential in the number of degrees of freedom. If I tell you as many numbers as there are states, I've determined the entire probability distribution. Well, that's a disaster, right? Then E to the 100, I don't know, we're done. Right? But maybe I can tell you things which are maybe linear or quadratic in the size of the system. And maybe if I were really good, I could tell you some things that are only a fixed number. It doesn't matter how big the system is.
That's what we can do in the systems that we understand in condensed matter physics, right? We're not quite there. Well, with the flocks, maybe we're getting there. But the things that I tell you, the fact that you've built the least structured model consistent with the things I told you, doesn't mean the system is in equilibrium. In fact, it doesn't even mean the model is right. It just means that you're getting a good approximation. OK. All right.
SPEAKER 1: [INAUDIBLE] Thank you, Bill, for a great talk.
WILLIAM BIALEK: Pleasure.
[APPLAUSE]
Using examples from families of proteins, networks of neurons and flocks of birds, William Bialek of Princeton University describes successes constructing statistical mechanics models of biological systems directly from real data, March 16, 2015, as part of the Department of Physics Bethe Lecture Series.