Protein Folding Blog

Protein Structure: Overview

Amino Acids

You should know the basic structure of one, identify the amine, the alpha carbon, the “R” side chain and the carboxyl group (acid). My goal would be for you to be able to recognize most of them. But, the main thing I want you to learn is the categories: which are charged, which are hydrophobic etc. there is a good chart in your book. Given an example, I would like you to be able to identify whether the "R" side chain is hydrophobic or what other important functional groups it has. Example: presented with serine:
you should be able to spot that the side-chain has an OH (top) on it. You could guess that it is a decent H-bond donor. You might also ask whether that OH is ever chemically attacked by a phosphate (it is, quite often).

Primary structure

Primary structure is nothing other than the sequence of amino acids that make up the proteins. While I don’t want you to sit and memorize the structures (well, I kind of do, but it’s a lot of memorization), I do want you to get to know the categories as they are identified in the book or at wikipedia. We haven’t gone into that yet. So far we have ignored the “R” groups.
Primary structure just gives you one long chain, like the chain of magnetic beads.
You should understand the chemistry of how amino acids connect to form peptides (dehydration synthesis). At typical peptide will include hundreds of amino acids. Notice that there is an
amino terminus and a carboxyl terminus. There are no actual amino acids left after peptide bond formation. They are then called “residues.”

Secondary Structure

Secondary structure is the first step of how the chain of amino acids folds. This does not directly involve the R groups, though, they will have an impact. The secondary structure will be due to hydrogen bonds between the carbonyl oxygen and the amine hydrogen of another residue. Those are part of the backbone.
For our purposes, there are two main secondary structures:
ɑ helices and β sheets (often called “beta pleated sheets,” by non biochemists).
Alpha helices will be like the small helices we built with the magnet spheres. There will not be enough space in the center of the helix for other molecules, even water, to fit. They are “right handed” screws. Looking down the spiral from either direction, the spiral runs clockwise away from you.
When we used the yellow models, we saw three main features:
  • The helix is held together by backbone hydrogen bonds between a carbonyl carbon and the amino group of the fourth amino acid along the chain.
  • This structure is fairly rigid.
  • This repeating structure places all the “R” side chains on the outside (they couldn’t fit in the center anyway). A subtle result of this is that residues 1, 4, 8 etc will be on the same side of the helix.
Here is a close up of a helix in the estrogen receptor
Notice the dotted lines indicating hydrogen bonds between a carbonyl in the backbone (little red sphere for oxygen) and the hydrogen of the amine on the loop in front of it. I have labeled two amino acids. There’s the ringed histadine (you probably are almost able to spot it by now) labeled H516 (that just means it’s the 516th amino acid residue in the primary structure. The one on the same side of the helix exactly one loop further to the right is Lysine 520 (K520). It’s four residues farther along.

Why do some amino acids form helices and some not?

The other common form of secondary structure is called a “beta sheet” or “beta pleated sheet.” (See below) Why do some chains form helices and some form sheets?

Dihedral angles:

Phi Psi Angles 1These angles, also known as the “phi/psi” angles denote the rotation around the N-Carbonalpha bond (phi, ϕ) or the rotation around the Carbon alpha-Carbon carbonyl bond (psi, ѱ)
The key thing to know is that, while the atoms are free to spin around the single bonds, there is a preferred dihedral angle. That is generally dictated by the nature of the R side chain. You can force any amino acid (except proline) into the correct angles for an alpha helix. But, some are better at it than others. Alpha helices form because they are strings of amino acids that prefer the correct phi/psi angles. If you have a run of these alpha-helix-preferring residues, they form quickly into the correct structure.
Beta sheets are made up of residues that prefer a slightly different dihedral angle. Again, any amino acid except proline can fit into a beta sheet. But, some residues are better at it.
Beta sheets will be like those flat sheets we made, or, at least the two strands we made, with the magnet spheres. As with the spheres, the sheet can be made either of antiparallel strands or of parallel ones. Also like the spheres, the details of the structure will be different in parallel and antiparallel sheets. However, the wonderful beads breakdown as a nearly perfect model at this point.


Folding of proteins and structure:

Recap: proteins are made as one long strand of hundreds of amino acid residues (the largest know has over 33,000 residues). Given that the average mass of an amino acid is 110 Daltons, a “typical” protein might have a mass of 50,000-60,000 daltons and a really big one in the several million Daltons, We actually use “KiloDalton,” 1000 Daltons, as the unit of choice. So, 50,000 Dalton would be 50KD.
They are in a line and the sequence of residues is known as the primary structure. There is an amino-terminus and a carboxyl terminus. But, there are no actual amino acids left, since water was removed when the peptide bond was formed between each residue.
Secondary structure, for A.P. bio, is simplified to be alpha helices and beta sheets. There are more subtle things I want you to know. First and foremost, I want you to know that there are more subtle things. You might encounter a 3/10 helix or a pi helix (we may build them). You know that beta sheets can fold into a barrel, like the one we folded.
My wife recently solved one that is a “beta propeller.” It’s a cool variant on a beta sheet.
Secondary structure is mediated by the backbone carbonyls interacting with backbone amines.
The last thing about secondary structure I mentioned was that an alpha helix is easy to make with one side hydrophobic and the other hydrophilic by incorporating residues with hydrophobic side-chains in every fourth position. This is called an amphipathic helix. Beta sheet have side-chains that alternate Up/down along each strand.
Structures with lots of alpha helices tend to be soluble in water. Beta sheets, not so much.
Remember that, while the “R” groups or side-chains of amino acids are not directly involved in secondary structure, they influence the dihedral angles and thereby indirectly determine what will form.

Tertiary structure:

This is the first aspect of structure that is mediated directly by the “R groups” or side-chains. The side chains influence the preferred dihedral angles and therefore influence secondary structure. But tertiary structure results from how helices and sheets come together in a 3-D shape.
Tertiary structure may be mediated by hydrophobic interactions, leaving the hydrophilic faces on the surface if the protein is found in water. But, there may also be “salt bridges” formed by interactions between negatively charged (Acid) side chains on one section with a positive (Basic) side-chain on another.
There also are covalent interactions, where to Sulfhydryl-containing side chains form a disulfide link. Two cysteine residues that may be very far apart in the primary structure may fold so that the structures in which they reside come close together. They can then form a covalent link:
This is an oxidation, as two electrons are removed. It looks like you would get hydrogen gas out. But, usually, the electrons go somewhere else.

Quaternary Structure:

This is when two or more folded polypeptide chains interact to form a larger structure. These are mediated by the same interaction as found in tertiary structure.

Lipids and such.

  • Your Bio book has intro to lipids starting on page 74. It looks at most of what I cover here.
  • We didn't get to cis and trans double bonds. But, this includes a description.
  • Outline

  • You must know the classes of lipids, how to spot saturated and unsaturated fatty acids, trans and cis double bonds and know how those things affect the interactions among fatty acids. You must be able to read a “shorthand” structure (applies to proteins and sugars too). You must be able to identify a mono, di or triglyceride (really "mono, di or triacylglyceride). Among diacylglycerides, identify a phospholipid and describe how they form a lipid bilayer. You should know that the fatty acids become attached to glycerol via a dehydration synthesis step, similar to what happens with both peptide bonds and glycosidic link. You must know that the membrane of the cell, the plasma membrane, is made of phospholipids, primarily (it also includes lots of proteins, cholesterol and other stuff). Phospholipids are probably the most important class for our purposes. We will talk about them again when we do membranes.
  • Note, most images below are taken from Wikicommons. The others were constructed with the program “chemdoodle.”
  • Lipids

  • We break lipids into two classes that don’t look a lot alike, but are both hydrophobic. The first are called sterols, which are based on this structure.
  • There are four carbon rings, three of which have six carbons and one with five. There is hydroxyl at the end. That’s what makes it an alcohol (ol ending).
  • simplesterol
  • The most well known of the sterols and most abundant in you is cholesterol, which looks like this:
  • Cholesterol
  • While you have heard that cholesterol is bad in your diet, it actually is an important molecule you need to live. You make it in your body, as do all animals. In addition to cholesterol, all the steroid hormones are based on the sterol molecule (for example, testosterone and estradiol, which we saw in the essay on structure).
  • That’s pretty much all you need to know about sterols for now. We will revisit them when we look at hormones.
  • CisFA
  • Fatty Acid-based lipids

  • As noted above, those structures are Fatty acids. These are the components of the other class of lipid. They comprise a chain of hydrocarbon with a carboxyl (acid) group at the end. We start counting carbons at the carboxyl group. The one above has 18, as noted.
  • Saturated or Unsaturated

  • These terms originally referred to whether a fat could accept more hydrogens into its structure. However, what that means structurally is whether it has any double bonds. Recall that carbon must make four bonds total. At the site of the double bond (carbons 9 and 10) in the middle of those structures above, the two carbons have only one H each. If we break the double bond, we would have to add one more hydrogen atom to each. So, that bond is “unsaturated.” This would be known as a monounsaturated fatty acid. In the popular media, that’s usually shortened, incorrectly, to “monosaturated.”
  • In contrast, a saturated fatty acid has no double bonds.
  • Cis and Trans

  • Cis and Trans ONLY apply to positions where there are double bonds…that is, unsaturated bonds.
  • Note that the bottom structure has a big kink in it whereas the other one is fairly straight, like a saturated chain.
  • Pasted Graphic 3
  • That’s because the carbons cannot rotate around the double bond and you therefore have two different ways to arrange the bond: the long carbon chains on the same side (both down in this case) of the double bond. That’s known as “cis” and results in the kink.
  • Or, the long chain on one side goes “up” and the one on the other goes “down.” That is known as “trans” (opposite directions) and results in a fairly straight molecule.
  • Trans fatty acids generally are not found in biology. The Cis fatty acids are important because of the kink. The kink keeps the fatty acids from sticking together as well and lowers the melting point of the fat. Plant oils (not from the tropics) tend to have CIS unsaturated bonds and are liquids at room temp. Animal fats and tropical plant fats tend to have saturated fatty acids and be solid.
  • Trans fats occur almost any time you chemically treat (or even heat) fatty acids with double bonds.
  • Mono-, Di- and Triglycerides

  • These are all fatty acids attached to glycerol, a three-carbon chain with an OH on each carbon.
  • The Mono, di and tri refer not to the number of glycerols, but the number of fatty acids stuck to the glycerol. I know…dumb naming scheme.
  • Once the bond is formed, since an OH is taken off the acid and an H of the hydroxyl, it is no longer a fatty acid (it’s a fatty ester).
  • The synthesis is another example of dehydration synthesis.
  • Pasted Graphic 12
  • This is a triglyceride, also known as a “Fat.” It’s primarily for storing fatty acids for use in membranes or for energy. In this case, there are three very different fatty acids on the glycerol. You can also see the alternate numbering of the “alpha” and “omega” carbon. But, again, don’t worry about that.
  • One misleading thing in this structure is that the double bonds are Cis, but the person drawing it has left out the kinks.
  • Phospholipids

  • These are the main components of the cell membrane, and any membrane within the cell. They allow us to build cells with an outside and an inside, as well as internal compartments, transport vesicles (that weird “bag” the Kinesin molecule was dragging in the movie).
  • They comprise glycerol with two fatty-ester chains. On the end position of the glycerol, there is a phosphate, which is then in turn connected to some other hydrophilic group (such as the amino acid, serine or the ionic structure, choline). The thing below is phosphatidylcholine:
  • Phopshoplipid
  • You see the two fatty acid (ester) chains going to the right and angled to the right. You should see the phosphate.
  • The key point is this: the part on the left is VERY attracted to water and the fatty acids avoid water at all costs. If you get a bunch of things like this together, they will arrange in the only way that allows each part to be where it wants. They will line up in two layers, fatty acid tails pointing toward each other and lined up alongside, with the hydrophilic part out.
  • This depiction from Wikipedia LipidBilayer
  • is a good one because it shows how thick the membrane is. It is bad because it doesn’t tell you what comes after the phosphate. Remember, though, I told you that varies all over the place. It just has to be hydrophilic. The completely hydrophobic (dehydrated) area is about 3.5nm thick. A nm (nanometer) is 1/1,000,000 of a millimeter. The membrane is about 35 carbon atoms thick.

Energy Blog

(If you want to skip my "capacity to cause pain" intro in which I imagine throwing things at you, you can jump down to "Let's make a law" below).
What is Energy?
This is trickier than it may seem, but it's very important that we have some understanding. One common definition is the capacity to do work. This begs the question: "what is 'work'?" Well, if you look that up, you find that work is a form of energy...that didn't help much.
I prefer to think of energy as "the capacity to cause pain." Yes, that's something of a joke, but it works better than you might think. Remember, the goal here is to come up with some way to relate things to each other. Pain is pain...we can all relate to it, no matter what causes it.
Imagine this: I'm standing before you with a spherical object I plan to throw at you; what are the things that matter to you? As the object is flying at your head, what determines how scared you are?
  1. how fast is it coming in at my head? (meters/second )
  2. how much does it weigh (or more properly, how much mass does it have, since it would hurt just as much if you were in space and the object were 'weightless'). Mass is in Kg.
Which do you care more about, the mass or the speed? This might not be so obvious, but some objects with low mass can cause a great deal of pain if they are traveling fast enough (e.g. think of a bullet). So, velocity matters more than mass. Based on this we can come up with an equation for how much pain the object will cause as KE=1/2mv2 where "KE" stands for the pain it causes. Actually, "KE" means "Kinetic energy," or the energy of motion. The reason for the 1/2 is not something I'm going to address here.

The units of pain
It's worth thinking about the units for this measure of capacity to cause pain. Mass is in Kg (kilograms) and velocity is in meters/second. So the units of this pain, KE, are Pasted Graphic 1 .
More Pain
Okay, let's suppose this time, instead of throwing the object at your head, I'm holding it over your head and I plan to drop it (I'm really turning out to be a mean S.O.B. aren't I). Now, what matters to you?
    1. What is its mass? (kilograms, Kg)
    1. How high is it above your head? (meters, or "m")
    1. What planet are we on?
Did that last one surprise you? Well, unlike the last example, it will matter whether we are in space, or on the moon, or here on Earth. If we are in outer space and there is very little gravity, I can drop something massive over your head and it will just float there. No worries. Gravity (g) is described in terms of acceleration, which is in units of Pasted Graphic 2. So, I can write an equation for this pain: PE=mgh , where "m" is mass (in Kg), g is acceleration due to gravity in Pasted Graphic 2, and "h" is height (in meters). Here's something interesting: the units of this capacity to cause pain are again Pasted Graphic 1 , the same as the units when I am throwing something at your head. Hmmm...seems like I have the beginnings of a relationship here.

What else causes pain? Well, I could slap you...but that would probably be against the law. Besides, it's the same as the first case: how fast is my hand moving (squared) times how massive my hand is.
It Burns!!!
Another thing that causes pain is when something is very hot. The connection to the other cases might not be so obvious. But, let's suppose for a moment that the hot thing I'm touching is made of particles (I'll give you evidence for that soon) and temperature is related to how fast these particles are moving. Even in a solid, they are vibrating. The higher the temperature, the faster they are moving (some of you may already know that temperature is a measure of average Kinetic Energy). I can give you evidence to support this soon too, but it probably seems intuitive.
So, why does a thing at high temperature hurt? Because you are being hit by particles...yes, they are small, but there are a lot of them and they are moving very fast. In fact, the average velocity of molecules in air at room temperature is about 500m/s (about 1100 miles per hour). Temperature is a measure of the kinetic energy of the particles. So, again, we have the pain being related to 1/2mv2 and we have units of Pasted Graphic 1 yet again.
Work, work, work:
So, we really have established the units of Energy, or, the capacity to cause pain. Okay most people talk about energy as capacity to do work. So, how should we define work? One type of work is when you lift or move something. Suppose you run out of gas and have to push your car. If you push on it, you are applying some force, so work must involve force. But, if you don't succeed in moving it, you haven't actually accomplished any work (in terms of moving the car, anyway). You have to move it some distance in order to do work. The farther you move it, the more work you have done. So, work is defined as force times distance–the amount of force you are applying, times the distance over which you apply it. You may remember from physics, F=ma (mass times acceleration), which means units of force are Pasted Graphic 3  (we give this a special name, 1Pasted Graphic 4 =1 Newton, or 1N). If I multiply that times the distance I move the object, I get Pasted Graphic 1 , the same units we had above. So, these really are the units of energy. If that's the case, maybe we should give those units their own name, rather than having to write Pasted Graphic 1 all the time. So, Pasted Graphic 5 . Joule is abbreviated "J" and comes from the name of another famous dead guy, James Joule (you will become very familiar throughout the year with the “Famous Dead Guy” rule for naming units).
Let's make a law:
Or, let's make two. For now, we are going to do so without any theoretical underpinnings. We are just going to base it on observations.
  1. First, let's say that energy does not get created or destroyed in normal processes, just converted from one form to another. This law is called the first law of thermodynamics.
  2. Energy will tend to distribute evenly in the space available to it. This is one way to state the second law of thermodynamics. In it’s most general form, it says that a quantity called “the entropy of the universe” will always increase. Some books refer to entropy as a measure of disorder. But, that’s not really a good way to look at it. There is a very useful statistical definition of it I will save for later. For now, increased entropy means that the energy and mass have become more evenly and randomly distributed.

A bowling ball at the top of a cliff has potential energy (PE=mgh). If I let it fall off the cliff, the potential energy decreases as the height drops. So, where does it go? Well, the ball moves faster, which means that its kinetic energy (KE=1/2mv2) is increasing. It turns out that the total increase in KE exactly matches the loss in PE. The reason I stressed "total" is that some of it goes to heat up particles in the air and the ball through friction. If the temperature of the air goes up, that's an increase in KE also. This fits with our law that energy is not created or destroyed, just converted between types of energy. Can you think of other examples?
Suppose that you have a tank of water and you stir it up and make some waves in it. You have just put energy into it, right? All the particles will be moving, so they will have Kinetic Energy (KE). Some will be in the tops (crests) of the waves. Particles in crest of a wave will be above the level of the water, so they will have gravitational PE, and will tend to fall back to the flat level. Over time, the water will calm back down and be flat on the surface.

What happened to the energy? Is it gone? Actually, no, all the energy will be in the form of KE uniformly distributed in the water, measured by its temperature. So, if our first law is right, the energy increase resulting in the gain in temperature (kinetic energy of the molecules) will be the same as the energy that you put into making the waves in the first place (James Joule, the famous dead guy for whom the unit of energy is named actually determined that the increase in temperature corresponded to the input kinetic energy). So, the energy is still's just harder to recognize because it has become evenly distributed in the tank. That it tends to do that is our second law. It’s also worth noting that once the energy is evenly distributed, it is no longer very useful. You cannot do work with it.

The second law basically suggests that there is a "landscape" of energy analogous to the surface of the water, and it will tend to get flat and even over time. If there is one place that has more energy (kinetic or potential) than a spot next to it, the energy will tend to flatten out so that there are no peaks and valleys unless there is some barrier that keeps that from happening.

Think of some simple examples of this. Suppose you have a hot brick sitting in the middle of a room. Over time, the brick will cool down and the room will warm up. Eventually they will be the same temperature, which means all the particles will have the same average kinetic energy. There is no loss of energy here. At the start of the experiment, the particles in the brick had higher KE than those in the air in the room. They end up with uniform KE. Again, you go from energy concentrated in one location to energy evenly distributed.

Here's another one: suppose there is a ball sitting on the top platform of a ladder. It has potential energy. Even a modest nudge will push it off, and the ball will drop. When it settles on the ground, the potential energy is gone, converted to kinetic energy, eventually passing to kinetic energy of the particles the ball interacts with. The air particles and the molecules in the floor will each increase in kinetic energy (temperature). That is, the ball falling will convert the potential energy that resided entirely with the ball into kinetic energy dispersed among the particles in the air and floor (some of which is sound).
So, as was the case for the waves, high potential energy is an unstable state. Things at high PE will eventually fall to a state of lower PE because, when they do, they distribute the energy more evenly.
In the example of the ball, it only takes a small nudge to knock it off the ladder. Thus, chances are it will fall sometime. Now, if the ball were in a bucket glued to the top of the ladder, it would take more than a small nudge and it would not be as likely to fall. So, just because something could go to a lower potential energy, doesn't mean it will. The sides of the bucket represent a barrier to the energy distributing evenly.
Things in a high-energy state will tend to move to a lower energy state, so that energy will become more evenly distributed.
This is the big principle we have been working toward . This ultimately is why things happen.
Analogous to gravitational potential energy, there is a thing known as chemical potential energy. And the same things apply to it. Why does gasoline burn? Because when it does, the chemicals end up at a lower state of PE (like the ball falling) and pass the difference in energy onto the surroundings as heat (Some of it can be used to do work, like moving your car).
That’s another big theme: even though there is no net loss of energy, the transfer of energy can be used to do work. Once it is all evenly distributed, there’s nothing you can do with it.

The Big Rule

Energy will tend to distribute evenly (entropy will increase). Because particles carry kinetic energy (unless they are at absolute zero), the distribution of particles will tend to become random over time because that distributes their energy. Why doesn’t everything fly apart into random distributions? Well, the universe as a whole seems to be heading that way. But, locally, things can stay fairly ordered-looking. Biological systems in particular seem very ordered.
It turns out there is another way to distribute energy more randomly: form bonds. You see, forming bonds always releases enthalpy, which can be loosely thought of as heat, to the surroundings. Conversely breaking bonds always requires input enthalpy.
This seems like forming bonds should be favored by our laws. If I release heat to the surroundings, I impart that energy to the surrounding particles, which can then distribute that energy more randomly.
But…there is a tension here.
Let’s consider the melting of ice or freezing of water. Most of you did a lab in the beginning of last year in which you saw that as water froze, heat was released to the surroundings. That’s consistent with what I just said: forming bonds releases enthalpy. That’s good because that released heat gets to distribute more randomly. But, forming bonds also restricts the motion of the molecules (in this case, water molecules become ice). That’s bad, because whatever kinetic energy the water molecules have gets pinned down in the crystal and can no longer distribute as widely.
On the other hand, when ice melts, the particles get to distribute more widely, carrying their energy and distributing it more randomly. That’s good. However, breaking the bonds of the ice crystal requires input enthalpy. That takes energy from the surroundings, cooling it down, and concentrates that energy as higher potential energy in the water molecules. That’s bad.
So, which one wins? Is it more energetically favorable for the bonds among the water molecules to form, releasing enthalpy to the surroundings, but constricting the molecules themselves? Or, does the freedom of the water molecules to move around win out? Does ice melt spontaneously or does water freeze spontaneously? The answer is, it depends on the temperature. And if you understand why, you understand a lot about energy.
At temperatures above 0
oC (273K), pinning the molecules down is more costly. The higher the temperature, the more kinetic energy the molecules have, the more they are able to break the bonds holding the ice together, or, more correctly, the greater the benefit to letting them move freely. At temperatures below 0oC, however, the cost of taking enthalpy from the surroundings and using it to break the bonds is not offset by the benefit of releasing the molecules.
Notice while at really low or really high temperatures, you are likely to have all ice or all water. But, right at 0
oC, you can have both ice and water. And, you can nudge it one way or the other by making relatively small changes to conditions. That’s because you are right at the point where the two players in the tug of war are equal to each other. This will be an important consideration in biology too.
We will come back again and again to breaking and reforming interactions. It is absolutely imperative for life that virtually all the reactions in our body be reversible, favoring one direction under some conditions and the other direction when those conditions change. We eke out a living at the margins of free-energy differences, always paying for it by heating up the surroundings (releasing enthalpy) satisfying the rule that we must increase the entropy of the universe.

A biological Example:

By now you have heard “DNA contains the information that specifies living things.” Or perhaps you’ve heard it called a “blueprint.” You almost certainly have seen representations of it as a “double helix.” You may even have heard that “A” binds with “T” and “G” binds with “C.” (The four “bases,” are Adenine, Thymine, Guanine and Cytosine are abbreviated by their initial letters). But, how does DNA encode information? Why do “A” and “T” pair and not “A” and “C”?
Well, for one thing, adenine sometimes will bind with cytosine…just poorly and less stably. This is one way that mistakes, or mutations, happen when DNA is copied. The reason A binds with T most of the time and better than it would bind with C is simply because the change in energy is more favorable. The enthalpy released to the surroundings is greater when A binds with T than when A binds with C. It turns out that G binding with C releases even more energy.
The energy involved is very much along the same lines as that involved when ice freezes. In both cases, the bonds are hydrogen bonds, either among water molecules or between the bases of DNA.
If the analogy is to hold up, what do you think will happen to a pair of bases, A-T or G-C, if we raise the temperature? If the two strands are held together because of the same principles that hold water together as a liquid, or ice as a solid, what should happen to a double helix of DNA if I raise the temperature?
If you answered: “the bonds should break and the two strands of the helix should come apart,” you are correct. We even call this “melting,” the DNA.
This is just a first brush against a set of principles we will return to often in the coming weeks. The answer to questions like: “why do proteins fold up into their active shapes?”; “how does the muscle contract?” and many more will come down to the same simple principles, though applied in a rather complex pattern.

Link to old document

  1. The enantiomer document also suggested you revisit the "Why structure is important" essay. So, here is a link to it.


Polymers of sugar

or, polysaccharides.

A monomer of sugar, with the empirical formula CH2O, is called a simple sugar. For our purposes, hexoses (6-carbon sugars) including glucose and fructose are most important here. Pentoses such as ribose in RNA and deoxyribose in DNA will be dealt with later.
Sugars have a carbonyl on one carbon and hydroxyls on the others. The carbonyl can be on the end (an aldose such as glucose), or not on the end (A ketose, such as fructose). We often draw them as a linear form. But, in water, they are not.

Sugar Cyclization.

Here is the linear form of glucose:
Pasted Graphic 8
and another view in ball-and-stick model.
Pasted Graphic 9
Carbon 1 is on the right, the carbonyl.
The cyclization is just a rearrangement of the atoms in the molecule. No atoms are lost (e.g. water is not released).
The dashed-line bonds are bonds that would go slightly back into the page and the dark wedge-shaped bonds would come forward out of the page, as indicated in the 3-d view.
The chemistry of the interaction, if you care, is that the carbonyl carbon is very electron poor (the brutish oxygen is stealing it’s electrons). The lone pairs on any of the alcohol (OH) group oxygens could in principle initiate a reactio zan with the carbonyl carbon (alcohols and aldehyde often when you put sugar in solution, it reacts with itself). The most stable ring is formed when the OH on Carbon 5 attacks the carbonyl carbon. As the bonds are exchanging, that oxygen ends up bridging carbon 1 and carbon 5 AND having a hydrogen bonded to it….This intermediate is unstable. Both the oxygen and Carbon 1 are making too many bonds, which cannot stay. So carbon 1 has to lose one bond to what had been the carbonyl oxygen. That leaves the carbon looking good, but we have one oxygen (in the ring) with 1 too many bonds (+1 formal charge) and the other on carbon one with not enough bonds (-1 formal charge). The more stable structures trades the hydrogen off the ring oxygen on (originally from carbon 5) to the former carbonyl carbon on carbon 1.
For those keeping score, this adds one more asymmetric carbon (carbon 1) and one more possible isomer, which we call the alpha and beta form, as we see in either starch or cellulose.
It seems like such a small thing: does the OH on carbon 1 stick straight out of the plane of the ring (called “axial”) as below:
Pasted Graphic 7
Or does it stick out more in plane with the ring (“equatorial”), as it does in this form:
Pasted Graphic 10

But, it is a big deal. The top form is called alpha and the bottom is called beta.


Polymerization is carried out by an enzyme that joins the joins carbon 1 of one ring with carbon 4 of another. It’s a dehydration synthesis, resulting in the OH from carbon 4 leaving with an H from the hydroxyl on the carbon 1 to form water. The bond can be hydrolyzed by a different enzyme. I said, animals lack the enzymes to deal with the beta form of glucose…But, actually, I was wrong about that, to some extent. We do have a (poor) beta glucosidase…in our tears. It’s called “lysozyme” and it’s part of our defense against bacteria. Also, a sugar similar to glucose (another hexose/aldose with the formula C
6H12O6) called galactose pairs in its beta form with glucose to form a disaccharide called Lactose (perhaps you’ve heard of it?). At least as infants, we make a beta galactosidase protein called “lactase.” Most of us don’t make that when we get older and so we have some level of lactose intolerance.

But, in general, we deal poorly with beta forms of glucose in polymers.

Pasted Graphic 3
This is a representation of cellulose (also known as “Fiber”) from Wikipedia, the same one we looked at today. It is a main structural component of plants, in what is known as the “cell wall”. Note that there are four VERY short chains linked through the oxygens in a beta 1-4 link. The actual chains would be much longer.
Notice that the oxygen on carbon 1 (right-hand carbon in each ring) is sticking out to the side of the ring. The net result of this is that the sugars alternate orientation. You can see this best by looking at the carbon 6, which is sticking out of the rings, either up or down, alternating.
As a result of this, there are hydrogen bonds both within each chain (from each sugar to the next) and to the chain running along side of it. Seems like a recipe for something fairly strong, but flexible (chains can slide along each other under stress, simply making new hydrogen bonds).
This is a great example of how details of the fine structure explain the behavior of the larger structure. Wood is essentially made of single fibers all cross-linked in many directions that allows at least some sliding, and therefore bending.

Compare it to this, which is starch (we can digest that).

Pasted Graphic 5

In this case, you see the O off carbon 1 sticking down, out of the plane of the ring. This leads to all the sugars more or less orienting the same way, no great ability for hydrogen-bond cross linking, and sort of a slow, spiraling of the strand.

Key properties of polysaccharides for AP:
  1. Cellulose and Starch (and the slightly branched “amylopectin”) are made by plants.
  2. Glycogen is made in animals (mainly in the liver).
  3. Starch, pectin and glycogen all use alpha glucose and are mainly storage forms of glucose. Your body, for example, will either make or hydrolyze glycogen to take up or release glucose to your blood.
  4. Cellulose is the beta form of glucose and used primarily to provide structure.
  5. Though a strand of polysaccharide is joined from carbon 1 of a glucose to carbon 4 of the next, you can have branching where a long strand will attack carbon 6 (the one outside the ring) with its carbon 1.
  6. Cellulose has no branching; starch (amylose) has no branching and pectin a little branching. Glycogen is highly branched.

Here are some things I would like you to know at this point or very soon.
  1. What are polymers? How are monomers assembled into polymers? Explain hydrolysis and dehydration synthesis.
  2. Identify the number of the carbons in a monossacharide both in linear and ring form.
  3. What is the difference between an aldose and a ketose sugar?
  4. What’s the difference between a triose sugar and a trisaccharide? (or hexose versus hexasaccharide).
  5. How do sugars form the ring structure. Show that no atoms are gained or lost.
  6. What is wrong with this sentence commonly found in biology texts “cells obtain energy by breaking the bonds in sugar molecules.” ? Students who had me in chemistry may have a leg-up on that one.
  7. Explain the key differences and similarities between glycogen, starch and cellulose. What makes cellulose a good “structural” polysaccharide? (you just read the answer).

Molecules and such

This is a 25 minute screencast on bonding that should be entirely review for most of you. Listen to it if you think you would benefit from the review.
Here is a link:
screencast covering Bonding Basics.
There is a “table of contents” icon (just 4 horizontal lines at the lower right as you mouse over it). If you click on it, it will list several topics. You can skip to those you want to hear again.
Also, here is a
PDF on functional groups that I think you should look over.
Finally, here is a primer on "molecular shorthand" we use when we draw molecules.

Structural shorthand.

When we are writing out a larger molecule, we often use a shorthand notation. So, it is important that you know what they mean. Bonds are line segments, atoms are at the junction or ends of line segments. Double line segments mean double bonds. Beyond that, it’s really simple and based on two rules:
  1. Carbon is the most common atom in all organic molecules. To save time, I only label things that are NOT carbon. Everywhere else you see a place for an atom (end of line segment) that is not labeled, it is a carbon. Anything else, I will label.
  2. Carbon will have four bounds around it. If you see fewer than four, the rest are made up by bonds to hydrogens, which I don’t bother to write.
In it’s most extreme case it could look a little silly. For example, a simple line segment: could represent the molecule “ethane,” which is this
Each end of the line segment is a C, which I don’t write. Each C must have three more hydrogens to fill it out, which I also don’t write.
However, consider this molecule:
It is a really short fatty acid called “butyric acid.” How many carbons and hydrogens are there.

I’ll wait.

You should have gotten 4 carbon, 8 hydrogen. Here it is explicitly.
Butyric Acid
The carbonyl carbon has four bonds already (two to the O on top, one to the O to its right, and one to the carbon on the left) and has no room for hydrogens. Carbons in the middle of a chain have two hydrogens and the one on the end has to have 3.
Here’s a much harder one:

Pasted Graphic 3
We will talk a lot more about these below. Both of these have the same number of carbons and hydrogens. How many do you see?

You should have gotten 18 carbons and 34 hydrogens (one on the OH, none on the carboxyl carbon itself, three on the last carbon, one on each of the carbons in the double bond--that’s 6, then each of the remaining 14 carbons have 2).
Here it is in more explicit form.

You Are Here

This is a graphic of the “tree of life” based on comparisons of ribosomes...the machine that makes protein. It was compiled a long time ago by Norman Pace. I’ve modified it slightly.
It’s not the most up to date complete. But, it gives you the idea.Pace Tree of Life

This is the most "up-to-date" tree of life, published last year:


If you are wondering where we are on it, the line labeled Opisthokonta, found at the lower right, includes all animals and fungus.

Here is a little light reading for tonight entitled: "The importance of stupidity."