Transcriptional regulation

As usual, images from wikicommons.
The first question is, how does the machine that makes RNA recognize a “gene?” In the past, I’ve called a “gene” the stretch of DNA encoding a protein
and its regulatory sequences. We can separate it into the “structural gene,” which contains the actual code of the protein as it would be found in the mRNA, and the regulatory sequences. A very basic regulatory region is called the “promotor.” It is a region that serves as the initial recognition site that says: “This is a gene, transcribe here.”
In addition to the hydrogen bonds that make up the “watson-crick” base pairs, there are additional sites for hydrogen bonds and other contacts that can be made without “unzipping” the DNA, primarily along the major groove. In particular, the major grove is just about the right size for an alpha helix of protein to fit in. There, R-chains on the outside of the helix can make specific contacts with the bases and “read” the sequence (that is, bind in a sequence-specific manner). The first step in transcription is recognition of the promotor, almost always upstream from the structural gene. There are several types of DNA binding proteins. Here are two examples:
LambdaRepThis is “Lambda repressor.” Notice two things: the protein binds as a dimer. In fact, each is identical and the overall DNA sequence is a “palindrome.” More on that later. Second, notice that one of the alpha-helices in each subunit is reaching down into the major groove, where it makes sequence specific contacts. This particular protein actually blocks access of the RNA polymerase in E. coli. But, similar proteins act to promote binding.
bZIPThis is a b-zip protein binding domain, of a class that acts to promote transcription in our cells. Notice the same two things. Even though the overall structure of the proteins is different (note that much of the structure is left out), the binding has some similarities.

Assembly of the transcription complex

What is the sequence?
Well, that depends. The promotor is the most important thing that determines what genes are expressed (turned on) in which cells. Thus, genes that have to be on in every cell have different promotors than those only on in some. Genes that are turned on following a signal cascade have specific promotors. One of the main “housekeeping” promotors has a sequence called the “TATA box” (I wonder if you can guess why. It is bound by a complex depicted below. This is known as a “General Transcription factor” because it is common to most genes.
As you can see, the
"Tata Binding Protein," or TBP interacts with the DNA over a larger region than just the TATA box. It also partially opens the DNA, where it bends it rather strongly. You may remember from the video we saw that there was a sharp bend to the DNA where the initiation complex was forming. It's also a little unusual in that it binds to the minor groove, using a beta-sheet domain. That bends and partially opens the DNA.
After initial proteins assemble at the binding site, they serve as binding sites for still more proteins that will be necessary to start transcription.

Here is a depiction of some of the proteins involved:
Transcription Factors

Notice the factors off to the right labeled 4 and 5. These have more to do with those larger levels of regulation I was talking about.

Here is a cool video on the process that shows some of the next step also: