Lecture 3 - Genetic Engineering

Overview

Professor Saltzman introduces the elements of molecular structure of DNA such as backbone, base composition, base pairing, and directionality of nucleic acids. He describes the processes of DNA synthesis, transcription, RNA splicing, translation, and post-translational processing required to make a protein such as insulin from its genetic code (DNA). Professor Saltzman describes the genetic code. RNA interference is also discussed as a way to control gene expression, which can be applied as a new way to treat diseases.

Lecture Chapters

Introduction
Building Blocks of DNA
Structure of DNA and RNA
Central Dogma and DNA Synthesis
Genetic Code and Protein Synthesis
Control of Gene Expression

0
95
678
1456
2115
2515

Transcript	Audio
html Frontiers of Biomedical Engineering BENG 100 - Lecture 3 - Genetic Engineering Chapter 1. Introduction [00:00:00] Professor Mark Saltzman: This week we're going to talk about DNA technology and genetic engineering: this is Chapter 3 of the book. Some of this will be familiar to some of you who've have had biology in high school or other places, you know something about DNA. In fact, even if you haven't had a biology class it's hard to be alive in 2008 and not know something about DNA; it's become such an important part of our lives. I'm going to ask you to indulge me while I go back to the beginning and talk about some things that you know but I'm going to go through this pretty rapidly. I think the book has a fairly good description of it so if you don't pick up everything in the lecture, hopefully you've read that beforehand, and you can go back to it afterwards and read about things that didn't make sense. We'll talk about some chemistry today, what DNA molecules are like, why they have the behavior that they do, and you need to understand this in order to understand how you manipulate DNA. So my goal today is to talk about sort of the basics of the molecules, their chemistry, the function of DNA in cells, sort of basic - the basic side of that. Then on Thursday we're going to start talking about how to manipulate DNA and get closer to using it in Biomedical Engineering. Chapter 2. Building Blocks of DNA [00:01:35] DNA is a double helix, you know this, the double helix was--the structure of DNA was discovered about the time that I was born and so it's been known throughout your lifetime, you've always lived with it. It's really remarkable how far--how fast we have come from just knowing the structure of this molecule to be able to manipulate it and study it in great detail. I want to start by showing you this cartoon that you already know about with the structure of a double helix. It's a twisted ladder and there's a couple of things to notice about this familiar structure. One is that there are two backbones right here in the light blue, so this would be the upright parts of the ladder that are twisted. Those are two continuous strands that wind around each other to form the double helix. One thing to notice is this part of the double helix that we'll call the backbone. The backbone's on the outside of the molecule like the upright struts of a ladder on the outside of a ladder. On the inside are the rungs or the struts that hold the ladder together. There are several things that you'll notice about the struts in this particular cartoon. One is that there's four different colors and so you can see red, blue, yellow, green here - four different colors and that's all there are, there aren't more than four. That there are two colors per strut, so what's linking the two backbones together are two colored segments that come from the outside towards the middle, and that the colors occur only in certain combinations, red and green, yellow and blue, that's all you see. You don't see a red and yellow, you don't see green and blue. This is a feature of DNA shown in this cartoon form, so if you can keep that sort of schematic in mind, it makes it a lot easier to understand the detailed structure. That's what I want to do for the first few minutes of the lecture here is tell you a little bit about the details of the structure and how molecules fit into this image of DNA that's already very familiar to you. The things that you need to know are the things that are really listed on this slide. You're going to know more details about it, they're not really colors, they're chemicals, specific chemicals - but the pattern is the same. The molecules that really make up DNA are nucleotides and DNA is a polymer of nucleotides. A polymer is just a large molecule that's made up of repeated units. We're familiar with polymers, plastics in our daily life; the chairs that you're sitting on are made of a kind of a plastic polymer that is basically an organic chemical that is cross-linked together. Cross-linked is not the right word--that is chemically bonded with repeat units to make large molecules so that when you have a bunch of large molecules together they have certain physical properties like the solid property of the plastic that you're sitting on. Nucleic acids, of which DNA is an example, are polymers of nucleotides. So the repeating unit in DNA is this structure here, a nucleotide, which has three different regions. There's a sugar, a five carbon sugar, which forms the core of the nucleotide and attached to this five carbon sugar at specific positions on the sugar relative to this oxygen, which is part of the sugar ring, is a phosphate group and an organic base. When these nucleotides get polymerized to form a long DNA molecule they all get polymerized in exactly the same way, the chemistry is the same. The phosphate group of one nucleotide gets linked to the sugar group of another nucleotide and I'm going to show you that in a few minutes. So what's going to form the backbone is this continual link, phosphate to sugar, phosphate to sugar, phosphate to sugar, all linked together to form one long, long molecule. What's hanging off of the side of this long molecule that's formed by polymerizing nucleotides are - is this base unit. It's the phosphate and the pentose that make up the backbone - that make up the upright struts of the ladder and it's the bases that make up the connecting struts, so the bases are the colors. There are four different bases, which I'll talk about in a moment. We're going to talk about two different nucleic acid molecules, two different nucleic acid polymers, one is DNA, deoxyribonucleic acid, the other is RNA, ribonucleic acid and one of the differences between the two is that the pentose, or the sugar, that makes up DNA is deoxyribose shown here, and the pentose that makes up RNA is ribose, shown here so every pentose in a DNA polymer is deoxyribose. The phosphate is linked to this carbon on the pentose, and notice that there is a number on this carbon, it's called - it's the 5' carbon. This is a convention that organic chemist's use when they're describing molecules like this. They'd like to be able to refer to each carbon separately so they can talk about reactions with this molecule, so they number the carbons. In this case, these pentose molecules, whether it's ribose or deoxyribose, the carbons are numbered the same 1', 2', 3', 4', 5', those are the five carbons that make up the pentose. So I could refer to the 4' carbon and you'd know I'd mean this one, or the 2' carbon you'd know I mean this one. The ones that are important to us are the 3' carbon and the 5' carbon. The reason for that is that the 5' carbon is where the phosphate is attached. In nucleotide the phosphate is always attached to the 5' carbon. The reason that 3' is important is that when you polymerize two nucleotides together and a third nucleotide, and a fourth nucleotide, when you polymerize nucleotides together they get polymerized, the phosphate of one gets linked to the 3' carbon of another. This is important because this molecule here, deoxyribose, is not the same upside down as it is - it's not symmetrical upside down and right side up, it's different because the 5' carbon's either pointed up or pointed down. The nucleotide has a directionality, there's an up and a down to it and it's going to turn out the chain that's formed by polymerizing these has a directionality as well and that's important in defining the structure. These are the pentoses - remember 5' and 3' because that orients you with respect to what direction the molecule is facing. Now the bases, and you don't need to memorize the structures of these I'm going - I'm describing the whole molecule to you in its molecular detail and then we're going to simplify it down to a version that we can talk about more easily. To give you all the detail, there's two classes of bases that appear here. One class is called the purines and they have two ring-like structures. There are two of them that are going to be important to us, one is adenine and the other is guanine, shown here. Because saying adenine takes a long time and saying guanine takes a long time we're going to simplify it by calling adenine (A) and guanine (G). The second class is the pyrimidines and there's three of those that are important; uracil, thymine and cytosine which we're going too simplify by calling (U), (T) and (C). Now remember that there were only four different colors in the cartoon of the DNA double helix that we talked about and I told you that those colors are really - represent the bases but there's five of them here. There's five because there's one of these that's particular to RNA only, that appears in only RNA, and there's one of them that appears in only DNA. The one that appears only in RNA is (U), the one that appears only in DNA is (T). In DNA there's only four colors, there's only four bases, (C)(T)(G)(A). In RNA there's four bases (A)(G)(U)(C). So (U) and (T) are interchangeable in a sense that (U) appears where (T) would appear in RNA and (T) appears where (U) would appear in DNA. If I drew this altogether and this is one particular nucleic acid, now shown in more detail, all of the carbons of the pentose are shown here, the phosphate is shown, and a base is shown. This particular base is (A), this is ribose. So this is a monomer or a single unit from an RNA molecule, you know that because it's ribose and it's the particular molecule that has (A) as its organic base. Chapter 3. Structure of DNA and RNA [00:11:18] We don't need to really talk about all the molecular detail in order to completely describe a DNA or an RNA molecule because these structures repeat themselves. Every unit in the backbone of this ladder has the same sugar unit, the same pentose; it's either RNA or DNA and has ribose or deoxyribose, so you don't need to describe the whole thing. You can just say it's RNA or DNA and you know everything about the pentose in every molecule on the chain. You don't need to say anything more about the phosphate because they all have the phosphate and every set of these is hooked together in the same way. The 5' carbon has a phosphate off of it and that phosphate is linked to the 3' carbon of the next one and they all have a base hanging off the side. The only thing I need to say in order to distinguish this particular part of the chemistry of a DNA or an RNA molecule is to say 'it's DNA or its RNA', and 'what the base is'. If I told you that 'draw me a nucleotide from RNA that has (A)', you could go back to this picture and you could draw the whole thing. You can just talk about it in a simpler way. You can say a polymer of DNA, for example, is four bases long, that means it has four of these repeat units and they go in the sequence from 5' to 3' of (A)(G)(T)(G). If I told you that a DNA sequence went 5' to 3' (A)(G)(T)(G) you could draw the whole thing referring back to these notes, right? In fact, you don't have to draw it this complicated way, you could just draw it as a line with (A)(G)(T) and (G) hanging off of it, and you would know that that's a DNA molecule four bases long. Now what does the line represent here? This represents the upright struts on a ladder that I showed you before, it represents this backbone that's shown by - that's formed by polymerizing the pentose's together through phosphate's always going 5' to 3', 5' to 3'. I could take and draw a line continually down the molecule where my finger was touching; my finger would be touching a phosphate here, the 5' carbon, the 4' carbon, the 3' carbon, and the next phosphate. The backbone is what? It's phosphate, carbon, carbon, carbon, phosphate, carbon, carbon, carbon, phosphate and it has this structure hanging off the side. Now DNA is a double helix and I've only shown you one part of the helix, right? I've shown you one upright strut and a base hanging off of it, but it forms a double helix because complementary strands of DNA strongly associate with one another and that's a very stable structure. They do that because the bases can interact with one another in particular ways, and this you know about. This was the famous finding of Watson and Crick in describing the structure of DNA. This is - what this diagram shows you is--the forces that hold these individual strands of DNA into a double stranded form. The forces occur because of hydrogen bonding between complementary pairs of the bases and the complementary pairs are adenine and thymine, (A) and (T) and guanine and cytosine, (G) and (C). Now if you read in the book, you read about where this figure is shown in the book, you can understand more about why these structures line up in the right way so that the right molecular elements are together to form hydrogen bonding pairs between them. That's really beyond what I'll be asking you understand for the course but you can understand that if you read it, I'm sure. Now remember that (T) only appears in DNA and (U) appears in RNA, and so (U) can also form a hydrogen binding pair with (A). The whole structure of a DNA molecule looks like this, going back to a more cartoon version like I showed you before but adding some detail onto it now. There are two upright struts of the ladder, one shown in blue here, the other shown black. They are linked together by four different colored segments indicated here not by colors now but by letters. It's DNA, so it's (G)(C)(A)(T) and they always occur in pairs. Where before it was two colored pairs now it's two lettered pairs (G)(C)(T)(A). The chains are different now. They were colored the same, the backbones were colored the same in the diagram. They're colored different here to indicate one new difference that you know about now, and that's that there is an orientation, there's an up and down on the chain. That's due to the asymmetry of the nucleotide, that there's a 5' and a 3' end and the way that they're linked together. This blue chain here goes from 5' carbon all along the chain and there's a 3' carbon left open at the bottom. If I wanted to link another nucleotide to this DNA chain what would I attach here on the bottom? I would attach the phosphate that's connected to the 5' carbon of another nucleotide. I would link this one facing in this direction I would add onto this. The other chain is facing in the other direction, the 3' carbon is up, the 5' carbon is down. Remember that this molecule, let's look at the blue one wouldn't be the same if I turned it upside down. It wouldn't be the same if I turned it upside down because the carbons - the rings here, the pentose's would all be turned over, the chemistry would look different and the sequence of bases would look different. The corresponding half of the ladder that corresponds to any given ladder, let's say the black DNA molecule that corresponds to the blue one is not just a mirror image. We call it the complement and each strand of DNA, each polymer of DNA that you could make or you could draw has only one complement and that complement has the following features. One, its chain is oriented in the opposite direction: where this one goes 5' to 3', this one goes 3' to 5'. It's oriented in the oppose direction and it has the complementary base pairs at each position. Where there's an (A) here there's got to be a (T) here, where there's a (T) here there's got to be an (A) here, where there's a (C) here a (G), a (G), a (C). That's because you have to satisfy this base pair matching in order to have hydrogen bonding in each of the struts of the ladder in order to form a stable structure. If I'm talking about two DNA strands and they differ only in one or two base pairs they won't be exact complements and they won't form this double helix. This notion of complementary strands is very important. It's the way that DNA exists inside the cells of your body. It exists in a double stranded form where every strand is matched by its complement. These molecules of DNA, very long molecules of DNA, are condensed and packaged within the nucleus of every cell in your body. Every cell in your body has exactly the same DNA; that is if I could stretch out all the DNA and look at the base pair sequence, the sequences of bases along all the DNA in your chromosomes, they'd be identical in all the cells. They'd be different in each of us and that leads to the difference in the diversity between people. You've heard about the human genome project, we'll talk about that a little bit later. The goal of that was to take for a typical human, or for a typical - in the case of the human genome project maybe you're looking at fruit flies, you want to look at all the DNA in a fruit fly, but to look at the sequence of base pairs that makes up human DNA and write them all out; we'll talk about that later. This slide shows one important feature of the physical chemistry of DNA that turns out to be very important for all of the technology that is built on DNA. It has to do with the nature of this complementary binding between double stranded DNA and the fidelity of this base pair matching in forming stable DNA molecules. I told you that the fidelity is very high. What does that mean? That only strands that have this exact complement can form double stranded DNA. Because of that you can do the following experiment, and it's a simple experiment, it's simple to understand, but the concept is very important so I encourage you to think about it and make sure you understand it. If I took two double stranded DNA molecules and I exposed them to certain conditions that caused them to denature, that means its native structure falls apart. The native structure is this double stranded structure here and if I heat it up slightly and I add some base, so under slightly basic conditions, these molecules will fall apart because you've created conditions where the hydrogen bonding is no longer favorable so they peel apart. If I had a beaker sitting on the table here and it contained a million blue double stranded DNA molecules and a million red double stranded DNA molecules and I heated it up and added a little base, I'd soon have four million individual strands just floating around in the solution because I've broken up this hydrogen bonding and the DNA molecules fall apart. That's called denaturing DNA. That tells you something about the physical chemistry of the molecule; that it's these hydrogen bonds that hold the double strands and I can break those down under certain conditions. If I then put it back into its original condition, lower the heat say, temperature back to body temperature and reduce the pH down to seven again, the molecules will re-nature. They will reform their natural structure, and for DNA that means forming double helixes. But they will do that in a very particular way, in that only strands that exactly match will be able to reform their native structure. A blue strand here will never re-nature with a red strand because their sequences don't match exactly, but a complementary blue strand will always rematch with its partner. Now this is the basis of a physical chemistry process called hybridization. It turns out that this is how we can identify specific DNA sequences and how we can do things like DNA fingerprinting, how we can clone molecules, DNA molecules from one organism to another, rely very heavily on this principle of re-naturation and hybridization. Hybridization simply means that DNA will re-nature and form a stable double helix only with its particular match, only with the hybrid that it is perfectly complementary too. That's something about the physical chemistry of DNA, what it looks like, and how it behaves in the simple sense. What I want to spend the rest of the time doing is talking about some of the biological properties of DNA. Again, I know this is something that's familiar to most of you and so indulge me just for the rest of this lecture, I'll go through it. I want to try to hit the points that I think are important to remember because they're going to be concepts that come up again and again throughout the course, and I want to make sure that we're on the same page. Chapter 4. Central Dogma and DNA Synthesis [00:24:16] This diagram at the top here is a very familiar one to most of you, it's sometimes called the central dogma of molecular biology. It indicates how information flows in cells and indicates a lot about the work that a cell does in maintaining and recreating itself, and maintaining its environment. That is, that the information needed to operate a cell is stored in its DNA. That information gets put into action through a process, a biological process called transcription, where particular regions of DNA are transcribed into RNA. That RNA is made into proteins, and proteins are the working molecules of the cell, they're enzymes, they're structural molecules, they're are proteins that exist in the membrane that allow things to go in and out of the cell, so really the working molecules are the cell in every sense. RNA is converted into protein by a process called translation. Here's another picture of it here, showing it in a little bit more detail, that you have lots of DNA in each of the cells in your body but you're not using all that DNA at any one time. Every cell in your body is only using a fraction of the DNA that's available to it. Cells in your pancreas, for example, are making the protein insulin. They're making that because you need this protein insulin, it's a hormone, and it's important for sugar metabolism in your body. Those cells in your pancreas are making insulin. That means the gene that encodes insulin, the sequence of base pairs that encode insulin. I'll talk about what that means, encoding insulin means in a minute, but there's a gene that tells your body what insulin looks like and that gets transcribed but only in those cells that make insulin. It gets converted into a protein, insulin, only in those cells that are able to make the RNA that are able to express the protein. Well, it turns out that proteins are essential in driving this process too. In order to have DNA you have to make DNA and your cells are continually making DNA inside your body, through a process of DNA synthesis and that synthesis is occurring because of the presence of an enzyme, a protein called DNA polymerase. In this same way, this process of transcription which is occurring in cells throughout body all the time is made possible by a protein called RNA polymerase. It allows RNA to be made from a DNA template. It's not as simple as DNA going to RNA going to protein, because proteins need to be present in order to make these things happen as well. Let's talk about DNA synthesis for a minute. When a cell divides in your body, when cells of your intestine divide, when cells of your skin divide, and they're doing this all the time, in order for a cell to divide and form two daughter cells--we'll talk about that process next week--but in order for that to happen the parent cell has to copy all of its DNA in order to have enough DNA to pass on to two daughter cells. It does that through a process of DNA synthesis. What happens is the machinery of the cell, largely this protein DNA polymerase, is able to open up the double stranded DNA, to denature it locally, exposing two strands which it then makes - allows it to make copies of. What's shown here is what's called a replication fork in DNA that's undergoing synthesis. The DNA molecule here has been spread apart, opening up two single stranded DNA's which have complementary base sequences because they were double stranded DNA. A new single stranded DNA is formed on each one of these open single strands. So DNA is replicated using one strand of the DNA as a template. The result of this process if this replication went down the whole length of the DNA would be to form two identical, double stranded DNA molecules. Now the book talks in more detail about this and you can read about it. Polymerase needs a primer and that turns out to be important. A primer is a short RNA sequence or DNA sequence that gets sort of the process of replication jump started, and that's just because of the biological properties of DNA polymerase that that primer's needed. Synthesis always occurs in one direction and that makes sense to you now because you know there's a directionality and the chemistry is different going one way than the other and this DNA polymerase only works on the chemistry going in one direction. The correct complement is made because of these principles of Watson-Crick base pairing that we talked about before. It's easy to know what nucleotide to put in each position as you're going along and polymerizing a new molecule. Because this process occurs this way, if a parent cell replicates its DNA and then passes them along to two daughter cells, one of the daughter cells has one strand from the parent, the dark blue strand here for example, the other daughter cell has the light blue strand, the complementary strand, and each of the daughter cells has a newly synthesized piece of DNA. That's synthesis and that has to happen in order for cells to replicate and cell replication is happening in your body all the time. Transcription is also happening. Certain segments of DNA are being converted into RNA, and whereas in replication, you have to copy the whole genome, the whole - all of the chromosomes, all of the DNA contained in the chromosomes of the cell in order to completely replicate it; transcription only works on particular sequences of DNA. The DNA that encodes the proteins that are important to the life of that cell. A pancreas--cell in the pancreas, for example, needs to make insulin and so the gene for insulin is transcribed. Transcription just means making a single stranded RNA copy of a sequence of base pairs in a DNA. I told you that that's driven by a protein called RNA polymerase. RNA polymerase is smart, it knows where it needs to go in order to make the copy of RNA that's required. It operates in a similar fashion to DNA polymerase in that it denatures locally or opens up the double stranded DNA, but it's different in that it creates a new polymer from the DNA template in the language of RNA, using RNA nucleotides and not DNA nucleotides. The end result of transcription is not double stranded DNA, it's single stranded RNA where the RNA that's produced is called messenger RNA. It's the transcribed version of DNA, and it's the exact complement of a particular region of DNA. Again, more details in your book if you want to read that. Well, what I said is not entirely true. That used to be the way that we thought about it. DNA goes to RNA, goes to protein, that's it, and that is the way it happens in simple organisms like bacteria. In complex organisms like humans there's another step that we're still only learning about now. We know some parts of it, we don't know all of it. It's very important in the biological operation of human cells and that step is RNA splicing, or processing of this RNA, single stranded RNA that's produced by transcription. We're going to talk more about this as we go through some specific examples of where RNA processing is important. For now, just think about modifying your picture of this sort of information flow through a cell to include another step that RNA is produced by transcription from DNA, double stranded DNA goes to a single stranded RNA molecule, and that RNA is processed in the cell in some way in order to form messenger RNA. One of the forms of processing that happens, that's very important in human gene expression is that some of the sections of the DNA molecule are not really necessary for describing the protein. The regions that are necessary for describing what the protein is like are called exons, the regions that are not are called introns. A section of DNA that is responsible for encoding a gene, let's say it's the insulin gene for example, might be some stretch of DNA on a certain chromosome inside your cells, inside the cells of the pancreas. If it was directly transcribed there'd be regions that are important for making insulin and regions that are not. Those regions that are not are spliced out during RNA processing to form the mRNA transcript that's used to make the protein. Now there are other kinds of processing that can happen to RNA as well, and again, I said this is really still an emerging science, but this is one that's well known, and you could imagine that it's important. If I want to clone a gene from a human, if I want to clone the gene for human insulin, mean make many copies of the gene that's responsible for making insulin, I need to know whether there are introns there or not. If I'm going to make insulin from this I have to know that I've got the introns spliced out correctly. That's something that will come up in the lecture tomorrow, so remember that concept and that revision of this sort of classical picture. I don't want to go through this in detail because I assume that you know it, plus I think it's a little bit easier to read and have some time to digest, but this process of translation or conversion of messenger RNA into a protein is a complicated biological process that's occurring all the time. Before, we talked about how do you know what messenger RNA to make, how do you know what RNA to copy from a DNA template? Well you do that by this Watson-Crick base pairing, so I know if I have (A)(C)(G)(C)(G)(A) I know what messenger RNA to make from that because I have to satisfy these base pairing rules. Chapter 5. Genetic Code and Protein Synthesis [00:35:15] It's more complicated in making protein from an RNA strand and that complication is called the genetic code. You know that messenger RNA is read in three base units called codons, and so this particular piece of messenger RNA is drawn in this cartoon in three base pair units. That's because every three base pairs describes an amino acid in a protein. While there are only four different bases that make up either RNA or DNA, and so the complexity of an RNA polymer is limited. It's only got one of four possible choices at each position. There are more than 20 amino acids that make up the biological polymers called proteins, so there are 20 choices of each amino acid at a position on a protein. Why are there three bases in a codon? Because it takes three units where there's only four choices at each position to have at least 20 unique combinations. How many combinations of codons are there if there's three bases and four possibilities at each base? 4 x 4 x 4--possibilities, because I could have (A)(G)(C)(U) here, (A)(G)(C)(U), (A)(G)(C)(U) - 4 x 4 = 16. If I only had two per codon I wouldn't have enough. I'd only have 16 possible two base sequences, that's not enough to specify over 20 amino acids. If I have three, I have 16 x 4 or 64 possible choices, way more than enough. That creates a problem in the genetic code in that there's 64 possible sequences but there's only 20 some amino acids, so each amino acid can be specified by more than one codon. There are combinations to spare. There are 64 combinations of three bases and I only need to describe 20, so there's combinations to spare. If I look this table here shows you how biological translation takes place whenever a three base sequence is identified. Say it's (G)(C)(U), that specify an amino acid. How would I know what amino acid that is? Well, I could look up this table because somebody's figured it out for you. (G) in the first position, (C) in the second position, (U) in the--(G)(C)(U) right here is alanine and that's the protein - that's the amino acid that's in that position in the protein. You could read through this sequence and you could figure out what the sequence of amino acids would be. The genetic code is said to be degenerate because I can read in one direction. I can read (G)(C)(U) as alanine, for example, from this table and if I see a (G)(C)(U), I know it has to be alanine. If I have the protein and I want to say what does the messenger RNA that produced that protein looked like, I can't go backwards because there's more than one possibility for alanine, right? There's four of them right here (G)(C)(U), (G)(C)(C), (G)(C)(A), (G)(C)(G) so I don't know exactly what gene that came from, I can't read backwards. That has to do with the statistics of this, right? There's just more sequences in a three unit codon than I need for the amino acids. How does translation occur biologically? As shown in this cartoon here, again you don't need to know the details of this, but if you're interested in knowing what's the biological bases of the genetic code this is it. Inside cells in your body there are special RNA molecules called transfer RNA. They are RNA molecules but they have at some points in their life span, they have amino acids attached to them. For example, this transfer RNA has a unit here, (G)(A)(G) at one end of the transfer RNA molecule. At the other end, is attached the amino acid leucine. And your cells making transfer RNA, if they make a transfer RNA molecule that has (G)(A)(G) here, they only put leucine at the other end. That's the physical basis of the genetic code because when a messenger RNA sequence is being transcribed one base pair - one codon at a time, when the sequence (C)(U)(C) appears in the messenger RNA, that sequence (C)(U)(C) can only bind with one particular three base complement, it has to be the complement (G)(A)(G). This is the codon, this is the anti-codon, there's only one anti-codon that matches this one and that anti-codon is always - occurs in a molecule that has leucine attached to the other side. Translation occurs by a special kind of polymerization where these transfer RNA's operate by Watson-Crick base pairing. They bring into proximity an amino acid so that instead of forming a new polymer of a nucleic acid, a polymer of an amino acids is formed. A polymer of amino acids is a protein. Again, I'm just trying to highlight things you already know a little bit about, the book describes the details. This isn't particularly important for us to know here, but that messenger RNA gets converted into a protein of a specific composition through a biological process called translation is important. Chapter 6. Control of Gene Expression [00:41:55] The last thing I want to talk about today is control of gene expression. Control of gene expression is a very big topic and so I'm going to show you one cartoon to sort of tell you that it is a big topic that's really important. Why is control of gene expression important? Well I talked about it - earlier I've mentioned several times all the cells in my body, all the cells in your body have essentially the same genomic chromosomal DNA in their nucleus. If you looked at cells in my pancreas and cells in my brain, and cells in my skin they all have the same DNA. But skin cells and brain cells and pancreas cells aren't doing the same things. They don't look the same, they don't behave the same, they don't perform the same biological functions. Why? Because brain cells and pancreas cells are expressing different proteins. All the cells in your body have the capability of making all the proteins that you make, but they're not all made in every cell. Only cells of your pancreas make insulin, for example. Only cells in your brain make the enzymes that produce certain neuron transmitters that are responsible for brain function. How do cells in your brain know which proteins they ought to be making, and how do cells in the pancreas know which proteins they ought to be making? They do that because they can control the expression of genes. Gene expression, for us, will mean the same thing as production of a particular protein. When a gene gets expressed, that means its protein is produced. When we talk about gene expression than we're talking about this whole sequence of events I just described: transcription, RNA processing, translation to make the protein. All those things have to happen in an orderly fashion, in enough quantity in order for a particular cell to make a protein. To make insulin, for example, your cells of your pancreas have to be transcribing that gene, it has to be processed, has to be translated into the protein insulin. But that's not all, that protein insulin is made in the form of a long polypeptide that not - that's not always the final version of the protein. In fact, for insulin, it's not the final version of the protein that comes out of translation. There are more steps that have to happen correctly in order for that insulin to become active. Those steps are called post-translational modifications. It's a long word that just means other chemistry that happens on the molecule after translation. It turns out that the kinds of post-translational modifications that human cells are able to do are very complicated. You can do many post-translational modifications; your cells are capable of doing many post-translational modifications. Bacteria, or simple organisms, are not always capable of that. Now that's going to be important when we talk about making human gene - making human proteins inside alternate hosts like bacteria, that they can't do all the things that your cells can do. How is gene expression controlled? It can be controlled in a variety of ways. The most basic control is by controlling when transcription happens. When transcription happens and it turns out that there's a whole biology associated with this, including molecules that are floating around inside your cells called transcription factors, and their job-- they are molecules that are (that know) about particular genes and what some of the sequences and are able to turn on those genes inside cells, to make them transcribe. It requires RNA processing to happen smoothly, so if you can interfere with RNA processing you can stop a gene from being expressed. If you can interfere with any stage in RNA processing you can stop a gene from being expressed and this is a very hot topic in molecular biology now and human therapeutics. You've heard about RNA interference, for example, and that is the process of stopping this, to stop a gene from being expressed. For example, a gene that makes a cell cancerous, I'd like to stop it from being expressed. And you could interfere with a translation by degrading messenger RNA, for example. If you had a way to specifically chew up all the RNA molecules that are responsible for making a particular protein, you could stop it from being expressed even though your cell is trying to make it. I want you to look at this picture, read the little bit about gene - control of gene expression - that's in the book, know that it's a big topic, that we're not going to talk about it except we're going to talk about some examples where control of gene expression can be exploited in order to treat diseases, for example. So I'll see you on Thursday. [end of transcript] Back to Top	mp3

Transcript

Audio

html

Frontiers of Biomedical Engineering

BENG 100 - Lecture 3 - Genetic Engineering

Chapter 1. Introduction [00:00:00]

Professor Mark Saltzman: This week we're going to talk about DNA technology and genetic engineering: this is Chapter 3 of the book. Some of this will be familiar to some of you who've have had biology in high school or other places, you know something about DNA. In fact, even if you haven't had a biology class it's hard to be alive in 2008 and not know something about DNA; it's become such an important part of our lives. I'm going to ask you to indulge me while I go back to the beginning and talk about some things that you know but I'm going to go through this pretty rapidly. I think the book has a fairly good description of it so if you don't pick up everything in the lecture, hopefully you've read that beforehand, and you can go back to it afterwards and read about things that didn't make sense.

We'll talk about some chemistry today, what DNA molecules are like, why they have the behavior that they do, and you need to understand this in order to understand how you manipulate DNA. So my goal today is to talk about sort of the basics of the molecules, their chemistry, the function of DNA in cells, sort of basic - the basic side of that. Then on Thursday we're going to start talking about how to manipulate DNA and get closer to using it in Biomedical Engineering.

Chapter 2. Building Blocks of DNA [00:01:35]

DNA is a double helix, you know this, the double helix was--the structure of DNA was discovered about the time that I was born and so it's been known throughout your lifetime, you've always lived with it. It's really remarkable how far--how fast we have come from just knowing the structure of this molecule to be able to manipulate it and study it in great detail. I want to start by showing you this cartoon that you already know about with the structure of a double helix. It's a twisted ladder and there's a couple of things to notice about this familiar structure. One is that there are two backbones right here in the light blue, so this would be the upright parts of the ladder that are twisted. Those are two continuous strands that wind around each other to form the double helix. One thing to notice is this part of the double helix that we'll call the backbone. The backbone's on the outside of the molecule like the upright struts of a ladder on the outside of a ladder. On the inside are the rungs or the struts that hold the ladder together.

There are several things that you'll notice about the struts in this particular cartoon. One is that there's four different colors and so you can see red, blue, yellow, green here - four different colors and that's all there are, there aren't more than four. That there are two colors per strut, so what's linking the two backbones together are two colored segments that come from the outside towards the middle, and that the colors occur only in certain combinations, red and green, yellow and blue, that's all you see. You don't see a red and yellow, you don't see green and blue.

This is a feature of DNA shown in this cartoon form, so if you can keep that sort of schematic in mind, it makes it a lot easier to understand the detailed structure. That's what I want to do for the first few minutes of the lecture here is tell you a little bit about the details of the structure and how molecules fit into this image of DNA that's already very familiar to you. The things that you need to know are the things that are really listed on this slide. You're going to know more details about it, they're not really colors, they're chemicals, specific chemicals - but the pattern is the same.

The molecules that really make up DNA are nucleotides and DNA is a polymer of nucleotides. A polymer is just a large molecule that's made up of repeated units. We're familiar with polymers, plastics in our daily life; the chairs that you're sitting on are made of a kind of a plastic polymer that is basically an organic chemical that is cross-linked together. Cross-linked is not the right word--that is chemically bonded with repeat units to make large molecules so that when you have a bunch of large molecules together they have certain physical properties like the solid property of the plastic that you're sitting on.

Nucleic acids, of which DNA is an example, are polymers of nucleotides. So the repeating unit in DNA is this structure here, a nucleotide, which has three different regions. There's a sugar, a five carbon sugar, which forms the core of the nucleotide and attached to this five carbon sugar at specific positions on the sugar relative to this oxygen, which is part of the sugar ring, is a phosphate group and an organic base. When these nucleotides get polymerized to form a long DNA molecule they all get polymerized in exactly the same way, the chemistry is the same. The phosphate group of one nucleotide gets linked to the sugar group of another nucleotide and I'm going to show you that in a few minutes. So what's going to form the backbone is this continual link, phosphate to sugar, phosphate to sugar, phosphate to sugar, all linked together to form one long, long molecule. What's hanging off of the side of this long molecule that's formed by polymerizing nucleotides are - is this base unit.

It's the phosphate and the pentose that make up the backbone - that make up the upright struts of the ladder and it's the bases that make up the connecting struts, so the bases are the colors. There are four different bases, which I'll talk about in a moment. We're going to talk about two different nucleic acid molecules, two different nucleic acid polymers, one is DNA, deoxyribonucleic acid, the other is RNA, ribonucleic acid and one of the differences between the two is that the pentose, or the sugar, that makes up DNA is deoxyribose shown here, and the pentose that makes up RNA is ribose, shown here so every pentose in a DNA polymer is deoxyribose.

The phosphate is linked to this carbon on the pentose, and notice that there is a number on this carbon, it's called - it's the 5' carbon. This is a convention that organic chemist's use when they're describing molecules like this. They'd like to be able to refer to each carbon separately so they can talk about reactions with this molecule, so they number the carbons. In this case, these pentose molecules, whether it's ribose or deoxyribose, the carbons are numbered the same 1', 2', 3', 4', 5', those are the five carbons that make up the pentose. So I could refer to the 4' carbon and you'd know I'd mean this one, or the 2' carbon you'd know I mean this one. The ones that are important to us are the 3' carbon and the 5' carbon. The reason for that is that the 5' carbon is where the phosphate is attached.

In nucleotide the phosphate is always attached to the 5' carbon. The reason that 3' is important is that when you polymerize two nucleotides together and a third nucleotide, and a fourth nucleotide, when you polymerize nucleotides together they get polymerized, the phosphate of one gets linked to the 3' carbon of another. This is important because this molecule here, deoxyribose, is not the same upside down as it is - it's not symmetrical upside down and right side up, it's different because the 5' carbon's either pointed up or pointed down. The nucleotide has a directionality, there's an up and a down to it and it's going to turn out the chain that's formed by polymerizing these has a directionality as well and that's important in defining the structure. These are the pentoses - remember 5' and 3' because that orients you with respect to what direction the molecule is facing.

Now the bases, and you don't need to memorize the structures of these I'm going - I'm describing the whole molecule to you in its molecular detail and then we're going to simplify it down to a version that we can talk about more easily. To give you all the detail, there's two classes of bases that appear here. One class is called the purines and they have two ring-like structures. There are two of them that are going to be important to us, one is adenine and the other is guanine, shown here. Because saying adenine takes a long time and saying guanine takes a long time we're going to simplify it by calling adenine (A) and guanine (G). The second class is the pyrimidines and there's three of those that are important; uracil, thymine and cytosine which we're going too simplify by calling (U), (T) and (C).

Now remember that there were only four different colors in the cartoon of the DNA double helix that we talked about and I told you that those colors are really - represent the bases but there's five of them here. There's five because there's one of these that's particular to RNA only, that appears in only RNA, and there's one of them that appears in only DNA. The one that appears only in RNA is (U), the one that appears only in DNA is (T). In DNA there's only four colors, there's only four bases, (C)(T)(G)(A). In RNA there's four bases (A)(G)(U)(C). So (U) and (T) are interchangeable in a sense that (U) appears where (T) would appear in RNA and (T) appears where (U) would appear in DNA.

If I drew this altogether and this is one particular nucleic acid, now shown in more detail, all of the carbons of the pentose are shown here, the phosphate is shown, and a base is shown. This particular base is (A), this is ribose. So this is a monomer or a single unit from an RNA molecule, you know that because it's ribose and it's the particular molecule that has (A) as its organic base.

Chapter 3. Structure of DNA and RNA [00:11:18]

We don't need to really talk about all the molecular detail in order to completely describe a DNA or an RNA molecule because these structures repeat themselves. Every unit in the backbone of this ladder has the same sugar unit, the same pentose; it's either RNA or DNA and has ribose or deoxyribose, so you don't need to describe the whole thing. You can just say it's RNA or DNA and you know everything about the pentose in every molecule on the chain. You don't need to say anything more about the phosphate because they all have the phosphate and every set of these is hooked together in the same way. The 5' carbon has a phosphate off of it and that phosphate is linked to the 3' carbon of the next one and they all have a base hanging off the side.

The only thing I need to say in order to distinguish this particular part of the chemistry of a DNA or an RNA molecule is to say 'it's DNA or its RNA', and 'what the base is'. If I told you that 'draw me a nucleotide from RNA that has (A)', you could go back to this picture and you could draw the whole thing. You can just talk about it in a simpler way. You can say a polymer of DNA, for example, is four bases long, that means it has four of these repeat units and they go in the sequence from 5' to 3' of (A)(G)(T)(G). If I told you that a DNA sequence went 5' to 3' (A)(G)(T)(G) you could draw the whole thing referring back to these notes, right? In fact, you don't have to draw it this complicated way, you could just draw it as a line with (A)(G)(T) and (G) hanging off of it, and you would know that that's a DNA molecule four bases long.

Now what does the line represent here? This represents the upright struts on a ladder that I showed you before, it represents this backbone that's shown by - that's formed by polymerizing the pentose's together through phosphate's always going 5' to 3', 5' to 3'. I could take and draw a line continually down the molecule where my finger was touching; my finger would be touching a phosphate here, the 5' carbon, the 4' carbon, the 3' carbon, and the next phosphate. The backbone is what? It's phosphate, carbon, carbon, carbon, phosphate, carbon, carbon, carbon, phosphate and it has this structure hanging off the side.

Now DNA is a double helix and I've only shown you one part of the helix, right? I've shown you one upright strut and a base hanging off of it, but it forms a double helix because complementary strands of DNA strongly associate with one another and that's a very stable structure. They do that because the bases can interact with one another in particular ways, and this you know about. This was the famous finding of Watson and Crick in describing the structure of DNA. This is - what this diagram shows you is--the forces that hold these individual strands of DNA into a double stranded form. The forces occur because of hydrogen bonding between complementary pairs of the bases and the complementary pairs are adenine and thymine, (A) and (T) and guanine and cytosine, (G) and (C).

Now if you read in the book, you read about where this figure is shown in the book, you can understand more about why these structures line up in the right way so that the right molecular elements are together to form hydrogen bonding pairs between them. That's really beyond what I'll be asking you understand for the course but you can understand that if you read it, I'm sure.

Now remember that (T) only appears in DNA and (U) appears in RNA, and so (U) can also form a hydrogen binding pair with (A). The whole structure of a DNA molecule looks like this, going back to a more cartoon version like I showed you before but adding some detail onto it now. There are two upright struts of the ladder, one shown in blue here, the other shown black. They are linked together by four different colored segments indicated here not by colors now but by letters. It's DNA, so it's (G)(C)(A)(T) and they always occur in pairs. Where before it was two colored pairs now it's two lettered pairs (G)(C)(T)(A).

The chains are different now. They were colored the same, the backbones were colored the same in the diagram. They're colored different here to indicate one new difference that you know about now, and that's that there is an orientation, there's an up and down on the chain. That's due to the asymmetry of the nucleotide, that there's a 5' and a 3' end and the way that they're linked together. This blue chain here goes from 5' carbon all along the chain and there's a 3' carbon left open at the bottom.

If I wanted to link another nucleotide to this DNA chain what would I attach here on the bottom? I would attach the phosphate that's connected to the 5' carbon of another nucleotide. I would link this one facing in this direction I would add onto this. The other chain is facing in the other direction, the 3' carbon is up, the 5' carbon is down. Remember that this molecule, let's look at the blue one wouldn't be the same if I turned it upside down. It wouldn't be the same if I turned it upside down because the carbons - the rings here, the pentose's would all be turned over, the chemistry would look different and the sequence of bases would look different.

The corresponding half of the ladder that corresponds to any given ladder, let's say the black DNA molecule that corresponds to the blue one is not just a mirror image. We call it the complement and each strand of DNA, each polymer of DNA that you could make or you could draw has only one complement and that complement has the following features. One, its chain is oriented in the opposite direction: where this one goes 5' to 3', this one goes 3' to 5'. It's oriented in the oppose direction and it has the complementary base pairs at each position. Where there's an (A) here there's got to be a (T) here, where there's a (T) here there's got to be an (A) here, where there's a (C) here a (G), a (G), a (C). That's because you have to satisfy this base pair matching in order to have hydrogen bonding in each of the struts of the ladder in order to form a stable structure.

If I'm talking about two DNA strands and they differ only in one or two base pairs they won't be exact complements and they won't form this double helix. This notion of complementary strands is very important. It's the way that DNA exists inside the cells of your body. It exists in a double stranded form where every strand is matched by its complement. These molecules of DNA, very long molecules of DNA, are condensed and packaged within the nucleus of every cell in your body.

Every cell in your body has exactly the same DNA; that is if I could stretch out all the DNA and look at the base pair sequence, the sequences of bases along all the DNA in your chromosomes, they'd be identical in all the cells. They'd be different in each of us and that leads to the difference in the diversity between people. You've heard about the human genome project, we'll talk about that a little bit later. The goal of that was to take for a typical human, or for a typical - in the case of the human genome project maybe you're looking at fruit flies, you want to look at all the DNA in a fruit fly, but to look at the sequence of base pairs that makes up human DNA and write them all out; we'll talk about that later.

This slide shows one important feature of the physical chemistry of DNA that turns out to be very important for all of the technology that is built on DNA. It has to do with the nature of this complementary binding between double stranded DNA and the fidelity of this base pair matching in forming stable DNA molecules. I told you that the fidelity is very high. What does that mean? That only strands that have this exact complement can form double stranded DNA.

Because of that you can do the following experiment, and it's a simple experiment, it's simple to understand, but the concept is very important so I encourage you to think about it and make sure you understand it. If I took two double stranded DNA molecules and I exposed them to certain conditions that caused them to denature, that means its native structure falls apart. The native structure is this double stranded structure here and if I heat it up slightly and I add some base, so under slightly basic conditions, these molecules will fall apart because you've created conditions where the hydrogen bonding is no longer favorable so they peel apart. If I had a beaker sitting on the table here and it contained a million blue double stranded DNA molecules and a million red double stranded DNA molecules and I heated it up and added a little base, I'd soon have four million individual strands just floating around in the solution because I've broken up this hydrogen bonding and the DNA molecules fall apart. That's called denaturing DNA. That tells you something about the physical chemistry of the molecule; that it's these hydrogen bonds that hold the double strands and I can break those down under certain conditions.

If I then put it back into its original condition, lower the heat say, temperature back to body temperature and reduce the pH down to seven again, the molecules will re-nature. They will reform their natural structure, and for DNA that means forming double helixes. But they will do that in a very particular way, in that only strands that exactly match will be able to reform their native structure. A blue strand here will never re-nature with a red strand because their sequences don't match exactly, but a complementary blue strand will always rematch with its partner. Now this is the basis of a physical chemistry process called hybridization. It turns out that this is how we can identify specific DNA sequences and how we can do things like DNA fingerprinting, how we can clone molecules, DNA molecules from one organism to another, rely very heavily on this principle of re-naturation and hybridization. Hybridization simply means that DNA will re-nature and form a stable double helix only with its particular match, only with the hybrid that it is perfectly complementary too.

That's something about the physical chemistry of DNA, what it looks like, and how it behaves in the simple sense. What I want to spend the rest of the time doing is talking about some of the biological properties of DNA. Again, I know this is something that's familiar to most of you and so indulge me just for the rest of this lecture, I'll go through it. I want to try to hit the points that I think are important to remember because they're going to be concepts that come up again and again throughout the course, and I want to make sure that we're on the same page.

Chapter 4. Central Dogma and DNA Synthesis [00:24:16]

This diagram at the top here is a very familiar one to most of you, it's sometimes called the central dogma of molecular biology. It indicates how information flows in cells and indicates a lot about the work that a cell does in maintaining and recreating itself, and maintaining its environment. That is, that the information needed to operate a cell is stored in its DNA. That information gets put into action through a process, a biological process called transcription, where particular regions of DNA are transcribed into RNA. That RNA is made into proteins, and proteins are the working molecules of the cell, they're enzymes, they're structural molecules, they're are proteins that exist in the membrane that allow things to go in and out of the cell, so really the working molecules are the cell in every sense. RNA is converted into protein by a process called translation.

Here's another picture of it here, showing it in a little bit more detail, that you have lots of DNA in each of the cells in your body but you're not using all that DNA at any one time. Every cell in your body is only using a fraction of the DNA that's available to it. Cells in your pancreas, for example, are making the protein insulin. They're making that because you need this protein insulin, it's a hormone, and it's important for sugar metabolism in your body. Those cells in your pancreas are making insulin. That means the gene that encodes insulin, the sequence of base pairs that encode insulin. I'll talk about what that means, encoding insulin means in a minute, but there's a gene that tells your body what insulin looks like and that gets transcribed but only in those cells that make insulin. It gets converted into a protein, insulin, only in those cells that are able to make the RNA that are able to express the protein.

Well, it turns out that proteins are essential in driving this process too. In order to have DNA you have to make DNA and your cells are continually making DNA inside your body, through a process of DNA synthesis and that synthesis is occurring because of the presence of an enzyme, a protein called DNA polymerase. In this same way, this process of transcription which is occurring in cells throughout body all the time is made possible by a protein called RNA polymerase. It allows RNA to be made from a DNA template. It's not as simple as DNA going to RNA going to protein, because proteins need to be present in order to make these things happen as well.

Let's talk about DNA synthesis for a minute. When a cell divides in your body, when cells of your intestine divide, when cells of your skin divide, and they're doing this all the time, in order for a cell to divide and form two daughter cells--we'll talk about that process next week--but in order for that to happen the parent cell has to copy all of its DNA in order to have enough DNA to pass on to two daughter cells. It does that through a process of DNA synthesis. What happens is the machinery of the cell, largely this protein DNA polymerase, is able to open up the double stranded DNA, to denature it locally, exposing two strands which it then makes - allows it to make copies of.

What's shown here is what's called a replication fork in DNA that's undergoing synthesis. The DNA molecule here has been spread apart, opening up two single stranded DNA's which have complementary base sequences because they were double stranded DNA. A new single stranded DNA is formed on each one of these open single strands. So DNA is replicated using one strand of the DNA as a template. The result of this process if this replication went down the whole length of the DNA would be to form two identical, double stranded DNA molecules. Now the book talks in more detail about this and you can read about it. Polymerase needs a primer and that turns out to be important. A primer is a short RNA sequence or DNA sequence that gets sort of the process of replication jump started, and that's just because of the biological properties of DNA polymerase that that primer's needed.

Synthesis always occurs in one direction and that makes sense to you now because you know there's a directionality and the chemistry is different going one way than the other and this DNA polymerase only works on the chemistry going in one direction. The correct complement is made because of these principles of Watson-Crick base pairing that we talked about before. It's easy to know what nucleotide to put in each position as you're going along and polymerizing a new molecule. Because this process occurs this way, if a parent cell replicates its DNA and then passes them along to two daughter cells, one of the daughter cells has one strand from the parent, the dark blue strand here for example, the other daughter cell has the light blue strand, the complementary strand, and each of the daughter cells has a newly synthesized piece of DNA. That's synthesis and that has to happen in order for cells to replicate and cell replication is happening in your body all the time.

Transcription is also happening. Certain segments of DNA are being converted into RNA, and whereas in replication, you have to copy the whole genome, the whole - all of the chromosomes, all of the DNA contained in the chromosomes of the cell in order to completely replicate it; transcription only works on particular sequences of DNA. The DNA that encodes the proteins that are important to the life of that cell. A pancreas--cell in the pancreas, for example, needs to make insulin and so the gene for insulin is transcribed.

Transcription just means making a single stranded RNA copy of a sequence of base pairs in a DNA. I told you that that's driven by a protein called RNA polymerase. RNA polymerase is smart, it knows where it needs to go in order to make the copy of RNA that's required. It operates in a similar fashion to DNA polymerase in that it denatures locally or opens up the double stranded DNA, but it's different in that it creates a new polymer from the DNA template in the language of RNA, using RNA nucleotides and not DNA nucleotides. The end result of transcription is not double stranded DNA, it's single stranded RNA where the RNA that's produced is called messenger RNA. It's the transcribed version of DNA, and it's the exact complement of a particular region of DNA. Again, more details in your book if you want to read that.

Well, what I said is not entirely true. That used to be the way that we thought about it. DNA goes to RNA, goes to protein, that's it, and that is the way it happens in simple organisms like bacteria. In complex organisms like humans there's another step that we're still only learning about now. We know some parts of it, we don't know all of it. It's very important in the biological operation of human cells and that step is RNA splicing, or processing of this RNA, single stranded RNA that's produced by transcription. We're going to talk more about this as we go through some specific examples of where RNA processing is important. For now, just think about modifying your picture of this sort of information flow through a cell to include another step that RNA is produced by transcription from DNA, double stranded DNA goes to a single stranded RNA molecule, and that RNA is processed in the cell in some way in order to form messenger RNA.

One of the forms of processing that happens, that's very important in human gene expression is that some of the sections of the DNA molecule are not really necessary for describing the protein. The regions that are necessary for describing what the protein is like are called exons, the regions that are not are called introns. A section of DNA that is responsible for encoding a gene, let's say it's the insulin gene for example, might be some stretch of DNA on a certain chromosome inside your cells, inside the cells of the pancreas. If it was directly transcribed there'd be regions that are important for making insulin and regions that are not. Those regions that are not are spliced out during RNA processing to form the mRNA transcript that's used to make the protein.

Now there are other kinds of processing that can happen to RNA as well, and again, I said this is really still an emerging science, but this is one that's well known, and you could imagine that it's important. If I want to clone a gene from a human, if I want to clone the gene for human insulin, mean make many copies of the gene that's responsible for making insulin, I need to know whether there are introns there or not. If I'm going to make insulin from this I have to know that I've got the introns spliced out correctly. That's something that will come up in the lecture tomorrow, so remember that concept and that revision of this sort of classical picture.

I don't want to go through this in detail because I assume that you know it, plus I think it's a little bit easier to read and have some time to digest, but this process of translation or conversion of messenger RNA into a protein is a complicated biological process that's occurring all the time. Before, we talked about how do you know what messenger RNA to make, how do you know what RNA to copy from a DNA template? Well you do that by this Watson-Crick base pairing, so I know if I have (A)(C)(G)(C)(G)(A) I know what messenger RNA to make from that because I have to satisfy these base pairing rules.

Chapter 5. Genetic Code and Protein Synthesis [00:35:15]

It's more complicated in making protein from an RNA strand and that complication is called the genetic code. You know that messenger RNA is read in three base units called codons, and so this particular piece of messenger RNA is drawn in this cartoon in three base pair units. That's because every three base pairs describes an amino acid in a protein. While there are only four different bases that make up either RNA or DNA, and so the complexity of an RNA polymer is limited. It's only got one of four possible choices at each position. There are more than 20 amino acids that make up the biological polymers called proteins, so there are 20 choices of each amino acid at a position on a protein.

Why are there three bases in a codon? Because it takes three units where there's only four choices at each position to have at least 20 unique combinations. How many combinations of codons are there if there's three bases and four possibilities at each base? 4 x 4 x 4--possibilities, because I could have (A)(G)(C)(U) here, (A)(G)(C)(U), (A)(G)(C)(U) - 4 x 4 = 16. If I only had two per codon I wouldn't have enough. I'd only have 16 possible two base sequences, that's not enough to specify over 20 amino acids. If I have three, I have 16 x 4 or 64 possible choices, way more than enough. That creates a problem in the genetic code in that there's 64 possible sequences but there's only 20 some amino acids, so each amino acid can be specified by more than one codon. There are combinations to spare. There are 64 combinations of three bases and I only need to describe 20, so there's combinations to spare.

If I look this table here shows you how biological translation takes place whenever a three base sequence is identified. Say it's (G)(C)(U), that specify an amino acid. How would I know what amino acid that is? Well, I could look up this table because somebody's figured it out for you. (G) in the first position, (C) in the second position, (U) in the--(G)(C)(U) right here is alanine and that's the protein - that's the amino acid that's in that position in the protein. You could read through this sequence and you could figure out what the sequence of amino acids would be.

The genetic code is said to be degenerate because I can read in one direction. I can read (G)(C)(U) as alanine, for example, from this table and if I see a (G)(C)(U), I know it has to be alanine. If I have the protein and I want to say what does the messenger RNA that produced that protein looked like, I can't go backwards because there's more than one possibility for alanine, right? There's four of them right here (G)(C)(U), (G)(C)(C), (G)(C)(A), (G)(C)(G) so I don't know exactly what gene that came from, I can't read backwards. That has to do with the statistics of this, right? There's just more sequences in a three unit codon than I need for the amino acids.

How does translation occur biologically? As shown in this cartoon here, again you don't need to know the details of this, but if you're interested in knowing what's the biological bases of the genetic code this is it. Inside cells in your body there are special RNA molecules called transfer RNA. They are RNA molecules but they have at some points in their life span, they have amino acids attached to them. For example, this transfer RNA has a unit here, (G)(A)(G) at one end of the transfer RNA molecule. At the other end, is attached the amino acid leucine. And your cells making transfer RNA, if they make a transfer RNA molecule that has (G)(A)(G) here, they only put leucine at the other end. That's the physical basis of the genetic code because when a messenger RNA sequence is being transcribed one base pair - one codon at a time, when the sequence (C)(U)(C) appears in the messenger RNA, that sequence (C)(U)(C) can only bind with one particular three base complement, it has to be the complement (G)(A)(G). This is the codon, this is the anti-codon, there's only one anti-codon that matches this one and that anti-codon is always - occurs in a molecule that has leucine attached to the other side.

Translation occurs by a special kind of polymerization where these transfer RNA's operate by Watson-Crick base pairing. They bring into proximity an amino acid so that instead of forming a new polymer of a nucleic acid, a polymer of an amino acids is formed. A polymer of amino acids is a protein. Again, I'm just trying to highlight things you already know a little bit about, the book describes the details. This isn't particularly important for us to know here, but that messenger RNA gets converted into a protein of a specific composition through a biological process called translation is important.

Chapter 6. Control of Gene Expression [00:41:55]

The last thing I want to talk about today is control of gene expression. Control of gene expression is a very big topic and so I'm going to show you one cartoon to sort of tell you that it is a big topic that's really important. Why is control of gene expression important? Well I talked about it - earlier I've mentioned several times all the cells in my body, all the cells in your body have essentially the same genomic chromosomal DNA in their nucleus. If you looked at cells in my pancreas and cells in my brain, and cells in my skin they all have the same DNA. But skin cells and brain cells and pancreas cells aren't doing the same things. They don't look the same, they don't behave the same, they don't perform the same biological functions. Why? Because brain cells and pancreas cells are expressing different proteins. All the cells in your body have the capability of making all the proteins that you make, but they're not all made in every cell. Only cells of your pancreas make insulin, for example. Only cells in your brain make the enzymes that produce certain neuron transmitters that are responsible for brain function.

How do cells in your brain know which proteins they ought to be making, and how do cells in the pancreas know which proteins they ought to be making? They do that because they can control the expression of genes. Gene expression, for us, will mean the same thing as production of a particular protein. When a gene gets expressed, that means its protein is produced. When we talk about gene expression than we're talking about this whole sequence of events I just described: transcription, RNA processing, translation to make the protein. All those things have to happen in an orderly fashion, in enough quantity in order for a particular cell to make a protein.

To make insulin, for example, your cells of your pancreas have to be transcribing that gene, it has to be processed, has to be translated into the protein insulin. But that's not all, that protein insulin is made in the form of a long polypeptide that not - that's not always the final version of the protein. In fact, for insulin, it's not the final version of the protein that comes out of translation. There are more steps that have to happen correctly in order for that insulin to become active. Those steps are called post-translational modifications. It's a long word that just means other chemistry that happens on the molecule after translation. It turns out that the kinds of post-translational modifications that human cells are able to do are very complicated. You can do many post-translational modifications; your cells are capable of doing many post-translational modifications. Bacteria, or simple organisms, are not always capable of that. Now that's going to be important when we talk about making human gene - making human proteins inside alternate hosts like bacteria, that they can't do all the things that your cells can do.

How is gene expression controlled? It can be controlled in a variety of ways. The most basic control is by controlling when transcription happens. When transcription happens and it turns out that there's a whole biology associated with this, including molecules that are floating around inside your cells called transcription factors, and their job-- they are molecules that are (that know) about particular genes and what some of the sequences and are able to turn on those genes inside cells, to make them transcribe. It requires RNA processing to happen smoothly, so if you can interfere with RNA processing you can stop a gene from being expressed. If you can interfere with any stage in RNA processing you can stop a gene from being expressed and this is a very hot topic in molecular biology now and human therapeutics.

You've heard about RNA interference, for example, and that is the process of stopping this, to stop a gene from being expressed. For example, a gene that makes a cell cancerous, I'd like to stop it from being expressed. And you could interfere with a translation by degrading messenger RNA, for example. If you had a way to specifically chew up all the RNA molecules that are responsible for making a particular protein, you could stop it from being expressed even though your cell is trying to make it. I want you to look at this picture, read the little bit about gene - control of gene expression - that's in the book, know that it's a big topic, that we're not going to talk about it except we're going to talk about some examples where control of gene expression can be exploited in order to treat diseases, for example. So I'll see you on Thursday.

[end of transcript]

mp3

<< Previous Session

Next Session >>

Assignment

Biomedical Engineering: Bridging Medicine and Technology, in preparation by Mark Saltzman (forthcoming by Cambridge University Press); chapter 3

Summary and Key Concepts: Chapter 3

Resources

Summary and Key Concepts: Chapter 3 [PDF]

BENG 100: Frontiers of Biomedical Engineering