WEBVTT 00:01.820 --> 00:02.900 Prof: Very good. 00:02.900 --> 00:06.390 So today we're going to talk about phylogenetics and 00:06.390 --> 00:10.360 systematics, and the lecture has this kind of structure. 00:10.360 --> 00:14.020 I'll remind you of what the Tree of Life looks like. 00:14.020 --> 00:17.770 Then I will motivate the lecture basically by giving you 00:17.765 --> 00:21.915 some surprising recent results from molecular systematics, 00:21.920 --> 00:25.660 and then I will go into basically what phylogenetic 00:25.660 --> 00:29.550 concepts are and how to build a phylogenetic tree. 00:29.550 --> 00:33.360 I won't do this in great detail, but I hope I do enough 00:33.362 --> 00:37.742 of it so that you at least have a good feel for the issues that 00:37.739 --> 00:40.139 are involved when you do this. 00:40.140 --> 00:44.540 So this is the same picture of the Tree of Life that I used 00:44.542 --> 00:49.112 earlier in the course, and basically it shows you that 00:49.105 --> 00:52.195 since about 3.5 billion years ago, 00:52.200 --> 00:55.360 there have been three large clades that have developed. 00:55.360 --> 01:20.720 <> 01:20.720 --> 01:23.410 Take a moment and look at this picture, 01:23.409 --> 01:25.709 and look at the next one--which is a-- 01:25.709 --> 01:33.809 this next one is essentially a blowup of what's been going on 01:33.810 --> 01:39.720 since about here-- and think about how much that 01:39.715 --> 01:42.365 tells us about biology. 01:42.370 --> 01:45.390 It provides a very basic structure, the structure of 01:45.391 --> 01:46.341 relationships. 01:46.340 --> 01:49.750 It tells you which things shared common ancestors, 01:49.753 --> 01:54.003 and why we might expect them to be one way rather than another 01:54.001 --> 01:54.561 way. 01:54.560 --> 01:58.230 It sets up thousands of comparisons in our minds about 01:58.230 --> 02:00.310 questions that we might ask. 02:00.310 --> 02:04.110 It provides an extremely useful, overarching structure. 02:04.108 --> 02:08.628 But the question is, how did phylogenetic biologists 02:08.634 --> 02:13.964 actually get this picture, and are they still changing it? 02:13.960 --> 02:17.220 And the answer is: they got it with the methods of 02:17.217 --> 02:19.877 inference that I'm going to sketch today, 02:19.877 --> 02:22.137 and they're still changing it. 02:22.139 --> 02:26.059 It's not written in stone and it's been changing ever since 02:26.058 --> 02:29.368 the first time somebody tried to write it down. 02:29.370 --> 02:33.650 So these things are working hypotheses, 02:33.650 --> 02:39.110 and we are able to get a better and better refinement of them as 02:39.111 --> 02:43.591 new information comes in, but there are significant 02:43.591 --> 02:44.331 changes. 02:44.330 --> 02:50.440 Now this is what Darwin had to say in the Origin about 02:50.436 --> 02:52.366 the Tree of Life. 02:52.370 --> 02:56.320 It's a truly wonderful fact, the wonder of which we are apt 02:56.319 --> 02:59.669 to overlook from familiarity, that all animals and all 02:59.669 --> 03:01.619 plants, throughout all time and space, 03:01.620 --> 03:04.190 should be related to each other in groups. 03:04.188 --> 03:07.208 And he goes on about how these groups are hierarchical. 03:07.210 --> 03:10.290 He said, "The affinities of all the beings of the same 03:10.293 --> 03:13.273 class have sometimes been represented by a great tree. 03:13.270 --> 03:17.720 I believe this simile is largely the truth." 03:17.720 --> 03:22.500 So in this beautiful Victorian prose, Darwin rhapsodizes a bit 03:22.503 --> 03:26.663 about the Tree of Life, and it was the only picture he 03:26.661 --> 03:28.231 had in his book. 03:28.229 --> 03:28.419 Okay? 03:28.420 --> 03:29.840 So he thought it was real important. 03:29.840 --> 03:33.610 He made a sketch of it, and by it, by this sketch, 03:33.610 --> 03:38.810 he meant to indicate that a lot of things had gone extinct and 03:38.812 --> 03:44.272 that through ancestry and shared common ancestry you could define 03:44.270 --> 03:45.720 relationship. 03:45.720 --> 03:49.180 Now he could see immediately that the tree isn't given, 03:49.181 --> 03:50.851 it should be discovered. 03:50.848 --> 03:53.798 He said, "Our classifications will come to be, 03:53.800 --> 03:56.810 as far as they can be so made, genealogies;" 03:56.810 --> 04:00.290 So they will reflect what actually happened in terms of 04:00.293 --> 04:03.393 evolutionary history to generate relationship; 04:03.389 --> 04:07.229 "and then they will truly give what may be called the plan 04:07.231 --> 04:08.101 of creation. 04:08.098 --> 04:11.338 The rules for classifying will no doubt become simpler when we 04:11.337 --> 04:13.087 have a definite object in view. 04:13.090 --> 04:16.420 We possess no pedigree or armorial bearings." 04:16.420 --> 04:16.750 Okay? 04:16.745 --> 04:20.585 That's mid-nineteenth century code for there's no barcode on 04:20.591 --> 04:21.571 the species. 04:21.569 --> 04:21.809 Okay? 04:21.812 --> 04:24.772 They're not wearing their name on the forehead and they're not 04:24.774 --> 04:26.624 telling you who they're related to. 04:26.620 --> 04:31.420 So we have to discover and trace these diverging lines. 04:31.420 --> 04:38.270 Well it actually has taken a long time for phylogenetics to 04:38.274 --> 04:43.244 settle on clear logic and clear methods. 04:43.240 --> 04:49.650 Up until about oh somewhere around 19-- 04:49.649 --> 04:53.899 the ideas were there but they weren't really implemented until 04:53.904 --> 04:57.294 about 1965 to 1970, and there ensued a huge 04:57.291 --> 05:01.161 controversy that lasted for about twenty years. 05:01.160 --> 05:05.740 And now that all--the dust has settled and that seems to be in 05:05.738 --> 05:08.498 the past, but many people actually my age 05:08.500 --> 05:12.480 are still marked by it, because we witnessed it. 05:12.480 --> 05:16.450 And I'm now going to more or less summarize what came out of 05:16.454 --> 05:19.414 it, but I just want to signal to 05:19.408 --> 05:23.598 you that this was, in the not very distant past, 05:23.596 --> 05:27.266 an extremely controversial area of science, 05:27.269 --> 05:31.089 and that it was revolutionized both by the advent of DNA 05:31.089 --> 05:34.799 sequence data, and by the development of 05:34.800 --> 05:40.570 powerful mathematical and computer methods for determining 05:40.569 --> 05:43.729 relationship; and there was an argument about 05:43.733 --> 05:45.033 which ones we should use. 05:45.029 --> 05:50.669 So the first step in that is taken by a guy named Zimmermann, 05:50.670 --> 05:54.810 and it's just sketched here, and it seems very simple: 05:54.812 --> 05:59.582 Sharing a more recent common ancestor defines relationship. 05:59.579 --> 06:02.819 So magnolias and apple trees are more closely related to each 06:02.817 --> 06:04.757 other than either is to a gingko, 06:04.759 --> 06:10.669 because this point on the tree is later in time, 06:10.670 --> 06:13.750 and connects them, than this point on the tree, 06:13.750 --> 06:15.900 which connects them to gingkoes. 06:15.899 --> 06:17.809 It seems like a very simple idea. 06:17.810 --> 06:21.700 It wasn't really clearly articulated until Zimmermann 06:21.704 --> 06:24.554 laid it down as a principle in 1931. 06:24.550 --> 06:28.260 Now some of these concepts you've already been through. 06:28.259 --> 06:31.909 So I'm going to simply mention these names again, 06:31.911 --> 06:36.551 just so that they get repeated and start becoming part of your 06:36.553 --> 06:38.003 own vocabulary. 06:38.000 --> 06:41.800 Monophyletic is a group that contains all of the descendants 06:41.800 --> 06:45.150 from a single common ancestor and nothing that is not 06:45.151 --> 06:47.731 descended from that common ancestor; 06:47.730 --> 06:51.370 paraphyletic groups are groups that do not contain all of the 06:51.370 --> 06:54.830 species descended from the most recent common ancestor; 06:54.829 --> 06:58.719 and polyphyletic groups really are a hodgepodge of stuff that 06:58.716 --> 07:01.046 shouldn't be in that sort of bin, 07:01.050 --> 07:05.610 under that category at all, because they have-- 07:05.610 --> 07:08.020 basically independent evolutionary lines are being 07:08.016 --> 07:09.486 incorrectly lumped together. 07:09.490 --> 07:13.110 07:13.110 --> 07:17.230 The basis of a lot of this inference is the concept of 07:17.226 --> 07:19.646 homology, and homology and analogy--I'll 07:19.648 --> 07:22.328 repeat them in a minute when we get to some slides that 07:22.327 --> 07:24.707 illustrate them-- but they were defined by 07:24.711 --> 07:27.441 Richard Owen in the early nineteenth century, 07:27.439 --> 07:29.949 before Darwin's Origin of Species. 07:29.949 --> 07:32.859 He was a great morphologist, in London, 07:32.860 --> 07:36.340 and basically the idea of homology is that a trait is 07:36.336 --> 07:40.546 identical in two or more species because they are descended from 07:40.548 --> 07:42.018 a common ancestor. 07:42.019 --> 07:45.499 So they got it because their ancestor had it. 07:45.500 --> 07:47.730 And homoplasy, or convergence, 07:47.733 --> 07:52.203 is similarity for any reason other than common ancestry. 07:52.199 --> 07:57.079 So convergence in morphological traits, mutation to the same 07:57.076 --> 08:00.626 sequence, in DNA, will lead to homoplasy. 08:00.629 --> 08:05.949 So homology is helpful, and homoplasy is confusing in 08:05.949 --> 08:08.609 determining phylogenies. 08:08.610 --> 08:11.200 So here's a good monophyletic group. 08:11.199 --> 08:13.759 Okay? This is the dogs. 08:13.759 --> 08:16.259 And here the dogs are. 08:16.259 --> 08:19.919 Here's Canis and Lycaon--so the Lycaon is the African Hunting 08:19.922 --> 08:23.892 Dog--and their closest relatives are the South American wolves. 08:23.889 --> 08:27.439 Canis, by the way, Canis--the timber wolf is in 08:27.435 --> 08:31.075 the genus Canis, and all domestic dogs are 08:31.077 --> 08:34.637 descended from wolves, which is a nice example of 08:34.640 --> 08:36.630 really rapid evolution; take a wolf, 08:36.633 --> 08:39.063 turn it into a Saint Bernard and a Chihuahua. 08:39.059 --> 08:40.239 I give you 5000 years. 08:40.240 --> 08:40.980 Can you do it? 08:40.980 --> 08:43.910 Well it takes pretty tough breeding to do that, 08:43.908 --> 08:44.798 but you can. 08:44.798 --> 08:47.118 And, by the way, I had a colleague--Armand 08:47.120 --> 08:50.520 Kuris, who's at Santa Barbara, decided he was going to create 08:50.517 --> 08:52.797 his own race; he wanted to create the ugliest 08:52.802 --> 08:53.572 dog in the world. 08:53.570 --> 08:56.250 He was going to name it the Louisiana Swamp Dog. 08:56.250 --> 08:58.270 It took him six generations. 08:58.269 --> 08:59.529 I mean, this dog is really ugly. 08:59.529 --> 09:00.329 > 09:00.330 --> 09:03.340 But it's registered with the American Kennel Club. 09:03.340 --> 09:05.230 So you can do rapid breeding with dogs. 09:05.230 --> 09:07.740 That's, of course, on a timescale up here that you 09:07.735 --> 09:10.695 couldn't even fit onto that little white line at the top of 09:10.701 --> 09:11.521 the picture. 09:11.519 --> 09:16.659 Interestingly, all this stuff over here is 09:16.657 --> 09:17.907 extinct. 09:17.908 --> 09:19.938 There are a few little marks here that are kind of 09:19.940 --> 09:20.520 interesting. 09:20.519 --> 09:24.629 Out here, on the branch going up to the Caninae, 09:24.629 --> 09:26.969 is digitigrady, which means that at that point 09:26.970 --> 09:30.090 they started chasing after things that were running fast, 09:30.090 --> 09:31.960 and evolution, just like it did with horses, 09:31.960 --> 09:35.830 started to cause them to be selected to run on their toes. 09:35.830 --> 09:39.500 So they got longer legs by running on their toe tips, 09:39.495 --> 09:42.945 rather than on the pads of their feet directly. 09:42.950 --> 09:46.620 And bone cracking came in right about here, so they could get 09:46.619 --> 09:48.209 the marrow out of bones. 09:48.210 --> 09:50.000 That's a good monophyletic group. 09:50.000 --> 09:51.840 You probably can't see it up there. 09:51.840 --> 09:53.710 I'll give you a little bit of timeline. 09:53.710 --> 09:57.680 The whole thing starts taking off about 40,000,000 years ago, 09:57.681 --> 09:58.741 in the Eocene. 09:58.740 --> 10:03.640 And Canis, the dog genus itself, is about 5,000,000 years 10:03.644 --> 10:05.584 old; it's about as old as Homo. 10:05.580 --> 10:09.640 10:09.639 --> 10:12.129 Okay, here are some paraphyletic groups. 10:12.129 --> 10:15.559 10:15.559 --> 10:17.969 Reptiles is paraphyletic. 10:17.970 --> 10:18.400 Okay? 10:18.399 --> 10:22.359 It's got turtles, lizards and crocodiles in it, 10:22.359 --> 10:25.199 but it doesn't have the birds. 10:25.200 --> 10:28.810 So reptiles is an inaccurate term. 10:28.808 --> 10:31.398 This is the monophyletic group--lizards, 10:31.403 --> 10:35.333 crocodiles and birds--and we don't have an everyday language 10:35.326 --> 10:36.386 term for it. 10:36.389 --> 10:40.029 Crocodiles, by the way, build nests and guard them, 10:40.029 --> 10:42.119 and when the baby crocodiles hatch out, 10:42.120 --> 10:44.880 they cheep like birds: "cheep, 10:44.875 --> 10:46.005 cheep." 10:46.009 --> 10:51.459 So there are--this relationship between crocs and birds is well 10:51.458 --> 10:52.688 established. 10:52.690 --> 10:55.050 Here's a polyphyletic group. 10:55.048 --> 10:59.028 If we define a group called the homoeothermic tetrapods, 10:59.027 --> 11:01.917 it contains the birds and the mammals. 11:01.918 --> 11:06.718 But look at all of the things that are more closely related to 11:06.721 --> 11:09.321 the birds than the mammals are. 11:09.320 --> 11:11.760 So if we were to define the birds and the mammals as a 11:11.755 --> 11:13.845 group, it would be a false group, 11:13.850 --> 11:17.730 because phylogenetically the birds have many things which are 11:17.727 --> 11:21.087 more closely related to them than the mammals do. 11:21.090 --> 11:22.730 And this group is polyphyletic. 11:22.730 --> 11:22.990 Okay? 11:22.986 --> 11:25.506 It has contributions from two different sources. 11:25.509 --> 11:28.539 Another good example of a polyphyletic group would be if 11:28.542 --> 11:31.852 you decided to link together all of the things that look like 11:31.849 --> 11:34.109 cactuses in Africa and South America. 11:34.110 --> 11:37.830 The ones in South America are cactuses, but the ones in Africa 11:37.833 --> 11:41.053 are euphorbs; they look just like cactuses. 11:41.048 --> 11:43.148 There's a nice example in the Peabody Museum. 11:43.149 --> 11:45.759 You can look at them sitting next to each other. 11:45.759 --> 11:46.229 Okay? 11:46.226 --> 11:48.926 That's a polyphyletic group. 11:48.928 --> 11:53.058 They are convergent, they came together. 11:53.058 --> 11:56.538 Similarly, the homeothermy in the birds and mammals is 11:56.544 --> 11:57.404 convergent. 11:57.399 --> 12:01.449 The ancestor back here did not have warm blood. 12:01.450 --> 12:04.760 It evolved twice, once in the line to mammals and 12:04.764 --> 12:06.704 once in the line to birds. 12:06.700 --> 12:09.030 I know there were some-warm blooded dinosaurs, 12:09.030 --> 12:10.120 but that was later. 12:10.120 --> 12:13.380 Warm-bloodedness probably came in, in this line, 12:13.375 --> 12:17.375 about here somewhere; don't know exactly. 12:17.379 --> 12:23.489 Then this central concept, homology. 12:23.490 --> 12:26.630 Here are the forelimbs of turtles, humans, 12:26.634 --> 12:29.094 horses, birds, bats and seals. 12:29.090 --> 12:33.720 So we've got some stuff here that's spanning quite a bit of 12:33.721 --> 12:38.091 the tetrapods; vertebrates that are living on 12:38.091 --> 12:38.711 land. 12:38.710 --> 12:44.220 And you can see that it's possible to match up sections of 12:44.221 --> 12:48.091 these structures, all the way through. 12:48.090 --> 12:52.550 And actually if you study that, and you realize that they were 12:52.548 --> 12:55.618 all in an ancestral condition together, 12:55.620 --> 12:58.760 you can see how evolution has changed their proportions, 12:58.759 --> 13:02.129 changed their thickness, but it hasn't changed their 13:02.129 --> 13:04.639 spatial relationships to each other. 13:04.639 --> 13:05.969 And, in fact, if you go through the 13:05.971 --> 13:07.991 development, you'll discover that the same 13:07.985 --> 13:10.735 nerves coming out of the backbone are running to the same 13:10.738 --> 13:13.538 parts of the limb, and all of those conditions 13:13.541 --> 13:16.661 have been held together, over evolutionary time. 13:16.658 --> 13:19.468 And if you look at the HOX genes that are controlling their 13:19.471 --> 13:20.981 development, you can see, 13:20.977 --> 13:24.027 as you saw in a lecture a little bit earlier, 13:24.028 --> 13:26.728 that the DNA sequences in the HOX genes, 13:26.730 --> 13:29.800 that are telling it whether to make a humerus, 13:29.798 --> 13:34.618 a radius, an ulna or digits, are actually homologous in 13:34.620 --> 13:37.210 their DNA sequence as well. 13:37.210 --> 13:41.470 So there's a molecular homology that underlies the morphological 13:41.466 --> 13:42.206 homology. 13:42.210 --> 13:45.010 And if you look at molecular sequences, 13:45.009 --> 13:47.559 here's a gene called aniridia in humans, 13:47.558 --> 13:50.578 and a gene called eye-less in fruit flies, 13:50.580 --> 13:54.330 and only six--these are not DNA sequences, 13:54.330 --> 13:58.540 this is protein sequence; so these are amino acids--only 13:58.542 --> 14:01.452 six of the sixty amino acids are different. 14:01.450 --> 14:03.910 The two sequences are 90% identical. 14:03.908 --> 14:07.028 There are search algorithms, like BLAST, that go out and 14:07.028 --> 14:09.238 look for these sorts of similarities. 14:09.240 --> 14:11.750 So that if you get a candidate gene or a candidate protein 14:11.750 --> 14:14.080 sequence, it's possible simply to put a 14:14.076 --> 14:16.356 search term in, to a search engine, 14:16.363 --> 14:19.483 and have other genes from other species pop up. 14:19.480 --> 14:21.410 So you can look for molecular homology that way. 14:21.409 --> 14:25.719 14:25.720 --> 14:31.000 A good molecular homology is the fruit fly homeobox complex 14:31.000 --> 14:35.330 and the human HOX complex-- we talked about this 14:35.325 --> 14:40.715 earlier--where the sequence of the genes along the chromosome, 14:40.720 --> 14:44.500 and the parts of the body that are being controlled by those 14:44.504 --> 14:48.624 genes developmentally, are similar in humans and in 14:48.623 --> 14:52.043 fruit flies, and actually unite everything 14:52.035 --> 14:53.445 that you see here. 14:53.450 --> 14:56.570 So this sort of thing is a signal of shared ancestry, 14:56.570 --> 15:00.110 and it's the kind of molecular information used to construct 15:00.110 --> 15:01.850 the broader Tree of Life. 15:01.850 --> 15:04.870 So this is something that is linking together arthropods, 15:04.865 --> 15:07.445 annelids, mollusks, echinoderms and chordates. 15:07.450 --> 15:12.030 15:12.029 --> 15:13.639 Now analogy. 15:13.639 --> 15:18.689 Analogy or convergence is a misleading kind of information, 15:18.690 --> 15:23.320 because that means that natural selection has taken things that 15:23.317 --> 15:28.457 were evolutionarily independent, and which have sister groups, 15:28.457 --> 15:32.187 have relatives, that don't look anything like 15:32.187 --> 15:34.877 this, and then shaped both of those 15:34.881 --> 15:38.001 things to come together to a common form. 15:38.000 --> 15:42.040 So the dolphin and the ichthyosaur have a very similar 15:42.043 --> 15:44.893 fusiform body, and this is because of strong 15:44.892 --> 15:48.322 selection to swim rapidly in the ocean and to chase down fish and 15:48.317 --> 15:51.347 squid; which they both did. 15:51.350 --> 15:55.110 And the analogy goes deeper than that. 15:55.110 --> 15:59.250 As you probably know, the dolphin has live birth, 15:59.248 --> 16:00.798 it's viviparous. 16:00.799 --> 16:02.659 So is the ichthyosaur. 16:06.860 --> 16:09.510 in Germany, and look at the world's largest collection of 16:09.506 --> 16:13.246 pregnant ichthyosaurs, you can see an ichthyosaur that 16:13.250 --> 16:17.710 was giving birth at the point when it was fossilized. 16:17.710 --> 16:23.170 And they often had twins, or triplets. 16:23.168 --> 16:28.798 So the definition of analogy is two things that look very much 16:28.798 --> 16:31.638 alike, even though they have many 16:31.638 --> 16:36.038 relatives that look quite different and are distant on the 16:36.041 --> 16:36.661 tree. 16:36.658 --> 16:40.298 So the dolphin is more closely related to a kangaroo than it is 16:40.304 --> 16:43.264 to an ichthyosaur, and an ichthyosaur is more 16:43.259 --> 16:47.099 closely related to a hummingbird than it is to a dolphin; 16:47.100 --> 16:48.480 nevertheless they look similar. 16:48.480 --> 16:49.570 So that's analogy. 16:49.570 --> 16:52.930 16:52.928 --> 16:59.468 So once people got their DNA sequences and their logic under 16:59.470 --> 17:02.300 control, and they started doing a lot of 17:02.302 --> 17:04.702 molecular systematics, they discovered some 17:04.700 --> 17:07.290 relationships that were kind of surprising, 17:07.288 --> 17:09.738 because analogy, convergence, 17:09.740 --> 17:12.980 had been covering up relationship, 17:12.980 --> 17:17.260 or because evolution had so radically changed the external 17:17.255 --> 17:21.425 appearance of these creatures, that it was very difficult to 17:21.430 --> 17:23.190 see who they were related to. 17:23.190 --> 17:25.520 Here are a few of these insights. 17:25.519 --> 17:29.399 I'll bet that if I ask Alex or Jeremy or Katie or the other 17:29.396 --> 17:32.736 teaching fellows if they've got a favorite one, 17:32.740 --> 17:35.680 that they could probably come up with others as well. 17:35.680 --> 17:37.800 Maybe I will in a minute; so start thinking. 17:37.799 --> 17:39.859 Okay? 17:39.858 --> 17:44.208 Pentastomids were a mysterious group of creatures, 17:44.212 --> 17:49.102 and they turn out to be closely related to fish lice. 17:49.098 --> 17:52.038 I'll show you a pentastomid in a minute. 17:52.038 --> 17:54.528 There used to be a group called carnivorous plants, 17:54.529 --> 17:56.109 the pitcher plants and the sundews, 17:56.108 --> 17:58.208 and people thought that the pitcher plants and the sundews 17:58.214 --> 17:58.774 were related. 17:58.769 --> 18:02.399 These are plants that are adapted to living under very low 18:02.403 --> 18:05.703 nitrogen conditions, and they need nitrogen to make 18:05.696 --> 18:08.566 all of their proteins, and they get it by killing 18:08.568 --> 18:09.978 insects and other things. 18:09.980 --> 18:13.240 Some of them can even kill a small frog. 18:13.240 --> 18:17.140 So it was thought that it was likely that this was a natural 18:17.144 --> 18:20.914 group and that they had all evolved these capacities in an 18:20.914 --> 18:23.484 ancestral condition, and that they were all related 18:23.482 --> 18:23.942 to each other. 18:23.940 --> 18:24.510 But they're not. 18:24.509 --> 18:27.969 It's happened several times. 18:27.970 --> 18:30.830 It wasn't really clear where the whales had come from. 18:30.828 --> 18:34.708 One might have thought that well maybe whales are related to 18:34.711 --> 18:37.931 seals, or perhaps they're related to other aquatic 18:37.934 --> 18:39.584 mammals, like otters. 18:39.578 --> 18:41.398 But seals and otters are carnivores, 18:41.400 --> 18:44.730 and it turns out that whales, including the toothed whales, 18:44.730 --> 18:47.550 the active carnivorous dolphins and sperm whales, 18:47.549 --> 18:49.729 are ungulates. 18:49.730 --> 18:50.010 Okay? 18:50.012 --> 18:53.472 So there was an ungulate that used go around eating plants and 18:53.472 --> 18:56.652 it went into the water and it started to eat all kinds of 18:56.648 --> 18:57.668 other things. 18:57.670 --> 18:58.040 Okay? 18:58.038 --> 19:01.808 It stopped eating plants and it started eating fish, 19:01.805 --> 19:05.475 squid; and some of them eat a lot of 19:05.479 --> 19:08.799 crustacea, if they filter feed. 19:08.799 --> 19:10.469 Sycamores. 19:10.470 --> 19:14.190 Sycamores, or plane trees, are the classic tree which is 19:14.189 --> 19:16.759 used to decorate the European plaza. 19:16.759 --> 19:17.279 Okay? 19:17.281 --> 19:23.141 If you like to sit out in the summer, in Italy or France, 19:23.137 --> 19:29.197 and watch the people go by, you're probably sitting under a 19:29.202 --> 19:31.192 sycamore tree. 19:31.190 --> 19:34.220 And they have a leaf that looks like a maple leaf, 19:34.219 --> 19:37.679 and if you just look at a sycamore--and by the way it has 19:37.680 --> 19:41.220 a kind of blotchy bark; so it has sort of white bark 19:41.215 --> 19:44.455 but with blotches on it-- if you just look at a sycamore 19:44.464 --> 19:46.744 and you just look at the shape of the leaf, 19:46.740 --> 19:48.050 you might think, "Oh, these are related to 19:48.051 --> 19:48.481 maples." 19:48.480 --> 19:49.620 They aren't at all. 19:49.618 --> 19:54.308 They are in fact more closely related to water lilies. 19:54.308 --> 19:56.958 Now those are some pretty radical surprises. 19:56.960 --> 19:59.580 Those are things that were buried in the DNA sequences, 19:59.578 --> 20:02.298 that were not apparent in the morphology, 20:02.298 --> 20:05.828 and they are not only testimony to the power of molecular 20:05.825 --> 20:08.795 systematics, they are testimony to the power 20:08.798 --> 20:12.858 of natural selection to change the shape of things in ways that 20:12.858 --> 20:15.868 are profoundly altering and create all sorts of 20:15.871 --> 20:18.491 mis-impressions about relationship. 20:18.490 --> 20:20.920 So here are some pentastomids. 20:20.920 --> 20:25.200 This set of pentastomid worms here is actually crawling across 20:25.195 --> 20:27.575 the roof of a crocodile's mouth. 20:27.578 --> 20:32.638 They tend to live in the noses of crocodiles and dogs. 20:32.640 --> 20:36.170 They don't tend to live in humans, this is one of those 20:36.165 --> 20:40.075 yucky, creepy-crawlies that you don't have to worry about too 20:40.084 --> 20:41.264 much yourself. 20:41.259 --> 20:44.489 But it looks--actually it looks something like a beetle larva, 20:44.487 --> 20:47.707 or something like that; it looks kind of like a 20:47.711 --> 20:48.541 mealworm. 20:48.538 --> 20:53.818 But look, it's got segments in it, and it has some kind of 20:53.815 --> 20:56.125 funny structure inside. 20:56.130 --> 20:59.470 It turns out that it's most closely related to this thing. 20:59.470 --> 21:06.620 Here is a fish louse on the outside of a triggerfish, 21:06.618 --> 21:09.918 and this is an isopod. 21:09.920 --> 21:13.030 Pentastomids are related to fish lice. 21:13.028 --> 21:17.458 They are not related to beetles, or to nematode worms, 21:17.461 --> 21:19.971 or to a lot of other things. 21:19.970 --> 21:23.190 And, in fact, they're nested within the fish 21:23.185 --> 21:23.705 lice. 21:23.710 --> 21:28.030 So evolution took something that looked like that and turned 21:28.030 --> 21:29.130 it into that. 21:29.130 --> 21:33.200 And like a lot of things, this probably was accomplished 21:33.201 --> 21:37.051 by things like crocodiles eating things like fish. 21:37.048 --> 21:40.278 And when the parasite that was living on the fish got ingested 21:40.279 --> 21:42.029 by the crocodile, it was more or less, 21:42.025 --> 21:44.355 "Oh my heavens, how am I going to adapt to the 21:44.362 --> 21:45.262 crocodile?" 21:45.259 --> 21:48.699 Well if it could fall of the fish and stay in the mouth of 21:48.700 --> 21:51.960 the crocodile and crawl up its nose, it can survive; 21:51.960 --> 21:53.430 which essentially is what these things did. 21:53.430 --> 21:56.930 21:56.930 --> 21:59.650 Here are two carnivorous plants. 21:59.650 --> 22:02.140 Okay? They are polyphyletic. 22:02.140 --> 22:05.300 Pitcher plants have evolved independently at least three 22:05.299 --> 22:07.669 times; flypaper traps at least five 22:07.672 --> 22:08.152 times. 22:08.150 --> 22:11.090 There are two groups of pitcher plants that are sister groups to 22:11.092 --> 22:14.852 two clades of flypaper traps; but others have other sister 22:14.848 --> 22:15.508 groups. 22:15.509 --> 22:18.559 So these are actually deeply cool plants. 22:18.558 --> 22:22.548 The world hotspot of pitcher plants, 22:22.548 --> 22:25.818 if you have a deep desire to go collect a lot of pitcher plants, 22:25.818 --> 22:31.848 is Borneo, and it is no coincidence that Borneo is an 22:31.854 --> 22:37.054 island that has very, very nitrogen poor soil, 22:37.048 --> 22:42.568 and the trees and all the shrubs that live on Borneo, 22:42.568 --> 22:46.348 many of them have special adaptations to dealing with this 22:46.348 --> 22:48.138 low nutrient environment. 22:48.140 --> 22:53.110 Things like flypaper and the sundews, the flypaper traps, 22:53.105 --> 22:57.355 they often live in bogs, which also are extremely 22:57.363 --> 22:58.963 nutrient poor. 22:58.960 --> 23:02.480 If you go to Bethany Bog here, you will find these living in 23:02.482 --> 23:03.322 Bethany Bog. 23:03.318 --> 23:07.298 It's a kettle lake that was left after the glaciers 23:07.295 --> 23:08.245 retreated. 23:08.250 --> 23:10.470 There's a mat of vegetation growing out over it, 23:10.469 --> 23:13.209 which means that in the middle of it, the plants are living 23:13.208 --> 23:14.198 right over water. 23:14.200 --> 23:19.230 The water is nutrient poor, and there are flypaper traps 23:19.229 --> 23:23.709 catching flies out there, to get their protein. 23:23.710 --> 23:27.760 This is a chunk of the Tree of Life that shows you the 23:27.756 --> 23:29.966 radiation of the ungulates. 23:29.970 --> 23:33.290 And you can see that both the toothed and the baleen whales 23:33.291 --> 23:36.901 are nested within the ungulates, and their closest relatives are 23:36.900 --> 23:37.760 the hippos. 23:37.759 --> 23:41.319 So you should think that the ancestor of the hippo went into 23:41.316 --> 23:44.936 the ocean, probably about 35,000,000 years 23:44.940 --> 23:46.870 ago, and right here, 23:46.865 --> 23:50.435 marked on the tree, are some of the genes which 23:50.442 --> 23:53.752 have changed along these particular lines and which are 23:53.748 --> 23:55.828 signals of those relationships. 23:55.829 --> 24:00.809 24:00.808 --> 24:05.478 So the take-home point is that appearances are deceptive and 24:05.480 --> 24:07.620 detective work is needed. 24:07.618 --> 24:09.918 Now how do you do the detective work? 24:09.920 --> 24:11.420 How do you build a phylogenetic tree? 24:11.420 --> 24:16.100 24:16.098 --> 24:18.408 Before I do that, do you guys have any favorite 24:18.411 --> 24:21.361 sort of phylogenetic--you have; Jeremy what's yours? 24:21.358 --> 24:22.828 Teaching Fellow: Yes, gnetales, 24:22.827 --> 24:25.167 not being related to flowering plants, for me personally. 24:25.170 --> 24:27.760 Because gnetales have double fertilization, 24:27.759 --> 24:30.719 which is a very interesting innovation of plants, 24:30.721 --> 24:32.511 or flowering plants have. 24:32.509 --> 24:35.529 And actually ginkgos are gnetales in the gymnosperms, 24:35.531 --> 24:36.521 the pine trees. 24:36.519 --> 24:40.569 That personally--we learned that from Burleigh and Mathews . 24:40.568 --> 24:42.148 Prof: So how recently was that discovered? 24:42.150 --> 24:43.640 Teaching Fellow: Like three years ago; 24:43.640 --> 24:44.240 three or four years ago. 24:44.240 --> 24:46.350 Prof: Okay, see, the tree keeps changing. 24:46.349 --> 24:47.289 It's a moving target. 24:47.289 --> 24:48.849 It gets better. 24:48.848 --> 24:52.118 The basic branches are not moving too much, 24:52.116 --> 24:56.626 but out there on the tips there's still a lot of action. 24:56.630 --> 24:57.520 So how do you get it? 24:57.519 --> 24:58.839 How do you build a phylogenetic tree? 24:58.839 --> 25:04.709 Well this is an important point. 25:04.710 --> 25:05.790 You need to have some characters; 25:05.789 --> 25:07.349 so those are states of traits. 25:07.348 --> 25:09.838 They could be nucleotide sequences. 25:09.838 --> 25:13.158 It could be whether the thing has scales or fur, 25:13.159 --> 25:17.679 or it could be whether it has a three or four-chambered heart. 25:17.680 --> 25:19.760 It could be a lot of things. 25:19.759 --> 25:23.609 So you need characters, and they need to have different 25:23.607 --> 25:24.247 states. 25:24.250 --> 25:29.260 And the characters that are informative are shared derived 25:29.257 --> 25:30.397 characters. 25:30.400 --> 25:33.210 I will go into this issue with a full slide, 25:33.209 --> 25:37.079 because this is a key point; the characters that give you 25:37.077 --> 25:41.367 phylogenetic information are the ones that everything in a group 25:41.369 --> 25:44.829 shares with each other, and it's different from the 25:44.828 --> 25:45.428 ancestor. 25:45.430 --> 25:51.130 25:51.130 --> 25:55.580 Well, you can only define derived by comparison with 25:55.582 --> 25:56.632 primitive. 25:56.630 --> 25:56.940 Okay? 25:56.942 --> 26:00.642 So primitive is like what it used to be, and derived is what 26:00.635 --> 26:04.015 it is now, somewhere on the tree, and you can't do that 26:04.017 --> 26:05.267 without a tree. 26:05.269 --> 26:07.309 So there's kind of a paradox. 26:07.308 --> 26:10.688 You don't have a tree, and if you don't have it, 26:10.690 --> 26:13.760 you don't have a way to determine what came first, 26:13.759 --> 26:16.969 and therefore you don't know about character polarization. 26:16.970 --> 26:21.890 Character polarization means knowing which state is primitive 26:21.892 --> 26:26.922 and which is derived; that polarizes that series of 26:26.915 --> 26:28.425 trait states. 26:28.430 --> 26:31.140 So there are number of ways out of this logical dilemma. 26:31.140 --> 26:35.610 One is you look at all possible trees-- 26:35.608 --> 26:38.318 this is, by the way, a huge computational problem, 26:38.318 --> 26:40.468 as you'll see at the end--and you choose the ones that are 26:40.473 --> 26:40.893 simplest. 26:40.890 --> 26:43.940 So that's the principle of parsimony. 26:43.940 --> 26:46.650 And it's a logical principle; it's not an empirical 26:46.647 --> 26:49.037 principle, and it's not necessarily the way that 26:49.039 --> 26:50.159 evolution operates. 26:50.160 --> 26:55.340 But given that there are many, many possible trees, 26:55.338 --> 26:59.168 choosing the simplest one basically is a way of saying, 26:59.170 --> 27:03.440 "This is how we're dealing with our ignorance." 27:03.440 --> 27:06.970 Or you could choose the tree that would make it most likely 27:06.965 --> 27:10.855 that you would have observed the character data that you actually 27:10.856 --> 27:14.206 did observe; that's called the principle of 27:14.211 --> 27:15.801 maximum likelihood. 27:15.798 --> 27:19.518 And, in fact, in the computer programs and in 27:19.516 --> 27:24.326 the theoretical arguments that go on in phylogenetics, 27:24.328 --> 27:28.478 these are two of the main themes, and many of the methods 27:28.482 --> 27:33.212 combine them, in various ways. 27:33.210 --> 27:37.930 Okay, so a bit about shared-derived characters. 27:37.930 --> 27:40.690 Remember that picture with the different colors, 27:40.686 --> 27:43.206 with the different parts of the forelimb? 27:43.210 --> 27:44.760 Having a forelimb that has a humerus, 27:44.759 --> 27:47.399 a radius, an ulna, carpals and metacarpals, 27:47.400 --> 27:50.660 in that sequence, really doesn't help us to 27:50.656 --> 27:53.136 distinguish bats from turtles. 27:53.140 --> 27:54.940 Okay? 27:54.940 --> 27:59.140 That trait's shared among all tetrapods. 27:59.140 --> 28:02.840 So the fact that you're looking at a thing, that's got those 28:02.839 --> 28:06.669 parts to it, doesn't help you to distinguish bats from turtles 28:06.665 --> 28:08.355 from whales from seals. 28:08.358 --> 28:10.598 Sure, they've all got it, but that's not telling you 28:10.599 --> 28:13.059 whether they're closely related to each other or not, 28:13.058 --> 28:16.328 because they all got it from a common ancestor; 28:16.329 --> 28:18.589 it's not derived. 28:18.588 --> 28:23.148 However, that structure distinguishes the tetrapods from 28:23.153 --> 28:25.233 the lobe-finned fishes. 28:25.230 --> 28:29.040 So at that point on the tree it becomes useful as a 28:29.042 --> 28:31.792 shared-derived marker, of a group; 28:31.788 --> 28:33.998 which is why, one of the reasons why, 28:34.000 --> 28:36.710 we're pretty confident that the tetrapods is a good group, 28:36.710 --> 28:41.440 and that things didn't- that the vertebrates didn't come out 28:41.442 --> 28:43.852 of the water multiple times. 28:43.848 --> 28:47.148 In this context it's marking a trait that originated once in 28:47.147 --> 28:49.827 their common ancestor; it's shared by all of them; 28:49.828 --> 28:52.588 it's not found in their closest relatives. 28:52.588 --> 28:54.808 The jargon is derived from the Greek. 28:54.808 --> 28:57.638 This thing is called a synapomorphy. 28:57.640 --> 29:01.600 Syn means shared; apo means derived; 29:01.599 --> 29:03.119 and morph means trait. 29:03.118 --> 29:08.098 So a shared-derived trait is a synapomorphy. 29:08.098 --> 29:12.518 Now there's a bunch of important stuff here. 29:12.519 --> 29:19.669 Just looking the same isn't very helpful. 29:19.670 --> 29:22.300 The informative traits are the ones that are shared and 29:22.301 --> 29:22.791 derived. 29:22.788 --> 29:25.918 And what is shared and what is derived, and therefore what is 29:25.923 --> 29:27.913 informative, depends on the context; 29:27.910 --> 29:29.810 it depends on the part of the tree that you're sitting in. 29:29.809 --> 29:33.059 29:33.058 --> 29:37.008 Okay, now a little bit about building trees. 29:37.009 --> 29:41.079 You can think--evolution is real complex and a lot of stuff 29:41.077 --> 29:42.057 is going on. 29:42.058 --> 29:47.068 But let's suppose that A has the ancestral state of a trait, 29:47.065 --> 29:51.725 and that between A and B and C, a new state of the trait 29:51.730 --> 29:52.750 evolves. 29:52.750 --> 29:57.460 So Ancestral is blue and New is red. 29:57.460 --> 30:01.440 And this marks the point where that trait started to change, 30:01.440 --> 30:05.560 and then it spread through the population and by this point it 30:05.555 --> 30:06.495 was fixed. 30:06.500 --> 30:10.720 Well this is a cartoon, and this is a cartoon of a 30:10.718 --> 30:11.578 cartoon. 30:11.578 --> 30:11.928 Okay? 30:11.926 --> 30:16.166 We're just marking where that happened, and the only important 30:16.165 --> 30:20.465 thing about this mark is that it is between this point and this 30:20.474 --> 30:22.174 point; other than that, 30:22.165 --> 30:24.595 we don't really know exactly where it is. 30:24.599 --> 30:26.709 Okay? 30:26.710 --> 30:32.010 Now A, B and C would normally be species. 30:32.009 --> 30:32.249 Okay? 30:32.247 --> 30:33.767 But they could be other things. 30:33.769 --> 30:36.599 They could be genes, or they could be genera or 30:36.597 --> 30:38.687 families or something like that. 30:38.690 --> 30:41.380 And 1,2, and 3 are characters. 30:41.380 --> 30:45.250 They could be morphological; they could be molecular. 30:45.250 --> 30:49.210 And, as a convention, the ancestral state of that 30:49.205 --> 30:53.895 character will be denoted 0, and the derived state will be 30:53.904 --> 30:57.334 denoted 1; and the arrow indicates going 30:57.333 --> 30:59.453 from ancestral to derived. 30:59.450 --> 31:03.550 And basically what this picture is telling you is that Trait 1 31:03.545 --> 31:06.025 changed from ancestral to derived, 31:06.028 --> 31:09.628 between A and B, and then after B split off from 31:09.626 --> 31:13.626 C, two more things changed in C. 31:13.630 --> 31:17.940 Okay, so that's the information that picture is trying to give 31:17.944 --> 31:18.374 you. 31:18.368 --> 31:23.128 And if you put down a real interesting vertebrate 31:23.130 --> 31:26.790 phylogeny, like this, a vertebrate 31:26.787 --> 31:31.417 phylogeny which has some unfortunate names, 31:31.420 --> 31:36.140 for certain groups on it--okay?; like reptiles aren't real--and 31:36.144 --> 31:39.244 we use these characters, having a vertebral column, 31:39.243 --> 31:41.413 having lungs, having an amniotic egg, 31:41.414 --> 31:43.834 having lactation, eggshell absent, 31:43.832 --> 31:48.052 chorioallantoic placenta, hooves on distal phalanges, 31:48.051 --> 31:50.601 a petrosal bulla, in the skull, 31:50.597 --> 31:54.617 then what we can see from plotting those along the line 31:54.623 --> 31:58.943 basically is which characters here are distinguishing which 31:58.944 --> 32:01.694 groups; what's going on when you go out 32:01.694 --> 32:04.394 into the ungulates, the horses and the cows. 32:04.390 --> 32:10.080 The vertebral column is a synapomorphy of the vertebrates, 32:10.076 --> 32:12.766 of this whole bunch here. 32:12.769 --> 32:17.099 But if we just look at the mammals, which are from here on 32:17.103 --> 32:21.143 out, it's an ancestral trait; it's not a derived trait, 32:21.143 --> 32:25.123 it's a shared-primitive trait, and that's a symplesiomorphy, 32:25.117 --> 32:26.597 not a synapomorphy. 32:26.598 --> 32:31.118 The lungs, which come in right here, at 2, are a synapomorphy 32:31.117 --> 32:34.277 of the tetrapods, but they are an ancestral 32:34.280 --> 32:37.670 trait, a symplesiomorphy of the amniotes; 32:37.670 --> 32:42.600 okay, so the amniotes are back here--excuse me, 32:42.596 --> 32:45.486 the amniotes are up here. 32:45.490 --> 32:48.340 So whether you call a trait one or the other depends on where it 32:48.343 --> 32:50.883 sits in the tree and what it allows you to do with it. 32:50.880 --> 32:54.820 32:54.818 --> 32:59.348 This is a bit of the relationship between trees and 32:59.354 --> 33:00.084 names. 33:00.078 --> 33:03.888 Ideally we would like that relationship to be totally clear 33:03.894 --> 33:07.384 and unambiguous so that if I just give you a name, 33:07.380 --> 33:10.330 you know where that thing sits in the tree. 33:10.329 --> 33:12.559 This is a hard thing to do. 33:12.558 --> 33:16.978 It is so hard to do that there is now a complete overturning of 33:16.978 --> 33:20.468 the Linnaean System of Classification going on. 33:20.470 --> 33:23.760 It's being led by a couple of guys here, Michael Donoghue and 33:23.758 --> 33:24.798 Jacques Gauthier. 33:24.798 --> 33:28.018 Michael is in our department, and he's also vice-president of 33:28.017 --> 33:30.427 the university, and Jacques Gauthier is a prof 33:30.430 --> 33:32.200 in G&G - in Paleontology. 33:32.200 --> 33:36.550 And they have worked a lot on how to get a way to name things 33:36.551 --> 33:40.981 that actually tells you their complete position on the Tree of 33:40.977 --> 33:41.627 Life. 33:41.630 --> 33:41.890 Okay? 33:41.885 --> 33:43.835 It's going to be a big computer code. 33:43.838 --> 33:47.648 It's not going to be something nice like Homo sapiens. 33:47.650 --> 33:48.090 Okay? 33:48.085 --> 33:51.125 Homo sapiens is the Linnaean name. 33:51.130 --> 33:54.630 The new method will contain a lot more information in it, 33:54.630 --> 33:57.970 and will probably only be something that you can store on 33:57.973 --> 34:00.303 your iPhone, or whatever else it is that you 34:00.304 --> 34:01.604 use to replace your memory. 34:01.599 --> 34:04.969 34:04.970 --> 34:10.970 But ideally these terms would map onto natural relationships; 34:10.969 --> 34:13.449 and these terms actually do map onto these natural 34:13.449 --> 34:14.259 relationships. 34:14.260 --> 34:14.570 Okay? 34:14.572 --> 34:17.832 So this is the primates, and these are the ungulates, 34:17.831 --> 34:21.091 and they are nested within the eutherian mammals. 34:21.090 --> 34:24.450 The therian mammals include the marsupials, and the mammals 34:24.449 --> 34:27.579 include the monotremes, which are the spiny echidna and 34:27.577 --> 34:29.197 the duckbilled platypus. 34:29.199 --> 34:32.519 And between them and the marsupials is where the female 34:32.523 --> 34:35.783 reproductive tract evolved, because the monotremes are 34:35.784 --> 34:37.144 still laying eggs. 34:37.139 --> 34:42.149 And then the amniotes would include the so-called reptiles; 34:42.150 --> 34:45.340 that would actually be crocodiles, lizards, 34:45.342 --> 34:46.712 snakes, turtles. 34:46.710 --> 34:49.560 The tetrapods include the amphibians, and then the 34:49.561 --> 34:51.891 vertebrates include the fish, as well. 34:51.889 --> 34:55.559 So this is a natural classification, 34:55.559 --> 35:01.219 and that's the way all taxonomic nomenclature should be 35:01.224 --> 35:05.004 related to all good systematics. 35:05.000 --> 35:09.670 Now, how do you actually infer a tree from a character matrix? 35:09.670 --> 35:12.630 So here we're just trying to figure out if things are related 35:12.626 --> 35:13.066 or not. 35:13.070 --> 35:14.340 We don't have a tree yet. 35:14.340 --> 35:15.520 Okay? 35:15.518 --> 35:21.348 And we have three species and we have three characters; 35:21.349 --> 35:27.009 and in this case ancestral is going to be 0 and 1 is going to 35:27.012 --> 35:28.242 be derived. 35:28.239 --> 35:32.329 Well this particular character matrix is consistent with 35:32.327 --> 35:35.967 drawing a tree this way, and then marking down the 35:35.969 --> 35:37.529 transitions here. 35:37.530 --> 35:43.260 So when 1 went from ancestral to derived, it changed here in 35:43.262 --> 35:47.052 both B and C, and when 2 and 3 went from 35:47.052 --> 35:51.622 ancestral to derived, it changed only in C. 35:51.619 --> 35:56.659 So can you see how the tree relates to the matrix? 35:56.659 --> 35:58.139 Okay? 35:58.139 --> 36:00.809 Any question about that at this point? 36:00.809 --> 36:02.429 I've made the simplest one I could. 36:02.429 --> 36:05.479 36:05.480 --> 36:13.250 Now if we did it with overall similarity, A and B share five 36:13.248 --> 36:15.748 ancestral counts. 36:15.750 --> 36:16.720 Okay? 36:16.719 --> 36:21.569 In three traits, 1,2 and 3, A has got the 36:21.568 --> 36:23.748 ancestral state. 36:23.750 --> 36:25.800 In two traits, two of those, 36:25.802 --> 36:29.682 in B--B has also got the ancestral trait in 2 and 3, 36:29.677 --> 36:32.487 and that would suggest this tree. 36:32.489 --> 36:37.449 But if we go by derived similarity, then we get this 36:37.449 --> 36:38.129 tree. 36:38.130 --> 36:40.150 And you can see what it does to the tree. 36:40.150 --> 36:43.330 So this is overall similarity, which is misleading, 36:43.329 --> 36:47.329 and this is shared derived trait phylogenetics, 36:47.329 --> 36:51.819 which yields this tree, and it shifts the sister of B 36:51.820 --> 36:53.030 from A to C. 36:53.030 --> 36:56.160 36:56.159 --> 36:59.519 Now if life were only so simple. 36:59.519 --> 37:01.079 Life is never simple. 37:01.079 --> 37:05.409 Traits can conflict with each other in the information they 37:05.409 --> 37:07.649 give you, and they often do. 37:07.650 --> 37:10.860 So here is a character matrix with no conflict, 37:10.864 --> 37:14.084 and here's a character matrix with conflict. 37:14.079 --> 37:14.899 Okay? 37:14.900 --> 37:17.660 So everything was looking pretty good, as long as we'd 37:17.663 --> 37:19.753 only measured Trait 1 and 2 down here. 37:19.750 --> 37:24.890 But then we measured Trait 3, and Trait 3 seemed to indicate 37:24.887 --> 37:28.717 that C had this ancestral trait over here. 37:28.719 --> 37:32.909 It was beginning to look like C was going to be a highly derived 37:32.905 --> 37:33.565 species. 37:33.570 --> 37:36.250 But then we've looked at another trait and it wasn't like 37:36.253 --> 37:36.593 that. 37:36.590 --> 37:37.760 What do you do? 37:37.760 --> 37:38.320 Right? 37:38.320 --> 37:39.180 What do you do? 37:39.179 --> 37:42.979 Well you can choose the simplest tree. 37:42.980 --> 37:45.660 You can choose the one that implies the least change. 37:45.659 --> 37:51.719 So here are all trees which are consistent with this character 37:51.715 --> 37:52.605 matrix. 37:52.610 --> 37:59.110 And if you go back and try, I think you will see that you 37:59.112 --> 38:03.992 can plot all of these changes onto here. 38:03.989 --> 38:08.219 So basically this tree is saying that third trait down 38:08.215 --> 38:12.015 here, it changed twice; it changed once in A and it 38:12.023 --> 38:13.303 changed once in B. 38:13.300 --> 38:17.160 So it went from an ancestral state up to a derived state, 38:17.163 --> 38:18.753 along these branches. 38:18.750 --> 38:22.110 And these traits up here, 1 and 2, they changed between 38:22.106 --> 38:24.836 A, on the one hand, and B and C on the other, 38:24.840 --> 38:25.400 here. 38:25.400 --> 38:28.480 So you could do that also, and you would find that all of 38:28.480 --> 38:31.450 these trees are actually consistent with that character 38:31.449 --> 38:35.159 matrix, but this one takes five changes 38:35.161 --> 38:38.411 and these only take 4: 1,2, 3,4; 38:38.409 --> 38:40.839 1,2, 3,4, 5. 38:40.840 --> 38:45.400 Therefore, we come to the conclusion that one of these 38:45.404 --> 38:49.454 traits, one of these trees, is probably correct, 38:49.451 --> 38:52.811 just by the principle of parsimony. 38:52.809 --> 38:56.449 And the only way that we really resolve this kind of issue is by 38:56.445 --> 38:57.595 getting more data. 38:57.599 --> 39:00.529 The more data you get, the more likely it is that that 39:00.532 --> 39:02.362 will converge on the real tree. 39:02.360 --> 39:05.640 I would like to point out that there probably is not enough 39:05.641 --> 39:09.041 data in the world to do that, for all of the creatures on the 39:09.036 --> 39:09.656 planet. 39:09.659 --> 39:11.239 In other words, at the end of the process there 39:11.237 --> 39:12.537 will still be some unresolved stuff. 39:12.539 --> 39:15.989 39:15.989 --> 39:19.589 Now what about--let me just put that in--what about this? 39:19.590 --> 39:21.390 Where does the root come from? 39:21.389 --> 39:24.889 Well these are unrooted trees, up here, 39:24.889 --> 39:29.599 and the choice of where you decide the ancestral state is 39:29.597 --> 39:34.217 actually makes quite a bit of difference to the tree. 39:34.219 --> 39:37.049 I'm sorry that in translating this has gotten screwed up a 39:37.047 --> 39:37.937 little bit here. 39:37.940 --> 39:41.740 This is actually A over here, and then it goes D, 39:41.742 --> 39:43.092 B, C over here. 39:43.090 --> 39:46.310 Choosing this as the root, rather than this, 39:46.309 --> 39:50.129 changes the relationship from B and D to B and C; 39:50.130 --> 39:54.550 this is B and this is C; this is B and this is D. 39:54.550 --> 39:57.910 So where do you get the outgroup, how do you decide 39:57.907 --> 39:59.717 where the root should be? 39:59.719 --> 40:08.879 Well, that issue can only be decided in the context of a 40:08.878 --> 40:11.208 bigger tree. 40:11.210 --> 40:14.780 So you must have some other kind of information to suggest 40:14.777 --> 40:16.717 what your out group might be. 40:16.719 --> 40:19.369 And, when this is actually done, sometimes people see 40:19.373 --> 40:22.233 whether the choice of an outgroup is actually going to be 40:22.231 --> 40:24.631 changing the shape of their tree very much. 40:24.630 --> 40:26.710 They will report, "We tried this and this 40:26.706 --> 40:29.146 and this as the out group, and these were the results, 40:29.153 --> 40:30.173 to our tree." 40:30.170 --> 40:33.410 Okay? 40:33.409 --> 40:36.829 Now, you get your trait matrix, you want to find the simplest 40:36.829 --> 40:37.229 tree. 40:37.230 --> 40:41.020 One of them is to do an exhaustive search. 40:41.018 --> 40:45.758 So here are two terminal taxa, B and C. 40:45.760 --> 40:46.670 Okay? 40:46.670 --> 40:49.410 We've chosen A as the ancestral condition; 40:49.409 --> 40:51.549 it's not an existing species now. 40:51.550 --> 40:54.170 A, we say, is going to be the out group; 40:54.170 --> 40:56.580 that's going to be link to the ancestors. 40:56.579 --> 40:58.909 Only one tree possible. 40:58.909 --> 41:05.569 If we have three terminal taxa, we can have either B as the 41:05.565 --> 41:10.195 closest relative of D; C as the closest relative of D; 41:10.199 --> 41:12.469 or B and C as being their closest relative. 41:12.469 --> 41:14.829 So there are three possibilities there. 41:14.829 --> 41:17.769 If we have four terminal taxa, oh my goodness, 41:17.766 --> 41:19.656 suddenly we have this many. 41:19.659 --> 41:21.449 Oh that's confusing. 41:21.449 --> 41:27.449 If we have 500 terminal taxa, we have 1 times 10^(1280 41:27.445 --> 41:29.365 )possibilities. 41:29.369 --> 41:33.029 This is a combinatorial explosion of possibilities. 41:33.030 --> 41:38.640 And about, oh say about 2003,2004, 41:38.639 --> 41:44.849 it took nine months of runtime on a supercomputer to sort out 41:44.847 --> 41:51.257 all of the possible trees for a reasonable number of characters 41:51.262 --> 41:54.162 for something like this. 41:54.159 --> 41:57.049 So if you have 500 species and you wanted-- 41:57.050 --> 41:59.570 and you had a reasonable number of characters, 41:59.570 --> 42:03.470 you would only get a tiny fraction of those trees covered, 42:03.469 --> 42:05.859 and you would wait nine months to get your answer. 42:05.860 --> 42:08.420 It's gotten a little better than that recently, 42:08.422 --> 42:09.262 but not much. 42:09.260 --> 42:09.850 Okay? 42:09.849 --> 42:11.509 So there are ways to get around this problem. 42:11.510 --> 42:13.820 There are all kinds of heuristic ways to get around 42:13.824 --> 42:14.154 this. 42:14.150 --> 42:17.360 There are ways of jumping into tree space and doing local 42:17.358 --> 42:20.338 approximations and then branching things together. 42:20.340 --> 42:23.080 So that, for example, when the New York Times 42:23.077 --> 42:25.887 reported last week, on Darwin's birthday, 42:25.887 --> 42:30.377 that biologists had recently been able to publish a tree of 42:30.382 --> 42:33.372 11,000 plant species, which was done here, 42:33.367 --> 42:35.827 by Stephen Smith and Michael Donoghue's group; 42:35.829 --> 42:39.379 he did that as a super-tree, using these approximation 42:39.375 --> 42:42.915 techniques to patch together lots of smaller trees. 42:42.920 --> 42:46.440 And there are all kinds of criteria that get applied to how 42:46.438 --> 42:47.348 good that is. 42:47.349 --> 42:49.759 By the way, one of the things that Stephen turned up is that 42:49.760 --> 42:51.560 hey, the ferns are still evolving rapidly; 42:51.559 --> 42:52.599 which is kind of neat. 42:52.599 --> 42:53.589 Lots of other stuff is in that. 42:53.590 --> 42:57.010 42:57.010 --> 43:00.310 So, the Tree of Life is not given. 43:00.309 --> 43:03.239 We have to discover it. 43:03.239 --> 43:06.219 The informative characters in the Tree of Life are those that 43:06.217 --> 43:07.457 are shared and derived. 43:07.460 --> 43:09.430 So appearances deceive. 43:09.429 --> 43:13.639 Simply looking similar is insufficient information. 43:13.639 --> 43:16.819 You can make a lot of trees with the same character matrix. 43:16.820 --> 43:18.420 You would prefer either the simplest, 43:18.420 --> 43:22.150 which implies the least change, or the tree that maximizes the 43:22.150 --> 43:25.210 probability of observing what you actually see, 43:25.210 --> 43:27.230 or some combination of those criteria. 43:27.230 --> 43:28.410 Okay? 43:28.409 --> 43:31.879 Next time we'll take these methods and see how we can infer 43:31.876 --> 43:32.946 history with them. 43:32.952 --> 43:33.672 43:33.670 --> 43:40.000