Professor of Anthropology

Jonathan Marks

Ruvolo, Maryellen (1995) Seeing the forest through the trees: Replies to Marks; Rogers and Commuzzie; Green and Djian. American Journal of Physical Anthropology, 98:218-232.

Maryellen Ruvolo gave an extended defense of the work in the AJPA, in response to three criticisms of her own work. Ruvolo had done DNA sequencing on the COII gene of mitochondrial DNA, which shows human-chimp and somehow proves that all other data are wrong and Sibley is right. Reasoning like that is quite primitive. At any rate, her response occupied more than twice as much space as the three criticisms combined, and again argues that (1) their data really showed human-chimp; (2) it doesn't matter if they falsified because they got the right answer; (3) Caccone-Powell renders the discussion moot. Virtually everything she says about the work is false, and easily demonstrated to be such. Here is a rebuttal.

In the October 1995 issue of the American Journal of Physical Anthropology, there are three critical comments of the work of Maryellen Ruvolo on molecular evolution in the apes, arguing that her published interpretations of her own work, and of the extant literature, are extravagant. Ruvolo's rebuttal occupies twice the length of the three comments put together, and makes a number of false assertions concerning some fairly basic things like outgroup comparisons, and the interpretation of phylogenetic trees. Some of the most egregious misstatements, however, are about that horse I thought was dead, DNA hybridization. Rather than go another round in the AJPA, I'll correct the record here for any interested readers. Most of these points were made in my paper in the Am. J. Phys. Anth., 85:207 (1991), cited in the comments themselves, but not in Ruvolo's response.

(1) Ruvolo asserts that after all is said and done, the infamous Sibley data actually do associate humans and chimpanzees, to the exclusion of gorillas. Here, she has greater confidence than do the authors themselves, who wrote in 1990, "If the corrections [sic] had not been used, it is virtually certain that Sibley and Ahlquist would have concluded that Homo, Pan, and Gorilla form a trichotomy" (J. Mol. Evol., 30:225). The only relevant question, then, is the validity of the "corrections". These were (1) substituting controls across experiments; (2) moving correlated points in to the regression line describing them; and (3) making precise numerical changes on the basis of a variable (fragment length) which was not even measured. They also neglected to report these "corrections" until their existence was discovered and their nature inferred quite serendipitously by others. The "corrections" were finally reported in the 1990 JME paper cited above, and discussed at length by me in the 1991 AJPA paper, cited above.

At any rate, for anyone to argue that these data actually do associate human and chimpanzee, they must accept the data alterations as valid. That is, in my opinion, a frightening prospect, with far-reaching implications for general scientific data analysis.

(2) Ruvolo, referring to a table in the 1990 paper, says that "for the entire uncorrected data set" there is "a fairly clear separation of the human chimpanzee lineage," which "seems to have been widely overlooked or ignored" (by me). In fact, I discussed that very table in my 1991 paper, cited above. As I showed in Table 2 of that paper (p. 217), pooling the data poses a considerable problem, for the data exhibit very poor reciprocity specifically in the experiments using human DNA -- thus, human-chimp yields values half as large as chimp-human; human-gorilla is half as large as gorilla-human; and human-orang is half as large as orang-human and all the permutations of orang vs. chimp or gorilla. In other words, the experiments using human tracer DNA were somehow technically fouled up. The pooled result that Ruvolo reports is a consequently a simple artifact, and nothing more. Actually the chimp-human value is 1.9, while the corresponding chimp-gorilla and gorilla-chimp values are 1.8, which stands in strong contradiction to her point.

(3) Ruvolo calls the subsequent Caccone-Powell DNA hybridization study "the most sophisticated". In fact the major claim from this study was not so much wrong as impossible. The claim was stridently that they had matched Sibley's (altered) numbers, given as T50H measurements, with their own numbers, given as Tm measurements. (Tm measures the temperature at which 50% of the DNA that formed hybrids is single-stranded; T50H measures the point at which 50% of the total input DNA is single-stranded; they are only expected to be identical when hybridization is exactly 100%.)

However, when Powell's Tms are compared to Sibley's unaltered Tms (finally given in the 1990 JME paper, p. 221) they do not in fact match at all. Matching altered and non-comparable numbers frankly requires far more than just technical sophistication.

(4) Ruvolo claims that "the data quality are high" for the Powell corpus, but as has been noted since its publication in 1989, they did not publish any melting profiles, which was the basis of the original criticism of the Sibley work, and is the only basis on which one can actually judge the quality of the data. She further claims that "these data have been freely available on request from the authors," which is directly contrary to my experience. I asked Powell, in person and in writing, several times in 1989 and 1990 for a look at his melting curves or the data on which they were based, and never received any. Vince Sarich received three representative experiments, which he copied for me. They are in fact far cruder than the comparable DNA melting profiles from the Sibley series. They are not at all of high quality, and exactly how data of poorer quality can possibly provide more precise phylogenetic resolution is quite a mystery to me. Ruvolo provides no evidence to the contrary, nor has any such evidence ever seen the light of day.

(5) Ruvolo's criteria for judging the Powell corpus are the "reciprocals and the outgroup-ingroup homogeneity" -- in other words, the final numbers, rather than the basis on which each number was generated. Since each number in this work was generated from a DNA melting profile with the aid of a "correction" for DNA fragment length, it is possible to ask what effect this "correction" had on the resolution reported. The data given in the Caccone-Powell paper (Evolution, 43:942, 1989) actually permit this calculation. It turns out that that this "correction" (based on a measurement of mean DNA length in an agarose gel, to the nearest single nucleotide, of whole genomic DNA, after sonication, labelling, incubation, and S1 nuclease digestion) is entirely responsible for the reciprocity, the resolution, and the claimed match to the (published, altered, non-comparable) Sibley numbers.

Ruvolo asserts that I merely said this correction might affect the result, but in fact I made the calculations and published them in Table 2 of the very paper she cited. They do affect the result; not only that, but they determine the result; I said so, and I say so. Anyone is free to make the calculations themselves from the data in the appendix to the Caccone-Powell paper.

The numbers I get are:

 

Before

After

Human-Chimp

2.4 +/- 1.1

1.6 +/- 0.2

Chimp-Gorilla

3.1 +/- 0.6

2.5 +/- 0.2

Human-Gorilla

4.1 +/- 1.1

2.5 +/- 0.1

Before "correction", there is no consistency to the numbers, no match to the Sibley numbers, and no no resolution; after correction, the numbers are perfectly consistent, they now match the altered/published delta-T50H values of Sibley/Ahlquist, and now resolve the trichotomy into human-chimp. Ruvolo does not provide any evidence to support her contention that the length "correction" was not responsible for generating the results, but rather she cites only unpublished "work in preparation".

Given that the precision of the reported length estimates themselves is difficult to justify on strictly a priori grounds, and that those estimates appear to be responsible for both the claim of phylogenetic resolution, and the reported match to the altered non-comparable published numbers of Sibley, I think it is difficult to sustain much faith in the general integrity and scientific value of this work.

(6) It is quite sad that these still have the appearance of being open issues. They should have been formally investigated, adjudicated, and settled years ago, having been discussed in the pages of Science (241:1598, 1988; 241:1756, 1988) and Scientific American (March 1989: 24). Informally, the work has largely been exorcised from the primary literature, and there are hardly any people even using the technique any more, since it appears capable of solving phylogenetic questions only when considerable liberties are taken with the data analysis. (With such liberties in data analysis, after all, any technique will work!) Powell himself quietly abandoned it a few years ago. Ruvolo is nearly alone in clinging to the work, loyal (if uncritical) of it to the end. That she can make such egregious misstatements in print at this late date is not of great credit to anyone.

(7) Lastly, of particular significance is the fact that the senior author of the original work (Sibley) is a member of the National Academy of Sciences, and I think it is extraordinary that that body never investigated the matter. I raised this publicly in the American Scientist (81: 382, 1993). In the following issue, Sibley indignantly responded that he would "suggest such an investigation to the National Academy of Sciences Home Secretary." Whether he actually did nor not, I cannot say. However, Vince Sarich and I immediately wrote separately to the Home Secretary (Peter Raven), detailing and documenting the accusations, which were already in the public record. An investigation would seem to have been called for, given the apparent agreement of both sides on the matter. It did not happen, however.

This is Raven's full response to me, dated October 25, 1993:

************
Dear Dr. Marks:

Thank you very much indeed for your letter and the enclosures. I was extremely interested in what you had to say in reading the enclosures. It is obviously a very complex case and, as I am sure you understand, the National Academy of Sciences would not undertake to conduct a formal review of the activities of its members as a matter of general principle, lacking the judicial machinery to do so properly. I would add, however, that no one is elected to the Academy for a single piece of work, and thus it is incorrect, as a matter of principle, to say that "this is the work that ultimately resulted in Sibley's election to the National Academy of Sciences.....". In summary, I was very interested in the material that you sent. We will be conducting no investigation.

Yours sincerely,
Peter H. Raven
Home Secretary

*****************

That has always struck me as inadequate to the circumstances.

One more thing: the work was cited with colossal ignorance in a widely-read book called Darwin's Dangerous Idea by philosopher Daniel Dennett.

Here's what he said, with my annotations.

Not investigating, of course, is the ideal bureaucratic solution for making a potentially embarrassing and egregious case of data falsification resemble a "you say potayto, I say potahto" scientific controversy. In other words, calling it a draw -- which is a victory for the original investigators, permitting them to retain the status they attained through the work in question. Some people in the field continued to cite the Sibley work as if it were fine (like the late Morris Goodman); others simply substituted the Caccone-Powell citation for the Sibley citation. After all, it is right, isn’t it?

And last of all, here is a bonus for persevering this long: A paper circulated for a conference in 2008 on biomedical fraud. Sibley revisited.

Jonathan Marks
Department of Anthropology
UNC-Charlotte

email: jmarks@uncc.edu
phone: (704) 687-5097
fax: (704) 687-1678