Contributed by: Jianzhi Zhang
Most molecular biologists would agree that a gene tends to
be more similar to its orthologs than paralogs in terms of function. This fundamental tenet, recently termed the ortholog
conjecture, is a cornerstone of phylogenomics and is used by both computational
and experimental biologists in predicting, interpreting, and understanding gene
functions. But, is this conjecture wishful
thinking or empirically founded?
In a pioneering study, Nehrt et al. (3) attempted to test
the ortholog conjecture using Gene Ontology (GO) annotations that were based on
experimental data. Contrary to
everyone’s expectation, they found that the functional similarity between
orthologs is lower than that between paralogs, when the level of sequence
divergence is controlled. Based on this
and other findings, the authors proposed that protein function evolution
is primarily determined by “the cellular context in which proteins act”. This would explain why within-species paralogs,
which are always in the same organism, were found functionally more similar than
orthologs, which by definition reside in different organisms.
Nehrt et al.’s (3)
finding stirred considerable controversies in cyberspace when published in the
summer of 2011, evidenced by numerous discussions in various blogs. The last 10 months have seen three papers
that challenged Nehrt et al.’s conclusion from different angles, although the
three papers do not completely agree with one another either.
First, Thomas and
colleagues, representing the group that annotated GO, claimed that GO
annotation differences between homologous genes “do not reflect differences in
biological function, but rather complementarity in experimental approaches” (4). That is, gene function data are so sparse at
the present that GO annotations reflect ascertainment biases in experiments rather
than true functional differences.
Second, Altenhoff
et al. (1) identified a number of biases in GO.
After correcting these biases, they found weak but significant evidence
for the ortholog conjecture.
Most recently, Chen
and Zhang (2) reanalyzed GO annotations and confirmed some of the biases
identified by Altenhoff and colleagues.
Most disturbingly, however, was the finding of many errors in GO
annotation. Even in so-called
experiment-based annotations, across-species functional inferences were
frequently made. For example, an
experiment was conducted on a monkey gene, but the function was annotated in GO
for its human ortholog, based ironically on the ortholog conjecture.
In one part of
their study, Chen and Zhang (2) focused on pairs of orthologs or paralogs that have
identical protein sequences and were studied in the same papers. Surprisingly, while all nine such paralogous
pairs have 100% GO-based functional similarity, only nine of 31 such orthologous
pairs have 100% functional similarity.
More extremely, eight of the 31 orthologous pairs show 0% functional
similarity, yet none of the papers that studied them explicitly mentioned their
functional dissimilarity. Apparently,
they reflect ascertainment biases rather than true functional differences. The authors also noted an upward trend in the
functional similarity of orthologs, relative to that of paralogs, when analyzing
the time series data of GO in the last five years.
These and other findings
led Chen and Zhang (2) to conclude that the current GO is unsuitable for
testing the ortholog conjecture. They
thus turned to RNA-Seq gene expression data, which would be relative immune to
ascertainment bias and annotation error.
They reported that orthologs are more similar to each other than to paralogs
in gene expression. But, regarding gene
function, the jury is still out. The
sheer difficulty of proving or rejecting the ortholog conjecture, one of the
most wildly assumed principles of molecular evolution, was completely unexpected,
and it still amazes me to this day.
References
1.
Altenhoff AM, Studer RA, Robinson-Rechavi M, Dessimoz C (2012) Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs. PLoS Comput Biol 8(5): e1002514.
2.
Chen X, Zhang J (2012) The Ortholog Conjecture Is Untestable by the Current Gene Ontology but Is Supported by RNA Sequencing Data. PLoS Comput Biol 8(11):
e1002784.
3.
Nehrt NL, Clark WT, Radivojac P, Hahn MW (2011) Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals. PLoS Comput Biol 7(6):
e1002073.
4.
Thomas PD, Wood V, Mungall CJ, Lewis SE, Blake JA, et al. (2012) On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report. PLoS Comput Biol 8(2): e1002386.