Thursday, June 28, 2012

wrap-up of the orthology, paralogy, and function symposium at SMBE 2012

I promised some people to write a short summary of the symposium that Matthew Hahn, Marc Robinson-Rechavi, Iddo Friedberg, and I co-organized at SMBE 2012. I particularly enjoyed the symposium and the room was pretty full all the time, despite running in parallel to other interesting topics. I will just write an overall summary without going into too much details of each of the talks, and at the end I would list a number of papers that were commented on the various talks. I have to clarify that this informal wrap-up only contains my own views and has not been consensuated among the organizers. I invite any of the attendants to add comments to highlight some important aspects that I may have missed.

I’ll start by providing a summary of how all this started... which is a rather unusual way, I believe. Indeed the idea of the symposium was born in the blogosphere, in the popular Jonathan Eisen’s Tree of Life blog, where he invited Matthew Hahn to write a special guest post on the “history behind” his paper on testing the orthology conjecture. One of the conclusions from that paper was that paralogous sequences were more similar in function (and in expression patterns) than paralogs, which contradicted one of the major expectations (and assumptions) behind the theories of duplication-driven functional divergence and the strategies for inferring functions from orthologous sequences. That paper had already caused a bit of a turmoil in the orthology community (I remember this was a hot discussion during the last Quest for Orthologs meeting, at Cambridge), and several concerns were being raised about the suitability of comparisons of functional annotations from different species, and the conclusions derived within the paper. Rather rapidly, several people commented on Matt’s post and a lively discussion started (more than 40 comments in total!). The discussion was so interesting that Marc Robinson-Rechavi suggested we should bring this scientific debate in the form of a symposium in one of the upcoming conference, and so is how some of us started to work on this idea.To me it was the first time that I met the other organizers in person.

The symposium started with Eugene Koonin, who nicely introduced the topic of what conjectures could be implied by the definition of orthology, a purely evolutionary one as introduced by Walter Fitch in 1970. He then showed results from his lab that indicate that conjectures tend to hold, but that there may be exception. For instance, the conjecture that orthologs should be best reciprocal hits can be broken by an accelerated evolution in one of the true orthologs, he then showed work from other groups (Sali, Sonnhammer) on the higher conservation of structure and domain architecture in orthologs as compared to paralogs. He criticized the use of GO terms by Hahn and others and argued that one should at variety of data on function to test the conjecture. He presented results from his own group which show higher conservation of expression across species. He concluded that the functional conjecture still holds, although he observed that differences may not be spectacular.  Catherina Gushanski was next talking on changes in gene expression following segmental duplications in mammals. They have produced an impressive dataset of expression from  different tissues in various mammal species. She used that set to ask the question whether duplication was contributing more to divergence than time alone and showed that levels of expression were decreasing in younger duplicates, changes were different across different tissues. She observed no differences between one-to-one orthologs or old duplicate pairs, she also found no differences in terms of tissue specificity in orthologs vs paralogs.  Next on stage was Nicholas Furnham who presented new implementations in FUNTREE that would allow exploring functional evolution on trees. He warned that EC classification is not univocal and that can also have problems for functional comparisons. They have developed “EC-Blast” which directly measures distances between enzymatic reaction based on the molecular structures of substrates and products. Christophe Dessimoz presented results from his recent paper in which they show important biases in GO term annotations, genes from the same species and families tend to be annotated with more similar terms because of experimental biases and author biases. When correcting for this biases, the conjecture still holds. However he admitted that differences were not very big, but still significant. Romain Studer came next. He measured selection and changes in structural stability in orthologs and duplicated genes. He showed that selected sites in paralogs tend to be more clustered in the structure than in orthologs, however he observed no differences in the evolution of stability between orthologs and paralogues. He concluded that differences between paralogues may be smaller than previously thought.

After the coffee break Jianzhi Zhang told us about his work towards probing the orthology conjecture. After giving a try, he gave up of using GO terms because of the many inconsistencies, and the biases observed. He thus reverted to interrogate for conservation of protein-protein interactions using experimentally determined interactions in various yeast species. Unfortunately the many interactions to test experimentally in duplicated proteins prevented him to show a comparison of orthologs and paralogs in this talk. Nevertheless he found that all PPIs tested for orthologs were conserved, even those that seemed not to be, were caused by possible errors in previous large-scale Yeast 2 Hybrid experiments. Alex Nguyen also showed results on the budding yeast gene duplications. They focused on a more specific aspect of function: the presence of short-conserved linear motifs in protein. They found that these were more likely to disappear/diverge after the duplication event, consistent with neo- or sub-functionalization models. We moved to Drosophila with our next speaker, Lev Yamplosky who exploited expression and genomic data from the 12 Drosophila genomes. They showed larger differences in paralogs, as compared to orthologs in rates of divergence, which were also more asymmetrical. They also found that these differences varied for fast- or slow-evolving families. Finally they could also find larger differences in paralogs in terms of expression. Then it was my turn, and I mainly showed our results on comparison of expression patterns in human and mouse. Our experimental design is different from others in that we use topological dating (not sequence divergence) to establish orthologs and paralogs of a similar age, and, second, we compared always orthologs to inter-species paralogs to get rid of species-specific biases in the comparisons. Our results support a larger divergence of paralogues as compared to orthologs in tissue pattern expression. Thanks to our experimental design we could also assess that most of the differences between paralogs were gained shortly after the duplication, linking the duplication event to a big fraction of the divergence. Our last speaker was Paul Thomas who gave an overview of what can you expect and what can you not expect from GO annotations. He also showed progress on how the consortium is trying to model functional evolution through gene families, and how these models can help in the study of the relationship between orthology, paralogy and gene function.

Thus we had a diverse set of talks, most of them focusing on the comparison of different aspects of functional evolution (GO annotations, expression, functional motifs, interactions, divergence, structure) and also using varying experimental designs and species. I would say one of the main conclusion is that GO (and even EC numbers) annotation can be misleading in our ascertainment of functional evolution. My personal view is that most talks showed results consistent with the conjecture, although the level of differences between paralogs and orthologs was sometimes small. Function can be described at multiple levels, and I would expect that functional divergence after duplications may affect only one or few of these. Thus if one experimental design focuses on one of such levels it may be expected to miss divergence in the other ones. In addition those designs that average over all levels will inevitably dilute small but important aspects of functional divergence. In conclusion this is an exciting topic and with the number and variety of groups that are now interested in the topic, I am sure that we will be closer and closer to understanding the complex relationships between orthology, paralogy and functional divergence.

Some links and  papers mentioned during the symposium (I probably miss some):

Abstracts from oral presentations in SMBE, including our symposium

Another post on the orthology conjecture 

Announcement of our symposiyum 

FunTree: a resource for exploring the functional evolution of
structurally defined enzyme superfamilies.
Furnham N, Sillitoe I, Holliday GL, Cuff AL, Rahman SA, Laskowski RA,
Orengo CA, Thornton JM.
Nucleic Acids Res. 2012 Jan;40(Database issue):D776-82

Brawand, D., et. al. The evolution of gene expression levels in mammalian organs. URL

 Forslund et. al. Domain conservation architecture in orthologs

Huerta-Cepas and Gabaldón Assigning duplication events to relative temporal scales in genome-wide studies.

Nehrt et. al. Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals

Nguyen et. al. Proteome-Wide Discovery of Evolutionary Conserved Sequences in Disordered Regions;5/215/rs1
Peterson et. al. Evolutionary constraints on structural similarity in orthologs and paralogs

Thomas et. al. On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report

Large-scale analysis of orthologs and paralogs under covarion-like and
constant-but-different models of amino acid evolution.
Studer RA, Robinson-Rechavi M.
Mol Biol Evol. 2010 Nov;27(11):2618-27.

How confident can we be that orthologs are similar, but paralogs differ?
Studer RA, Robinson-Rechavi M.
Trends Genet. 2009 May;25(5):210-6.

Pervasive positive selection on duplicated and nonduplicated vertebrate
protein coding genes.
Studer RA, Penel S, Duret L, Robinson-Rechavi M.
Genome Res. 2008 Sep;18(9):1393-402.

1 comment:

  1. Thanks for the nice write-up, it was a great summary and very interesting to read. Seems like it was a great and fruitful symposium!