tag:blogger.com,1999:blog-42707029799507899992024-02-20T19:58:31.784-08:00Treevolution: Biology through the evolutionary lensBlog of Toni Gabaldón, evolutionary biologist, working on comparative genomics at the Centre for Genomic Regulation (Barcelona, Spain)Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.comBlogger41125tag:blogger.com,1999:blog-4270702979950789999.post-77838601926508494642016-05-31T12:03:00.000-07:002016-06-04T05:11:31.874-07:00Response to Late Mitochondrial Origin is Pure Artefact<br />
<br />
<div style="text-align: justify;">
We <span style="color: black;"><a href="http://www.nature.com/nature/journal/v531/n7592/full/nature16941.html">recently</a> published a study showing that protobacterial derived proteins in the Last Eukaryotic Common Ancestor (LECA) show a tendency to have shorter phylogenetic distances to their bacterial counterparts as compared to LECA proteins originating from other Bacteria or Archaea. We interpreted this as evidence suggesting a late acquisition of mitochondria by a host which already contained bacterial and archeal-derived protein families. Our work has been heavily criticized by William Martin -one of the main proponents of mito-early hypotheses- and colleagues. The critic was first submitted to Nature, reviewed by editors and independent reviewers and eventually rejected. The authors have decided to publish a slightly modified version of the letter in <a href="http://biorxiv.org/content/early/2016/05/25/055368">BioRxiv</a>. In my opinion the tone of the letter is unacceptable for an open scientific discussion. In any case the bottom line is that their arguments do not support the claim that our results are artefactual, nor they show in which way the purported artefact produces the observed trend. For the sake of scientific discussion we have decided to publish our original response to their letter. We tried to post it in BioRxiv but it was declined because "<i>is a rebuttal to a criticism not a research paper</i>". Therefore I have decided to post it here. </span></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<style type="text/css">p { margin-bottom: 0.1in; direction: ltr; color: rgb(0, 0, 0); line-height: 120%; text-align: left; text-decoration: none; page-break-before: auto; page-break-after: auto; }p.western { font-family: "Arial",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; }p.cjk { font-family: "Arial",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; }p.ctl { font-family: "Arial",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal</style></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<span style="color: maroon;"><span style="font-size: small;">Martin
et. al. criticize several methodological aspects of our study.</span></span><span style="color: red;"><span style="font-size: small;">
</span></span><span style="color: maroon;"><span style="font-size: small;">We
first want to note that none of the points raised affect the core of
our conclusions -</span></span><span style="color: maroon;"><span style="font-size: small;"><i>i.e</i></span></span><span style="color: maroon;"><span style="font-size: small;">.
that differences in stem lengths relate to phylogenetic origin of
LECA families so that they are shorter in bacterial, and particularly
alpha-proteobacterial derived families- because the observed
relationships i) are independent of the clustering performed in
Figure 1 of Pittis and Gabaldón (2016), and ii) their criticism
focuses on one single comparison of a single dataset but the
differences are present across several datasets and approaches,
including the very same dataset from the authors mentioned in their
lette</span></span><span style="color: maroon;"><span style="font-size: small;"><span style="background: transparent;">r
(Ku et. al. 2015), a</span></span></span><span style="color: maroon;"><span style="font-size: small;">s
we show below. Secondly, their interpretation of our stem length
measurement and how they extrapolate to branches sub-tending
eukaryotic clades is conceptually flawed, as we also demonstrate
below. Thus none of their arguments compromise at any rate the main
conclusions of our article. We nevertheless want to discuss their
points. </span></span>
</div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<span style="color: black;"><span style="font-size: small;">Contrary
to what Martin et al. claim we do not assume a normal distribution of
the global distribution of stem lengths. The claim that our
statistical analyses are inappropriate is simply not true, we clearly
explain all the methods used, and the tests performed to support
observed differences are all nonparametric, without any assumption of
normality. In Figure 1 we did use a probabilistic clustering method
that fits a Gaussian mixture model, a mixture of normal
distributions, assuming multimodality in the data. Martin et. al.
show that a unimodal log-normal distribution would better fit the
data when the number of parameters is penalized. Does this
demonstrates that the underlying distribution is not a composite of
five gaussians?</span></span><span style="color: red;"><span style="font-size: small;">
</span></span><span style="color: black;"><span style="font-size: small;">No,
because when data are drawn from a five gaussian distributions with
the obtained parameters, in 81% of the cases a log-normal
distribution would be (wrongly) preferred using the BIC criterion.
Also, the fact that any randomly sampled log-normal distribution
could be fitted by a mixture model is by no means a surprise. In fact
any distribution of data could be fitted by a finite number of
mixture components, and this is precisely why these mixture models
are commonly used as universal function approximators and as a tool
to partition various kinds of data. Finally the definition of
overfitting is not BIC inflation but the lack of predictive power.
Thus other parameters have to be considered when assessing whether a
model provides a reasonable representation of the data. The use of
the EM algorithm is justified as a method for partitioning the data
because i) we may expect composite of signals from a proteome (LECA)
with at least two ancestral components (Archaeal host, and bacterial
endosymbiont), and ii) prior studies have suggested that normalized
branch lengths measurements as the ones used here to be approximately
normal (Rasmussen and Kellis, 2007). The assumtion of a unimodal
distribution such as the one proposed by Martin et. al. does not
capture the expected mixture origins for a chimeric proteome and does
not fit with the observation that differences in stem lengths relate
to non-homogeneous phylogenetic origins. In any case our results are
independent of this clustering exercise as the differences in stem
lengths are apparent when simply grouping the LECA families according
to their sister clades (Fig. 2 and Extended Data Fig. 1b of Pittis
and Gabaldón, 2016), or when using other forms of clustering the
data such as equal binning (results not shown). </span></span>
</div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<span style="color: red;"> </span>
</div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<span style="color: black;"><span style="font-size: small;">Their
purported extrapolation of our analyses to eukaryotic clades and
their derived dates is totally flawed and misleading. First of all,
we explicitly say that we do not assume constant rates (</span></span><span style="color: black;"><span style="font-size: small;"><i>i.e</i></span></span><span style="color: black;"><span style="font-size: small;">.
molecular clock), and our normalized branch length is a measurement
that is proportional to time but multiplied by a ratio between the
rate preceding and postdating LECA, so their timing exercise,
providing date estimates, is completely ungrounded. Secondly, Martin
et al. consider the normalized </span></span><span style="color: black;"><span style="font-size: small;"><i>sl</i></span></span><span style="color: black;"><span style="font-size: small;">
to yield arbitrary values, resulting in a log-normal distribution.
This openly contradicts the observation that families of different
prokaryotic origins show significant differences in </span></span><span style="color: black;"><span style="font-size: small;"><i>sl</i></span></span><span style="color: black;"><span style="font-size: small;">
and also </span></span><span style="color: black;"><span style="font-size: small;"><i>rsl</i></span></span><span style="color: black;"><span style="font-size: small;">
values. All our analyses robustly prove the opposite, there are
differences and these differences reflect the relative divergence
times. The cases of the cyanobacterial signal in Archaeplastida
(Extended Data Fig. 3, Pittis and Gabaldón 2016) and of
Lokiarchaeota signal in LECA (Extended Data Fig. 7, Pittis and
Gabaldón 2016) nicely indicate the validity of the measurement.
</span></span><span style="color: black;"><span style="font-size: small;"><span style="background: #ffffff;">Expecting
some extreme </span></span></span><span style="color: black;"><span style="font-size: small;"><i><span style="background: #ffffff;">ebl</span></i></span></span><span style="color: black;"><span style="font-size: small;"><span style="background: #ffffff;">
values to reflect radical adaptations and fast rates of some
lineages, we used the median because of its robustness with respect
to extreme outliers (see Methods). We also tried not accounting for
fast evolving taxonomic groups in the calculations, without any
change in our main results. All these observations are not explained
by the interpretation of the data provided by Martin. et. al.
Furthermore</span></span></span><span style="color: black;"><span style="font-size: small;">,
Martin et. al. show that the normalized branch lengths sub-tending
each eukaryotic clade follow log-normal distributions, and conclude
that this observation demonstrates that this is natural variation for
branches meant to represent a single time interval (e.g. divergence
of fungi from metazoans). By adopting this assumption they are
surprisingly ignoring that eukaryotic families are also subject to
differential gene loss and other processes, which would result in
multiple underlying patterns of the sub-tending branches (i.e. the
sub-tending branch of a fungal family, which was lost in metazoans
does not derive from the divergence between fungi and metazoans, but
from the deeper divergence of fungi and other unikonts). This becomes
apparent when controlling for the relationship of the normalized
branch lengths with the phylogenetic affiliation of the sister branch
-a key step in our analyses which they ignore. Indeed applying to the
eukaryotic clades an EM-based clustering and measuring enrichments in
phylogenetic affiliations as we did in our previous analysis (Pittis
and Gabaldón, 2016) reveals major underlying distributions related
with the nature of the sister group (Figure 1). </span></span><span style="color: black;"><span style="font-size: small;"><span style="background: #ffffff;">Thus,
in this case also, the variation of </span></span></span><span style="color: black;"><span style="font-size: small;"><i><span style="background: #ffffff;">sl</span></i></span></span><span style="color: black;"><span style="font-size: small;"><span style="background: #ffffff;">
values, interpreted by the authors as “vividly documenting abundant
branch length variation”, is clearly shown to naturally carry the
signal of different divergence times. So yes, the sl values in
eukaryotic groups </span></span></span><span style="color: black;"><span style="font-size: small;"><span style="font-weight: normal;"><span style="background: #ffffff;">do</span></span></span></span><span style="color: black;"><span style="font-size: small;"><b><span style="background: #ffffff;">
</span></b></span></span><span style="color: black;"><span style="font-size: small;"><span style="background: #ffffff;">imply
phases of early and late divergence times due to gene loss or other
biological events, as they do in the case of LECA. </span></span></span><span style="color: black;"><span style="font-size: small;">Of
note this is a new, independent demonstration that variation in stem
lengths relate with underlying variation in phylogenetic
distribution, and provides additional support to our approach. </span></span></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiImL8RedMEEkIkd3iG9AWjlFuv8eRkjzeWsQzBateutgpMGH0aFbuUhWLF9vqyI2rNVHJfgmsfejKdiGuz7-ZD0gVol6XDLiZJqP0cnqJwiF9kRzFVo3psUImlo95LKM1YN2keaki-6k/s1600/Figure1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="205" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiImL8RedMEEkIkd3iG9AWjlFuv8eRkjzeWsQzBateutgpMGH0aFbuUhWLF9vqyI2rNVHJfgmsfejKdiGuz7-ZD0gVol6XDLiZJqP0cnqJwiF9kRzFVo3psUImlo95LKM1YN2keaki-6k/s320/Figure1.png" width="320" /></a></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<span style="color: black;"><span style="font-size: small;"> Figure | Ascomycota stem length analysis. Different phylogenetic sister groups show<br />significant differences in stem lengths according to their divergence times from Ascomycota.<br />Gene losses in the sister group lineage can explain the alternative tree topologies and<br />differences in estimated stem lengths.</span></span>
</div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<span style="color: black;"><span style="font-size: small;">Finally,
Martin et. al. Focus their criticism in only one of our comparisons
and on only one of the datasets used. For that dataset, they wrongly
claim that we reused eukaryotic sequences in the different tree. This
is false. Given the multidomain nature of eukaryotic protein
sequences, the source of that dataset (Powell et. al. 2014) may
incorporate a given protein to more than one orthologous cluster.
However we made sure we only used the orthologous sequence regions in
a given analysis, thus never re-using a given eukaryotic sequence.
Our analyses use standard filtering approaches but they claim that
statistical significance for one of our comparisons
(alpha-proteobacterial to other bacteria) is lost when applying
additional ad hoc filtering on top of our previous filtering steps.
We must note that even applying their filterings and using a
permutation test as the one used in our paper, the
alpha-proteobacterial sl values, remain significantly lower compared
to other bacteria (P=1e-2, accounting only for families with
eukaryotic sequence lengths >= 100 and P=3.7e-2, accounting only
for alignments with gaps <= 50%, 10</span></span><span style="color: black;"><sup><span style="font-size: small;">6</span></sup></span><span style="color: black;"><span style="font-size: small;">
permutations). The loss of significance in some of the tests when
artificially reducing the data is unsurprising. We are focusing on
very ancient events and the signal we are measuring must be
necessarily weak, and the number of LECA families that can be traced
back to specific ancestries is limited. Indeed the statistical
significance using a Mann-Whitney U-test is often lost (>60%-70%
of the times) when randomly reducing the data to sizes similar to the
resulting sizes in their filtered dataset, which suggest that the
mere effect of reducing the size, rather than the particular
additional filtering used is having a major effect. This is why we
made sure the signal was robust across different datasets, always
using state of the art filtering approaches. Given the suggestion by
Martin et. al. that a recent phylogenetic analyses from them (which
appeared after we had submitted the paper) represents a more careful
dataset (Ku et al., 2015), we repeated our analyses using this
dataset, which confirmed our results (650 eukaryotic clades, Archaeal
vs Bacterial families, </span></span><span style="color: black;"><span style="font-size: small;"><i>P</i></span></span><span style="color: black;"><span style="font-size: small;">=1.2e-41,
two-tailed Mann-Whitney </span></span><span style="color: black;"><span style="font-size: small;"><i>U</i></span></span><span style="color: black;"><span style="font-size: small;">-test
and α-proteobacterial families’ sl significantly smaller within
Bacterial, </span></span><span style="color: black;"><span style="font-size: small;"><i>P</i></span></span><span style="color: black;"><span style="font-size: small;">=4.7e-2,
permutation test, 10</span></span><span style="color: black;"><sup><span style="font-size: small;">6</span></sup></span><span style="color: black;"><span style="font-size: small;">
permutations). Again, this result lends further support to our
findings. </span></span>
</div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in;">
<span style="font-size: small;">Altogether, we show that the
criticisms raised by Martin et. al. do not compromise the main results
and conclusions of our paper. Furthermore, we would like to stress
that the new dataset and analyses brought about by this discussion
lend additional support to our approach and conclusions. </span>
</div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<ol>
<li><div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">Ku,
C. et al. Endosymbiotic origin and differential loss of eukaryotic
genes. </span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;"><i>Nature</i></span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">
</span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;"><b>524</b></span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">,
427–432 (2015).</span></span></span></div>
</li>
<li><div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">Rasmussen,
M. D. & Kellis, M. Accurate gene-tree reconstruction by learning
gene- and species-specific substitution rates across multiple
complete genomes. </span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;"><i>Genome
Res</i></span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">.
</span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;"><b>17</b></span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">,
1932–42 (2007).</span></span></span></div>
</li>
<li><div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">Pittis,
A. A. & Gabaldón, T. Late acquisition of mitochondria by a host
with chimaeric prokaryotic ancestry. </span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;"><i>Nature</i></span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">
</span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;"><b>531</b></span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">,
101–4 (2016).</span></span></span></div>
</li>
<li><div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in;">
<span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">Powell
et. al. eggNOG v4.0: nested orthology inference across 3686
organisms. </span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;"><i>Nucleic
Acids. Res. </i></span></span></span><span style="color: black;"><span style="font-family: "arial" , sans-serif;"><span style="font-size: small;">42(Database
issue):D231-9</span></span></span></div>
</li>
</ol>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
</div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in; page-break-after: auto; page-break-before: auto;">
<br /></div>
<div align="justify" class="western" style="line-height: 115%; margin-bottom: 0in;">
<br /></div>
<div style="text-align: justify;">
<span style="font-size: x-small;"><span style="font-size: 10pt;"></span></span></div>
Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com5tag:blogger.com,1999:blog-4270702979950789999.post-81023058829474979342012-11-01T03:40:00.003-07:002012-11-01T03:41:23.682-07:00A genetic cartography of humans<br />
<br />
<div style="text-align: justify;">
The Phase I paper of the<a href="http://www.1000genomes.org/"> 1000 genomes project</a> has been published in <a href="http://www.nature.com/nature/journal/v491/n7422/full/nature11632.html">Nature</a>. Similarly to the completion of the first draft of the human genome sequence, this work constitutes a milestone in the path to understand the complex relationships between genotype and phenotype in our species. When we had the first human sequence we had, for the first time, a broad view of what were the genetic constituents of our species, no doubt that this has served to advance our understanding in many fields related to human biology and disease. <b>What is then the significance of having 999 genomes more?</b> I have been asked this question by some journalists in the last days. </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
If one would like to <b>describe our species purely in genetic terms</b>, a single genome could be a good approximation, but only that, an approximation. We know that we all differ from each other genetically, and that some of these differences explain part of the observable differences (the phenotype). What is the extent and nature of the genetic differences that exists currently?, or that even existed before in the human population?, which of these differences are important in terms of phenotypic variability, including the propensity to suffer from certain diseases?, what fraction of these differences have no important effect and can vary freely?. All these questions cannot get an answer from the analyses of a single genome, and only the comparison of a large set of genomes would serve to have a better idea of what is the genome of our species. </div>
<div style="text-align: justify;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIzxvaga0ywpABq3Ofw1h1cMT01A0Li0zGl4S8lICT6i4G_FEtAFJG9e73jJFszjeQjvJ9uAwnmNISE7GrJ_0Pde6y0cspA171tFbaadpIVgOkoVmSN_Wn35gZhsX3SFTUa54RZIdWSYs/s1600/early_map.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="231" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIzxvaga0ywpABq3Ofw1h1cMT01A0Li0zGl4S8lICT6i4G_FEtAFJG9e73jJFszjeQjvJ9uAwnmNISE7GrJ_0Pde6y0cspA171tFbaadpIVgOkoVmSN_Wn35gZhsX3SFTUa54RZIdWSYs/s320/early_map.jpg" width="320" /></a></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The analogy of a map has been used several times to illustrate how the genome sequence has helped us navigate it and has enabled dramatic improvements in how we address questions related to human biology. I think the analogy is very good, since a map in itself has only a limited scientific value, since it is, basically, a description. However, similarly to how ancient maps dramatically affected the course of history, having this maps enable unanticipated scientific discoveries. This first 1000 (1092 to be extact) genomes constitutes a first cartography of human genetic variability. Providing detailed information of what mutations occur in different populations. This map is not complete, of course, but enables a good level of resolution. <b>The authors estimate that we now have a catalogue of more than 98% of the mutations that occur at a frequency of at least 1%. </b>Continuing with the analogy we still miss is the specific details of how the coastal areas are: like if we would see them from very far away. This missing variability may be important, since variants involved in deleterious phenotypes (disease) are expected to be at very low frequencies. Thus the effort of improving this cartography will continue and 1500 additional genomes are planned within the consortium. In parallel, many other projects and even some from particular private persons are producing more individual genome sequences. It will be important to ensure that all these information ends up in public repository, so that this information is efficiently exploited by the scientific community. </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqkP6xfvUt_-l2Y-KaUbvycRyU8wIdYp4tyfuMzw7Hp4UfKUQVFD9YQvLwT2YeGfNrWWHl8Tam3OwxyKMNsFVLqb8cQ_7rmwxSNUwcNL8j1-hskD6Vx_wBXMb4Mf0YTKhaBC_2es6I-RE/s1600/250px-NHGRI_human_male_karyotype.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqkP6xfvUt_-l2Y-KaUbvycRyU8wIdYp4tyfuMzw7Hp4UfKUQVFD9YQvLwT2YeGfNrWWHl8Tam3OwxyKMNsFVLqb8cQ_7rmwxSNUwcNL8j1-hskD6Vx_wBXMb4Mf0YTKhaBC_2es6I-RE/s1600/250px-NHGRI_human_male_karyotype.png" /></a></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The 1000 paper is very descriptive but already shows some important results that have an impact on how we think about the relationships of genotypes and phenotypes. <b>They report that an individual would carry on average 200-300 variants that affect conserved residues in non-coding sequences, and even 2-4 that have been associated to disease in other studies.</b> All individuals sequenced are healthy and thus this result tells us about the plasticity of the genome to tolerate mutations that may be deleterious in other genetic backgrounds. There is much to learn from this and the 1000 genomes will be a useful resource for studies trying to associate genetic backgrounds with disease propensity. In addition the genome sequences carry the footprints of the<b> recent evolution of human populations</b>, and the level of observable variability of a site can be informative of the potential functionality. Thus the possible applications of this data are many, and as I posed to a journalist. The main scientific discovery enable by this articles yet to come. </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Finally, there is one important aspect that journalists do not pay much attention. Putting together this project has been a gigantic effort and has required the development <b>of new tools and algorithms</b> to work with this massive amount of data. Only the coordinated efforts of many groups has made this possible.This comes at a time in which such tools are desperately needed, given the growing impact of idividual genome sequencing in medicine and other fields. Similar to how an ambitious mission to bring a rover to Mars impacts scientific development beyond the particular purpose of this mission, the tools developed by the 1000 genomes project are already playing a role in hundreds other genomics project. Thus the merit of this big consortium project is not entirely the immediate scientific discoveries- at times deceiving because they are inevitably only descriptive- but their <b>catalytic effect</b> on a scientific field. </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
</div>
Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com1tag:blogger.com,1999:blog-4270702979950789999.post-49087612125437550182012-09-22T04:56:00.001-07:002012-09-22T04:56:33.860-07:00Can genomics save endangered species?<title></title>
<style type="text/css">
<!--
@page { margin: 0.79in }
P { margin-bottom: 0.08in }
</style>
<br />
<div align="JUSTIFY" style="margin-bottom: 0in;">
<span style="font-style: normal;"><span style="font-weight: normal;">Nowadays
genomics is pervading many research fields in biology, and
conservation biology is not an exception anymore. The <a href="http://www.nature.com/nature/journal/v463/n7279/full/nature08696.html">Giant panda </a>was
perhaps the first organism selected for sequencing in which the
primary reason was its status as an endangered species. Since then,
other species have been selected for sequencing, in an effort to
contribute to their conservation. To name a few: the Californian
condor, t</span></span><span style="font-style: normal;"><span style="font-weight: normal;"><span style="-moz-background-clip: border; -moz-background-inline-policy: continuous; -moz-background-origin: padding; background: transparent none repeat scroll 0% 0%;">he
Tiger, Tasmanian devil and</span></span></span><span style="font-style: normal;"><span style="font-weight: normal;">
the Iberian lynx, are also entering the genomic era. Our group is
contributing to the efforts of sequencing and analyzing the Iberian
Lynx genome, an emblematic predator of our peninsula which has the
dubious honor to be the most endangered feline species on the planet.
With a population below 400, a fragmented and restricted distribution
area and a dangerously low level of genetic diversity, its situation
is rather critical. Two years ago a <a href="http://lynxgenomics.org/">consortium of Spanish research group</a>s joined forces to sequence this species' genome. </span></span></div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhp4Yjm9DKzr6DR1ev3CJQ-YOHc3jbBlDciFdqr7ChgoVxzu8vkWWSiICBzC0E1PWjbH70_6S4Mvt2hyphenhyphen3uugHLg9WeipCFjzCf9UNmm5BoiTtsCO4xTT71R_SIkAVm9LzsZx6j9zkXMQ4/s1600/Candiles3_2C.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhp4Yjm9DKzr6DR1ev3CJQ-YOHc3jbBlDciFdqr7ChgoVxzu8vkWWSiICBzC0E1PWjbH70_6S4Mvt2hyphenhyphen3uugHLg9WeipCFjzCf9UNmm5BoiTtsCO4xTT71R_SIkAVm9LzsZx6j9zkXMQ4/s320/Candiles3_2C.jpg" width="264" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">"Candiles" the sequenced Iberian Lynx male </td></tr>
</tbody></table>
<div align="JUSTIFY" style="margin-bottom: 0in;">
<span style="font-style: normal;"><span style="font-weight: normal;"> </span></span>
</div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
<br /></div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
<span style="font-style: normal;"><span style="font-weight: normal;">I
have been asked many times if this effort will definitely save the
species, or even whether the money would not be better invested in
other efforts. How can a genome help in saving an endangered
species?, are we feeding unreasonable expectations on the possible
role of genomics in species conservation? Although only time will
tell whether such efforts will pay off, I consider that genomics can
certainly provide a new, very useful angle to species conservation.
In any case, genomics should be considered just as another tool
towards species conservation, rather than as the definitive solution.
Species are endangered because of various causes, mostly territory
loss and degradation, overexploitation, and alteration of their
ecological networks. It is obvious that the main focus should be
given to fight the causes that triggered population drops and create
the necessary conditions for the populations to recover safely. As a
powerful tool to understand a species' biology, and as a way to
investigate past and current population dynamics, the availability of
a genome can greatly help in understanding some of the factors that
may have been decisive in population decline. Having a reference
genome opens the door for a closer genetic monitoring of wild
populations, not only because it enables the selection of new marker
genes than can be sampled in many individuals but also because it
paves the way for obtaining whole genome-level population data by
re-sequencing strategies. Indeed, our project includes already
re-sequencing of additional individuals from the main fragmented
territories occupied by the species. </span></span>
</div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
<br /></div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
<span style="font-style: normal;"><span style="font-weight: normal;">Having
such kind of data is key to understand gene flow among the different
populations, since it will provide a better picture of the genetic
pools of the different populations. This will help</span></span><span style="font-style: normal;"><span style="font-weight: normal;"> to better plan crosses among captive individuals -mainly those
with permanent injuries that cannot be successfully released to the
wild- and future releases of their progeny.</span></span><span style="font-style: normal;"><span style="font-weight: normal;"> </span></span><span style="font-style: normal;"><span style="font-weight: normal;">This will have a direct
impact in the case of the Iberian lynx, where high levels of
inbreeding and low genetic diversity exposes fragmented populations
to a higher rate of diseases with a genetic basis (particularly a
renal disease), and a reduced potential to overcome potential
infectious diseases. A better knowledge of the genetic pool of both
wild and captive populations will undoubtedly help in guiding
strategies to help them recover. </span></span><span style="font-style: normal;"><span style="font-weight: normal;">In addition individuals and their
territories could be tracked from materials such as faeces or hairs. Other applications may be more
specific for a particular endangered species, for instance in the
tasmanian devil, genomics has been used to track a transmissible
cancer that causes a <a href="http://www.bbc.co.uk/news/science-environment-17062091">facial tumor disease</a> that is transmitted by
biting. </span></span><br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgociXW-f7lJ-wfilCkxWVnInRXqdDgoYQwGYkAXYuvh6ZRnxTQPqkVTQg9jlGhhn0g_ewMdp279I4tZrlywPhauVQf6Dlq20zbIBvhhcV6WPhhcPAWkiBFhj1X2Ox71GHMRRVDpaaiPls/s1600/tasmanian.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgociXW-f7lJ-wfilCkxWVnInRXqdDgoYQwGYkAXYuvh6ZRnxTQPqkVTQg9jlGhhn0g_ewMdp279I4tZrlywPhauVQf6Dlq20zbIBvhhcV6WPhhcPAWkiBFhj1X2Ox71GHMRRVDpaaiPls/s320/tasmanian.jpg" width="243" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Tasmanian devil with transmissible facial tumor</td></tr>
</tbody></table>
<br />
<br />
<span style="font-style: normal;"><span style="font-weight: normal;">Other applications of conservation genomics that go beyond
the sequencing of the endangered species itself, refer to the
monitoring, using similar genomics tools, of important pathogens or
symbionts of endangered species. Of course all these efforts will
only be of little help if the causes that drove their decline are
still around. Thus there is a growing number of promising possible
applications of genomics to the conservation of endangered species,
some of them already at work. I expect this field to grow fast in the
coming years, as a concerned scientist I am proud that my particular
corner of expertise can contribute to the noble cause of helping to
keep the biodiversity of our planet. </span></span>
</div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
</div>
Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-31156329321973994792012-06-28T01:31:00.000-07:002012-06-28T05:25:38.410-07:00wrap-up of the orthology, paralogy, and function symposium at SMBE 2012<div dir="ltr" style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">I
promised some people to write a short summary of the symposium that <b>Matthew Hahn, Marc Robinson-Rechavi, Iddo Friedberg</b></span><span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;"><b><span style="font-size: x-small;">, </span>and I</b> co-organized at SMBE 2012. I
particularly enjoyed the symposium and the room was pretty full all the
time, despite running in parallel to other interesting topics. I will
just write an overall summary without going into too much details of
each of the talks, and at the end I would list a number of papers that
were commented on the various talks. I have to clarify that this
informal wrap-up only contains my own views and has not been
consensuated among the organizers. I invite any of the attendants to add
comments to highlight some important aspects that I may have missed. </span></div>
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;"></span><br />
<div dir="ltr" style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-bOITVX-EIC3YzbxaqweFUYnRqZ524aSkWMOilNEBfN3Svr969LxYZPFZQn5FXMmfFgnhPGPpSA4vCTOX5ERNPKc_r3WHFTLwBbScGJ0PExdqitzeB6_PqwXXnR8U1fzPdEIVrYvh4xc/s1600/duplication.jpg" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="237" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-bOITVX-EIC3YzbxaqweFUYnRqZ524aSkWMOilNEBfN3Svr969LxYZPFZQn5FXMmfFgnhPGPpSA4vCTOX5ERNPKc_r3WHFTLwBbScGJ0PExdqitzeB6_PqwXXnR8U1fzPdEIVrYvh4xc/s400/duplication.jpg" width="400" /></a><span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">I’ll
start by providing a summary of how all this started... which is a
rather unusual way, I believe. Indeed the idea of the symposium was born
in the blogosphere, in the popular Jonathan Eisen’s Tree of Life blog,
where he invited Matthew Hahn to write a special guest post on the
“history behind” his paper on testing the orthology conjecture. One of
the conclusions from that paper was that paralogous sequences were more
similar in function (and in expression patterns) than paralogs, which
contradicted one of the major expectations (and assumptions) behind the
theories of duplication-driven functional divergence and the strategies
for inferring functions from orthologous sequences. That paper had
already caused a bit of a turmoil in the orthology community (I remember
this was a hot discussion during the last Quest for Orthologs meeting,
at Cambridge), and several concerns were being raised about the
suitability of comparisons of functional annotations from different
species, and the conclusions derived within the paper. Rather rapidly,
several people commented on Matt’s post and a lively discussion started
(more than 40 comments in total!). The discussion was so interesting
that Marc Robinson-Rechavi suggested we should bring this scientific
debate in the form of a symposium in one of the upcoming conference, and
so is how some of us started to work on this idea.To me it was the first time that I met the other organizers in person. </span></div>
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;"></span><br />
<div dir="ltr" style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">The
symposium started with <b>Eugene Koonin</b>, who nicely introduced the topic
of what conjectures could be implied by the definition of orthology, a
purely evolutionary one as introduced by Walter Fitch in 1970. He then
showed results from his lab that indicate that conjectures tend to hold,
but that there may be exception. For instance, the conjecture that
orthologs should be best reciprocal hits can be broken by an accelerated
evolution in one of the true orthologs, he then showed work from other
groups (Sali, Sonnhammer) on the higher conservation of structure and
domain architecture in orthologs as compared to paralogs. He criticized
the use of GO terms by Hahn and others and argued that one should at
variety of data on function to test the conjecture. He presented results
from his own group which show higher conservation of expression across
species. He concluded that the functional conjecture still holds,
although he observed that differences may not be spectacular. <b>Catherina
Gushanski</b> was next talking on changes in gene expression following
segmental duplications in mammals. They have produced an impressive
dataset of expression from different tissues in various mammal species.
She used that set to ask the question whether duplication was
contributing more to divergence than time alone and showed that levels
of expression were decreasing in younger duplicates, changes were
different across different tissues. She observed no differences between
one-to-one orthologs or old duplicate pairs, she also found no
differences in terms of tissue specificity in orthologs vs paralogs.
Next on stage was <b>Nicholas Furnham</b> who presented new implementations in <a href="http://www.ebi.ac.uk/thornton-srv/databases/FunTree/">FUNTREE</a> that would allow exploring functional evolution on trees. He
warned that EC classification is not univocal and that can also have
problems for functional comparisons. They have developed “EC-Blast”
which directly measures distances between enzymatic reaction based on
the molecular structures of substrates and products. <b>Christophe Dessimoz</b>
presented results from his recent paper in which they show important
biases in GO term annotations, genes from the same species and families
tend to be annotated with more similar terms because of experimental
biases and author biases. When correcting for this biases, the
conjecture still holds. However he admitted that differences were not
very big, but still significant.<b> Romain Studer</b> came next. He measured
selection and changes in structural stability in orthologs and
duplicated genes. He showed that selected sites in paralogs tend to be
more clustered in the structure than in orthologs, however he observed
no differences in the evolution of stability between orthologs and
paralogues. He concluded that differences between paralogues may be
smaller than previously thought. </span></div>
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;"></span><br />
<div dir="ltr" style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">After
the coffee break <b>Jianzhi Zhang</b> told us about his work towards probing
the orthology conjecture. After giving a try, he gave up of using GO
terms because of the many inconsistencies, and the biases observed. He
thus reverted to interrogate for conservation of protein-protein
interactions using experimentally determined interactions in various
yeast species. Unfortunately the many interactions to test
experimentally in duplicated proteins prevented him to show a comparison
of orthologs and paralogs in this talk. Nevertheless he found that all
PPIs tested for orthologs were conserved, even those that seemed not to
be, were caused by possible errors in previous large-scale Yeast 2
Hybrid experiments. <b>Alex Nguyen</b> also showed results on the budding yeast
gene duplications. They focused on a more specific aspect of function:
the presence of short-conserved linear motifs in protein. They found
that these were more likely to disappear/diverge after the duplication
event, consistent with neo- or sub-functionalization models. We moved to
Drosophila with our next speaker, <b>Lev Yamplosky</b> who exploited
expression and genomic data from the 12 Drosophila genomes. They showed
larger differences in paralogs, as compared to orthologs in rates of
divergence, which were also more asymmetrical. They also found that
these differences varied for fast- or slow-evolving families. Finally
they could also find larger differences in paralogs in terms of
expression. Then it was <b>my turn,</b> and I mainly showed our results on
comparison of expression patterns in human and mouse. Our experimental
design is different from others in that we use topological dating (not
sequence divergence) to establish orthologs and paralogs of a similar
age, and, second, we compared always orthologs to inter-species paralogs
to get rid of species-specific biases in the comparisons. Our results
support a larger divergence of paralogues as compared to orthologs in
tissue pattern expression. Thanks to our experimental design we could
also assess that most of the differences between paralogs were gained
shortly after the duplication, linking the duplication event to a big
fraction of the divergence. Our last speaker was<b> Paul Thomas</b> who gave an
overview of what can you expect and what can you not expect from GO
annotations. He also showed progress on how the consortium is trying to
model functional evolution through gene families, and how these models
can help in the study of the relationship between orthology, paralogy
and gene function.</span></div>
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;"></span><br />
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;"></span><br />
<div dir="ltr" style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">Thus
we had a diverse set of talks, most of them focusing on the comparison
of different aspects of functional evolution (GO annotations,
expression, functional motifs, interactions, divergence, structure) and
also using varying experimental designs and species. I would say one of
the main conclusion is that GO (and even EC numbers) annotation can be
misleading in our ascertainment of functional evolution. My personal
view is that most talks showed results consistent with the conjecture,
although the level of differences between paralogs and orthologs was
sometimes small. Function can be described at multiple levels, and I
would expect that functional divergence after duplications may affect
only one or few of these. Thus if one experimental design focuses on one
of such levels it may be expected to miss divergence in the other ones.
In addition those designs that average over all levels will inevitably
dilute small but important aspects of functional divergence. In
conclusion this is an exciting topic and with the number and variety of
groups that are now interested in the topic, I am sure that we will be
closer and closer to understanding the complex relationships between
orthology, paralogy and functional divergence. </span></div>
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;"></span><br />
<div dir="ltr" style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">Some links and papers mentioned during the symposium (I probably miss some):</span></div>
<div dir="ltr" style="font-family: inherit; margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="font-size: small;"><br /></span></div>
<div dir="ltr" style="font-family: inherit; margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="background-color: transparent; color: black; font-size: small; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">Abstracts from oral presentations in SMBE, including our symposium <a href="http://imgpublic.mci-group.com/ie/PCO/OralAbstracts_Final.pdf%20">http://imgpublic.mci-group.com/ie/PCO/OralAbstracts_Final.pdf </a></span></div>
<div dir="ltr" style="font-family: inherit; margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="font-size: small;"><br /></span></div>
<div style="font-family: inherit;">
<span style="background-color: transparent; color: black; font-size: small; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">The blog post that initiated this:</span><span style="font-size: small;"><a href="http://www.blogger.com/goog_749097779"> http://phylogenomics.blogspot.com.es/2011/09/special-guest-post-discussion.html</a></span></div>
<div style="font-family: inherit;">
<span style="font-size: small;"><br /></span></div>
<div style="font-family: inherit;">
<span style="font-size: small;"><span style="font-size: x-small;">Another post on the orthology<a href="http://treevolution.blogspot.com.es/2011/09/on-orthology-conjecture.html"> conjecture </a></span></span></div>
<div style="font-family: inherit;">
<span style="font-size: small;"><br /></span></div>
<div style="font-family: inherit;">
<span style="font-size: small;">Announcement of our symposiyum </span></div>
<div style="font-family: inherit;">
<span style="font-size: small;"><br /></span></div>
<div style="font-family: inherit;">
<span style="font-size: small;">Altenhoff et. al.<a href="http://www.blogger.com/goog_749097768"> </a><a href="http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002514">Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs</a></span></div>
<div style="font-family: inherit;">
<span style="font-size: small;"><br /></span></div>
<div style="font-family: inherit;">
<span style="font-size: x-small;">FunTree: a resource for exploring the functional evolution of <br />
structurally defined enzyme superfamilies.<br />
Furnham N, Sillitoe I, Holliday GL, Cuff AL, Rahman SA, Laskowski RA, <br />
Orengo CA, Thornton JM.<br />
Nucleic Acids Res. 2012 Jan;40(Database issue):D776-82<br />
<a href="https://owa.crg.es/owa/redir.aspx?C=802856ba89794547b2554c5a05b9c6c0&URL=http%3a%2f%2fnar.oxfordjournals.org%2fcontent%2f40%2fD1%2fD776.long" target="_blank">http://nar.oxfordjournals.org/content/40/D1/D776.long</a><br />
<br />
<br />
<a href="https://owa.crg.es/owa/redir.aspx?C=802856ba89794547b2554c5a05b9c6c0&URL=http%3a%2f%2fwww.ebi.ac.uk%2fthornton-srv%2fdatabases%2fFunTree%2f" target="_blank"></a></span></div>
<div style="font-family: inherit;">
<span style="font-size: small;"><span style="background-color: transparent; color: black; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">Brawand, D., et. al. The evolution of gene expression levels in mammalian organs. </span><a href="http://www.ncbi.nlm.nih.gov/pubmed/22012392"><span style="background-color: transparent; color: #1155cc; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: underline; vertical-align: baseline;">URL</span></a></span> </div>
<div style="font-family: inherit;">
<span style="font-size: small;"><br /></span></div>
<div dir="ltr" style="font-family: inherit; margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="font-size: small;"> </span><span style="font-size: small;"><span style="background-color: transparent; color: black; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">Forslund et. al. Domain conservation architecture in orthologs</span></span></div>
<div dir="ltr" style="font-family: inherit; margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="font-size: small;"><a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3215765/"><span style="background-color: transparent; color: #1155cc; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: underline; vertical-align: baseline;">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3215765/</span></a></span></div>
<div style="font-family: inherit;">
<span style="font-size: small;"><br /></span></div>
<div style="font-family: inherit;">
<span style="font-size: small;">Huerta-Cepas. et. al. <a href="http://bib.oxfordjournals.org/content/12/5/442.long">Evidence for short-time divergence and long-time conservation of tissue-specific expression after gene duplication.</a></span></div>
<h1 style="font-family: inherit; font-weight: normal;">
<span style="font-size: small;">Huerta-Cepas and Gabaldón <a href="http://bioinformatics.oxfordjournals.org/content/27/1/38.long">Assigning duplication events to relative temporal scales in genome-wide studies.</a></span><span style="font-size: small;"> </span></h1>
<h1 style="font-family: inherit; font-weight: normal;">
<span style="font-size: small;">Nehrt et. al. Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals<a href="http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002073"> http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002073</a></span></h1>
<div style="font-family: inherit;">
<span style="font-size: small;">Nguyen et. al.</span><span style="font-size: small;"> Proteome-Wide Discovery of Evolutionary Conserved Sequences in Disordered Regions</span><span style="font-size: small;"> <a href="http://stke.sciencemag.org/cgi/content/abstract/sigtrans;5/215/rs1">http://stke.sciencemag.org/cgi/content/abstract/sigtrans;5/215/rs1</a></span></div>
<div dir="ltr" style="font-family: inherit; margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="font-size: small;"><span style="background-color: transparent; color: black; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;"> </span></span></div>
<div style="font-family: inherit;">
<span style="background-color: transparent; color: black; font-size: small; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;">Peterson et. al. Evolutionary constraints on structural similarity in orthologs and paralogs</span></div>
<div dir="ltr" style="font-family: inherit; margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="font-size: small;"><a href="http://onlinelibrary.wiley.com/doi/10.1002/pro.143/full"><span style="background-color: transparent; color: #1155cc; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: underline; vertical-align: baseline;">http://onlinelibrary.wiley.com/doi/10.1002/pro.143/full</span></a></span></div>
<div dir="ltr" style="font-family: inherit; margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="font-size: small;"><br /></span></div>
<div dir="ltr" style="font-family: inherit; margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="background-color: transparent; color: #1155cc; font-size: small; font-style: normal; font-variant: normal; font-weight: normal; vertical-align: baseline;">Thomas et. al.<a href="http://www.blogger.com/goog_749097775"> </a></span><span style="font-size: small;"><a href="http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002386"> On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report</a></span><br />
<br />
<span style="font-size: x-small;"><br />
Large-scale analysis of orthologs and paralogs under covarion-like and <br />
constant-but-different models of amino acid evolution.<br />
Studer RA, Robinson-Rechavi M.<br />
Mol Biol Evol. 2010 Nov;27(11):2618-27.<br />
<a href="https://owa.crg.es/owa/redir.aspx?C=802856ba89794547b2554c5a05b9c6c0&URL=http%3a%2f%2fmbe.oxfordjournals.org%2fcontent%2f27%2f11%2f2618.short" target="_blank">http://mbe.oxfordjournals.org/content/27/11/2618.short</a><br />
<br />
How confident can we be that orthologs are similar, but paralogs differ?<br />
Studer RA, Robinson-Rechavi M.<br />
Trends Genet. 2009 May;25(5):210-6.<br />
<a href="https://owa.crg.es/owa/redir.aspx?C=802856ba89794547b2554c5a05b9c6c0&URL=http%3a%2f%2fwww.sciencedirect.com%2fscience%2farticle%2fpii%2fS0168952509000559" target="_blank">http://www.sciencedirect.com/science/article/pii/S0168952509000559</a><br />
<br />
Pervasive positive selection on duplicated and nonduplicated vertebrate <br />
protein coding genes.<br />
Studer RA, Penel S, Duret L, Robinson-Rechavi M.<br />
Genome Res. 2008 Sep;18(9):1393-402.<br />
<a href="https://owa.crg.es/owa/redir.aspx?C=802856ba89794547b2554c5a05b9c6c0&URL=http%3a%2f%2fgenome.cshlp.org%2fcontent%2f18%2f9%2f1393.short" target="_blank">http://genome.cshlp.org/content/18/9/1393.short</a><br />
</span><span style="font-size: small;"><br /></span></div>
<div dir="ltr" style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;">
<span style="background-color: transparent; color: #1155cc; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: underline; vertical-align: baseline;"> </span><span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;"></span></div>
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 15px; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline;"></span>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com1tag:blogger.com,1999:blog-4270702979950789999.post-4608837499554200852012-06-01T04:11:00.003-07:002012-06-01T07:02:04.984-07:00Publicly available or not?<title></title>
<style type="text/css">
<!--
@page { margin: 0.79in }
P { margin-bottom: 0.08in }
-->
</style>
<br />
<div align="JUSTIFY" style="margin-bottom: 0in;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi79fs7p5GMnjOXJ1EiYt2boUMitpAZUkqduuNUlNR_ma24Jw0HpP7EMk1VCTu1LQxrr31oSEEMOZxEeG5iflz7-Z8Qo2Lq8jSJ8FwvMbNwL0xFVsJPKp_0gkCagkeU45XSxxpl2BwOTZw/s1600/no_access.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="236" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi79fs7p5GMnjOXJ1EiYt2boUMitpAZUkqduuNUlNR_ma24Jw0HpP7EMk1VCTu1LQxrr31oSEEMOZxEeG5iflz7-Z8Qo2Lq8jSJ8FwvMbNwL0xFVsJPKp_0gkCagkeU45XSxxpl2BwOTZw/s320/no_access.jpg" width="320" /></a>I have always had the
naive understanding that databases such as GenBank were public, and
that one was free to do research on data accessed from there, and
eventually publish the results. However nothing seems to be as simple
as that, since many of the genomes deposited in there have not been
published yet. I have experienced myself and heard from many
colleagues problematic situations regarding the use of genome data
taken from public databases but yet to be published. Current
guidelines are open to different interpretations, and different
stakeholders (editors, reviewers, users, data producers) may have
entirely different and conflicting views. With the current trend we
will soon have more unpublished than published genomes in public
databases, so I think it is worth re-assessing the policies. Here I
share some views.
</div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
<br /></div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
Policy guidelines
regarding the use of genomic sequences prior to publication are
available (see NHGRI rapid data release policy
<a href="http://www.genome.gov/10506376">http://www.genome.gov/10506376</a>), and set reasonable rules. For
instance that data producers should deposit the data publicly and
should produce a paper citable for the source of the data within a
short period of time. This could precede a full genome paper in which
a more througough analysis is produced. Users should not take the
public data to publish an analysis focused on that genome. But this
situation should not be prolonged too much. The underlying idea is to
reserve the opportunity to describe the main characteristics and
findings to the researchers that do the effort of sequencing,
assembling, and annotating a genome, while ensuring that the data
serves the advancement of science by allowing other groups to perform
research on the genome data as soon as it is produced. However, there
are many interpretations on what possible uses of the data should be
allowed. Moreover, although indicative time-frames for the
preferential exploitation of the data are given (e.g. 6 months),
these are only indications. In the absence of clear-cut rules, the
situation is calling for conflict. With the current flow of
sequencing data, we will increasingly face the situation that data
produced for public use and accessible through public databases is
not associated to a paper and thus unclear whether its use should
require permission. In such situations one may have different
interpretations on existing rules, that of the leader of the
sequencing project, that of the researcher that is accessing the
data, that of the agency that financed the sequencing, and even that
of the editors and reviewers of papers using available but
unpublished data. Below I list some undesirable situations that
highlights the contradictions of the current system. These situations
are not hypothetical but rather correspond to real cases that I
experienced or heard from colleagues</div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
<br /></div>
<ul>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
Users of public
databases may unadvertedly download unpublished data, specially when
they use they do this at large scales. After all they are using a
public repository, and it is contradictory that public databases
provide data that are not usable.
</div>
</li>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
Most genome
sequencing projects are financed using public money or from agencies
that require that the data is made publicly available as soon as it
is produced, but this leads to the situation above, making it
difficult to sequencing project leaders to know what use is being
made of their data.</div>
</li>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
Referees may
specifically ask authors to use genomes that are in databases, or
simply reject a paper because it does not use this or that “publicly
available” genome in the comparative analyses. In addition
referees or editors may ask for evidence of a specific permission to
use unpublished data.
</div>
</li>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
Authors willing to
ask for the use of an unpublished genomes may be required to explain
the exact use of the data, which expose their ideas to possible
direct competitors.</div>
</li>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
Leaders of genome
projects may feel in the right to ask for authorship in exchange of
data that is available on public databases.
</div>
</li>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
Leaders of genome
projects may intentionally delay the publication of the genome paper
to extend the period of preferential use. They may even decide to
publish partial analysis before the genome paper.</div>
</li>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
Some unpublished
genomes are in public databases for several years, and still
different interpretations are possible of whether these data could
be freely used.
</div>
</li>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
Some genomes may
never be published in the form of a genome paper, because they were
sequenced with a very particular purpose.
</div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
</div>
</li>
</ul>
<div align="JUSTIFY" style="margin-bottom: 0in;">
In my opinion the current
situation is too ambiguous, generates conflicts and ultimately
jeopardizes the advance of science. We need clear rules, rather than
guidelines, and I below propose four simple rules that would simplify
the process.
</div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
<br /></div>
<ul>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
Granting agencies
and sequencing centers should specify a reasonable time-frame for
preferential use (6-12 months) before the data is released. This
should suffice for giving the upper hand to the research team that
is doing the sequencing effort, but will also force them to focus on publish
inga genome paper as soon as possible.</div>
</li>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
During this period,
sequencing projects may announce the availability of the data for
restricted use, through a specific repository that can be accessed
only after a specific permission is granted. This will enable use of
the data from time 0.</div>
</li>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
Data is released to
the public repositories (at least in the form of bulk download) only
after that period.
</div>
</li>
<li><div align="JUSTIFY" style="margin-bottom: 0in;">
All data in public
repositories should thus be free to be used for any purpose,
regardless whether a genome paper is published.</div>
</li>
</ul>
<div align="JUSTIFY" style="margin-bottom: 0in;">
<br /></div>
<div align="JUSTIFY" style="margin-bottom: 0in;">
Personally, for what the
activity in my lab concerns I have taken the decision that we will
use any data publicly deposited in GenBank for more than a year, for
any purpose other than doing a “genome paper” (of course!). I think this is in
perfect agreement with the NHGRI recommendations and will definitely
save us time, and worries.
</div>
<br />Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com3tag:blogger.com,1999:blog-4270702979950789999.post-32248448286960022682012-03-25T10:20:00.000-07:002012-03-25T10:20:04.010-07:00Challenges in phylogenetic tree visualization<div style="text-align: justify;">I recently read an excellent <a href="http://www.sciencedirect.com/science/article/pii/S0169534711003545">review by Roderic Page</a>, on the challenges in phylogenetic tree representation and visualization. It provides an overview on existing software and tools (although he missed our <a href="http://cgenomics.blogspot.com/">ETE</a> package, see image below for an example of ETE's visualization features). The number and diversity of <a href="http://bioinfo.unice.fr/biodiv/Tree_editors.html">existing tools</a> is overwhelming, but probably matches the diversity of different interests and possible applications of phylogenetic trees. One may be interested in overlaying sequence information (see below), while other would be interested in displaying information on the geographical distribution of the species. Some may need to represent uncertainty and overly different topologies, or networks to represent transfers of genetic material, the possibilities are unlimited. </div><div style="text-align: justify;"><br />
</div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1DP9eodS0qfzA0MOihIBqpHHYse2xcbd68OA5-7hEsUCXgJluwpqWkKtc1CZ0lihtjyfpu3A0L08NqvEUZ-Pxoexvb7ZCeNslJXNwN30e84yD0AVWgsW2x36SIxllkbrJ3eh3keHmgOc/s1600/tree_and_histogram.preview.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1DP9eodS0qfzA0MOihIBqpHHYse2xcbd68OA5-7hEsUCXgJluwpqWkKtc1CZ0lihtjyfpu3A0L08NqvEUZ-Pxoexvb7ZCeNslJXNwN30e84yD0AVWgsW2x36SIxllkbrJ3eh3keHmgOc/s320/tree_and_histogram.preview.png" width="320" /></a></div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"> Most importantly he mentions some of the challenges of tree visualization software such as the ability to represent huge trees and to allow interactive behavior with the user. In our group we have encountered such needs and this is the reason behind implementing more visualization features in ETE. Fortunately new technologies are offering new opportunities as well, and I enjoyed imagining the possibilities that 3D visualization and touchscreen technologies will provide to researchers. Definitely is a field to follow. </div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"> If you are interested in the topic. I recommend<a href="http://vizbi.org/Videos/26210611"> this video</a>. </div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com1tag:blogger.com,1999:blog-4270702979950789999.post-62117771397995806472012-03-10T09:52:00.000-08:002012-03-10T09:52:58.888-08:00Open Letter for Research in SpainAs you surely have heard, Spain is facing a serious crisis in the context of a globalized market-economy (yes, it used to be a time when economical crisis related to something more tangible, such as a serious drought or a plague, but now one can only blame abstract fluxes of financial speculations). The new government is preparing a new budget which is predicted to include the most dramatic cuts in our history. Researchers here, who have already been hit by previous cuts (<a href="http://www.nature.com/news/spanish-changes-are-scientific-suicide-1.10027">see this letter</a>), are now embracing for the worst. <br />
<br />
<div style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_zfX2BTopyWb3cz473SlxPrSTUezc_I1u3s3QxhsDxLbzlz2FoWinc8RM_Tp9_se1WrLsACN9wN2SXO2bFZDQhqcUtMa-F_yxjWl4FI5UaqqPWr8_2vJesJr3x7Nc3lUF37k7ExpuKmc/s1600/OEP_OPIS_eng.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="248" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_zfX2BTopyWb3cz473SlxPrSTUezc_I1u3s3QxhsDxLbzlz2FoWinc8RM_Tp9_se1WrLsACN9wN2SXO2bFZDQhqcUtMa-F_yxjWl4FI5UaqqPWr8_2vJesJr3x7Nc3lUF37k7ExpuKmc/s320/OEP_OPIS_eng.jpg" width="320" /></a> </div><br />
In this context, an <a href="http://www.investigaciondigna.es/wordpress/sign">open letter</a> has been put together by the Confederation of Spanish Scientific Societies, the Federation of Young Researchers and others. I recommend you to read it (some cited figures and data are very revealing), and if yo agree with it sign it, as I just did.<br />
<br />
<a href="http://www.investigaciondigna.es/wordpress/sign">Open letter for research in Spain.</a>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-66823207399519966242012-03-04T00:26:00.002-08:002016-06-01T19:38:08.037-07:00Darwin's h-index<div style="text-align: justify;">
I guess most scientists are nowadays familiar with the term "h-index", which is a metric of citations to your published articles. More specifically the <a href="http://en.wikipedia.org/wiki/H-index">h-index </a>correspond to the number of articles (h) that have at least h citations. Given that this index is used by many funding agencies and by peers that evaluate you for a position or competitive grant, we all hope to see it grow year by year.</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvKxF3dcKjaBWtFxM-DPtaUsmuCgHaduyxJHP3TeLGdUrvUElJKTXm2dOUMB8y5Pvx94NxWSIIs-hj6wdENqShSS8FnIksVYJOitj6OpZtVBE97hTv2tdfyHgHkDs5BpPxkKglhDQBUtI/s1600/200px-Charles_Darwin_01.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvKxF3dcKjaBWtFxM-DPtaUsmuCgHaduyxJHP3TeLGdUrvUElJKTXm2dOUMB8y5Pvx94NxWSIIs-hj6wdENqShSS8FnIksVYJOitj6OpZtVBE97hTv2tdfyHgHkDs5BpPxkKglhDQBUtI/s1600/200px-Charles_Darwin_01.jpg" /></a></div>
<br />
<br />
<div style="text-align: justify;">
Charles Darwin lived in completely different times, he had no need to apply for grants or positions every few years and there was no system to track citations or give a "number" to the supposed "impact" of his research. He, nevertheless, has been absorbed by the current metrics obsession and has already an <a href="http://scholar.google.com/citations?user=bLg9maoAAAAJ&hl=en">h-index</a>, computed by google scholar. </div>
<br />
<div style="text-align: justify;">
His magic number is 63. Will this change anyway our idea of how important was Darwin's impact to Science? or it will rather help us to put the h-index into context, and highlight the difficulty of measuring true impacts?</div>
Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com2tag:blogger.com,1999:blog-4270702979950789999.post-16344863372244623842012-02-22T00:06:00.001-08:002012-02-22T00:08:47.302-08:00Phylogenetic Tree Challenge in Encyclopedia Of Life The <a href="http://eol.org/">Encyclopedia of Life</a> initiative aims at providing an open, digital resource providing comprehensive information about the diversity of life. It has recently opened a call for teams that can provide a phylogeny-aware organization of as many scientific names as possible. This text is from the call:<br />
<br />
<i>A prize is offered to the individual or team that can provide a very large, phylogenetically-organized set(s) of scientific names suitable for ingestion into the Encyclopedia of Life as an alternate browsing hierarchy. </i><br />
<i><br />
[...]</i><br />
<br />
<div class="MsoNormal"><i>Among other factors, the total number of uniquely named nodes, node/leaf ratios and tree height may be used to compare entries so contestants should consider how they wish to trade off strict consensus versus other methods of reflecting the state of phylogenetic knowledge.</i></div><div class="MsoNormal"><i>Problems to solve include 1) how to assign labels to unnamed nodes, 2) how to fill in gaps so that the set of taxa included is as comprehensive as possible, even if trees are not fully resolved or all taxa have not been analyzed, 3) how to handle competing hypotheses, 4) how to update the hierarchy at least annually. </i></div><i>The winning submission must be available to EOL and others under an acceptable CC license if it is under copyright. The tree need not be previously published in peer-reviewed form.</i><br />
<i> </i><br />
<i> </i>and more information is available <a href="http://eol.org/info/tree_challenge">here</a>.<br />
<br />
<i><br />
</i>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-58764745411651064972012-02-08T09:15:00.000-08:002012-02-08T09:15:14.496-08:00Getting more complex and gaining.... nothing<div style="text-align: justify;"> The origin of complexity is a highly debated issue in biology. For instance, many functions in the cell are carried out by intricate macro-molecular complexes formed of a multitude of subunits. When tracing the evolution of such complexes, <a href="http://www.sciencedirect.com/science/article/pii/S0022283605002378">as we did with mitochondrial Complex I</a>, one often finds that the number of subunits have increased through time. However, the addition of subunits not always seems to correlate with the acquisition of novel functions, which would provide a selective advantage for the increase in complexity. Can we think of a mechanism promoting a trend for increasing complexity in the absence of a selective advantage provided by a novel function?. </div><br />
<div style="text-align: justify;"> A <a href="http://www.nature.com/nature/journal/v481/n7381/fig_tab/nature10724_ft.html">recent paper</a> by Finnigan and colleagues show a plausible mechanism and present evidence that this may have been responsible for the acquisition of a novel subunit in fungal vacuolar ATPases (depicted below). </div><br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj11P27ULO_ZVTr5t1byKd9TvqHXQUHqqjkeKcV4oeFzpDWv6e3pB3nsQhEnGVIxR0xcjeOlLujUYJkoMG72zpuYgSUHn1Gc1p3MnUS6WM5rOAc0ga8p5pwoZt5oFBm53FkNWCPieM6A8w/s1600/VATAPASE.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj11P27ULO_ZVTr5t1byKd9TvqHXQUHqqjkeKcV4oeFzpDWv6e3pB3nsQhEnGVIxR0xcjeOlLujUYJkoMG72zpuYgSUHn1Gc1p3MnUS6WM5rOAc0ga8p5pwoZt5oFBm53FkNWCPieM6A8w/s320/VATAPASE.jpg" width="320" /></a></div><div style="text-align: justify;"> This molecular machines that pump protons across membranes have a membrane ring (in green in the figure) formed by 6 units. In vertebrates two different subunits (originated from the duplication of an ancestral gene) form the 6-units ring in a 1:5, stoichiometry. In fungi a more recent duplication brought about one more subunit type so that the ring is formed by the products of three different genes in a 1:1:4 organization. Using <i>ancestral sequence resurrection</i> (I love that name!), a technique that consists of reconstructing most likely ancestral sequences and then synthesizing them in the lab, they show that a single mutation acquired early in each paralogue, was sufficient for making the two of them indispensable. Thus, such model could explain a trend to increase complexity in multi-paralogue complexes (those comprised by some subunits derived from duplicated genes) without a requirement for an initial selective advantage. </div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"> In a way, I see this model as a special type of sub-functionalization. That is, the two new paralogues would in sum make the same function that was performed by the ancestral gene. In the absence of more examples we do not know how widespread is this mechanism, but the fact that it does require few likely events and that it actually constitutes a "ratchet" (<a href="http://www.nature.com/nature/journal/vaop/ncurrent/full/nature10816.html">as noted by W Ford Doolittle</a>), that is once you gain that complexity you don't go back, one would expect to have occurred in several of many multi-paralogue complexes, at least in some lineages. </div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"> Perhaps this could explain an intersting finding we did some years ago when looking at the evolution of the<a href="http://www.biomedcentral.com/1471-2148/9/295"> mitochondrial electron transport chain in fungi </a>(mostly formed by multi-protein complexes): the amount of duplications in members of this complexes was of the same level as other proteins. This is in contrast to the gene-dosage effect hypothesis that states that complexes would tend to duplicate only when the stochiometry is conserved (that is in when the whole complex duplicates, e.g in whole genome duplications). </div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"> Finally, another remark that I always do when seeing ancestral sequence resurrection working is that the fact that ancestral reconstructions display the expected biochemical activities (e.g by complementing extant sequences) is an indication that the models of evolution we use are not that wrong after all. </div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"> </div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-48138138827756140262012-01-29T04:29:00.000-08:002012-01-30T03:45:12.162-08:00Interview with Nick Lane<div style="font-family: inherit; text-align: justify;"><span style="font-size: small;">As I reported in an <a href="http://treevolution.blogspot.com/2011/12/sesbe-spanish-society-for-evolutionary.html">earlier post</a>, I had the opportunity to meet <a href="http://www.nick-lane.net/">Nick Lane</a> during the Spanish Evolutionary Society meeting. We had a very interesting discussion over a couple of beers around mitochondrial endosymbiosis and the origin of eukaryotes. Some days after the meeiing, Andrés Moya, the President of the society, suggested to me to interview him for the Society's Bulletin <a href="http://www.sesbe.org/eVolucion">eVolución</a>. You can find this interview translated to Spanish in the <a href="http://www.sesbe.org/sites/sesbe.org/files/eVOLUCION-7%281%29.pdf">current issue</a> of eVolution 7(1), however I think the interview might be of interest for a broader audience and thus I paste here the original, English version. <b><br />
</b></span></div><div style="font-family: inherit; text-align: justify;"><br />
</div><div style="font-family: inherit; text-align: justify;"><br />
</div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;"><b>TG- After your recent visit to Spain as an invited speaker to the III SESBE congress (Madrid, November 2011), what is your opinion about the field of Evolutionary Biology in Spain?</b><br />
</span></div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;">NL- Well, I thoroughly enjoyed the few talks I attended, but my Spanish is poor and I could hardly judge many of them; and unfortunately I missed much of the conference. But I liked the great range of themes that were being discussed. And in general I am impressed with a lot of evolutionary research going on in Spain. There is a tendency to consider comparative physiology in evolution more than there is in England, for example, and I find that a very insightful approach. One thing that has struck me over the years is that Spanish researchers are not cited as frequently as they ought to be. This does not reflect the quality of the research, but rather the US-dominated English-language citation bias.<br />
</span></div><div class="separator" style="clear: both; font-family: inherit; text-align: center;"><span style="font-size: small;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiavKERwHWacTbk9P5YNeoeYIfL_EhdwC1bZjlFtjsOkcqGb40hRLs-NNEY7Fn_EP-vhiHfsomcu7Q8S7UqDqpUirOuFqdr5fhD3Dxx89xfaSpfoc4bacnav2vswSo-CWlD3a_cfjyZ8EM/s1600/Nick_Lane_study.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiavKERwHWacTbk9P5YNeoeYIfL_EhdwC1bZjlFtjsOkcqGb40hRLs-NNEY7Fn_EP-vhiHfsomcu7Q8S7UqDqpUirOuFqdr5fhD3Dxx89xfaSpfoc4bacnav2vswSo-CWlD3a_cfjyZ8EM/s320/Nick_Lane_study.jpg" width="315" /></a></span></div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;"><br />
<b>TG- Your career has been quite unconventional. Can you summarize for our readers which have been the major steps in your career path?</b><br />
</span></div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;">NL- It sure has! I had a medical research background, and my PhD was on mitochondrial function and oxygen free radicals in transplanted organs. But I was getting nowhere with that, and couldn’t see a way of getting from there into what was really an interest for me: evolutionary biology. So I took to writing instead, for several independent agencies doing medical education for pharmaceutical companies. That was an eye opener, and I learnt to write clearly and quickly, but it was also a frustration. After quite a lot of hard work I finally got a contract to write Oxygen, which was initially conceived as a book about free radicals, mitochondria and medicine, but ended up reflecting my interests in evolutionary biology to a much greater extent. That was the beginning of a decade spent writing books on evolutionary biochemistry, drawing heavily on my background in bioenergetics but ranging widely over any material that interested me. It was fantastic fun but no way to make a living. And ultimately frustrating too, in that in writing on that scope, you can’t help but come up with new ideas, essentially a broad synthesis with gaps, that you sketch in with speculations, which can be reframed as testable hypotheses. That’s what drew me back into research – the frustrated desire to test some of these hypotheses.<br />
<b><br />
3) Thus, you have been active as a science writer, a researcher, and now you seem to combine both aspects. Do these two tasks reinforce or rather interfere with each other?</b></span></div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;"><b> </b><br />
Both. I think I’ve benefited tremendously as a researcher from the decade I spent thinking and writing. I now have a coherent set of hypotheses that are testable in one way or another – experimentally or by some kind of mathematical modeling, or just by empirical analysis of existing data. So I’m drawing heavily on this ‘credit’ now. At the same time it is hard to think synthetically or to write books while in research, there are so many demands on time. So on a daily basis, writing and research interfere with each other, but I think if you are able to focus on one or the other for periods then they can, and should, reinforce each other. The trick is to balance each so that they reinforce each other over time. I’m not sure I’ve mastered that trick yet, but it is my long term goal: for me, it is the best way to understand the most interesting evolutionary questions, and that is what I want to do.<br />
<b><br />
4) In your view, where lies the main responsibility of communicating science to the general society (e.g scientists, funding agencies, scientific societies etc, science journalists)?</b></span></div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;"><b> </b><br />
Good question. There is certainly a responsibility, but being responsible counts for nothing if nobody listens to what you have to say: as a writer, you must be interesting to be noticed at all. And society is rarely interested in responsible but boring views. So there is a balance that you have to wrestle with every sentence, between interest and accuracy. That’s another reason I’m happy to be back in research: to write accurately (in precise scientific language) is at least as much pleasure for me as to write interestingly. Frankly it is the questions themselves that interest me. I think that the real challenge in writing for the public is to find ways of phrasing questions in an interesting way, which draws attention to the problem, without sacrificing the accuracy. That is the ideal: responsible (boring) and interesting at the same time.</span></div><div style="font-family: inherit; text-align: justify;"><br />
</div><div class="separator" style="clear: both; font-family: inherit; text-align: center;"><span style="font-size: small;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-IoYl9PVFzAwmuykf1L0RzoB3TDDXKXEeh_nzaWEE1_5zbbVraVg_RaK_HR1dBwdZp0uI-m2Zm-Wey-haauomVm6rYpDElWclFOIYjVlayScODsS80FiEugYf6F6i5qkH6UxG7pX8ZZo/s1600/diez_grandes_inventos.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-IoYl9PVFzAwmuykf1L0RzoB3TDDXKXEeh_nzaWEE1_5zbbVraVg_RaK_HR1dBwdZp0uI-m2Zm-Wey-haauomVm6rYpDElWclFOIYjVlayScODsS80FiEugYf6F6i5qkH6UxG7pX8ZZo/s320/diez_grandes_inventos.jpg" width="192" /></a></span></div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;">With respect to which group has the responsability, I don't think that one group alone can be considered responsible communicating science to general society. Each group can address different needs, and each has its own responsibility. Scientists are responsible for sculpting new ideas, for conveying the excitement and intellectual thrust of science. The best ideas in science are still driven by individuals with passion, insight and ingenuity, and there is nobody better to convey this intensity to the general reader, although it is rare. Journalists are responsible for balanced reporting, explaining ideas clearly and intelligibly, providing context for the reader, ideally some commentary from other scientists. It is unusual for journalists to drive the scientific agenda, but serious journalists have a broader perspective and can sometimes see things that scientists can't. <br />
<br />
Scientific societies can provide very helpful consensus statements on difficult issues, from global warming to the effectiveness of chemotherapy. It's not really for them to give a sense of the cut and thrust of science, more the strength of the conclusions that emerge from the uncertainty. <br />
<br />
Finally, funding agencies. In my view, funding agencies have a duty to explain to the public and to politicians that research is open-ended and unpredictable. Research that appears to have little immediate societal impact can have immense and unimagined benefits in the future. Most major scientific breakthroughs, with the greatest economic benefits, came from unexpected quarters, and could not have been anticipated by either the scientists themselves or the funders. This perspective is being lost in a political drive to justify spending by societal impact. As with so much, short-term political cycles are trumping long term good sense. It is up to funding agencies to explain why research should be funded on its own merits, without constant recourse to some hoped-for and probably illusory impact. </span><span style="font-size: small;"> <b><br />
TG- In one of you articles, to commemorate the 150 anniversary of “The Origin of species”, you discuss about what Darwin would love to know about the origin of the eye if he were still alive. Darwin is granted for being the first who used a “tree of life” to describe the evolutionary relationships of species and their shared ancestry. What do you think he would love to now in this respect if he were still alive?</b></span></div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;"><b> </b><br />
NL- Well I think he’d love what’s going on in microbial genomics. The picture that has emerged over the last couple of decades of lateral gene transfer and endosymbiosis in microbes is radically different to the idea of gene sequence divergence between populations. Having said that, I see all this as a juxtaposition to standard Neodarwinian population genetics. He would have loved that too, although it is old hat to us now; but given that Darwin knew nothing about genes, he would have been thrilled by the Neodarwinian synthesis, and what amounted to a genetic basis for a tree of life. All of this means that variation is more complex than any of us imagined; and in this sense, Darwin’s coyness on the mechanisms of variation was well placed: it really is wild and fascinating.<br />
<br />
<b>TG- In one of your last books, you mention 10 major transitions in the evolution of life on earth. Which one of them is, according to you, the most enigmatic or difficult to explain?</b><br />
</span></div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;">NL- Consciousness, without a doubt. Frequently the origin of life and consciousness are put forward as the twin pinnacles, the two big unanswered questions in biology. I think we’re actually quite close to understanding the origin of life in conceptual terms, but I personally can’t understand consciousness well at all. I read a lot on the subject and came to the conclusion that nobody really does. We still can’t answer the simple question: how does the depolarization of a neuron give rise to a feeling or sensation of anything at all? They are two different languages, and we don’t seem to have any kind of Rosetta stone at the moment.<br />
<br />
<b>TG- Some of these transitions seem to have happened only once in the history of life. If they were so advantageous why they have been restricted to a single lineage?</b><br />
</span></div><div class="separator" style="clear: both; font-family: inherit; text-align: center;"><span style="font-size: small;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxzLG9f1EVVkI4ePJ3xBAK1tOICzq4gMNLFzNBbJTNz73RRikCQiJBYwxpRdjJEsGIR0Qk4Tu8Y2JWN0rgdp2LhQdE8MlN0apc6eg5xxopXTO4qjBQwaJKWZ97AJx-VLTw8o4dQ4xY9nE/s1600/mitochondria.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="216" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxzLG9f1EVVkI4ePJ3xBAK1tOICzq4gMNLFzNBbJTNz73RRikCQiJBYwxpRdjJEsGIR0Qk4Tu8Y2JWN0rgdp2LhQdE8MlN0apc6eg5xxopXTO4qjBQwaJKWZ97AJx-VLTw8o4dQ4xY9nE/s320/mitochondria.jpg" width="320" /></a></span></div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;">NL- I think each transition has to be taken on its own terms. These are tremendously difficult questions and you will find diametrically opposed answers to each question from very insightful researchers. The answers reflect temperament more than anything else. Christian de Duve actually wrote a book called ‘Singularities’, and my reading of that is that there isn’t a single answer that would apply to the origin of life, the origin of photosynthesis, the origin of the eukaryotic cell, the origin of animals, and the origin of consciousness. Obviously for some reason, each was improbable or it would have happened more than once (like eyes), but the reasons for improbability differ and are very dependent on context. In the case of eukaryotes, I would say their unique origin was based on an improbable endosymbiosis between prokaryotes, followed by a problematic reconciliation of selfish interests between two entities that had to live in intimate union. There were no advantages at all until they had come out of that tight bottleneck; on the contrary, all the advantages were with the bacteria that just kept on doing their bacterial thing. From that point of view, the difficult question is why did it happen at all?<br />
<br />
<b>TG- Some of your research interests concern very ancient events (e.g. the origin of eukaryotes, of life itself). This is a field in which different hypotheses are difficult to prove right or wrong given the difficulty of direct experimentation. What are the criteria used by scientists in your area to reach a consensus over which is the support for the different scenarios?</b></span></div><div style="font-family: inherit; text-align: justify;"><span style="font-size: small;"><b> </b><br />
NL- There is a consensus on quite a lot: cell structure, behavior (phagocytosis or sex) genome sequences (albeit with disputes over methodology), the existence of introns in certain positions and so on. Where consensus breaks down is when different methods give different answers. That happens all the time. I’m actually focusing a lot of my attention now on the origin of life itself, because this seems to me to be more experimentally tractable: we can ask specific experimental questions that involve chemistry and thermodynamics, which are much more reliable than biology and genes, so although the event was the most ancient of all, it is not necessarily the most inaccessible. I think we’re making progress on many questions, but in the case of the origin of eukaryotes a lot of the evidence is oblique and disputable. The reasoning is often equivalent to historical reconstruction in that you need to weigh the evidence: there’s no doubt that it happened, and there’s plenty of evidence, it’s just that some of it is unreliable and some is irrelevant, so there’s plenty of scope for argument still.<br />
<br />
<b>TG- In this respect. What is the impact on your field of the ever-growing number of genome sequencing projects?. What are the species or environments you would like to be sampled in order to help answering important questions in the origin and evolution of complex life.</b></span></div><div style="text-align: justify;"><span style="font-size: x-small;"><span style="font-family: inherit; font-size: small;"><b> </b><br />
NL- Genome sequences have made a tremendous difference, the only trouble being that they tend to reflect pathogens or industrially interesting bugs, rather than those most relevant to, say, the origin of eukaryotes. I would love to see more genomes from anoxic or anaerobic deep ocean environments, or the deep hot biosphere. I’m especially interested in two questions: the variation in eukaryotic genomes, and the variation in mitochondrial genomes. There is a brilliant and bold hypothesis that the origin of the eukaryotic cell was an endosymbiosis between two prokaryotes, an archaeon host cell and an alpha-proteobacterium (or somesuch). The prediction is that all eukaryotes should have mitochondria or organelles derived from them like hydrogenosomes or mitosomes; and that in terms of mitochondrial genomes we should find more overlap between bacterial metabolic capacity and metabolically versatile mitochondria. This is a wonderful prediction because it is so easy to falsify, and yet all the genome sequencing so far has failed to disprove it. The places most likely to disprove – or prove – it are precisely those anaerobic environments that have been undersampled so far.<br />
<br />
<b>TG- Carbon has always been considered a hallmark of life on earth, but life (elsewhere) based on other molecules (e.g Silicium) has been speculated. You seem to favor the idea that oxygen was the molecule that enabled the appearance of complex life on earth, could you speculate on the theoretical possibility of other molecules playing a similar role in other forms of life.</b></span></span></div><div style="text-align: justify;"><span style="font-size: x-small;"><span style="font-family: inherit; font-size: small;"><b> </b><br />
NL- I think it is most likely that life elsewhere would be constrained by much the same issues that constrain life here. I doubt very much that there will be silicon based life forms. There are two important properties of carbon: it is much better than silicon at organic chemistry; but equally important, it is available in the form of a gaseous oxide, a Lego brick if you will. There are no gaseous silicon oxides, only sand, which is vast and unwieldy in comparison. You can’t build a house on sand and you can’t build an organism from sand. My feeling is that not only is carbon especially useful, it is also more abundant than silicon. Likewise, water is more abundant than methane and a much better solvent (you can’t dissolve carbon chains of more than about 5 carbon atoms in methane). And so on. On the basis of usefulness and abundance, I would argue that life would mostly be carbon based. I would go further to argue that it is likely to require proton gradients over membranes for thermodynamic reasons. When I say that oxygen is necessary for complex life, I mean large active animals. I doubt that anything else could do the job: nothing else could accumulate to the appropriate level in an atmosphere and at the same time be sufficiently reactive to provide the power needed. So I’d say that in terms of their broad biochemistry, alien life won’t be all that different. In terms of morphology or the specifics of their biochemistry, they could be very different, of course.<br />
<br />
<b>TG- Are you already working on your next book?, can you advance something on what is it about?</b></span></span></div><div style="text-align: justify;"><span style="font-size: x-small;"><span style="font-family: inherit; font-size: small;"><b> </b><br />
NL- I’m not writing yet, but I do have a contract… and it will be about everything I have talked about here. The origin of complex life, and why it was a unique event here on Earth.</span> <br />
</span></div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-28781546992225832022012-01-24T00:19:00.000-08:002012-01-24T00:19:06.235-08:00RECOMB 2012 (Barcelona): one week left for early registration As I reported in an <a href="http://treevolution.blogspot.com/2011/10/recomb-2012-barcelona.html">earlier post</a>, <a href="http://recomb2012.crg.cat/">RECOMB 2012 </a>will be held in Barcelona and CRG's Bioinformatics and Genomics program is part of the local organizing committee. <br />
This post is a reminder that the deadline for early registration with a reduced rate is approaching and will expire 31st of January. More information <a href="http://recomb2012.crg.cat/">here</a>. <br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjU8VJpEGqSbRdc2LogdxXwW7lCpyiXmYfxVBhguso0bb8xQjZPpdAvXbeXYAxj_JPh2GVcmUQ5DTusltieDchqtWq2jF4PFopxewe5CM9AeEj6Ihy9_mU2PMNkT9-8xnJxi4Rdnn18Ph8/s1600/RECOMB.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="50" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjU8VJpEGqSbRdc2LogdxXwW7lCpyiXmYfxVBhguso0bb8xQjZPpdAvXbeXYAxj_JPh2GVcmUQ5DTusltieDchqtWq2jF4PFopxewe5CM9AeEj6Ihy9_mU2PMNkT9-8xnJxi4Rdnn18Ph8/s320/RECOMB.png" width="320" /></a></div> See you there!Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-71942270759630718032012-01-13T09:09:00.000-08:002012-01-13T09:10:28.935-08:00SMBE 2012 early registration deadline and symposium on orthology<span style="font-size: x-small;"> <span style="font-family: inherit; font-size: small;">For those who don't know, the deadline for abstract to the next <a href="http://www.smbe2012.org/">Society for Molecular Biology and Evolution meeting</a> (Dublin 23-26 June) is approaching. I am co-organizing a workshop on orthology/paralogy and function in collaboration with Marc Robinson-Rechavi, </span></span><span style="font-family: inherit; font-size: small;"> Matthew Hahn, and Iddo Friedberg. Find below an invitation to submit to SMBE2012 and more info on this workshop. </span><br />
<div style="font-family: inherit;"><span style="font-size: small;"><br />
</span></div><div style="font-family: inherit;"><span style="font-size: small;"> Hope we can meet in Dublin. </span></div><div style="font-family: inherit;"><span style="font-size: small;"><br />
</span></div><div style="font-family: inherit;"><span style="font-size: small;"><br />
</span></div><div class="separator" style="clear: both; font-family: inherit; text-align: center;"><span style="font-size: small;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5TEDE5SCcGiU31UwMj-r1kdT1ax4fFpF5cwSS__rJeQJPq-eXTvyRaB-iZOMN32_Qhe70heOYsZxaniGaoG1mJ4oA44vyqAplNrB9cjHXRjpQPiFHEyZxk-6HD_xbq8jY7RmJiHmfaPI/s1600/smbe2012.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="113" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5TEDE5SCcGiU31UwMj-r1kdT1ax4fFpF5cwSS__rJeQJPq-eXTvyRaB-iZOMN32_Qhe70heOYsZxaniGaoG1mJ4oA44vyqAplNrB9cjHXRjpQPiFHEyZxk-6HD_xbq8jY7RmJiHmfaPI/s320/smbe2012.png" width="320" /></a></span></div><div style="font-family: inherit;"><span style="font-size: small;"><br />
</span></div><div style="font-family: inherit;"><span style="font-size: small;"><br />
</span></div><div style="font-family: inherit;"><span style="font-size: small;">Dear colleague,<br />
<br />
We invite you to submit an abstract to the symposium "The complex relationship between orthology, paralogy, and function" to take place at the meeting of the Society for Molecular Biology and Evolution in Dublin (23rd-26th June, 2012).<br />
<br />
The deadline to submit an abstract is the 27th of January 2012, for more details please visit:<br />
<br />
<a href="https://owa.crg.es/owa/redir.aspx?C=92015e949e4e40639328cae9476c11de&URL=http%3a%2f%2fwww.smbe2012.org%2fscientific-content%2fcall-for-abstracts.html" target="_blank">http://www.smbe2012.org/scientific-content/call-for-abstracts.html</a><br />
<br />
<br />
Symposium "The complex relationship between orthology, paralogy, and function"<br />
<br />
Orthology and paralogy have been central concepts in molecular evolution since the distinction was first proposed by Fitch in 1970. A long standing interpretation of this distinction has been that orthologs would be more similar in function than paralogs. Until recently, this interpretation was rarely tested, and in fact rarely explicitly articulated in a testable manner. Yet it has been widely used, from undergraduate teaching, to the practical application of orthology searches for genome annotation. There has been a recent increase of research, seeking to define and test this "ortholog conjecture". Notably, a recent paper (Nehrt et al. 2011, PloS Comput. Biol.) has reported a higher functional similarity of paralogs than of orthologs. This paper has generated much attention and debate, while at the same time recent work on orthologs has shown the vitality and importance of this field to a broad range of applications and questions. Our symposium will feature speakers addressing the fundamental relationships between molecular evolution and biological function, focusing especially on the role of orthology and paralogy in modulating such relationships.<br />
<br />
Confirmed speakers: Eugene V. Koonin, Jianzhi Zhang<br />
<br />
If you have any question regarding this symposium please do not hesitate to contact us:<br />
<br />
Toni Gabaldon , Matthew Hahn , Iddo Friedberg, Marc Robinson-Rechavi </span></div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-47165590532681022362012-01-10T11:30:00.000-08:002012-01-11T01:52:15.920-08:00Diversity arises whenever, wherever, and at whatever rate is advantageous This is the conclusion from a <a href="http://www.nature.com/nature/journal/v479/n7373/full/nature10516.html">recent paper </a>from the group of Mark Pagel, in which they analyzed a dataset of body sizes of 3,185 extant mammals in a phylogenetic context.<br />
<br />
<div style="text-align: justify;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixin43r6zId8SLMMSJVhrxy9CHvN4VaPZy9tNkZVBp_17b-iLnVegCnRySvuNTzktfIYZNqSDIgiL5brBwOQSXhzYVqCK8nkUIg2X2fddCrLgqTmZXavf-oDY2m42nRdOc-mP-uoO9E00/s1600/mammal_info_graphic.gif" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixin43r6zId8SLMMSJVhrxy9CHvN4VaPZy9tNkZVBp_17b-iLnVegCnRySvuNTzktfIYZNqSDIgiL5brBwOQSXhzYVqCK8nkUIg2X2fddCrLgqTmZXavf-oDY2m42nRdOc-mP-uoO9E00/s320/mammal_info_graphic.gif" width="253" /></a> They modeled the evolution of body sizes across the phylogeny using a Bayesian approach that allows evolutionary rates to vary at every branch. </div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;">This provided them with an idea of where burst of evolution (big shifts in sizes) had occurred. The main idea was to contrast a long-held hypothesis that the early radiation of mammals was accompanied by increased rates of body-size variation (i.e burst in species diversity coincided with burst in body-size). This was explained by the idea that mammals expanded into a largely-unoccupied niche which provided opportunities for diversification. When the niche was filled up, diversification and evolutionary rates decreased. </div><br />
<div style="text-align: justify;"> Results from this team are in stark contrast with such view, since they see bursts at many different places of the phylogeny, which are uncoupled with the early radiation of mammals. </div><div style="text-align: justify;"><br />
</div><div style="text-align: justify;"> Reading this paper was very useful to me since, I was by then preparing the evaluation of a PhD thesis by Victor soria-Carrasco (see some related paper <a href="http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=21955145&dopt=AbstractPlus">here</a>) on, precisely mammalian, diversification. In the thesis they found that most mammalian orders showed a decline in the rate of diversification (in terms of forming of new species), which may seem compatible with the idea of a niche being filled-up. This highlights the importance of properly delimiting what evolutionary rates we refer to (sequence variation, variation in some morphological character, speciation rate...), since we may reach apparently different conclusions. Complicating the issue further, one does not know whether niche limitation may select for or against diversification. </div><div style="text-align: justify;"></div><div style="text-align: justify;"> In any case it is comforting to see that the increasing amount of genetic, phylogenetic, and other type of data, as well as sophisticated models, enable us to explore such interesting issues at the edge between evolution, phylogenetics and ecology. I was really impressed by the works mentioned. </div><div style="text-align: justify;"></div><div style="text-align: justify;"><br />
</div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-56565847406080985832011-12-10T08:12:00.000-08:002011-12-10T08:17:09.386-08:00Sequencing species.... by the thousands<div style="text-align: justify;"> When I was giving my first steps in the field of comparative genomics, there was not much to think about when deciding which genomic datasets to use: one would just take them all. With only a few dozens of genomes, mostly of bacteria, one could have everything at hand, in the local disk, just need to update every couple of months by adding one or two more...</div><br />
<div style="text-align: justify;"> These times have definitely passed, and now the flow of newly sequenced genomes is... well, overwhelming (see figure below, taken from <a href="http://www.genomesonline.org/">Genomes Online</a>). This is both a blessing and a curse for us doing comparative genomics, since we have an unprecedented amount of data which enables more resolution, but we are increasingly facing novel technical and analyitical challenges.</div><br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhap3i2KanSw3POeio7ZcGvgXgJsInWqSCYvHnQs-5o21MXfWIwCUqIkYuGRWiWWPU2eZJAe4qXgmofb2qJ0zFTejsnTymgD6rR-i_nXITJKysyOXHLQPB1p8VStbpkyyV6aKctqZIdSzE/s1600/gold_s2.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhap3i2KanSw3POeio7ZcGvgXgJsInWqSCYvHnQs-5o21MXfWIwCUqIkYuGRWiWWPU2eZJAe4qXgmofb2qJ0zFTejsnTymgD6rR-i_nXITJKysyOXHLQPB1p8VStbpkyyV6aKctqZIdSzE/s320/gold_s2.gif" width="320" /></a></div><br />
<div style="text-align: justify;"> Just to give a taste of this avalanche of genomes from different species (projects for sequencing genomes for a given species, such as the 1000 genomes is another story) that is coming, I here list some of the projects I am aware of that aim at sequencing thousands of genomes from a given taxonomic group.</div><ul><li><a href="http://www.blogger.com/%20http://arthropodgenomes.org/wiki/i5K" target="_blank">i5K: 5000 arthropod genomes</a> </li>
<li><a href="http://www.blogger.com/%20http://1000.fungalgenomes.org/home/" target="_blank">1000 fungal genomes</a></li>
<li><a href="http://www.genome10k.org/">genome 10K: 10000 vertebrate genomes</a> </li>
<li><a href="http://en.genomics.cn/navigation/show_navigation.action?navigation.id=225">10000 microbial genome project</a></li>
</ul><br />
<div style="text-align: justify;">As expected, in this kind of projects it is way more easy to come up with a bold number, than to actually define the list of species that are actually going to be sequenced. At least this is what I can tell from my involvement in the i5K initiative, in which prioritisation of species to be sequenced is not simple, since usually one wants to weigh in different criteria (phylogenetic relevance, biological, economical, and clinical importance, etc). </div><br />
<div style="text-align: justify;"> I'm sure I missed some, and, in addition, there is a growing flow of genomes that are sequenced by independent groups, including my modest own group. One common weakness of this large, and small-scale initiatives is that they sometimes come with the cost for covering the genome sequencing but do not account for the necessary bioinformatics analyses to actually make sense of the data. With the sequencing costs dropping and the potential analyses becoming more complex, the actual costs of sequencing projects will more and more be on the side of the analysis beyond the assembly and annotation phases. As a result, many bioinformatics groups are streching their resources to contribute to genomics projects without getting any specific funding.</div><br />
<div style="text-align: justify;">In my opinion the planning of a sequencing project should account for all the downstream phases with their associated costs. With such an approach we may end up having a handful of genomes less, but we will definitely learn more from them. </div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-4787674312973602362011-12-05T07:27:00.000-08:002011-12-05T07:28:08.266-08:00Watch the talks from the CRG Symposium: Computational Biology of Molecular Sequences.<div class="x_MsoNormal"><br />
</div><div class="x_MsoNormal"><span class="x_apple-style-span"><span lang="EN-GB" style="background: none repeat scroll 0% 0% white; color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt;"><span style="color: black;"> If you missed the opportunity to attend physically our past symposium on "Computational Biology of molecules" (</span><a href="http://treevolution.blogspot.com/2011/09/crg-symposium-computational-biology-of.html" style="color: black;" target="_blank">see this past post</a><span style="color: black;">), you can now watch the videos of the talks (read message below). </span></span></span></div><div class="x_MsoNormal"></div><div class="x_MsoNormal"><br />
</div><div class="x_MsoNormal"><span class="x_apple-style-span"><span lang="EN-GB" style="background: none repeat scroll 0% 0% white; color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt;">***************** </span></span></div><div class="x_MsoNormal"><span class="x_apple-style-span"><span lang="EN-GB" style="background: none repeat scroll 0% 0% white; color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt;">Dear all,</span></span></div><div class="x_MsoNormal"><br />
</div><div class="x_MsoNormal" style="line-height: 150%; text-align: justify;"><span class="x_apple-style-span"><span lang="EN-GB" style="background: none repeat scroll 0% 0% white; color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;">All contents of the </span></span><span class="x_apple-style-span"><b><span lang="EN-GB" style="background: none repeat scroll 0% 0% white; color: #e36c0a; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;">10<sup>th</sup> CRG Annual Symposium on Computational Biology of Molecular Sequences</span></b></span><span class="x_apple-style-span"><span lang="EN-GB" style="background: none repeat scroll 0% 0% white; color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;">, celebrated last 10<sup>th</sup> and 11<sup>th</sup> of November, are now available online.</span></span><span class="x_apple-style-span"><span style="background: none repeat scroll 0% 0% white; color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;"></span></span></div><div class="x_MsoNormal" style="line-height: 150%; text-align: justify;"><br />
</div><div class="x_MsoNormal" style="line-height: 150%; text-align: justify; text-autospace: none;"><span lang="EN-GB" style="color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;">Leading scientists in computational biology came together in Barcelona on the occasion of the tenth edition of the CRG Annual Symposium, which focused on computational biology of molecular sequences, organized by the <b>Centre for Genomic Regulation (CRG)</b>. The auditorium of the Barcelona Biomedical Research Park (PRBB) hosted the event, celebrated from Thursday 10 to Friday 11 November 2011.</span></div><div class="x_MsoNormal" style="line-height: 150%; text-align: justify; text-autospace: none;"></div><div class="x_MsoNormal" style="line-height: 150%; text-align: justify; text-autospace: none;"><span lang="EN-GB" style="color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;">In the </span><a href="http://2011symposium.crg.es/%20" target="_blank"><span lang="EN-GB" style="color: #e36c0a;">microsite</span></a><span lang="EN-GB" style="color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;"> you can find the inaugural video of the Symposium, videos of the talks, interviews with some of the speakers, participants and organizers of the event and two summary videos that capture the major points of all sessions. There are also available two articles that summarize the talks and news related to the field of computational biology of sequencing.</span></div><div class="x_MsoNormal" style="line-height: 150%; text-align: justify;"><br />
</div><div class="x_MsoNormal" style="line-height: 150%; text-align: justify;"><span lang="EN-GB" style="color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;">We hope that these resources are useful for you! </span></div><div class="x_MsoNormal" style="line-height: 150%; text-align: justify;"><br />
</div><div class="x_MsoNormal" style="line-height: 150%; text-align: justify;"><span lang="EN-GB" style="color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;">Click<a href="http://www.blogger.com/goog_706738370"> </a></span><a href="http://2011symposium.crg.es/%20" target="_blank"><span style="color: #e36c0a; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;"><span lang="EN-GB" style="color: #e36c0a;">here</span></span></a><span style="color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;"> <span lang="EN-GB">to visit the </span></span><b><span lang="EN-GB" style="color: #e36c0a; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;">10<sup>th</sup> CRG Annual Symposium</span></b><span lang="EN-GB" style="color: #7f7f7f; font-family: "Arial","sans-serif"; font-size: 10pt; line-height: 150%;"> web.</span></div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-35237585780038650942011-12-03T08:46:00.000-08:002011-12-03T10:07:20.942-08:00SESBE: Spanish Society for Evolutionary Biology<div style="text-align: justify;"> Last week I went to Madrid to attend the 3rd congress of the <a href="http://www.sesbe.org/" target="_blank">Spanish Society for Evolutionary Biology</a> (SESBE). This is a relatively new (7 years) society that embraces evolutionary biology as a whole, from palaeontology and systematics, to evolutionary genomics and darwinian medicine. Thus, the meetings are very diverse and one can listen to the most diverse talks, always with the common ground of evolutionary theory as a framework of analysis.</div><br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhBRqw9fCywlIDLsF-F-mD4dK8uvoUDqVqp8z6Y_o4L7UITYskE7TEne7uItJIjjPkBwuX07PKJjweQ-FAC7LF_Ul5_fBYXih1oF-JvP6hNo8-Cx-s7BZB4ifY1SoCm6Yb36tJNSsFet3s/s1600/SESBE_logo.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhBRqw9fCywlIDLsF-F-mD4dK8uvoUDqVqp8z6Y_o4L7UITYskE7TEne7uItJIjjPkBwuX07PKJjweQ-FAC7LF_Ul5_fBYXih1oF-JvP6hNo8-Cx-s7BZB4ifY1SoCm6Yb36tJNSsFet3s/s1600/SESBE_logo.png" /></a></div><br />
<div style="text-align: justify;">Due to other commitments, I could only stay two days but it was worth and enjoyed most of the talks and, most of all, meeting colleagues around Spain. I would highlight here the talk of <a href="http://www.nick-lane.net/" target="_blank">Nick Lane</a>, on the evolution of eukaryotes and the role played by mitochondrial endosymbiosis. Nick, who is also a prolific writer of popular science books, gave a very nice talk that seduced the whole audience, including me. I had the opportunity to discuss with him, and it was nice to discuss again on big theories on the evolution of eukaryotes, a big theme that I am passionate. </div><br />
<div style="text-align: justify;">This year, the SESBE elected a new board, in which I will stand as a secretary. Not that I am very keen on holding such a position, but I was asked and I think one should be prepared to contribute his two cents to noble causes, such as that of this society promoting the study of evolution and its transmission to society in our country. </div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0Madrid, Spain40.4166909 -3.7003454000000640.2509674 -3.88584290000006 40.5824144 -3.5148479000000603tag:blogger.com,1999:blog-4270702979950789999.post-36056260690669883312011-11-20T00:01:00.000-08:002011-12-03T10:08:05.584-08:00XI Jornadas de Bioinformatica in Barcelona (23-25 January)<div style="text-align: justify;"> A short note to spread the word on the joint <a href="http://sgu.bioinfo.cipf.es/jbi2012/" target="_blank">Spanish and Portuguese Meeting on Bioinformatics</a>. This is a yearly meeting that is gaining momentum every year, and it is a great opportunity to meet most groups doing bioinformatics in the region. Talks are in English and everybody is welcome to attend.</div><br />
<div style="text-align: justify;"> As other years, this meeting has associated a regional (Spain, Portugal and North Africa) <a href="http://sgu.bioinfo.cipf.es/jbi2012/?page_id=165" target="_blank">ISCB student symposium</a>. This year this symposium is co-organized by, Salvador Capella-Gutierrez, one of the members of my lab. </div><br />
If you plan to submit a communication, there is time till the end of November.<br />
<br />
See you there.Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com1Barcelona, Spain41.387917 2.169918700000039341.312308 2.0830957000000394 41.463526 2.2567417000000392tag:blogger.com,1999:blog-4270702979950789999.post-4161605942197381032011-11-08T02:42:00.000-08:002011-11-08T02:43:49.395-08:00ALPHY 2012: French-Spanish meeting on Bioinformatics and Evolutionary Genomics (March 19 -21, Banyuls-sur-Mer) I am glad to announce<a href="http://lbbe.univ-lyon1.fr/alphy/"> ALPHY 2012</a>, which for the first time is jointly co-organized by French and Spanish researchers. I was very glad to be invited by my French colleagues to sit at the organizing committe. I think it is a great opportunity to join two communities with ample experience in phylogenetics-related research. <br />
<br />
ALPHY is an annual meeting, organized in France since 1995, dedicated to the field of Bioinformatics and Comparative Genomics (ALPHY = ALignments and PHYlogeny). The main goal of this meeting is to promote informal exchanges in this highly multidisciplinary field, and to encourage young scientists to present their work. The official invitation follows, plus a very tempting picture of the location. <br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkD7tr3t46U0k_d02bAPDHlnY2fpECaY-RkvR-NO-zGd2tWakK2JdvvxjzZ_j1XRSdlFREUur8CaMRkwfkvy4DcaaDCXda755yqcQuGSYvZvMABiS9Z_XCcdSVu2_h2_SuP0mdBfaWA3U/s1600/banyuls1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="208" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkD7tr3t46U0k_d02bAPDHlnY2fpECaY-RkvR-NO-zGd2tWakK2JdvvxjzZ_j1XRSdlFREUur8CaMRkwfkvy4DcaaDCXda755yqcQuGSYvZvMABiS9Z_XCcdSVu2_h2_SuP0mdBfaWA3U/s320/banyuls1.jpg" width="320" /></a></div><br />
This year, ALPHY is co-organized by Spanish and French scientists, in the nice city of Banyuls. There will be two invited speakers (<a class="spip_out" href="http://www.unil.ch/cig/page7858_en.html" rel="external">Henrik Kaessmann</a> and <a class="spip_out" href="http://molevol.cmima.csic.es/castresana/" rel="external">Jose Castresana</a>), and the program will be open to contributions for 20’ talks.<br />
The registration to the meeting is free, but mandatory. Please use the link (top left of this page) to register. If you wish to present your work, submit your abstract in the registration form.<br />
<b>Important dates:</b><br />
<ul class="spip"><li> Deadline for abstract submission: January 10 2012</li>
<li> Deadline for registration : February 1st 2012</li>
</ul><b>Hasta pronto – A bientôt – fins aviat - see you in Banyuls!</b>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-6101374341210614832011-11-03T02:19:00.000-07:002011-11-03T04:14:54.097-07:00Sad news from CIPF: the rise and fall of the "flagship" of valencian researchFor those who don't know. I am originally from <a href="http://en.wikipedia.org/wiki/Valencia,_Spain">Valencia</a>. There, one of the deepest traditions and the main festivity are the so-called <a href="http://en.wikipedia.org/wiki/Falles">"Falles"</a>, which in part consist of building huge temporary cardboard sculptures which are exposed for little more than a week and then burned in a big fire. For some people is hard to understand how so much time and money is invested in something that is then left to the flames.<br />
<br />
Apparently, something similar is happening with a <b>research centre</b>!!!<br />
<br />
The <a href="http://www.cipf.es/">"<i><b>Centro de Investigación Príncipe Felipe</b></i></a>" was created in 2005 by the local Valencian government with the idea of making it the "flagship" of research in the region. It came with a strong investment from the regional and central governments and soon attracted many scientists. I was one of the seduced scientists, who originally from Valencia, and at that time in the Netherlands was enthusiastic about a move aiming to put biomedical research in Valencia at the forefront.<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZElCphjaphv7ds4jSzBte0knCFgu0_4j6QrNlx09vvGIg6eucIZaOCVFgxfLEGiLf94E3CLLGheuwN6eNnRKWCIpF_s9AK5CgoQ0bxft5F8qfRkZbchRl2A6svMWvu4RGMW4tyZkEanY/s1600/centroCIPF.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="82" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZElCphjaphv7ds4jSzBte0knCFgu0_4j6QrNlx09vvGIg6eucIZaOCVFgxfLEGiLf94E3CLLGheuwN6eNnRKWCIpF_s9AK5CgoQ0bxft5F8qfRkZbchRl2A6svMWvu4RGMW4tyZkEanY/s320/centroCIPF.jpg" width="320" /></a></div><br />
<br />
Five years after its creation, the cuts started. Crisis had hit Spanish economy and many local governments had big debts, particularly that of Valencia who has been famous for investing in huge events such as the America's cup or the formula 1 competition. When things went complicated, research was seen as one of the most superfluous thing in which a government could invest, and thus cuts were announced. This year the centre is firing 40% of the personnel, including PhD candidates at the middle of their PhD. I guess many of the remaining researchers will leave this downsized center for a better live elsewhere. The flagship is now sinking, "burned" after so much investment and efforts, the comparison to our "Falles" is unavoidable. <br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdu4O6PZ-DQv0NXPmxQnzEqa8BpU0UfU8jSACxydswWR7lwYy01hMW2VCDP1RjHTKHln8DxkEkYRM1FPsdaRz0XFREk4iQYMH7vmbpdxWuyNxaRjlDCbGa-7kaCOCryRo6nw8CcLDHU84/s1600/ofrena.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdu4O6PZ-DQv0NXPmxQnzEqa8BpU0UfU8jSACxydswWR7lwYy01hMW2VCDP1RjHTKHln8DxkEkYRM1FPsdaRz0XFREk4iQYMH7vmbpdxWuyNxaRjlDCbGa-7kaCOCryRo6nw8CcLDHU84/s320/ofrena.jpg" width="320" /></a></div><br />
<br />
The whole story is reported by <a href="http://www.nature.com/news/2011/111101/full/news.2011.623.html">Nature </a>and by many articles in the Spanish press. As Juli Peretó <a href="http://blocs.mesvilaweb.cat/node/view/id/207896">reports,</a> the local government is letting CIPF fall, while keeping investing on other type of events, such as an international Golf tournament in Castelló, or increasing the funds for a motorbike circuit. This is most ironic, and deeply sad. <br />
<br />
I just wish the best for my many ex-colleagues that are still at CIPF and hope this is not the kind of science policy that the future government of Spain (according to polls is likely to be the same conservative party that is now governing in Valencia) is planning.Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com2tag:blogger.com,1999:blog-4270702979950789999.post-71405481784349646872011-11-01T03:46:00.000-07:002011-11-01T04:06:43.969-07:00Educational video on the Tree of Life<div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: left;"><br />
</div><div class="separator" style="clear: both; text-align: justify;">In the blog of <a href="http://jonoave.blogspot.com/">Jun-Hoe Lee</a>, a former visiting student in my lab, I found this interesting video from Yale university on the Tree of Life and the efforts to reconstruct it.</div><div class="separator" style="clear: both; text-align: justify;"><br />
</div><div class="separator" style="clear: both; text-align: justify;"><br />
</div><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.youtube.com/embed/mD94D0KAn2U?feature=player_embedded' frameborder='0'></iframe> </div><div class="separator" style="clear: both; text-align: justify;"><br />
</div><div class="separator" style="clear: both; text-align: justify;"><br />
</div><div class="separator" style="clear: both; text-align: justify;">I think it is a good piece for popular communication of science and conveys pretty reasonably well the problem. Of course, there are simplifications and some important aspects such as that of horizontal transfer of genes, symbioses, and their effects on the tree are not covered, but it provides an attractive and educational introduction to the problem of assembling the tree of life. </div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com2tag:blogger.com,1999:blog-4270702979950789999.post-23752402776435917582011-10-24T06:36:00.000-07:002011-10-24T06:36:58.188-07:00RECOMB 2012 (Barcelona) The next <a href="http://recomb2012.crg.eu/">RECOMB</a> meeting will be held at Barcelona. Our department is part of the local organizing committee and the list of confirmed speakers looks very promising.<br />
<br />
Submission opened in September, and you still have time to submit papers until the end of the week. Do not miss the deadline.<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjU8VJpEGqSbRdc2LogdxXwW7lCpyiXmYfxVBhguso0bb8xQjZPpdAvXbeXYAxj_JPh2GVcmUQ5DTusltieDchqtWq2jF4PFopxewe5CM9AeEj6Ihy9_mU2PMNkT9-8xnJxi4Rdnn18Ph8/s1600/RECOMB.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="50" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjU8VJpEGqSbRdc2LogdxXwW7lCpyiXmYfxVBhguso0bb8xQjZPpdAvXbeXYAxj_JPh2GVcmUQ5DTusltieDchqtWq2jF4PFopxewe5CM9AeEj6Ihy9_mU2PMNkT9-8xnJxi4Rdnn18Ph8/s320/RECOMB.png" width="320" /></a></div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-10965254871053579042011-09-24T10:32:00.000-07:002011-09-24T10:32:51.237-07:00Special BiB issue on "Orthology and Applications" An special issue on<a href="http://bib.oxfordjournals.org/content/12/5.toc"> "Orthology and Applications" </a>is out in the journal <i>Briefings in Bioinformatics</i>.<br />
<br />
This special issue has been edited by Christophe Dessimoz and comprises a number of interesting papers including several comprehensive reviews and also original research articles. Some of the papers emerge from efforts on orthology benchmarking and standardization of datasets that were initiated during the first "Quest for Orthologs meeting" in 2009. See this <a href="http://www.ncbi.nlm.nih.gov/pubmed/19785718">letter </a>reporting from that meeting. We contributed with an <a href="http://www.ncbi.nlm.nih.gov/pubmed/21515902">article</a> reporting on the comparison of expression patterns between across-species orthologs and paralogs of a similar evolutionary age.Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-46806281472705045892011-09-21T09:04:00.000-07:002011-09-21T09:17:32.461-07:00On the "orthology conjecture" Hi,<br />
<br />
Jonathan Eisen has opened a thread <a href="http://phylogenomics.blogspot.com/2011/09/special-guest-post-discussion.html">in his blog </a>to discuss the recent paper by <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002073">Hahn and colleagues on the "ortholog conjecture"</a> You can read more about the discussions raised by this paper <a href="http://phylogenomics.blogspot.com/2011/09/some-links-on-ortholog-conjecture-paper.html">here</a>. <br />
<br />
This is what I wrote, a text which I had to split in three pieces in Eisen's blog given the word limit for comments!!<br />
<br />
Hi<br />
<br />
I appreciate the effort by Matthew Hahnn on explaining the story behind his paper on the so-called "Ortholog conjecture" and on facing some of the criticism. This paper attracted my interest as that of many others that work on or just use orthology. For instance it was chosen by one of my postdocs for our "Journal Club" meeting. And it was discussed during our last "<a href="http://www.ebi.ac.uk/training/onsite/110617_Quest_for_Orthologs.html">Quest for Orthologs</a>" meeting in Cambridge. I think is raising a necessary discussion and therefore I think is a good paper. This does not mean that I fully agree with the interpretation and conclusions ;-). I hope to modestly contribute to this debate with the following post. <br />
<br />
I think one of the causes that this paper has caused so much debate is that the conclusions seem to challenge common practice (inferring function from orthologs), and could be interpreted as the need of changing the strategies of genome annotation. I think, however, that one should interpret carefully these results before start annotating based on paralogous proteins. As I will discuss below one of the problems is that we need to agree in what is the conjecture to then agree in how to test it. I see three main points that can be a source of confusion: i) the issue of what is actually stated by this conjecture, ii) the issue of annotation, and iii) the issue of time<br />
<br />
1) What is the "ortholog conjecture"?<br />
Or in other terms, when should we expect orthologs to be more likely to share function than paralogs?. Always? Of course not. All of us would agree that two recently duplicated paralogs are likely to be more similar in function than two distant orthologs, so it is obvious that the conjecture is not simply "orthologs are more similar in function than paralogs". In reality the expectation that orthologs are more likely to be similar in function than paralogs, as least this is how I interpret it, is directly related to the effect that duplication have on functional divergence. If gene duplication has some effect on functional divergence (even in not 100% of the cases), then, given all other things equal (divergence time, story of speciation/duplication events - except fpr the duplication defining the orthologs) one would expect orthologs to be more likely to conserve function. <br />
<br />
I think this complexity is not well considered (by many authors, in general). Hahn refeers to the famous review of orthology by Koonin (2005) as the source for the term "ortholog conjecture". However, In that paper this conjecture is discussed always within the context of genes accross two particular species, whether in Hahn's paper it is taken as well to other contexts. Thus, the proper context in which to test this conjecture is only between orthologs and between-species paralogs. As we can see, Red and purple lines in Hahn paper in figure2 do not show any clear difference. <br />
<br />
Secondly, Koonin was very cautions in his paper, stating that he was referring to "equivalent functions" and not exactly the same "function", correctly implying that the functional contexts would be different in the two different species. This brings me to the next point.<br />
<br />
ii) annotation<br />
If the expectation of functional conservation of orthologs refers to a given pair of species, then it makes no sense to test that expectation between paralogs within the same species and orthologs in different species. We were interested in this issue and it took us some effort to control for this "species" influence on the comparison, if you are interested you can read our paper on divergence of expression profiles between orthologs and paralogs (<a href="http://www.ncbi.nlm.nih.gov/pubmed/21515902">http://www.ncbi.nlm.nih.gov/pubmed/21515902</a>)<br />
<br />
As Hahn founds, and it was anticipated by Koonin in that review, there is a huge influence of the "species context", a big constraint of what fraction of the function is shared. Indeed I think is the dominant signal in Hahn's paper. Why is that? One possibility is that the functional context determines the function, I agree. However, we should not discard biases in how different communities working around a model species define processes and function, also the type of experiments that are usually done. For instance experimental inference from KO mutants might be common from mouse, but I guess is not the case in humans (!!). I think this may be having a big influence and might even be the dominant signal in Hahns paper. <br />
<br />
Finally function has many levels and I expect subfunctionalization mostly affect lower levels (i.e. more specific). Biases may also <br />
exist in the level of annotation between species or between families of different size (contributing more or less to the orthologs/paralogs class).<br />
<br />
Microarray data are less likely to be subject to biases (although some may exist), at least they should be expected to be free of "human interpretation biases" and so Hahn and colleaguies did well, in my opinion, of testing that dataset. It is important to note that for microarrays and for orthologs and between-species paralogs (which I think is the right frame for testing the conjecture) ortholgs are more likely to share an expression context. This is compatible to what we found in the paper mentioned above, and compatible with the orthology conjecture as stated by koonin (accross species)<br />
<br />
<br />
iii) time<br />
Finally, one aspect which I think is fundamental is the notion of "divergence time". Since paralogs can emerge at different time-scales they are composed by a heterogeneous set of protein pairs. Most of comparisons of orthologs and paralogs (Hahn's as well) use sequence divergence as a proxy of time. However this is only a poor estimate, specially when duplications (as in here) are involved (we explored this issue in the past: <a href="http://www.ncbi.nlm.nih.gov/pubmed/21075746">http://www.ncbi.nlm.nih.gov/pubmed/21075746</a>). This means that for a given divergence time paralogs may have larger sequence divergence than orthologs at the same divergence time, or otherwise (if gene conversion is playing a role). Is the conjecture based on sequence divergence or on divergence time?, I think the initial sense of using orthology to annotate accross species is based on the notion of comparing things at the same evolutionary distance. Thus basing our conclusions on divergence times might not be the proper way of doing it. <br />
<br />
CONCLUSIONS AND PROPOSAL FOR RE-STATEMENT<br />
<br />
To conclude, and with the intention of going beyond this particular paper, <br />
I would finish by saying that the key to the problem lies on how we interpret the so-called "ortholog conjecture" or how are our expectations on how function evolves. What I get from re-reading Eugene Koonin's paper and how I am using that "assumption" in my day-to-day work is the following: <br />
<br />
"Orthologs in two given species are more likely to share equivalent functions than paralogs between these two species"<br />
<br />
Therefore the notion of "accross the same pair of species" is important and thus only part of the comparisons made by Hahn and colleagues could directly test this. Looking at the microarray and between-species comparisons data, the conjecture may even hold true!!<br />
<br />
I, however, do think that the conjecture as stated above is limited and does not capture the complexity of orthology relationships. Indeed us, and many other researchers, are tuning the confidence of the orthology-based annotation based on whether the orthologs are one-to-one, one-to-many or many-to-many, even when orthologs are "super-orthologs" (with no duplication event in the lineages separating the two orthologs). <br />
<br />
Since, the underlying assumption of the ortholog conjecture is that duplication may (not necessarily always) promote functional shifts, then many-to-many orthology relationships will tend to include orthologous pairs with different functions.<br />
<br />
Thus I would re-state the conjecture (or expectation) as follows:<br />
<br />
"In the absence of additional duplication events in the lineages separating them, two orthologous genes from two given species are more likely to share equivalent functions than two paralogs between these two species"<br />
<br />
This would be a more conservative expectation, which is closer to the current use of orthology-based annotation that tends to identify one-to-one orthologs, rather than any type. <br />
<br />
When duplications start appearing in subsequent lineages thus creating one- or many-to-many orthology relationships, the situation is less clear. Following the assumption that duplications may promote functional divergence. Then one could expand the conjecture by "the more duplications in the evolutionary history separating two genes, the lower the expectation that these two genes would share equivalent functions".<br />
<br />
I wrote this contribution on the fly, and surely there are ways of expressing this in more appropriate terms. In any case I hope I made clear the idea that the conjecture emerges from the notion of duplications causing functional shifts and that our expectations will be clearer if expressed on those terms. This goes on the lines of what Jonathan Eisen mentioned on considering the whole phylogenetic story to annotate genes. <br />
<br />
Under this perspective, the real important hypothesis is that "duplications tend promote functional shifts", I think this is based on solid grounds and has been tested intensively in the past. <br />
<br />
Cheers,<br />
<br />
Toni Gabaldón<br />
<br />
http://treevolution.blogspot.comTonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com0tag:blogger.com,1999:blog-4270702979950789999.post-75329679560175336372011-09-14T07:48:00.000-07:002011-09-14T07:49:49.702-07:00CRG Symposium: Computational Biology of Molecular Sequences. 10-11 November<div class="separator" style="clear: both; text-align: center;"><br />
</div><div class="separator" style="clear: both; text-align: justify;">Registration is open for the <a href="http://pasteur.crg.es/portal/page/portal/Internet/04_EVENTS/HIDE-EVENTS/A4B6730881D36EA2E04012AC0E01759B">CRG symposium</a> organized by our <a href="http://big.crg.es/">Bioinformatics and Genomics programme</a>. This meeting will host internationally reknown scientists in the Bioinformatics field. Just to cite some: Smith, Tramontano, Ponting, Sankoff, Koonin, Bairoch, Brunak... Below you'll find the symposium overview and the complete list of speakers. </div><div class="separator" style="clear: both; text-align: justify;"><br />
</div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjLKym4QKLH45FozFHUcmzt5AVDo6gWWtilTNas2b3XEiWBug84BsUkUcRDbSONHuZjAdjQTqFa1whnvzCAKnHPaqyRQvd9_uU9sT2dAqc-amhVKS_ns1-LQHrZU_MJol1i1tY0ljOWiIE/s1600/CRGsymp.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="143" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjLKym4QKLH45FozFHUcmzt5AVDo6gWWtilTNas2b3XEiWBug84BsUkUcRDbSONHuZjAdjQTqFa1whnvzCAKnHPaqyRQvd9_uU9sT2dAqc-amhVKS_ns1-LQHrZU_MJol1i1tY0ljOWiIE/s200/CRGsymp.png" width="200" /> </a></div><div class="separator" style="clear: both; text-align: center;"><br />
</div><div style="text-align: justify;">Advances in methods to sequence nucleic acids, coupled with more general advances in automation, robotization, and multiplexing, have resulted in the capacity to survey the phenomena of life in a global manner and with unprecedented resolution. As a result, Biology, traditionally an analytic science in which the natural world is dissected in its elemental components in order to be comprehended, is becoming a synthetic science, in which the phenomena of life is approached in more systemic way. In parallel, Biology, a science in which human effort been directed until very recently towards data acquisition, is increasingly becoming a discipline in which data is obtained with almost no human intervention, and the effort is being directed towards data analysis. Computational systems to store, analyze and model biological data have thus become an essential part of research in Biology. The connection between Biology and Computation, however, runs much deeper as we are coming to realize that the unfolding of the instructions in the genome is, <i>stricto senso</i>, a computation on the DNA sequence. Biology, thus, cannot be understood without Computation. The two-day CRG symposium on <b>“Computational Biology of Molecular Sequences” </b>will bring together renowned Computational Biologists from around the world, including both pioneers in the field, as well as promising young scientists. Presentations, discussions and dialogue during the Symposium will contribute to survey the status of a discipline that, at the intersection of Biology and Computation, will have an enormous impact on the world of the XXIst century.</div><b>Confirmed Speakers</b><br />
<i><b>Amos BAIROCH</b></i> Swiss Institute of Bioinformatics (SIB) and University Geneva, Geneva CH<br />
<i><b>Mathieu BLANCHETTE</b></i><b> </b>McGill University, Montréal CA<br />
<i><b>Søren BRUNAK</b></i><b> </b>Technical University of Denmark, Kongens Lyngby DK<br />
<i><b>Philipp BUCHER</b></i><b> </b>Swiss Institute for Experimental Cancer Research (ISREC), Lausanne CH<br />
<i><b>Brendan FREY</b></i> University of Toronto, Toronto CA<br />
<i><b>Mark GERSTEIN</b></i><b> </b>Yale University, New Haven US<br />
<b><i>Nick GOLDMAN </i></b>European Bioinformatics Institute, Hinxton UK<br />
<i><b>Tim HUBBARD</b></i><b> </b>Wellcome Trust Sanger Institute, Hinxton UK<br />
<i><b>Eugene V. KOONIN</b></i> National Center for Biotechnology Information, Bethesda US<br />
<i><b>Gene MYERS</b></i> Janelia Farm Research Campus, Ashburn US<br />
<i><b>Chris PONTING</b></i><b> </b>University of Oxford, Oxford UK<br />
<i><b>David SANKOFF</b></i> University of Ottawa, Ottawa CA<br />
<i><b>Ron SHAMIR</b></i><b> </b>Tel-Aviv University, Tel-Aviv IL<br />
<i><b>Temple F. SMITH</b></i><b> </b>BioMolecular Engineering Resource Center, Boston US<br />
<i><b>Terry SPEED</b></i> Walter & Eliza Hall Institute of Medical Research, Parkville AU<br />
<i><b>Peter STADLER</b></i><b> </b>Universität Leipzig, Leipzig DE<br />
<span style="font-style: italic;"><span style="font-weight: bold;">Gary STORMO</span></span><b> </b>Washington University School of Medicine, Saint Louis US<br />
<i><b>Ana TRAMONTANO</b></i> Sapienza University, Rome IT<br />
<i><b>Michele VENDRUSCOLO</b></i><b> </b>University of Cambridge, Cambridge UK<br />
<i><b>Martin VINGRON</b></i><b> </b>Max Planck Institute for Molecular Genetics, Berlin DE<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><br />
</div><div style="text-align: center;"></div>Tonihttp://www.blogger.com/profile/03001880189572009935noreply@blogger.com2