ANR Evoluthon

ANR Evoluthon: Artificial life as a benchmark for Molecular Evolutionary studies

[2019-2023]

Context.

Methods in molecular evolution, genomics and phylogenetics are applied widely across Biological sciences. For example, they are used for uncovering the functional importance of genes for species of interest (Liu et al, 2015), predicting seasonal viral strains against which a vaccine needs to be developed (Luksza et al, 2014), understanding human migrations on earth (Slatkin et al, 2016), managing agro-systems (Thrall et al, 2011), annotating medically-relevant variants (Cooper and Shendure, 2011), or even in judicial inquiries (Scaduto et al, 2010). Because these methods perform inferences of a historical nature, they face a validation issue: it is not possible to travel in time and verify hypotheses and predictions, which concern events that can be up to 4 billion years old. A possible experimental validation can be to evolve organisms in the lab (Randall et al, 2016), but experiments are short term, costly and have never lead to instances able to discriminate among different methods. Cross validation can be performed by comparing with the fossil (Szöllősi et al, 2012, Romiguier et al, 2012) or ancient DNA record (Duchemin et al, 2015), but samples are rare, in particular in the microbial world, and ancient DNA is not preserved above one million years. Predicted ancestral proteins can be synthesized to verify that they are still functional (Groussin et al, 2017), but even simplistic methods with known shortcomings seem to produce functional ancestral proteins. Theoretical considerations about the models and methods can also help to choose among competing approaches (e.g. statistical consistency, computational complexity), and there are ways to assess the robustness of the results (e.g. by data re-sampling), but those say nothing about the validity of the underlying modeling choices (Felsenstein, 2003). Throughout the scientific literature the most popular validation approach remains computer simulations. Genome evolution can be simulated in silico for a much higher number of generations than in experimental evolution, at a much lower cost. Then the results of simulations can be used as instances of inference methods.

Performing simulations for validation requires epistemological and organizational thinking. Indeed very often an individual method is tested with an ad hoc simulation, i.e. a simulation made on purpose to test it. In that situation some elements of the method are inevitably integrated in the simulator, which is then likely to generate only easy instances for this method and has no chance to reach the complexity of real data. Even when simulations are based on a general software that has not been designed for a specific study (Dalquen et al, 2011, Sjostrand et al, 2013, Arenas et al, 2014, Mallo et al, 2015, Edgar et al, 2018), some important underlying principles remain, shared between the simulation and inference methods simply because they are widely accepted (and often implicit) in the bioinformatics community. These principles are the “Natural Interpretations” of the community (Feyerabend, 1975): for example genes are considered as evolutionary units and intragenic rearrangements are neglected; simulations are performed at the inter-specific level, and ignore population-level processes where mutation, drift and selection occur; sites evolve independently from each other and independently from higher-level (e.g. structural) constraints. Extinct or unsampled species are ignored and not simulated. In such a situation, methods are only tested in a world designed for them, which does not assess their efficiency in the real world.

There is a need for a cooperative effort to organize and standardize benchmarks, as acknowledged for example by the addition of a section in PLoS Computational Biology dedicated to benchmarking, or the upcoming edition in 2019 of a special issue of Genome Biology on benchmarking studies.

Proposition

We propose an original, principled way of benchmarking models and methods for molecular evolution studies with computer simulations. We are inspired by the “double blind” principle that governs test studies in science in general, and also some software development techniques (Pugh, 2011), where development and test teams are separated and work independently. The principles are that:

  1. Inference methods and simulated benchmarks should not be built by the same team. Moreover, the benchmark and inference teams, while having a common biological culture, should be “methodologically blind” to each other, meaning that principles from inference methods should not be included in simulations, and the other way around, principles specific to simulations should not be used by inference methods. To this aim, the simulation and the inference methods should be produced by teams belonging to different scientific communities.
  2. The simulated benchmarks are produced by a model which has not been designed to be used as a benchmarking tool. While this seems hardly doable and somewhat contradictory, we argue that this is the way of approaching the double blind principle, and that it is possible for molecular evolution because of the existence of disjoint scientific communities around the modeling of genome evolution.
  3. As much as possible, processes, and not patterns, should be simulated. This means that instead of tuning parameters to resemble empirical data in some arbitrary sense, we should uncover the processes that produce these empirical data and implement them into a mechanistic model. Although it is desirable to produce simulated data that looks like empirical data, the definitions of the similarity measures can themselves be ad hoc design choices, dependent on a particular inference method.

We will implement these principles by gathering two teams from two different backgrounds, and by organizing an original mode of collaboration between the two. The first is the Inria “Beagle” team, specialized in in silico experimental evolution, and the second is the CNRS “Cocoon” team from the Biometry and Evolutionary Biology Lab (LBBE) of the University of Lyon, specialized in molecular evolution inference methods. The scientific background of Beagle is in bio-inspired computer science: complex systems, genetic algorithms, genetic programming and multi-agent models. The scientific background of Cocoon is molecular biology, evolutionary biology and bioinformatics. The Beagle team has developed several in silico evolutionary platforms. Importantly they were not devised with the aim to be used as benchmarks for inference methods, which paradoxically creates an ideal situation to use them for this purpose.

The two teams will work in close collaboration, exchanging results, expectations and challenges, but, importantly, not exchanging ideas on computational models. The Beagle team will construct a benchmark useful for a large variety of bioinformatics models and methods including clustering of genes into homologous families, orthologous gene detection, reconstruction of multiple alignments, phylogenies, ancestral genomes, demographic history, and detection of selection, adaptation and convergent genomic evolution. The Cocoon team will organize the application of state-of-the-art methods on these benchmarks, by proposing to scientific labs around the world to participate to a benchmarking challenge. It will also propose improvements based on the results, particularly on its area of expertise: multi-scale interactions in evolution –the nano-scale for genes, the micro-scale with microbes, the visible scale with animals or plants, and a macro-scale, with the global environment. Indeed, the team has a renown record in integrative phylogenetics at several scales and part of the benchmarking activity will tend towards modeling processes of such interactions.

The results of this benchmarking will go far beyond the two involved teams. Indeed we will maintain an open access to all data and expect the benchmark to become an international reference for validating a model or a method. We will organize an “Evoluthon” contest to compare different methods. All teams in molecular evolution will be welcome to submit their method, and we hope to gain the reputation of a standard benchmark for any new method susceptible to be tested by standard simulations.

Jonathan Rouzaud-Cornabas
Jonathan Rouzaud-Cornabas
Associate Professor of Computer Science

My research interests include computational biology, high performance computing and ordinary differential equations.

Related