Agrigenomics

De-novo assembly and annotation of complex plant and animal genomes

De novo genome assembly plays a key role in computational biology as a contiguous and accurate genome reconstruction acts as a starting point for subsequent functional analyses. Long-read platforms integrated by short-reads for polishing residual errors in the assembled contigs enabled tackling genome assembly also for non-model organisms at reasonable costs. However, long repetitive/duplicated regions still limit the assembly to contigs and requires scaffolding using an additional long-range platform to increase contiguity, possibly reaching chromosome-level assemblies. Therefore, de novo genome assembly is still a step-by-step process, that requires the iterative integration of additional data layers. We start from PacBio HiFi or Oxford Nanopore (ONT) long reads to produce the contigs, than we use Illumina reads for polishing. Despite the improving quality of long reads, we still consider the polishing a necessary step to remove InDels, the major cause of imprecise gene annotation because of the frameshift errors. Then, we use Hi-C technology and/or optical (Bionano Genomics) and electronic (Nabsys) maps for scaffolding.

Pan-genomes

The pan-genome represents the entire set of genes within a species, consisting of a core genome - containing sequences shared between all individuals of the species - and the ‘dispensable’ genome. The idea of a pan-genome was first conceived for bacterial species in 2005, when the genomes of six strains of Streptococcus agalactiae were sequenced, revealing a core genome containing 80% of S. agalactiae genes.
By de novo assembly of RNA-seq data, in 2013 we reported that the high polyphenol content of grapevine cultivar Tannat is conferred primarily by genes that are not shared with the reference genome, paving the way to pan-genome studies in plants and helping the replacement of the term ‘dispensable’ with ‘accessory’. The discovery that plant varieties/ecotypes can be caracterised by sets of proprietary genes and not only by a proprietary combination of different alleles of the same set of genes required a tremendous effort. Today, putting together a pan-genome for complex genomes is facilitated by improvements in genome sequencing technologies, particularly long-read sequencing. In collaboration with Roberto Papa we are currently constructing the bean pan-genome.

15 selected publications

The French–Italian Public Consortium for Grapevine Genome Characterization. (2007). The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449: 463-467
Denoeud F. et al (2008). Annotating genomes with massive-scale RNA-sequencing. Genome Biology 9: R175
Zenoni S. et al (2010). Characterization of transcriptional complexity during berry development in Vitis vinifera using RNA-Seq. Plant Physiol 152: 1787-1795
Fasoli M. et al (2012). The grapevine expression atlas reveals a deep transcriptome shift driving the entire plant into a maturation program. Plant Cell 24: 3489-3505
Dal Santo S. et al (2013). The plasticity of the grapevine berry transcriptome. Genome Biol. 14: R54
Da Silva C. et al (2013). The high polyphenol content of Vitis vinifera cv. Tannat berries is conferred mostly by genes that are not shared with the reference genome. Plant Cell 25:4777-478
Bellucci E. et al (2014). Decreased Nucleotide and Expression Diversity and Modified Coexpression Patterns Characterize Domestication in the Common Bean. Plant Cell 26: 1901–19128
Venturini L. et al (2015) Symbiotic plant-fungi interactions stripped down to the root. Nature Genetics 2015 47: 309-310
Aversano R. et al (2015). The Solanum commersonii Genome Sequence Provides Insights into Adaptation to Stress Conditions and Genome Evolution of Wild Potato Relatives. Plant Cell 7: 954–968
Bombarely A. et al (2016). Insight into the evolution of the Solanaceae from the parental genomes of Petunia hybrida. Nature Plants. 2: 16074
Chin C.S. et al (2016). Phased diploid genome assembly with single-molecule real-time sequencing. Nat Methods. 13: 1050-1054.
Rau D et al (2018). Genomic dissection of pod shattering in common bean: mutations at nonorthologous loci at the basis of convergent phenotypic evolution under domestication of leguminous species. Plant J. 2018 Nov 13. doi: 10.1111/tpj.14155.
Cecchin M et al (2019). Chlorella vulgaris genome assembly and annotation reveals the molecular basis for metabolic acclimation to high light conditions. Plant J. 100(6):1289-1305
Massonnet M. et al (2020) The genetic basis of sex determination in grapes. Nat Commun. 2020 Jun 9;11(1):2902. doi: 10.1038/s41467-020-16700-z.
Bellucci E. et al (2021). The INCREASE project: Intelligent Collections of food-legume genetic resources for European agrofood systems. Plant J. doi: 10.1111/tpj.15472.