Massimo
Delledonne

Agrigenomics

De-novo assembly and annotation of complex plant and animal genomes

De novo genome assembly plays a key role in computational biology as a contiguous and accurate genome reconstruction acts as a starting point for subsequent functional analyses. Long-read platforms integrated by short-reads for polishing residual errors in the assembled contigs enabled tackling genome assembly also for non-model organisms at reasonable costs. However, long repetitive/duplicated regions still limit the assembly to contigs and requires scaffolding using an additional long-range platform to increase contiguity, possibly reaching chromosome-level assemblies. Therefore, de novo genome assembly is still a step-by-step process, that requires the iterative integration of additional data layers. We start from PacBio HiFi or Oxford Nanopore (ONT) long reads to produce the contigs, than we use Illumina reads for polishing. Despite the improving quality of long reads, we still consider the polishing a necessary step to remove InDels, the major cause of imprecise gene annotation because of the frameshift errors. ​Then, we use Hi-C technology and/or optical (Bionano Genomics) and electronic (Nabsys) maps for scaffolding. 

Pan-genomes

The pan-genome represents the entire set of genes within a species, consisting of a core genome - containing sequences shared between all individuals of the species - and the ‘dispensable’ genome. The idea of a pan-genome was first conceived for bacterial species in 2005, when the genomes of six strains of Streptococcus agalactiae were sequenced, revealing a core genome containing 80% of S. agalactiae genes. 
By de novo assembly of RNA-seq data, in 2013 we reported that the high polyphenol content of grapevine cultivar Tannat is conferred primarily by genes that are not shared with the reference genome, paving the way to pan-genome studies in plants and helping the replacement of the term ​‘dispensable’ with ‘accessory’. The discovery that plant varieties/ecotypes can be caracterised by sets of proprietary genes and not only by a proprietary combination of different alleles of the same set of genes required a tremendous effort. Today, putting together a pan-genome for complex genomes is facilitated by improvements in genome sequencing technologies, particularly long-read sequencing. In collaboration with Roberto Papa we are currently constructing the bean pan-genome. 

15 selected publications

   Previous