Scholarship code CU5.290

Genomic tools for DNA-informed breeding for fruit crops resilience

  • Reference person
  • Host University/Institute
    Fondazione Edmund Mach
  • Internship
  • Research Keywords
    DNA-informed breeding
    Haplotype-resolved genomic sequence
    Breeding information management tools
  • Reference ERCs
  • Reference SDGs
    GOAL 2: Zero Hunger
    GOAL 3: Good Health and Well-being
    GOAL 13: Climate Action


Climate changes are affecting crops production and both industry and consumers always demand new and improved cultivars. Genome-informed breeding can provide effective solutions to market demands and production challenges. The transition from ‘classical’ breeding to DNA-informed breeding techniques is often hampered by the availability of genomic tools and information that necessarily are at the basis of this enhanced process of breeding. At the Fondazione Edmund Mach (FEM) of San Michele all’Adige (Trento-IT), large germplasm collections (particularly of grapevine, apple and small fruits) are maintained which could be used as the source of superior alleles to be introgressed into the breeding material (parental lines and selections) in response to abiotic or biotic stresses and to climate changes. At the same time, several genomic tools have been developed – particularly simple sequence repeat (SSR) assays and single-nucleotide polymorphism (SNP) arrays – to assess the genetic potential of the available material and correlate it with the phenotypic traits of interest. Although these tools alone can provide very useful information, they do not provide a comprehensive picture of the allelic state of each trait of interest and additional analyses are needed to unravel the complex mosaic structure of elite materials. All this information is essential to bring genome-informed breeding to the next level, but in order to exploit its full potential high-quality and haplotype-resolved genome assemblies are needed, at least of the main founders of the breeding programs. Once produced, this allele-specific information has then to be integrated with all the genotypic and phenotypic data available for the breeding programs and a breeding-information system has to be built to accommodate all this know-how.The aim of this Ph.D. Project is twofold. From the one side, the project aims to obtain a high-quality, chromosome-scale and possibly haplotype-resolved genome of some of the most used genotypes in the breeding programs of apple and grapevine at FEM. A combination of the most recent sequencing technologies like PacBio HiFi, Omni-C and Illumina will be used to provide a state-of-the-art genome sequence and annotation of the selected genotypes. As for the second goal of the project, the work of the student will focus on the implementation of bioinformatic tools to mine and visualize this high-resolution information and combine it with all the available breeding information (including pedigrees), providing a useful toolkit to support breeding decisions.The project has a strong bio-informatic connotation and will allow the student to gain hands-on experience with the assembly of genomes by using the latest technologies and most recent software. The student will also strengthen his/her programming skills by developing computational breeding tools that can effectively support the breeding activities at FEM.

Suggested skills:

Good knowledge and experience in Computational Biology topics both as a user of bioinformatics tools (to analyze genetic data) and as a developer of software/scripts to accomplish data analysis. Preferred programming languages include Python, R and C. Prior experience with the analysis of genotypic data from plants is required.

Research team and environment

The Research and Innovation Centre (CRI) of the Fondazione Edmund Mach pursues scientific research, develops biotechnologies, and promotes innovation for agriculture, bioeconomy, ecology, biodiversity, the environment and food. The Centre focuses on basic and applied research on: (i) strategic supply chains of the Trentino agrosystem; (ii) forest and alpine ecology; (iii) biodiversity evolution and conservation; (iv) effects of climate change on natural and agro ecosystems; (v) bioeconomy, (vi) agrobiotecnology. The multidisciplinary functionality of the Center is guaranteed by the matrix organization and the transversal integration of the 21 Units and 21 Technological facilities on 4 thematic areas, namely Agrosystems and Bioeconomy, Biodiversity, Ecology and Environment, Food and Nutrition and Computational Biology. The technological platforms are operated by highly qualified personnel and cover Plant Phenotyping, Sequencing and Genotyping and Metabolomics. The Centre is equipped with a High-Performance Computing Facility with 376 cores and over 7.5TB of RAM (up to 2TB per node) and over 100TB of dedicated storage. The Centre hosts three major germoplasm banks, namely:Grapevine germplasm collection that includes species of the genus Vitis, cultivars of V. Vinifera subsp. Sativa, V.V. Subsp. Sylvestris and interspecific hybrids;Apple germplasm collection that includes species of the Malus genus, cultivar of M. X domestica, M. Sylvestris, M. Sieversii, M. Orientalis and interspecific hybrids;Berries germplasm collection that comprises species of the Vaccinium genus, such as V. Corymbosum, V. Angustifolium, V. Virgatum, V. Myrtillus, V. Vitis ideae, V. Macrocarpon and hybrids, Rubus, Fragaria x ananassa, Ribes and other minor crops.