Molecular tools for the Macadamia industry: Genetic fingerprinting, genome assembly and annotation, and characterization of fatty-acid biosynthesis associated genes.

Ranketse, M.*1,2, Pierneef, R.1,4,5, Fourie, G.2, Hefer, C. A.3, Myburg, A. A.2

1 Biotechnology Platform, Agricultural Research Council – Onderstepoort Veterinary Institute, Pretoria, South Africa
2 Department of Biochemistry, Genetics and Microbiology, Forestry and Agriculture Biotechnology Institute (FABI), University of Pretoria, Pretoria, South Africa
3 AgResearch Ltd, Lincoln, New Zealand
4 Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
5 Centre for Bioinformatics and Computational Biology, University of Pretoria, Pretoria, South Africa

Macadamia is a valuable tree crop with a high-quality nut product, however genomic and molecular breeding resources are still limited. Macadamia nuts are the most expensive in the world and are also nutritious and healthy, with high fat and low sugar content. South Africa is one of the world’s largest producers, along with Australia, where macadamia species are native. The macadamia cultivars grown in South Africa are mainly imports from Australia and Hawaii where it was first commercialised more than 100 years ago. Commercial macadamia cultivars are mainly hybrids derived from two species, Macadamia integrifolia and M. tetraphylla. Genome assemblies for these species and their hybrids can provide valuable references for developing genomic tools for genetic resource management, breeding and crop improvement. The aim of this study was to perform genetic fingerprinting of Macadamia cultivars present in South Africa, to generate high-quality genome assembly of key cultivars of importance and finally to characterize fatty-acid biosynthesis associated genes present in Macadamia. Towards this end, we used 13 microsatellite markers to perform genetic fingerprinting to differentiate 110 cultivars present in South Africa. We performed genome assembly of the HAES 695/Beaumont, HAES 791 and Santa Anna cultivars using the Illumina HiSeq 2500 for short-read data and the Oxford Nanopore PromethION for long-read data. We further performed annotation of the three genomes using RNA-Seq data generated from the HAES 695 cultivar. Finally, we used the annotation data to characterize four key gene families (SAD, FAD, KAS and FAT) associated with fatty acid biosynthesis. This study will add to the existing genomic resources for macadamia and assist researchers to decipher the genomic composition of elite macadamia accessions and develop tools for accelerated breeding efforts.

Keywords: genome assembly, genetic fingerprinting, fatty-acid biosynthesis, macadamia, molecular breeding