Zass, L.1*, Ghedira, K.2, Fakim, Y. J.3, Panji, S.1, Mulder, N.1
1 Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI Africa Wellcome Trust Centre, University of Cape Town, South Africa.
2 Laboratory of Bioinformatics, Biomathematics and Biostatistics, LR16IPT09, Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia.
3 Faculty of Agriculture, Faculty of Information, Communication and Digital Technologies, University of Mauritius, Reduit 80837, Mauritius
African data are underrepresented and data that do exist are not easy to find due to superficial annotation. The African Genomics Data Hub (AGDH) is a new project which is maintaining and further developing H3ABioNet resources to ensure that the African data that do exist are findable and available in an accessible format for different users. The hub provides a set of related resources for African genomics research that are globally applicable. These resources aim to fill gaps in international efforts and improve the processing and analysis of African data. AGDH encourages and facilitates the submission of multi-omics African data to public repositories and will provide a African data catalogue to enable searching of metadata for African datasets stored locally and in public databases. AGDH complements resources such as gnoMAD and serves data in the African Genomic Variation Database that are not available otherwise (due to data sharing limitations). Data include African population-level allele frequencies at a more granular level than currently exists. Further information on clinically actionable variants that have been identified or verified in African populations are served from the African Genomic Medicine Portal (AGMP: https://agmp.h3abionet.org/). AGHD will extract and curate additional African genotype-phenotype data from literature and public databases to include in the AGMP. Additional H3ABioNet tools that are transitioning to AGDH include the H3Africa imputation service and an African reference graph and pangenome for variant calling. Together, these resources provide African data in a more accessible form and a tool suite to analyze and interpret the data.
Keywords: genomics, African data, variation, database