Regarding the recruitment of unrelated hematopoietic stem cell donors, HLA haplotype frequencies of specific populations are used to optimize both donor searches for individual patients and strategic donor registry planning. However, the estimation of haplotype frequencies from unphased HLA genotyping data as typically found in donor registry data is challenged by a large amount of genotype data, the complex HLA nomenclature and the heterogeneous and ambiguous nature of typing records.
To meet these challenges, DKMS has developed the publicly available, open-source software Hapl-o-Mat. It estimates haplotype frequencies from population data, including an arbitrary number of loci using an expectation–maximization algorithm. Its key features are the processing of different HLA typing resolutions within a given population sample and the handling of ambiguities recorded via multiple allele codes or genotype list strings. Implemented in C++, Hapl-o-Mat facilitates efficient haplotype frequency estimation from large amounts of genotype data.