Review of Sundaram et. al. Nature Genetics 2018

In the practice of precision medicine, the accurate filtering of common and rare missense variants of benign consequence remains an unsolved problem. Sundaram et. al. using an identity by state model attempt to infer a variants clinical importance by using their prevalence in the genomes of our close relatives, non-human primate (NHP) species. In human, a common empirical practice is to dismiss the importance of missense variants that occur >0.10 %. In this study a deep neural learning algorithm was developed to identify pathogenic variants in rare disease patients that was aided by the prevalence of NHP variants. They succeeded in finding new variants of predicted clinical significance among patients that were part of the Deciphering Developmental Disorders study and overall their method scored better than other popular tools that rely on evolutionary conservation, e.g., CADD and REVL. The ClinVar database was used to benchmark this outcome. By training this neural network on six species of NHP, a database of 70 million missense variants in human are cataloged for further interpretation and validation. Interestingly, they find an improved ascertainment of the benign state of human variants that was equally due to the use of NHPs and deep learning modules from benchmark variant sets. This study is a promising new use of comparative species, however, their use of small sample sizes of NHPs, the uncertain accuracy of lower allele frequency NHP alleles and the lacking clinical validation for newly discovered variants will need further consideration before this approach is used in large scale filtering of human missense variants.