page under construction

Check out: the ASHG 2017 poster

The website collects information about inborn errors of metabolism, their molecular basis and related variants in global (healthy) population.

Taking the phenylketonuria page) as an example – we are pooling data from GnomAD and combining it with the protein structural data. The protein structural data involves extracting information about the enzyme’s natural substrates from MetaCyC and finding PDB (protein structural) files which have similar peptide sequence and have been crystallized with a ligand closely similar to the substrate. This latter is to pinpoint the location of the catalytic pocket, the enzyme's most sensitve part.

Visualization of protein structure is provided by NGL.

The protein structural information in combination with conservation on the protein level across vertebrates is used to estimate the significance of variants, the known ones from gnomAD, as well as the variants from exhaustively enumerated list. Providing the background for the estimate involves several mapping steps- genome address (UCSC) to coding DNA sequence (Ensembl) to protein (Ensembl, Uniprot).

The goal of this exercise is to estimate the frequency of inborn errors of metabolism from the population frequencies of known as well as undescribed but possibly harmful variants.

The overview page shows where we are predictionwise, as of this moment. The chart courtesy of Highcharts. The chart also shows comparison with the prediction based on the known disease causing variants (ClinVar and HGMD, public version as available in Ensembl BioMart), as well as the incidence reported from newborn screening programs in the US, 2001-2010.

inquiries ivana dot mihalek at childrens harvard edu