Review Of Relevant Ontologies - Ntp.niehs.nih.gov

Transcription

A Review of Relevant Ontologies andApplication of ReasonersMelissa Haendel, PhD@ontowonka

Outline Using ontologies and reasoners for classification Anatomy and Stage Ontologies Example of ontologies and reasoning at work: diagnosingdiseases Environmental ontologies How to exchange data better

What is an Ontology?Definition:A formalconceptualizationof a specifieddomainKey Features: Terms are defined Relationships between terms aredefined, allowing logical inferenceand sophisticated data queries Terms are arranged in a hierarchy Expressed in a knowledgerepresentation language such asRDFS, OBO, or OWLExamples:SNOMED, Foundational Model of Anatomy,Gene Ontology, Linnean Taxonomy of species

Example taxonomy

Ontologies enable queries to “just work” asyou would hope.Without ontological“subsumption” reasoning,synonym formalism, theuser would either need todo 17 different queries, orget an incomplete set ofresults.

cking,inferredclassificationalong differentaxes, andpowerfulgraph-basedapplications

Ontologies are formal classificationsAppendageTailMedian finPaired finPectoral finPelvic finappendagePaired fintailPectoralfinPelvicfin

Relationships also support classification‘pectoral fin radial’ SubClassOf part of some ‘fin’part of some‘fin’pectoralfin radialpelvicadductorprofundus

Necessary and sufficient conditionsAny sense organ that functions in the detection of smellis an olfactory sense organsense organolfactorysenseorgancapable ofsomedetection ofsmell

Classifyingsense organcapable of somedetection of smellnosenoseolfactorysenseorgansense organnosecapable ofsomedetection ofsmellThese are necessary and sufficient conditions, alsocalled an equivalent class axiom

Using reasoners to detect errorsUBERON: boneVertebratais aDrosophila melanogasterpart ofis aFruit fly FBbt ‘tibia’is aHomo sapiensUBERON: tibiais apart ofHuman FMA ‘tibia’

Using reasoners to detect errorsUBERON: boneonly in taxonVertebratais aDrosophila melanogasterpart ofis aFruit fly FBbt ‘tibia’is aHomo sapiensUBERON: tibiais apart ofHuman FMA ‘tibia’

Using reasoners to detect errorsUBERON: bonedisjoint withDrosophila melanogasterpart ofis aFruit fly FBbt ‘tibia’only in taxonVertebratais ais aHomo sapiensUBERON: tibia is apart ofHuman FMA ‘tibia’

A compendium of interoperable ontologiesFunctional Genomics: GenefunctionGene OntologyTranscriptomics, proteomics:Gene expressionAnatomy and StageOntologiesPhenomics and assays:Effects of gene mutationsand environment and theirmeasurementPhenotype and TraitOntology, Ontology ofBiomedical InvestigationsEnvironments: drugs,exposures, life historyENVO, MRE, ZECO, ECTODisease: Effects of genemutations phenotypesenvironment stagingNumerous nosologies,MonDO

Anatomy and stage ontologies

The Zebrafish Anatomy and stage ontologies

The Zebrafish Anatomy and stage ontologies

Uberon:bridgingsemantics foranatomyMungall et al. (2012). Genome Biology,13(1), R5. doi:10.1186/gb-2012-13-1-r5Köhler et al. (2014) F1000Research2:30 Haendel et al. (2014) JBMS 5:21doi:10.1186/2041-1480-5-21

A merger of disease ontologies

The challenge of multiple perspectives: how can webridge these?InternationalClassificationof DiseasesOnlineMendelianInheritancein MedGenNationCancerInstituteThesaurus Disease classifications and lists there are a lot of them

ORDO/Orphanet(yellow)DOID(blue)4 disease resourcesplus mappings:Hemolytic anemiaOMIM(brown)SubClassOf(solid line)MESH(grey)Xref(dashed grey line)MungallHarmonizing disease vocabularies: http://bit.ly/Monarch-Disease

Haremstitutionalhem hemolytic-uremics rotlcSyndrome,Type7nephrosis.y p1cahem olytic -uremics yndrom ew ithHfact oranom alyp1cahemolytic-uremicsyndrom ewithanti-factorHantibodies-· - .y p1cahem olytic-uremics yndrom ew ithBfact nomaly p1cahem olytic -uremics yndrom ew ithDG KEd efic ienc

Haremstitutionalhem ciency. : NephrotlcSyndrome,Type7nephrosis.y p1cahem olytic -uremics yndro m ew ithHfact oranom alyp1cahemolytic-uremicsyndrom ewithanti-fact orHantibodies-· - .y p1cahem olytic-uremics yndrom ew ithBfact nomaly p1cahem olytic -uremics yndrom ew ithDG KEd efic ienc

kBOOMmonarchinitiative.orgMungall, C. J., bioRxiv, 048843. doi:10.1101/048843

MonDO: Merged Ontology of Disease Entities“Ontology”Classes (before, aftermerge)SubClass axiomsMappingsDOID6878 6012708236656MESH (D)11314 415219036OMIM (D)7783 7783031242Orphanet (D)8740 46831518220326OMIA4833 48333120355DC209 208310316Medic08630343539757 narch-initiative/monarch-disease-ontology

Phenotype ontologies

Different communities use different eratosisThickhand skinUlceratedpaws

Challenge: Each data source uses theirown vocabulary/ontologyMPHPMGIZFAZFINHPOA

Challenge: Each data source uses theirown phenotype CQTLdbHPOAOMIM WBZFAPBFYPOAPOSGDZFINEHRSNOMED

Decomposition of complex conceptsallows interoperabilityPATOHuman phenotype“Palmoplantarhyperkeratosis” increasedGOkeratinizationUberonStratum corneumlayer of skinAutopodSpecies neutral ontologies, homologous concepts

Semantic similarity of phenotypes for diseasediscoveryFMA PATOMPZFA PATOFBbt PATO"Linking Human Diseases to Animal Models Using Ontology-Based Phenotype Annotation." PLoS Biol 7(11): e1000247.doi:10.1371/journal.pbio.1000247 Washington NL, Haendel MA, Mungall CJ, Ashburner M, Westerfield M, Lewis SE

The Human Phenotype Ontology fordeep phenotyping

Ontologies at work:Data integration and diseasediagnosis

A: Data types. covered by Monarch data sourcesI1-------------------------- 1G2P/Dt gU) Cl) .2E·U) ] ; g 5l-2 - Cl -c; · · ·· · · ee o o · 0 0ee-'1nMo. narch----- -- ClinVa rCTDGene RevsOM IM dbHPOAOrpha netGWASCoriellKEGGOMIAAnima lQTLDB-MPDMMRRCWm BaseFlyBaseIMPCMG IO ntologyannotations----- ·mapp1n9 -i: - - -edGen- -BridgingOntology- - - - --MeSH0010V)WIV)- wV)OMIMo,:::::::::::.::::::::;::::::::::::: 0 RD 0 - Elementsof MorphologyWI Ii 0zwMP -- W-Pi : : . . . . . - FBev -- W--ACLMA --.· FB1'tEMAPA o::: :. ZP -- ZfA:JC0. 0UBERON z- V)uGENOffizwZflN ucscHGNCBioGridPantherV)0.HPEnsemblNCB I IB: Monarch data sources and onto logy a nnotationsg Da ta so urce:.o . c.------ ---C : Mappings to b ridging onto logies* Annolated to many sources GO* Annotated to a ll sources ECO* Annotated to a ll sources ROContributedIMaintai nedICreated

Harmonizing diseases, phenotypes, anatomy, and genotypes91% of our 2.2 Million G2P associations require integrating2 or more data sources

Phenotypic matchmaking for disease diagnosticsQ)Q)Q) cohomology(.!) o.Patient APhenotypeProfileQ).0 o.Q)a. .0cQ).co.Closestt ermi n common

Combining genotype andphenotype data for variant prioritizationWhole exomeRemove off-targetand common variantsVariant score from alleleMendelian filtersPhenotype score from phenotypic similarityfreq and pathogenicityPHIVE score to give final candidateshttp://bit.ly/exomiser

Putting all that data to use to diagnose a rareplatelet iserRanked STIM-1 variant maximally pathogenicbased on cross-species G2P data,in the absence of traditional data sourcesMGI gous,missense mutationSTIM-1Heterozygous,missense mutationSTIM-1N/A

What about environment andexposure ontologies?

Earth

Earth“the environment is everythingthat isn’t me”----Albert Einstein

EarthCan we sensibly make anontology of everything thatisn’t me?

5.1.1.16.1.1.17.1.1.18.1.1.19.Occupational diseases caused by exposure to agents arisingfrom work activitiesDiseases caused by chemical agentsDiseases caused by beryllium or its compoundsDiseases caused by cadmium or its compoundsDiseases caused by phosphorus or its compoundsDiseases caused by chrom ium or its compoundsDiseases caused by manganese or its compoundsDiseases caused by arsenic or its compoundsDiseases caused by mercury or its compoundsDiseases caused by lead or its compoundsDiseases caused by fluorine or its compoundsDiseases caused by carbon disulfideDiseases caused by halogen derivatives of aliphatic or aromatic hydrocarbonsDiseases caused by benzene or its homologuesDiseases caused by nitro- and amino-derivatives of benzene or its homologuesDiseases caused by nitroglycerine or other nitric acid estersDiseases caused by alcohols, glycols or ketonesDiseases caused by asphyxiants like carbon monoxide, hydrogen sulfide, hydrogen cyanideor its derivativesDiseases caused by acrylonitri leDiseases caused by oxides of nitrogenDiseases caused by vanad ium or its compounds

Can we make these lists computable?Translate them into a form a machine canunderstand and reason over?

We have a precise machine-readable language fordescribing some environmental exposures CCOC( O)CC(SP( S)(OC)OC)C( O)OCCCHEBI:6651CheBI is a chemical ontology

But others are harder to defineImage: Zol87 CC by/nc

The ZebrafishEnvironmentalConditions Ontologyhttps://github.com/ybradford/zebrafish experimental-conditions-ontology

The Environment Ontology Originally created formetagenome samples Characterize microbialenvironments Extended for ecologicalscience The “Earth PhenotypeOntology” Being adapted for sionPollutionBiologicalAlgal bloom Buttigieg, P. L., Morrison, N., Smith, B., Mungall, C. J., & Lewis, S. E. (2013). The environment ontology: contextualisingbiological and biomedical entities. Journal of Biomedical Semantics, 4(1), 43. doi:10.1186/2041-1480-4-43

Biome: Food desertFeature: Store (alcohol, sugar-rich food)Material: Air, high particulate matterProcess: decreased investmentin infrastructureImage: Zol87 CC by/nc

CHEBI: chemical classificationmonarchinitiative.org

Environmental conditions, treatmentsand exposures ontology onsmonarchinitiative.org

The Ontology of Biomedical Investigations(2016) The Ontology for Biomedical Investigations. PLOS ONE 11(4): e0154556. os.org/plosone/article?id 10.1371/journal.pone.0154556

Recording and exchangingphenotype and environmentaldata better

WebPhenote and NoctuaA causal/spatiotemporal network curation rchinitiative.org/noctua.berkeleybop.org

Computable encodings are essentialGenesBase pairsVariant notation (eg. HGVS) EnvironmentMedical procedure coding PhenotypesHuman PhenotypeOntology

Standard exchange formats exist for genes but for phenotypes? Environment?GenesGFFVCFEnvironmentBEDPhenotypesPXF

If it is alive, it can be PhenoPackagedPatients & CohortsPersonalizedMedicineRare DiseaseDiagnosisDisease vectorsEpidemiologicalMonitoringModel OrganismsMechanisticDiscoveryBiodiversityDomestic AnimalsCropsEnvironmentalMonitoringDrug discovery& DevelopmentGeneticEngineeringSome biodiversity images adapted from http://i.vimeocdn.com/video/417366050 1280x720.jpg

A semantic vision for environmental healthresearchLaying a Community-Based Foundation for Data-Driven SemanticStandards in Environmental Health Scienceshttps://ehp.niehs.nih.gov/15-10438/

NICEATM NewsFor updates on the SEAZIT project and other activitiesrelated to in vitro alternatives, subscribe to theNICEATM News email list.– To subscribe to the NICEATM News email list, go main/formViewer/form id/361– Check the NICEATM News box and click submitX

AcknowledgementsOHSUMatt BrushKent ShefchekJulie McMurryTom ConlinNicole VasilevskyDan KeithGenomicsEngland/Queen MaryDamian SmedleyJules JacobsonJackson LaboratoryPeter RobinsonZFINCeri VanSlykeYvonne BradfordCTDCarolyn MattinglyLawrence BerkeleyChris MungallSuzanna LewisJeremy NguyenSeth CarbonNicole WashingtonGarvanCharitéMax PlankRTITudor GrozaPier ButtigiegEBIDavid OsumiSutherlandSebastian KohlerJim BalhoffBecky BoylesCyverseRamona WallsFUNDING: NIH Office of Director: 2R24OD011883; NIH-UDP: HHSN268201300036C, HHSN268201400093P, NSF-DEB0956049, NCINCI/Leidos #15X143, BD2K U54HG007990-S2 (Haussler) & BD2K PA-15-144-U01 (Kesselman)With special thanks to Julie McMurry for excellent graphic design

www.monarchinitiative.orgPDs: Melissa Haendel, Chris Mungall, Peter RobinsonFunding:NIH Office of Director: 1R24OD011883; NIH-UDP: HHSN268201300036C, HHSN268201400093P;NCINCI/Leidos #15X143, BD2K U54HG007990-S2 (Haussler) & BD2K PA-15-144-U01 (Kesselman)

A Review of Relevant Ontologies and . Application of Reasoners. . Mendelian MedGen Inheritance in Man Disease Ontology Disease classifications and lists there are a lot of them. DOID . EHR IMPC OMIM QTLdb . Dec