Crowdsourcing ontology verification [electronic resource]
- Jonathan M. Mortensen.
- Physical description
- 1 online resource.
- Mortensen, Jonathan M.
- Musen, Mark A., primary advisor.
- Khatri, Purvesh, advisor.
- Noy, Natalya F., advisor.
- Stanford University. Program in Biomedical Informatics.
- Biomedicine and healthcare rely heavily on ontologies, with ontology development and use increasing rapidly in the domain. However, as the scale and complexity of these ontologies increases, so too do errors and engineering challenges. There are both automated and manual methods that provide ontology quality assurance and identify ontology errors. However, these methods do not readily scale as ontology size increases, and they do not necessarily identify the most salient errors. Recently, crowdsourcing has enabled solutions to complex problems that computers alone cannot solve. Crowdsourcing presents an opportunity to develop methods for ontology quality assurance that overcome the current limitations of scalability and applicability. Toward that end, the work described in this dissertation has the following aims: (1) to examine the effect of ontology errors in biomedical applications, (2) to develop and tests a scalable framework for ontology verification via crowdsourcing that overcomes current ontology quality assurance method limitations, (3) to apply this framework to ontologies in-use, and (4) to evaluate the methodology and its effect in the context of a the biomedical domain. In the preliminary studies, I found that crowd workers perform best when answering questions about concrete (not abstract) ontology concepts, when presented with a simply stated natural language representation of an ontology axiom, and when provided textual definitions of ontology concepts. After completing these early studies, I refined and applied the crowd-based methodology to biomedical ontologies in-use. On SNOMED CT, the crowd identified 39 errors in a set of 200 expert-verified relationships, it was indistinguishable from any single expert by inter-rater agreement, and it performed on par with any single expert compared against the consensus standard that five subject-matter experts developed, with a mean AUC of 0.83. On the Gene Ontology, a different set of subject-matter experts identified 16 errors, generally in relationships referencing acids and metals. The crowd performed poorly in identifying those errors, with an AUC ranging from 0.44 to 0.73, depending on the methods configuration. However, when the crowd verified what experts considered to be easy relationships with useful definitions, they performed reasonably well. The results of the crowd's performance in verifying SNOMED CT and GO suggest that the crowd can indeed assist with ontology engineering tasks and, rather than serving as a complete replacement, the crowd can serve as an assistant, helping experts with ontology verification by completing the easy tasks and allowing experts to focus on the difficult tasks.
- Publication date
- Submitted to the Program in Biomedical Informatics.
- Thesis (Ph.D.)--Stanford University, 2015.