An ontology in the sense in which it is used in informatics is ``a representation of the shared background knowledge for a community. Very broadly, it is an implementable model of the entities that need to be understood in common in order for some group of software systems and their users to function and communicate at the level required for a set of tasks
An ontology describes the categories of objects described in a body of data and the relationships between those objects and the relationships between those categories. In doing so, an ontology describes those objects and sometimes defines what is needed to be known in order to recognise one of those objects within the information being processed by an application. An ontology should be distinguished from thesauri, classification schemes and other simple knowledge organisation systems. By controlling the labels given to the categories in an ontology, a controlled vocabulary can be delivered
Ontology is a term with its origins with Aristotle in his writings on Metaphysics, IV,1 from 437 BCE… The goal is to achieve a complete and true account of reality. Computer scientists have taken the term and somewhat re-defined it, removing the more philosophical aspects and concentrating upon the notion of a shared understanding or specification of the concepts of interest in a domain of information that can be used by both computer and humans to describe and process that information. The goal with a computer science ontology is to make knowledge of a domain computationally useful. There is less concern with a true account of reality as it is information that is being processed, not reality
We live in a world of instances, individuals or objects. There are trees, flowers, the sky, stones, animals, etc. As well as these material objects, there are also immaterial objects, such as ideas, spaces, representations of real things, etc.
As human beings, we put these objects into categories or classes. These categories are a description of that which is described in a body of data. The categories themselves are a human conception. We live in a world of objects, but the categories into which humans put them are merely a way of describing the world; they do not themselves exist; they are a conceptualisation. The categories in an ontology are a representation of these concepts. The drive to categorise is not restricted to scientists; all human beings seem to indulge in the activity. If a community aggrees upon which categories of objects exist in the world, then a shared understanding has been created.
In order to communicate about these categories, as we have already seen, we need to give them labels. A collection of labels for the categories of interest forms a vocabulary or lexicon. Human beings can give multiple labels to each of these categories. This habit of giving multiple labels to the same category and the converse of giving the same label to different categories polysemy) leads to grave problems
AS well as agreeing on the categories in which we will place the objects of interest described in our data, we can also agree upon what the labels are for these categories
Due to human nature, the autonomous way in which these resources develop, the time span in which they develop, etc., the categories into which biologists put their objects and the labels used to describe those categories are highly heterogeneous. This heterogeneiety makes the knowledge component of biological resources very difficult to use
There is, therefore, a need to have a common understanding of the categories of objects described in biologys data and the labels used for those categories. In response to this need biologists have begun to create ontologies that describe the biological world. The initial move came from computer scientists who used ontologies to create knowledge bases that described the domain with high-fidelity; an example isEcoCyc. Ontologies were also used in projects such as TAMBIS to describe molecular biology and bioinformatics to reconcile diverse information sources and allow creation of rich queries over those resources. The explosion in activity came, however, in the post-genomic era with the advent of the Gene Ontology (GO). The GO describes the major functional attributes of gene products---molecular function, biological process and cellular components. Now some twenty plus genomic resources use GO to describe these aspects of the gene products of their respective organisms. Similarly, the Sequence Ontology describes sequence features; PATO (the phenotype Attribute and trait ontology) describes the qualities necessary to describe an organism's phenotype. All these and more are part of the Open Biomedical Ontologies project