In many areas of computer science, people are faced with the task of representing recognized or conceived knowledge and communicating knowledge, e.g., about facts, circumstances, or rules in a technical application area, in a business process, or in a legal procedure, or about the contents of documents or Web pages.
Humans Can make use of stored knowledge by drawing on their basic and contextual knowledge of the respective knowledge area, using textbooks, rulebooks, encyclopedias, and keyword indexes, and linking them to the stored content. If, on the other hand, automata are to perform search, communication, and decision tasks with respect to the stored knowledge, or exchange data that themselves contain information about how they are to be structured and interpreted (so-called metadata), they require a representation of the underlying concepts and their contexts for this purpose. For this purpose, the term ontology has become common in some branches of computer science in recent years.
The probably most well-known definition attempt comes from T. Gruber. He calls ontology an explicit formal specification of a shared conceptualization.
In this sense an ontology describes a knowledge domain with the help of a standardizing terminology as well as relations and if necessary derivation rules between the concepts defined there. The common vocabulary is usually given in the form of a taxonomy, which contains classes, relations, functions and axioms as initial elements (modeling primitives). Since there are many fields of knowledge – each with its own or even several competing terminologies – the use of the plural (“ontologies”) also makes sense here (in contrast to philosophy).
In addition to being assigned to areas of application, ontologies can also be classified according to their scope. Thus Ron Weber (following Guarini) distinguishes three levels of ontologies: (1)general, cross-domain, “top level ontologies”, (2) “domain ontologies” related to specific application areas, (3) well-known conceptual data and class models, which are merely to be upgraded with the fashionable name “ontology”. In the following only the levels (1)and (2)shall be considered.
What are ontologies used for in computer science? Gruninger and Lee distinguish three fields of application: Communication, automatic reasoning and representation, and knowledge reuse. If two programs (e.g., web search engines or software agents) are to communicate with each other, they must either carry the interpretation rule for the data themselves (i.e., they are data-dependent), or they provide it in the form of metadata from a mutually accessible ontology. – In automatic reasoning, programs can draw logical inferences already based on the derivation rules known per ontology – so these do not always have to be communicated from scratch. The situation is similar for knowledge representation and reuse (cf. the detailed presentation by Staab).
Ontologies are thus important in all areas of computer science concerned with knowledge – such as artificial intelligence, databases, and information systems (in the broadest sense, including the global information system WWW). In addition, there are related areas such as software engineering and multimedia communication, as well as application areas such as medicine, law, and business informatics.
The term ontology
The term “ontology” is connected with other terms such as “ontology design” and “ontological engineering”. Both terms of course make sense only in the understanding of the computer scientists – in philosophical interpretation these could be at most assumed procedures of a demiurge, i.e. a metaphysical higher being. Ontological engineering comprises – analogous to software engineering – everything that can serve to support the ontology life cycle. An ontology design can basically be done by an inductive approach (formation of larger ontologies from several small “lightweight” ones by connection – “merging”) or by a deductive approach (definition of general concepts and rules by a committee or consortium, review, standardization and subsequent specialization for subdomains).
The value of an ontology stands and falls with the extent of recognition and approval (“ontological commitment”) it receives in the relevant professional community. In general, the more decision makers and stakeholders are involved in the design process, the easier it is to achieve this approval. On the other hand, the effort usually increases with the number of people involved in the design.
Ontologies have recently attracted special, worldwide interest due to the Semantic Web initiative of WWW creator Tim Berners-Lee and his colleagues. It is based on the basic idea of providing Web documents (of any size) with “semantics” in the form of metadata (“tags”) that describe their content in more detail, and linking them together by inference rules. This is intended to support search engines and other electronic mechanisms such as agents in finding and linking required information in a targeted and efficient manner. Ontologies are used to provide the necessary basis of metadata and inference rules. This means that, for example, two agents can communicate about their tasks and results using an ontology available to both.
A number of languages, methods, and tools have emerged or been made available in recent years for developing and testing ontologies. In the context of the Semantics Web approach, XML (Extensible Markup Language) and RDF (Resource Description Framework) are particularly noteworthy: XML for annotation and structural description of data and documents, RDF for the possibility to describe resources by properties and to assign values to them – including references to other resources. This approach is based on the well-known basic idea of understanding semantic networks as graphs.