The need to classify and categorize corporate information has never been greater. The proliferation of information channels, sources and delivery platforms make managing information a complex business challenge. The key to managing information is to develop a way to identify, classify and categorize enterprise information. This categorization allows for effective management of content throughout the information lifecycle: capture, storage, retrieval, archival and disposal. The most common way to associate structure around information is to develop and implement a taxonomy. More often than not multiple taxonomies are created in order to understand the relationships between the information in a particular domain and across multiple domains.
What is Taxonomy?
A taxonomy is a way to classify and assign a structure to information. The structure can consist of many levels and sublevels, referred to as nodes and sub-nodes, each aligned with a specific type or category of information. It is not surprising that we all work with taxonomies in our daily lives. On our computers and mobile devices we organize our icons, create folder structures to store and locate our content quickly as well as when we use sites on the internet to shop and/or search online libraries.
Developing a corporate taxonomy requires a well thought out approach that addresses not only how information will be categorized, but more importantly, how information units relate to each other and how these will be accessed and retrieved at various points in time. There are several types of taxonomies which include:
Functional Taxonomy: attempts to represent the business model and organizes information around the services and/or functions the company performs.
Organizational Taxonomy: formations mirror departmental functions and thereby are operationally aligned by Marketing, Accounts Payable, Procurement and so forth.
Topic-based Taxonomy: formations attempt to categorize and label the content by the nature of the content. Examples include financial, policies and procedures, images, contracts, and application.
Ontologies are explicit formal specifications of the terms in the domain and relations among them. An ontology defines a common vocabulary for researchers who need to share information in a domain. It includes machine-interpretable definitions of basic concepts in the domain and the relationships between/among them. Ontologies serve as the basis for Knowledge Graphs and the foundation of gathering insights from data in which knowledge graphs provide.
Ontologies provides four (4) major uses:
- A shared common understanding of the structure of information among people
- An enabler to reuse domain knowledge and make domain assumptions explicit
- A way to analyze domain knowledge (especially through the use of knowledge graphs)
- A way to institute semantic search through ontological structures
Ontologies provides the declarative specification of the terms, which serves as the mechanism to analyze domain knowledge. The formal analysis of terms is extremely valuable when reusing existing ontologies and extending them.
A content model is a representation of the various types of content in a domain and any relationships between them. A content model documents all the different types of content (i.e., pdf, word, excel, PowerPoint, webpage, graphic) including detailed definitions of each content type’s elements (metadata). The level of detail in the model is determined by the purposes you need it to serve.
A content model provides the framework for organizing content for reuse, search and retrieval (findability), enables labeling of content and for content to be structured for authoring (via templates, style guides and style sheets).
Content Types are determined by distinct, reusable elements. Necessary if the content needs to be associated with any other piece of content. Functional requirements for content types determine the “what” and “why” of a particular Content Type. Organizational requirements determine how the content should be organized, using the Content Model to assist in the decision.
Content Model Lifecycle Stages
- Conceptual: The initial content model aims to capture the names and high level relationships between content types.
- Design: Adds the descriptive elements to each content type and further refines the structural relationships between them.
- Implementation: Models the content within the context of the target technology, e.g. CMS, KMS, Search Engines, Semantic Tools, etc.
Relationship between Taxonomy, Ontology and Content Model
A content model similar to an ontology is a representation of various types of “things” (in the case of a content model it represents artifacts such as pdf, word, excel, PowerPoint, webpage, graphic; in the case of an ontology, it represents formal specifications of the terms) in a domain and any relationships between them. An ontology is considered as another way to classify content (like a taxonomy) that allows you to relate content based on the information in it as opposed to a term describing it. A taxonomy formalizes the hierarchical relationships among concepts. A taxonomy is often seen as a precursor to an ontology, or forms the basis for an ontology, or as seen as a less complex taxonomy.
In the order of detail (from lease detail to most) I see (1) taxonomy; (2) ontology; and (3) content model. The content model provides the most detail because it includes metadata. However, In the order of complexity (from lease complex to most) I see (1) taxonomy; (2) content model; and (3) ontology. An ontology can become quite complex given the potential number of layers within an ontology. This provides a brief comparison of taxonomies, ontologies and content models. In practice it is good to start with building a taxonomy first before moving forward with an ontology. Content modeling is a concept that utilizes concepts of both taxonomies and ontologies in its development.