Oct 312017
 

Data-Science-IA-Big-DataInformation Architecture is an enabler for Big Data Analytics. You may be asking, why would I say this, or how does IA enable Big Data Analytics. We need to remember that Big Data includes all data (i.e., Unstructured, Semi-structured, and Structured). The primary characteristics of Big Data (Volume, Velocity, and Variety) are a challenge to your existing architecture and how you will effectively, efficiently and economically process data to achieve operational efficiencies.

In order to derive the maximum benefit from Big Data, organizations must be able to handle the rapid rate of delivery and extraction of huge volumes of data, with varying data types. This can then be integrated with the organization’s enterprise data and analyzed. Information Architecture provides the methods and tools for organizing, labeling, building relationships (through associations), and describing (through metadata) your unstructured content adding this source to your overall pool of Big Data. In addition, information architecture enables Big Data to rapidly explore and analyze any combination of structured, semi-structured and unstructured sources. Big Data requires information architecture to exploit relationships and synergies between your data. This infrastructure enables organizations to make decisions utilizing the full spectrum of your big data sources.

                                                            Big Data – Component

Information Architecture Element Volume Velocity Variety
Content Consumption Provides an understanding of the universe of relevant content through performing a content audit. This contributes directly to volume of available content. This directly contributes to the speed at which content is accessed by providing initial volume of the available content. Identifies the initial variety of content that will be a part of the organization’s Big Data resources.
Content Generation Fill gaps identified in the content audit by Gather the requirements for content creation/ generation, which contributes to directly to increasing the amount of content that is available in the organization’s Big Data resources. This directly contributes to the speed at which content is accessed due to the fact that volumes are increasing. Contributes to the creation of a variety of content (documents, spreadsheets, images, video, voice) to fill identified gaps.
Content Organization Content Organization will provide business rules to identify relationships between content, create metadata schema to assign content characteristic to all content. This contributes to increasing the volume of data available and in some ways leveraging existing data to assign metadata values. This directly contributes to improving the speed at which content is accessed by applying metadata, which in turn will give context to the content. The Variety of Big Data will often times drive the relationships and organization between the various types of content.
Content Access Content Access is about search and establishing the standard types of search (i.e., keyword, guided, and faceted). This will contribute to the volume of data, through establishing the parameters often times additional metadata fields and values to enhance search. Contributes to the ability to access content and the speed and efficiency in which content is accessed. Contributes to how the variety of content is access. The Variety of Big Data will often times drive the search parameters used to access the various type of content.
Content Governance The focus here is on establishing accountability for the accuracy, consistency and timeliness of content, content relationships, metadata and taxonomy within areas of the enterprise and the applications that are being used. Content Governance will often “prune” the volume of content available in the organization’s Big Data resources by only allowing access to pertinent/relevant content, while either deleting or archiving other content. When the volume of content available in the organization’s Big Data resources is trimmed through Content Governance it will improve velocity by making available a smaller more pertinent universe of content. When the volume of content available in the organization’s Big Data resources is trimmed through Content Governance the variety of content available may be affected as well.
Content Quality of Service Content Quality of Service focuses on security, availability, scalability, usefulness of the content and improves the overall quality of the volume of content in the organization’s Big Data resources by: – defending content from unauthorized access, use, disclosure, disruption, modification, perusal, inspection, recording or destruction – eliminating or minimizing disruptions from planned system downtime making sure that the content that is accessed is from and/or based on the authoritative or trusted source, reviewed on a regular basis (based on the specific governance policies), modified when needed and archived when it becomes obsolete – enabling the content to behave the same no matter what application/tool implements it and flexible enough to be used from an enterprise level as well as a local level without changing its meaning, intent of use and/or function – by tailoring the content to the specific audience and to ensure that the content serves a distinct purpose, helpful to its audience and is practical. Content Quality of Service will eliminate or minimize delays and latency from your content and business processes by speeding to analyze and make decisions directing effecting the content’s velocity. Content Quality of Service will improve the overall quality of the variety of content in the organization’s Big Data resources through aspects of security, availability, scalability, and usefulness of content.

The table above aligns key information architecture elements to the primary components of Big Data. This alignment will facilitate a consistent structure in order to effectively apply analytics to your pool of Big Data. The Information Architecture Elements include; Content Consumption, Content Generation, Content Organization, Content Access, Content Governance and Content Quality of Service. It is this framework that will align all of your data to enable business value to be gained from your Big Data resources.

Note: This table originally appeared in the book Knowledge Management in Practice (ISBN: 978-1-4665-6252-3) by Anthony J. Rhem.

Mar 312017
 

CognitiveThere are approximately 22,000 new cases of lung cancer each year with an overall 5-year survival rate of only ~18 percent (American Cancer Society). The economic burden of lung cancer just based on per patient cost is estimated $46,000/patient (lung cancer journal). Treatment efforts using drugs and chemotherapy are effective for some, however more effective treatment has been hampered by the inability of clinicians to better target treatments to patients. It has been determined that Big Data holds the key for providing clinicians with the ability to develop more effective patient centered cancer treatments.

Analysis of Big Data may also improve drug development by allowing researchers to better target novel treatments to patient populations. Providing the ability for clinicians to harness Big Data repositories to develop better targeted lung cancer treatments and to enhance the decision-making process to improve patient care can only be accomplished through the use of cognitive computing. However, having a source or sources of data available to “mine” for answers to improve lung cancer treatments is a challenge!

There is also a lack of available applications that can take advantage of Big Data repositories to recognize patterns of knowledge and extract that knowledge in any meaningful way. The extraction of knowledge must be presented in a way that researchers can use to improve patient centric diagnosis and the development of patient centric treatments. Having the ability to use cognitive computing and KM methods to uncover knowledge from large cancer repositories will provide researchers in hospitals, universities, and pharmaceutical companies with the ability to use Big Data to identify anomalies, discover new treatment combinations and enhance diagnostic decision making.

Content Curation

An important aspect to cognitive computing and Big Data is the ability to perform a measure of content curation. The lung cancer Big Data environment that will be analyzed should include both structured and unstructured data (unstructured being documents, spreadsheets, images, video, etc.). In order to ingest the data from the Big Data resource the data will need to be prepared. This data preparation includes applying Information Architecture (IA) to the unstructured data within the repository. Understanding the organization and classification schemes relating to the data both structured and unstructured is essential to unifying the data into one consistent ontology.

Are We Up for the Challenge!

Even if a Big Data source was available and content curation was successful, the vast amounts of patient data is governed by HIPAA laws which makes it difficult for researchers to gain access to clinical and genomic data shared across multiple institutions or firms including research institutions and hospitals. According to Dr. Tom Coburn in his January 14th article in the Wall Street Journal ‘A Cancer ‘Moonshot’ Needs Big Data; gaining access to a big data repository all inclusive of patient specific data is essential to offering patient centered cancer treatments. Besides the technology challenges, there are data and regulation challenges. I’m sure that many of these challenges are being addressed. Thus, far there have been no solutions. Are we up for the challenge? Big Data analysis could help tell us which cancer patients are most likely to be cured with standard approaches, and which need more aggressive treatment and monitoring. It is time we solve these challenges to make a moonshot a certain reality!

Jun 302016
 

WCG Content ModelContent modeling is a powerful tool for fostering communication and alignment between User Experience (UX) design, editorial, and technical resources on a Information Architecture effort. By clearly defining the content domains, content types, content attributes (metadata) and relationships, we can make sure that the envisioned content strategy becomes a reality for the content creators.

The Content Model is a logical depiction of what an organization knows about things of interest to the business and graphically shows how they relate to each other in an entity relationship (ER) diagram or class diagram. An entity relationship diagram is an abstract conceptual representation of structured data.  It uses standard symbols to denote the things of interest to the business (entities), the relationships between entities and the cardinality and optionality of those relationships.  The Content Model, contains detailed characteristics of the content types or concepts, attributes or properties and their definitions.  It is a result of detailed analysis of the business requirements.

When starting a content modeling effort, it is important to begin with a high-level (conceptual content model). The conceptual content model is the first output from content modeling. After some initial work identifying, naming and agreeing on what content domains and content types are important within your problem domain you are now ready to structure them together into a conceptual content model.

It is essential that content strategists, information architects and business stakeholders engage with content modeling early on in the process. These are the people best positioned to find and classify content types that make sense for the business. They bring that understanding of why content needs to be structured, named and related in a certain way. In addition, the business subject matter experts bring knowledge of the rules about content that drives the naming and determining of relationships between content types.

Finding Content Types

Content types live in existing web sites, customer call centers (call logs), product documentation, communications, as input & output of processes and functions as well as in the mind of people performing various tasks. The mission is to find them, document and define them. here are other reasons to make something a separate type of content:

  1. Distinct, reusable elements. You might decide to create an Author content type that contains the name, bio and photo of each author. These can then be associated with any piece of content that person writes.
  2. Functional requirements. A Video might be a different type of content because the presentation layer needs to be prepared to invoke the video player.
  3. Organizational requirements. A Press Release may be very similar to a general Content Page, but only the Press Release is going to appear in an automatically aggregated Newsroom. It’s easier for these to be filtered out if they’re a unique type of content.

Content models progress along a continuum of constant refinement, there are three important stages in the content modeling lifecycle:

  • Conceptual: The initial content model which aims to capture the content domains, content types and high level relationships between content types.
  • Design: Adds the descriptive elements (metadata) to each content type and further refines the structural relationships between them.
  • Implementation: Models the content within the context of the target technology, e.g. CMS, Search Engines, Semantic Tools, etc.

Remember Content is KING!

Apr 182015
 

IoTA lot has been said about the next big movement … the Internet of Things (IoT). Simply, IoT is a massive network of connected devices and/or objects (which also includes people). The relationship will be people-to-people, people-to-devices, and devices-to-devices. These devices will have network connectivity, allowing them to send and receive data.

The IoT will lead to Smart Grids and Smart Cities and Information Architecture (IA) will enable a “smart architecture” for delivering content in the right context to the right device (or object)!

So where does IA come into this scenario?

IA is all about connecting people to content (information and knowledge) and it is this ability that is at the core of enabling a myriad of devices and/or objects to connect and to send and receive content. It delivers that “smart architecture”.

The larger amounts of data brought in through the internet need a viable and clear information architecture to deliver consistency to a varied amount of devices. IA offers a viable option in which content (information and knowledge) can be represented in a flexible object-oriented fashion. However, with any option used for representing content, it will have to be able to design the “base” structure for all human content, everywhere. This, of course, is impossible.

It’s impossible because we simply cannot comprehend the extent of all content that is or will be available. This fully flexible object-oriented structure will need to be built similarly to how the human genome project scientists map DNA. This will allow the structure to continue to evolve and grow, which will continue to enable the delivery of content to devices and objects as they become connected to the internet.