XML Information Management

XML (eXtensible Markup Language) is the W3C standard that is now universally adopted for managing information in the domains of messaging, service description, process definition, and single-sourced documentation. XML has a wide variety of standard, proprietary, and public domain dialects (called, or governed by, schema). XML is not a language in its own right, rather is a simple set of grammatical rules for defining more specialized grammars (markup languages), and hence is considered a meta-language. The main technologies associated with XML for information management include:

  • - XSD (XML Schema Definition language), a W3C standard
  • - XSLT (XML Style-sheet Language Transformation), a W3C standard
  • - XHTML 1.1 (eXtensible HTML), a W3C standard, now giving way to HTML 5
  • - CSS 3 Cascading Style Sheets, a W3C standard for styling (X)HTML
  • - XSL:FO, a W3C standard for styling PDF documents
  • - DocBook, an OASIS standard XML language for authoring conventional (linear) technical books
  • - DITA, an OASIS standard XML language for structured topic-based authoring

DITA Information Management

Information management with DITA involves collaborative authoring of topics, managed as an integral part of ECM systems; within a controlled vocabulary environment. Vocabulary can be controlled by developing taxonomies and specializations to limit and/or extend the use of standard DITA info types. In a managed information development framework, controlling authoring vocabularies, topic authoring, and map styling and publishing are separate functions.

XML/DITA Information Design

Information Design refers to the activities modeling, authoring, publishing, and evaluating the usability of information. Modeling information involves developing taxonomies, info types (vocabularies) and grammars to rule the occurrence of these info types; with the last two activities defined in schema. It is recommended to use standard schema, such as DITA, and develop specializations when necessary, rather than developing own. Thus taking advantage of the wealth of authoring, collaboration, and publishing tools that support the standardized info types. Authoring involves the development of minimalist content within the most fitted info types, according to pre-defined practices, to yield best possible uniformity and re-usability of information. Publishing involves applying default or customized style-sheets to information topics or maps to generate renditions in various standard formats, such as (X)HTML, WebHelp, PDF, EPUB, and others. Variables may be defined to allow for conditionally publishing, or styling, certain content parts, depending on the values of these variables. Usability evaluation involves the qualitative assessment of information readability, uniformity, organization, and usefulness through expert analysis and readership feedback.

Structured Documentation with DITA

DITA offers the following benefits for structured technical and business documentation:

  • - Topic-based authoring promotes minimalism (conciseness)
  • - Controlled vocabulary authoring promotes uniformity throughout an organization
  • - Information re-use by reference, at topic, element, or even phrase levels
  • - Single sourcing; same source is used in multiple contexts
  • - Multi-channel publishing; a single map is published to several output formats
  • - Effective content management of topics, maps, style sheets, and resources
  • - A universal XML-based standard facilitates default styling cross-transformation

Content map

  • » DITA XML functions
    •    - Structured information management
    •    - Strcutred documentation
    •    - Structured knowledge development
    •    - Strcutred content management
  • » DITA XML applications
    •    - DITA/XML for ePublishing
    •    - DITA/XML for eLearning
    •    - DITA/XML for eBusiness

Case study

Methods and motives of combining MS Word (OOXML) and DITA