Subscribe to Dr. Macro's XML Rants

NOTE TO TOOL OWNERS: In this blog I will occasionally make statements about products that you will take exception to. My intent is to always be factual and accurate. If I have made a statement that you consider to be incorrect or innaccurate, please bring it to my attention and, once I have verified my error, I will post the appropriate correction.

And before you get too exercised, please read the post, date 9 Feb 2006, titled "All Tools Suck".

Thursday, March 22, 2007

CMS Requirements for DITA

At last night's Central Texas DITA User's Group we had a nice presentation from France Baril of Ixiasoft on some of the challenges that authoring DITA documents can pose, in particular the need to be able to find topics and know what the dependencies among topics are as you revise the topics through the life cycle of the documentation set.

This sparked a discussion on some basic requirements on CMS systems that provide DITA-specific features. In addition, one of my colleagues is doing a DITA CMS project for one of our clients and he and I got to talking about what the CMS they're implementing did and didn't do with the DITA data, which revealed that the CMS vendor was perhaps not displaying as much insight into and imagination about how DITA should be managed as it could be.

So I thought I would try to outline what I think the key DITA non-obvious content management features are that any CMS that claims to provide DITA support should provide. I will not state what should be obvious requirements related to the creation and management of links, the ability to search on content and metadata, and so on.

See my earlier posts tagged XCMTDMW for a discussion of general XML content management requirements. Those requirements are the base from which these DITA requirements start. Therefore I won't state obvious things like XML-aware query, basic link management, and so on.

Map-Related Requirements

Maps are a key feature of DITA and the management of maps is essential to productive use of DITA. Key map-related features are:

M.1 - Import of an entire map as a single action. Given a map, the system should be able to import the map and all maps and topics it directly or indirectly links to as a single action. The system should provide options for how imported maps and topics are organized into whatever the CMS' organization mechanism is (folders, cabinets, whatever). The system should provide options for how to handle the import of link targets that are not of scope "local", including the creation of proxy topics for locally unavailable targets. Following import, all links in the imported content must resolve correctly to their targets as imported.

M.2 - Export of an entire map as a single action. Given a map in CMS, the system should be able export to some location outside the CMS (the filesystem, a Zip file, a WebDAV repository, etc.) the map and all of its direct or indirect dependencies. Following import all the links in the exported should resolve correctly to their targets as exported.

M.3 - Map-based views. All CMS operations that involve access to topics should require or allow the selection of a map that establishes the "map context" for the operation such that the operation only reflects those subordinate maps and topics that are within the direct or indirect scope of the map. For example, you should be able to select a map and then do queries that only return topics in the map's scope or, when creating direct links, only provide as candidate targets those topics that are in the scope of the map.

M.4 - Map view of everything. The CMS should be closed over maps such that all DITA-related content is presented in a map (for example, the system synthesizes a map that includes all topics in the repository or all topics within the scope of a particular CMS-specific organizing structure). By the same token, the results of queries should be viewable as DITA maps (that is, given a query result, it is either literally returned as a map to which the normal CMS map functionality is applied or the CMS provides a "save as map" option).

M.5 - Support for compound maps. CMS must support the use of topicref format="ditamap" to construct "compound maps". Any CMS functionality that modifies existing maps must preserve any pre-existing map-to-map relationships. Any CMS functionality that creates new maps should provide features for creating subordinate maps.

Specialization-Related Features

S.1 - All processing applied to specializations automatically. Any CMS functionality that is specific to a DITA-defined base type should be automatically applied to specialized elements. For example, if the CMS provides a feature for importing maps, it should automatically provide that feature for importing any specialization of map. If the CMS provides some form of configuration for mapping from specific element types to generic CMS functionality (for example, indicating which elements are links), such a mapping defined for the DITA base types should automatically be applied to all specialized documents without additional user or configuration effort.

That is, one of the main points of specialization is that DITA-specific processing just works for any specialized element. This is how the DITA Open Toolkit works and all other DITA-aware tools should too.

Or said another way, DITA awareness means "specialization awareness" in addition to whatever else it might mean.

S.2 - Capture and maintain the dependency relationships among shell document types and the base and specialized modules they use. For example, if I import a shell document type that includes a specialization module I created, the CMS should capture the dependency between the shell and the specialization module as well as the dependencies from the specialization module to the base DITA-provided modules.

S.3 - Specialization project management. System should provide features for managing the components of specialization modules as "projects" such that there is clear binding between the specialization module name and the specific implementation components that make it up. This project manager should reflect and, as appropriate, enforce (or at least encourage and reward) the implementation design patterns defined by the DITA architecture. This management should include tracking dependencies among specialization schema components (that is, from local specializations to the DITA-provided modules they depend on).

S.4 - Generalized views. System should provide ability to see, on demand, a generalized view of a given map or topic. It should provide a way to select the level of generalization desired. This view should be read-only by default but should allow for saving the generalized view as a new object.

Labels:

Thursday, March 15, 2007

Tutorial: Specializing DITA Conditional Attributes

[5 April 2007: This tutorial has been incorporated in my more complete and formal DITA Specialization Tutorial hosted here: http://www.xiruss.org/tutorials/dita-specialization/.]

A new feature in DITA 1.1 is the ability to specialize from the base= and props= attributes. For conditional processing, this lets you add your own attributes rather than using otherprops=, which can be clearer to authors and implementors. [NOTE: at the time of writing the DITA Open Toolkit does not implement support for specializations of props=, but it should be added soon.]

This form of specialization is fairly easy to implement. This tutorial shows you how to do it using DTDs (the mechanism using Schemas is essentially the same and if you've stepped up to using the DITA 1.1 schemas I'm going presume you can figure this out on your own).

The specialization requires two things:

1. Modification of any shell DTDs that need to reflect the specialized attribute (e.g., topic.dtd, reference.dtd, or your own specialized topic types' shell DTDs). You integrate the specialization attribute domain through the shell DTDs.

2. For each specialization of props=, a .ent declaration set that defines the attribute and a corresponding domain declaration. This is the "attribute domain specialization module".

Note that as a rule, any production use of DITA will likely require local versions of the DITA-provided shell DTDs, if only to do configuration of the domains you need, so unless you are using DITA very informally, you should already have local copies of all the DITA-provided shell DTDs. Just saying.

For this tutorial we want to create a specialization of "props=" called "phase-of-moon" that takes as its value one or more moon phase names (e.g., "full", "new", "waning", "waxing", etc.). We will call our domain "moonPhaseProp". (Domains must have unique names within the scope of the shell DTDs or schemas that use them.)

For organizing the files, I like to create a separate directory to put my local shell DTDs and specializations in. For this tutorial assume we're putting everything in the directory dtd/myspecs within the normal DITA Open Toolkit distribution structure. (It can go anywhere as long as you configure the entity resolution catalogs appropriately, but for initial development and testing I find it convenient to use relative paths to the various declaration components as that eliminates a variable from the configuration (resolution via catalogs) that can lead to confusing errors. Once you've established that the declarations are correct you should replace all relative paths with absolute URLs or (if you must) PUBLIC IDs that are resolved via catalogs. For my development work I use the OxygenXML editor, which makes it easy to set up catalog configurations for testing resolution via catalogs (and generally testing the correctness of all the parts). Similar tools like XML Spy are probably comparable (but I don't use them).

Step 1 is to create the attribute domain declaration:

1.a. Create a file named moonPhasePropsDomain.ent

1.b. In that file, create these two declarations:

<!ENTITY % moon-phase-props-d-attribute
"phase-of-moon
CDATA
#IMPLIED
"
>

<!ENTITY moon-phase-props-d-att
"a(props phase-of-moon)"
>
The first declaration declares the "phase-of-moon=" attribute and puts it in a parameter entity so we can add it to the DITA-defined %selection-atts parameter entity via the %props-attribute-extensions configuration parameter entity.

The second declaration is the domain declaration string for the attribute domain. It will be added to the value of the "domains=" attribute declared for each topic-type element type.

You should of course add an appropriate descriptive header to the file as well as a little documentation for the attribute itself.

This is all that is required for the attribute domain module.

Step 2 is to integrate the domain into your local copy of each shell DTD. The pattern is the same for each shell. For this tutorial I'm using a copy of the topic.dtd shell.

2.a. Find the comment that reads "DOMAIN ATTRIBUTE DECLARATIONS". Following that comment, add this declaration:
<!ENTITY % moon-phase-props-d-dec     
SYSTEM "moonPhasePropsDomain.ent"
>
%moon-phase-props-d-dec;
This pulls in the attribute domain module.

2.b. Find the comment that reads "DOMAIN ATTRIBUTE EXTENSIONS". Following that comment you should see a declaration for the %props-attribute-extensions parameter entity. It will probably be declared as an empty string.

Modify the entity replacement text to include a reference to the %moon-phase-props-d-attribute parameter entity:
<!ENTITY % props-attribute-extensions  
"%moon-phase-props-d-attribute;"
>
This adds the "phase-of-moon=" attribute to the %selection-atts parameter entity which is then included in the %univ-atts parameter entity, making this new attribute available on most elements (some elements, such as title, are not selection candidates).

2.c. Find the comment that reads "DOMAINS ATTRIBUTE OVERRIDE". Following that you should see the declaration of the text entity included-domains and it should include references to a number of "x-d-att" text entities.

To this entity add a reference to the moon-phase-props-d-att text entity:
<!ENTITY included-domains 
"&hi-d-att;
&ut-d-att;
&moon-phase-props-d-att;
"
>
This formally declares your props= attribute specialization so that DITA 1.1 processors will know that "phase-of-moon=" is in fact a conditional attribute and that they should filter on it as appropriate.

That's all there is to it. Now just repeat Step 2 for each shell DTD you use and you're done.

Step 3 is to test your declarations to make sure they work. This is simply a matter of creating an XML document that uses your local shell DTD as its DTD and verifying that the "phase-of-moon=" attribute is now available on all elements that allow the selection attributes.

Labels:

Wednesday, March 07, 2007

Tagging the Old Posts

Now that Blogger lets you tag your posts with descriptive tags I'm going through and tagging all my old posts to help with retrieval (which is pretty bad right now--not sure how to address that without creating some sort of hand-crafted index over the posts).

If you're subscribed to this blog as a feed this may cause you to get re-fed all the old posts.

If this does happen, I apologize for any stuffing of feed reader inboxes this causes.

Nothing To Say?

I haven't posted in a long while, what with the holidays and work and vacation and being sick and ...

Since the first of the year I've been very busy with client work, most of it DITA-related (creating sophisticated specializations, doing data analysis of documents that don't have an obvious mapping to DITA, etc.). Very interesting stuff--I've learned a lot about DITA and the Open Toolkit and XSD schemas but nothing that translates directly into pithy blog posts (although I do plan to write a tutorial on creating DITA specializations, which turns out to be remarkably easy once you get the pattern down).

In the meantime, I haven't really been doing much with any interesting technology nor have I seen much interesting coming down the pike (although Mike Kay's recent posting about assertions in XML Schema is pretty interesting--that could be very powerful if the Working Group can get it right). [And let me say that all the WS-* and identity standards stuff just bores me so totally to tears that I can't stand it--I'm sure it's important stuff but I just don't see how at the end of the day it's really going to matter much to our day to day and if it does I'm certainly not going to be anything other than a naive user of it....]

So I thought I should post something just to remind people that I'm still out here.

Some of the topics that are on my list to talk about, but that will require a good bit of time to discuss clearly and cogently, include:

- Why Norm just doesn't get what's wrong with DocBook and right with DITA, namely specialization

- So much more the DITA Open Toolkit could do with relationship tables

- Using DITA maps to model time-specific versions and similar configurations

- Reforming DITA's linking semantics and addressing infrastructure (a road map for DITA 2.0)

So here's hoping I have a little more time to write about these things in the future...

Labels: