XCMTDMW: Element to Element Linking: Overview
The most general definition is "a semantic object that establishes a set of one or more relationships among uniquely-addressible XML components". This definition is reflected by the XLink and HyTime standards, which provide syntax and semantics for establishing arbitrarily-complex relationships between arbirarily-addressible things. (XLink is limited to the domain of linking among XML componentsXLink is limited to the domain of linking among components for which a URL-compatible fragment addressing syntax and semantic have been defined, HyTime provides generic facilities for making anything generically addressable and therefore enables linking anything to anything via a single standard representation mechanism (groves)). [See the comments to this post for more good discussion around my original mistatement of XLink's limitations. The distinction between XLink and HyTime in this area is subtle but important: HyTime provides a generic mechanism by which you can define the "in-memory" addressing representation of anything (the downside is somebody has to define it and somebody has to implement the instantiation of the representation). By contrast, XLink is dependent on different defined data formats (XML, HTML, CGM, MS Word, whatever) defining, as IETF or W3C specifications, what their addressible components are and what the syntax for doing that addressing is. If there is no such specification XLink can't link it. This is one reason it often seems like a good idea to use XML to encode everything: it makes it universally addressible. It's true but it's not the only way it could be done. The Web world could easily define the functional equivalent of HyTime groves and XLink/XPointer could then be defined in terms of addresses of things in terms of that generic grove-like thing. But I don't see that happening any time soon. In addition, XLink addressing is done via URIs exclusively, which should not be a limitation in practice but is another difference--HyTime is so flexible that almost any reasonable way of writing down addresses using SGML or XML markup can be made recoganizable to a HyTime processer--a degree of flexibility that made HyTime difficult for many people to understand, but I digress).]
For example, using XLink you could semantically link words in one section of PCDATA to words in another section just as easily as you can link one element to another.
While that level of generality is sometimes useful and it's important for standards like XLink and HyTime and XPointer to both enable it and standardize it clearly and completely, for the purposes of both discussing the issues inherent in linking and in doing workaday technical documentation, we can narrow our focus to a key subset of the general case, linking one element to another element or set of elements.
First, let me explain my repeated stress on the word "semantic". A link is a semantic relationship whose meaning is independent of how the relationship is established. Think of it like marriage: it doesn't matter whether you're married in a church or a justice of the peace, in Austin or Amsterdam, in English or in Chinese, the resulting relationship is the same: A is married to B.
By the same token it doesn't matter how a link is expressed syntactically in your data: XLink, XIinclude, HyTime, HTML, your own 20-year-old link markup, the relationships will be the same: Element A is linked to Element B for some reason.
By the same token, addressing, on which semantic linking depends, is entirely syntactic. Addressing is the plumbing or mechanics that let you physically connect things together: the pointers. The addressing syntax you use has many practical implications, including the availability of implementations, the cost of implementation and processing, the opportunities for interoperation, and so on, but the specific syntax you use doesn't affect the meaning of the relationships established by the links that do the addressing.
That is, it doesn't matter whether you arrive at your wedding via train or car or pontoon boat, the end result is the same. The cost and speed and availability are different but as long as you get there on time, it doesn't matter which one you use.
This is very important. Clear thinking about linking requires that you be able to make a complete and clear distinction between the syntax-independent and syntax-specific parts of linking.
And clear thinking is the only way we will be able to find our way safely to a generalized approach to XML document and link management that can both satisfy all our requirements and not require crazy optimizations.
My promise to you is that if you stick with me in this exploration of the intricacies and pitfalls of linking is that when we come out the other end you will have at your disposal a general architectural approach to link management that can be implemented simply or sophisticatedly as you require that will do everything you need it to do at a cost exactly proportional to your specific needs in terms of scale, performance, and completeness. That is, if you need to do simple element-to-element linking there is a simple solution that is completely compatible with and upgradable to the most sophisticated system that lets you link anything to anything. We already saw this with the Woodward Governor system. You do not need a hugely-expensive, overoptimized XML-aware CMS as cost of entry to doing sophisticated linking. You may need one eventually if your scale and performance requirements are high. But it is likely that in fact you need something less daunting and expensive.
OK, back to linking.
Here are some facts that help us narrow our problem statement while preserving our ability to do more sophisticated things in the future:
- You can always change the addressing syntax without changing the semantics of the links. For example, you can change from using only ID references to using full XPointers without changing the meaning of any links as long as the addressed result is the same.
- Any link expressed as an element that points directly to another element (an "inline link") can be replaced with an "out-of-line" link that points to the two original elements without changing the semantics of the relationship expressed. The reverse is also true.
- Doing one-to-many linking or many-to-many linking is no harder than doing one-to-one linking in a generalized system. It mostly becomes a user interface issue (which is why HTML doesn't directly allow it).
- The nature of the things linked doesn't change the general nature of the issues inherent in doing semantic linking and addressing.
- Most of the complexity of link processing and management is in the addressing.
- Most of the complexity of addressing comes from managing addresses within a body of information under revision. If your data is static and unchanging, addressing is easy, just a simple matter of programming. It's when your data changes over time that things get interesting.
- The core requirements for linking and addressing in authoring support repositories and delivery support requirements are fundamentally different. In particular, authoring repositories must provide sophisticated mechanisms for doing indirect addressing while delivery repositories need not do any indirection and would rather not do any (in order to keep things as simple and quick as possible).
Taken together, these facts mean the following for us:
- We can focus on the simplest case, element-to-element links and know that the same issues and principles will apply to less common cases, such as element-to-text links.
- Our choice of addressing method will be the primary determiner of the cost of our system in terms of both cost to implement and cost to use.
I will also observe that in most technical documentation linking is limited to element-to-element links and usually to strictly binary links of one element to one element, for the simple reason that doing anything more sophisticated is challenging for writers from a rhetorical standpoint and is complicated by the inherent lifecycle management challenges posed by linking in general. That is, doing more than simple links is just too hard in most cases.
[NOTE: In examples that follow I will omit namespace declarations just to keep the examples simple but my policy is that all elements should be in a namespace other than the no-namespace namespace. Just so we're clear.]
OK, let's pull the covers off a link and see what makes it tick. Let's start with one we've seen, an XInclude link:
<?xml version="1.0?>Here we have a simple XInclude "include" link. This link is establishing a relationship between itself, the <xi:include> element, and the element that is the document element of the XML document named by the href= attribute. The semantics of the relationship are defined by the XInclude specification and are "transclude" or "use-by-reference".
<doc>
...
<:xi:include href="../common/warnings/dont_run_scissors.xml"/>
...
</doc>
Note that this is not a link between the xi:include element and document entity "dont_run_scissors.xml". It is also not a link between the document that contains the xi:include element and the document entity. It is a link from one element, the xi:include element to another element, the document element of the document entity named. This is very important and if you aren't seeing the distinction we need to stop now and make sure you do see it because this is crucial to our understanding going forward. To make it clearer, lets look at dont_run_scissors.xml:
<xml version="1.0"?>The relationship established by the xi:include element is between itself and the <warning> element that happens to be the document element of dont_run_scissors.xml.
<warning>
<p>Don't run with scissors.</p>
</warning>
Why is this? It's because XInclude defines a useful shortcut which is that, by definition (not just by convention), a reference to a document entity with no explicit XPointer is a reference to that document entity's document element.
Let's make this clear by changing our data a bit. Let's aggregate all our standard warnings into a single document for convenience:
<xml version="1.0"?>Now let's create a new version of our linking document to reflect this new organization of warnings [NOTE: I'm pretty sure my xpointer syntax is not complete. I'm keeping it simple for example purposes. See the spec for the exactly correct syntax]:
<warning_set>
<warning>
<p>Don't run with scissors.</p>
</warning>
<warning_set>
<warning>
<p>Don't stand on the top rung of a step ladder</p>
</warning>
<?xml version="1.0?>What have we changed? Because the warning we want is no longer a document element (it's no longer the root element of its containing document), we can't use just an href=--we have to add an xpointer= in order to address the element we want. So we've added an xpointer= attribute with an XPointer that addresses the first warning in the new warning_set document.
<doc>
...
<:xi:include href="../common/warnings/warnings.xml"
xpointer="xpointer(/*/warning[1])"
/>
...
</doc>
The relationship is still the the same: the xi:include element is pointing to the don't run with scissors warning. The addressing has changed (because the data changed) but the semantics are the same and the processing result will be the same.
And note that it doesn't matter how we address the target warning. Here I made the smallest possible change to the warning data (added a wrapper warning_set element) but I didn't change the target warning at all. In particular I didn't do what a lot of people would either assume is required or do instinctively: add an ID to the warning.
This is to make the point that how you do addressing doesn't frickin' matter as regards the semantics of the links. The only questions are "how hard is it to create the pointer in the first place and how hard will it be to resolve?" As it happens, with XPointer, most of it is pretty easy and you can do it in XSLT 1 (and it's really easy with XSLT 2). I've done it and I make that XSLT code freely available (I believe an older version is somewhere on the XSL FAQ site--I have a newer version that supports XSLT 2 but I need to post it somewhere). In any case, it's not that hard and it gets easier every day.
Have I made my point about addressing vs semantics? I hope so because it's crucial to making everything work. In particular, if you can't change the form of address without changing the meaning of your links, link management would be very hard indeed.
Having said that, it's also the case that the form of address you choose will affect many practical aspects of the system. In particular, if you choose a form of address that is not standards based (that is, is not XPointer or some form of schema-defined key/keyref) then you are at a minimum increasing the cost of implementation because you'll be on the hook for all the code components that have to work with those addresses (both to create them in new documents and to resolve them during processing). If the addressing mechanism is specific to a product (for example, references to object IDs in some proprietary repository) then you've tied yourself to that repository at the data level which I think is a very dangerous thing to do and should only be done when there is no alternative (and there's always an alternative).
Note too that if your address is to object IDs in a repository you are doing exactly what we did above when we used just the href= to point to dont_run_scissors.xml: you're addressing a storage object in order to address its root element. That is, any system that decomposes documents at the element level is making individual documents out of each of those elements. That's not necessarily bad (and we'll see later where having the ability to do that as needed is a good thing) but let's not pretend that you are addressing elements directly. You are not. A lot of the incorrect behavior of these systems (such as synthesizing invalid documents on export) comes from not realizing or admitting that their objects are documents and not elements in some element tree reflecting a single document (which is what they usually claim or the appearance they expose through UIs and APIs). Just saying.
OK, let's look at what we've done and what we've got so far:
- We started with a very simple link, an XInclude from inside one document to an element in another document. Our intent was to relate the xi:include element to a single warning element and we did that by pointing to the document entity that contained the warning element and for which the warning element was the root element.
In terms of our storage-management framework, this created a system of two documents with a dependency between the first document and the warning document of type "component of".
- We decided to put all our warnings into one document (for example, because they all go through a single approval workflow and must all be approved by the same deadline or because they're created and managed by one author). This required us to create a new document, warnings.xml. Into this document we copied the original warning from dont_run_scissors.xml as well as other warnings. We committed this new document into our system.
- By some means as yet unrevealed, we, the authors of the original document, came to know that the authoritative version of our warning is now in warnings.xml and that we need to create a new version of our document that reflects this new location. So we checked out our doc (let's call it doc_01.xml), added the necessary xpointer= attribute to the xi:include element, and committed this new version into the repository.
There's some interesting stuff going on here that I need to point out:
- The original version of doc_01.xml continues to irrevocably point to the original warning in dont_run_scissors.xml. The creation of warnings.xml did not change anything about this. If you were to process version 1 of doc_01.xml right now you would get the same result you got before we created warnings.xml--that is, the warning we would use would be the one dont_run_scissors.xml, not the one in warnings.xml.
- There are two versions of the don't run with scissors warning that we, as humans doing this work, know are versions in time. However, the information we have seen so far does not explicitly relate the two versions in any way and only weakly implies it through the two versions of doc_01.xml, which differ only in the form of address used for the xi:include (but note that could be because we decided to use an entirely different warning--there's nothing about the link that says we were linking to a new version of the same warning resource (in SnapCM terms)). And not that making each element its own document wouldn't help us here because the whole point was we wanted all the warnings in one document. If we want that reasonabl level of storage organization flexibility then we have to step up to being able to both address elements that are not document elements and provide some way of tracking the version history of elements regardless of their storage locations. Fortunately it's not too hard to do.
- The change to the warning, in this case a change to its physical location required us to react by creating a new version of our document doc_01.xml even though the content of the warning itself did not change and therefore we had no other reason to need to change doc_01.xml. This is very important. This is the essential problem in the management of versioned hyperdocuments. Think about the implications here for a large body of documents all of which use this standard warning.
From this simple use case, which is pretty much the simplest use case, you should start to see a few things with some clarity:
- Moving from addressing elements indirectly via reference to the XML documents of which they are the root to addressing elements anywhere inside their containing documents complicates things a good bit (mostly for address creation, which really means for authoring user interfaces).
- There is a need to track the version history of elements, not just storage objects. It would really be nice to know where our warning, as a unit of managed information in a non-trivial workflow, has been over its lifetime.
I picked warnings on purpose because they are the most obvious example of information for which there could be severe legal and safety implications and for which you therefore need to know what you said when and where you said it and what documents used which version and what time in the past. That is, when ScissorCo gets sued you need to be able to prove that your authors used the right warning in the right documents and therefore the plaintif should have known not to run with them. I also chose warnings because they are an obvious target of re-use and they tend to go through an authoring and revision workflow separate from any documents that use them. Keep that in mind as we go forward. Warnings are just an obvious instance of a more common general case in use-by-reference, which is using information among publications or data sets with different workflows that have no necessary or natural synchronizations. For example, where core content is developed on a per-engine basis but is used in publications whose workflow schedule is driven by specific product development and release cycles.
- During authoring (that is, during the revision life cycle of the information) there is a strong requirement for various forms of indirect addressing in order to avoid the very problem we ran into here: change to a link target requires changing the link source even though the semantics of the link were otherwise not affected.
The SnapCM model provides one form of indirect addressing, the dependency link, but that alone is not sufficient if we want to enable direct addressing of elements regardless of how they are stored (because SnapCM dependencies are only between storage objects). If your requirements can be met by only doing linking and addressing of document root elements then it is sufficient (although the implication is sometimes that you end up with a lot of very small documents). But it's not that hard to step up to doing indirect addressing of elements anywhere.
Finally, I'll leave you with one question: what W3C or OASIS or IETF standard provides a mechanism for doing indirect addressing of XML elements that are not document elements? [I left out ISO because we already know the answer: HyTime (ISO/IEC 10744:1996).]
Next time: Why indirection is so important for authoring
Labels: XCMTDMW "xml content management" indirection xinclude xlink linking hytime snapcm xpointer