XCMTDMW: Element to Element Linking: Overview
      What do we mean by "linking" in the context of XML document processing?
The most general definition is "a semantic object that establishes a set of one or more relationships among uniquely-addressible XML components". This definition is reflected by the XLink and HyTime standards, which provide syntax and semantics for establishing arbitrarily-complex relationships between arbirarily-addressible things. (XLink is limited to the domain of linking among XML componentsXLink is limited to the domain of linking among components for which a URL-compatible fragment addressing syntax and semantic have been defined, HyTime provides generic facilities for making anything generically addressable and therefore enables linking anything to anything via a single standard representation mechanism (groves)). [See the comments to this post for more good discussion around my original mistatement of XLink's limitations. The distinction between XLink and HyTime in this area is subtle but important: HyTime provides a generic mechanism by which you can define the "in-memory" addressing representation of anything (the downside is somebody has to define it and somebody has to implement the instantiation of the representation). By contrast, XLink is dependent on different defined data formats (XML, HTML, CGM, MS Word, whatever) defining, as IETF or W3C specifications, what their addressible components are and what the syntax for doing that addressing is. If there is no such specification XLink can't link it. This is one reason it often seems like a good idea to use XML to encode everything: it makes it universally addressible. It's true but it's not the only way it could be done. The Web world could easily define the functional equivalent of HyTime groves and XLink/XPointer could then be defined in terms of addresses of things in terms of that generic grove-like thing. But I don't see that happening any time soon. In addition, XLink addressing is done via URIs exclusively, which should not be a limitation in practice but is another difference--HyTime is so flexible that almost any reasonable way of writing down addresses using SGML or XML markup can be made recoganizable to a HyTime processer--a degree of flexibility that made HyTime difficult for many people to understand, but I digress).]
For example, using XLink you could semantically link words in one section of PCDATA to words in another section just as easily as you can link one element to another.
While that level of generality is sometimes useful and it's important for standards like XLink and HyTime and XPointer to both enable it and standardize it clearly and completely, for the purposes of both discussing the issues inherent in linking and in doing workaday technical documentation, we can narrow our focus to a key subset of the general case, linking one element to another element or set of elements.
First, let me explain my repeated stress on the word "semantic". A link is a semantic relationship whose meaning is independent of how the relationship is established. Think of it like marriage: it doesn't matter whether you're married in a church or a justice of the peace, in Austin or Amsterdam, in English or in Chinese, the resulting relationship is the same: A is married to B.
By the same token it doesn't matter how a link is expressed syntactically in your data: XLink, XIinclude, HyTime, HTML, your own 20-year-old link markup, the relationships will be the same: Element A is linked to Element B for some reason.
By the same token, addressing, on which semantic linking depends, is entirely syntactic. Addressing is the plumbing or mechanics that let you physically connect things together: the pointers. The addressing syntax you use has many practical implications, including the availability of implementations, the cost of implementation and processing, the opportunities for interoperation, and so on, but the specific syntax you use doesn't affect the meaning of the relationships established by the links that do the addressing.
That is, it doesn't matter whether you arrive at your wedding via train or car or pontoon boat, the end result is the same. The cost and speed and availability are different but as long as you get there on time, it doesn't matter which one you use.
This is very important. Clear thinking about linking requires that you be able to make a complete and clear distinction between the syntax-independent and syntax-specific parts of linking.
And clear thinking is the only way we will be able to find our way safely to a generalized approach to XML document and link management that can both satisfy all our requirements and not require crazy optimizations.
My promise to you is that if you stick with me in this exploration of the intricacies and pitfalls of linking is that when we come out the other end you will have at your disposal a general architectural approach to link management that can be implemented simply or sophisticatedly as you require that will do everything you need it to do at a cost exactly proportional to your specific needs in terms of scale, performance, and completeness. That is, if you need to do simple element-to-element linking there is a simple solution that is completely compatible with and upgradable to the most sophisticated system that lets you link anything to anything. We already saw this with the Woodward Governor system. You do not need a hugely-expensive, overoptimized XML-aware CMS as cost of entry to doing sophisticated linking. You may need one eventually if your scale and performance requirements are high. But it is likely that in fact you need something less daunting and expensive.
OK, back to linking.
Here are some facts that help us narrow our problem statement while preserving our ability to do more sophisticated things in the future:
- You can always change the addressing syntax without changing the semantics of the links. For example, you can change from using only ID references to using full XPointers without changing the meaning of any links as long as the addressed result is the same.
- Any link expressed as an element that points directly to another element (an "inline link") can be replaced with an "out-of-line" link that points to the two original elements without changing the semantics of the relationship expressed. The reverse is also true.
- Doing one-to-many linking or many-to-many linking is no harder than doing one-to-one linking in a generalized system. It mostly becomes a user interface issue (which is why HTML doesn't directly allow it).
- The nature of the things linked doesn't change the general nature of the issues inherent in doing semantic linking and addressing.
- Most of the complexity of link processing and management is in the addressing.
- Most of the complexity of addressing comes from managing addresses within a body of information under revision. If your data is static and unchanging, addressing is easy, just a simple matter of programming. It's when your data changes over time that things get interesting.
- The core requirements for linking and addressing in authoring support repositories and delivery support requirements are fundamentally different. In particular, authoring repositories must provide sophisticated mechanisms for doing indirect addressing while delivery repositories need not do any indirection and would rather not do any (in order to keep things as simple and quick as possible).
Taken together, these facts mean the following for us:
- We can focus on the simplest case, element-to-element links and know that the same issues and principles will apply to less common cases, such as element-to-text links.
- Our choice of addressing method will be the primary determiner of the cost of our system in terms of both cost to implement and cost to use.
I will also observe that in most technical documentation linking is limited to element-to-element links and usually to strictly binary links of one element to one element, for the simple reason that doing anything more sophisticated is challenging for writers from a rhetorical standpoint and is complicated by the inherent lifecycle management challenges posed by linking in general. That is, doing more than simple links is just too hard in most cases.
[NOTE: In examples that follow I will omit namespace declarations just to keep the examples simple but my policy is that all elements should be in a namespace other than the no-namespace namespace. Just so we're clear.]
OK, let's pull the covers off a link and see what makes it tick. Let's start with one we've seen, an XInclude link:
Note that this is not a link between the xi:include element and document entity "dont_run_scissors.xml". It is also not a link between the document that contains the xi:include element and the document entity. It is a link from one element, the xi:include element to another element, the document element of the document entity named. This is very important and if you aren't seeing the distinction we need to stop now and make sure you do see it because this is crucial to our understanding going forward. To make it clearer, lets look at dont_run_scissors.xml:
Why is this? It's because XInclude defines a useful shortcut which is that, by definition (not just by convention), a reference to a document entity with no explicit XPointer is a reference to that document entity's document element.
Let's make this clear by changing our data a bit. Let's aggregate all our standard warnings into a single document for convenience:
The relationship is still the the same: the xi:include element is pointing to the don't run with scissors warning. The addressing has changed (because the data changed) but the semantics are the same and the processing result will be the same.
And note that it doesn't matter how we address the target warning. Here I made the smallest possible change to the warning data (added a wrapper warning_set element) but I didn't change the target warning at all. In particular I didn't do what a lot of people would either assume is required or do instinctively: add an ID to the warning.
This is to make the point that how you do addressing doesn't frickin' matter as regards the semantics of the links. The only questions are "how hard is it to create the pointer in the first place and how hard will it be to resolve?" As it happens, with XPointer, most of it is pretty easy and you can do it in XSLT 1 (and it's really easy with XSLT 2). I've done it and I make that XSLT code freely available (I believe an older version is somewhere on the XSL FAQ site--I have a newer version that supports XSLT 2 but I need to post it somewhere). In any case, it's not that hard and it gets easier every day.
Have I made my point about addressing vs semantics? I hope so because it's crucial to making everything work. In particular, if you can't change the form of address without changing the meaning of your links, link management would be very hard indeed.
Having said that, it's also the case that the form of address you choose will affect many practical aspects of the system. In particular, if you choose a form of address that is not standards based (that is, is not XPointer or some form of schema-defined key/keyref) then you are at a minimum increasing the cost of implementation because you'll be on the hook for all the code components that have to work with those addresses (both to create them in new documents and to resolve them during processing). If the addressing mechanism is specific to a product (for example, references to object IDs in some proprietary repository) then you've tied yourself to that repository at the data level which I think is a very dangerous thing to do and should only be done when there is no alternative (and there's always an alternative).
Note too that if your address is to object IDs in a repository you are doing exactly what we did above when we used just the href= to point to dont_run_scissors.xml: you're addressing a storage object in order to address its root element. That is, any system that decomposes documents at the element level is making individual documents out of each of those elements. That's not necessarily bad (and we'll see later where having the ability to do that as needed is a good thing) but let's not pretend that you are addressing elements directly. You are not. A lot of the incorrect behavior of these systems (such as synthesizing invalid documents on export) comes from not realizing or admitting that their objects are documents and not elements in some element tree reflecting a single document (which is what they usually claim or the appearance they expose through UIs and APIs). Just saying.
OK, let's look at what we've done and what we've got so far:
- We started with a very simple link, an XInclude from inside one document to an element in another document. Our intent was to relate the xi:include element to a single warning element and we did that by pointing to the document entity that contained the warning element and for which the warning element was the root element.
In terms of our storage-management framework, this created a system of two documents with a dependency between the first document and the warning document of type "component of".
- We decided to put all our warnings into one document (for example, because they all go through a single approval workflow and must all be approved by the same deadline or because they're created and managed by one author). This required us to create a new document, warnings.xml. Into this document we copied the original warning from dont_run_scissors.xml as well as other warnings. We committed this new document into our system.
- By some means as yet unrevealed, we, the authors of the original document, came to know that the authoritative version of our warning is now in warnings.xml and that we need to create a new version of our document that reflects this new location. So we checked out our doc (let's call it doc_01.xml), added the necessary xpointer= attribute to the xi:include element, and committed this new version into the repository.
There's some interesting stuff going on here that I need to point out:
- The original version of doc_01.xml continues to irrevocably point to the original warning in dont_run_scissors.xml. The creation of warnings.xml did not change anything about this. If you were to process version 1 of doc_01.xml right now you would get the same result you got before we created warnings.xml--that is, the warning we would use would be the one dont_run_scissors.xml, not the one in warnings.xml.
- There are two versions of the don't run with scissors warning that we, as humans doing this work, know are versions in time. However, the information we have seen so far does not explicitly relate the two versions in any way and only weakly implies it through the two versions of doc_01.xml, which differ only in the form of address used for the xi:include (but note that could be because we decided to use an entirely different warning--there's nothing about the link that says we were linking to a new version of the same warning resource (in SnapCM terms)). And not that making each element its own document wouldn't help us here because the whole point was we wanted all the warnings in one document. If we want that reasonabl level of storage organization flexibility then we have to step up to being able to both address elements that are not document elements and provide some way of tracking the version history of elements regardless of their storage locations. Fortunately it's not too hard to do.
- The change to the warning, in this case a change to its physical location required us to react by creating a new version of our document doc_01.xml even though the content of the warning itself did not change and therefore we had no other reason to need to change doc_01.xml. This is very important. This is the essential problem in the management of versioned hyperdocuments. Think about the implications here for a large body of documents all of which use this standard warning.
From this simple use case, which is pretty much the simplest use case, you should start to see a few things with some clarity:
- Moving from addressing elements indirectly via reference to the XML documents of which they are the root to addressing elements anywhere inside their containing documents complicates things a good bit (mostly for address creation, which really means for authoring user interfaces).
- There is a need to track the version history of elements, not just storage objects. It would really be nice to know where our warning, as a unit of managed information in a non-trivial workflow, has been over its lifetime.
I picked warnings on purpose because they are the most obvious example of information for which there could be severe legal and safety implications and for which you therefore need to know what you said when and where you said it and what documents used which version and what time in the past. That is, when ScissorCo gets sued you need to be able to prove that your authors used the right warning in the right documents and therefore the plaintif should have known not to run with them. I also chose warnings because they are an obvious target of re-use and they tend to go through an authoring and revision workflow separate from any documents that use them. Keep that in mind as we go forward. Warnings are just an obvious instance of a more common general case in use-by-reference, which is using information among publications or data sets with different workflows that have no necessary or natural synchronizations. For example, where core content is developed on a per-engine basis but is used in publications whose workflow schedule is driven by specific product development and release cycles.
- During authoring (that is, during the revision life cycle of the information) there is a strong requirement for various forms of indirect addressing in order to avoid the very problem we ran into here: change to a link target requires changing the link source even though the semantics of the link were otherwise not affected.
The SnapCM model provides one form of indirect addressing, the dependency link, but that alone is not sufficient if we want to enable direct addressing of elements regardless of how they are stored (because SnapCM dependencies are only between storage objects). If your requirements can be met by only doing linking and addressing of document root elements then it is sufficient (although the implication is sometimes that you end up with a lot of very small documents). But it's not that hard to step up to doing indirect addressing of elements anywhere.
Finally, I'll leave you with one question: what W3C or OASIS or IETF standard provides a mechanism for doing indirect addressing of XML elements that are not document elements? [I left out ISO because we already know the answer: HyTime (ISO/IEC 10744:1996).]
Next time: Why indirection is so important for authoring
    The most general definition is "a semantic object that establishes a set of one or more relationships among uniquely-addressible XML components". This definition is reflected by the XLink and HyTime standards, which provide syntax and semantics for establishing arbitrarily-complex relationships between arbirarily-addressible things. (XLink is limited to the domain of linking among XML componentsXLink is limited to the domain of linking among components for which a URL-compatible fragment addressing syntax and semantic have been defined, HyTime provides generic facilities for making anything generically addressable and therefore enables linking anything to anything via a single standard representation mechanism (groves)). [See the comments to this post for more good discussion around my original mistatement of XLink's limitations. The distinction between XLink and HyTime in this area is subtle but important: HyTime provides a generic mechanism by which you can define the "in-memory" addressing representation of anything (the downside is somebody has to define it and somebody has to implement the instantiation of the representation). By contrast, XLink is dependent on different defined data formats (XML, HTML, CGM, MS Word, whatever) defining, as IETF or W3C specifications, what their addressible components are and what the syntax for doing that addressing is. If there is no such specification XLink can't link it. This is one reason it often seems like a good idea to use XML to encode everything: it makes it universally addressible. It's true but it's not the only way it could be done. The Web world could easily define the functional equivalent of HyTime groves and XLink/XPointer could then be defined in terms of addresses of things in terms of that generic grove-like thing. But I don't see that happening any time soon. In addition, XLink addressing is done via URIs exclusively, which should not be a limitation in practice but is another difference--HyTime is so flexible that almost any reasonable way of writing down addresses using SGML or XML markup can be made recoganizable to a HyTime processer--a degree of flexibility that made HyTime difficult for many people to understand, but I digress).]
For example, using XLink you could semantically link words in one section of PCDATA to words in another section just as easily as you can link one element to another.
While that level of generality is sometimes useful and it's important for standards like XLink and HyTime and XPointer to both enable it and standardize it clearly and completely, for the purposes of both discussing the issues inherent in linking and in doing workaday technical documentation, we can narrow our focus to a key subset of the general case, linking one element to another element or set of elements.
First, let me explain my repeated stress on the word "semantic". A link is a semantic relationship whose meaning is independent of how the relationship is established. Think of it like marriage: it doesn't matter whether you're married in a church or a justice of the peace, in Austin or Amsterdam, in English or in Chinese, the resulting relationship is the same: A is married to B.
By the same token it doesn't matter how a link is expressed syntactically in your data: XLink, XIinclude, HyTime, HTML, your own 20-year-old link markup, the relationships will be the same: Element A is linked to Element B for some reason.
By the same token, addressing, on which semantic linking depends, is entirely syntactic. Addressing is the plumbing or mechanics that let you physically connect things together: the pointers. The addressing syntax you use has many practical implications, including the availability of implementations, the cost of implementation and processing, the opportunities for interoperation, and so on, but the specific syntax you use doesn't affect the meaning of the relationships established by the links that do the addressing.
That is, it doesn't matter whether you arrive at your wedding via train or car or pontoon boat, the end result is the same. The cost and speed and availability are different but as long as you get there on time, it doesn't matter which one you use.
This is very important. Clear thinking about linking requires that you be able to make a complete and clear distinction between the syntax-independent and syntax-specific parts of linking.
And clear thinking is the only way we will be able to find our way safely to a generalized approach to XML document and link management that can both satisfy all our requirements and not require crazy optimizations.
My promise to you is that if you stick with me in this exploration of the intricacies and pitfalls of linking is that when we come out the other end you will have at your disposal a general architectural approach to link management that can be implemented simply or sophisticatedly as you require that will do everything you need it to do at a cost exactly proportional to your specific needs in terms of scale, performance, and completeness. That is, if you need to do simple element-to-element linking there is a simple solution that is completely compatible with and upgradable to the most sophisticated system that lets you link anything to anything. We already saw this with the Woodward Governor system. You do not need a hugely-expensive, overoptimized XML-aware CMS as cost of entry to doing sophisticated linking. You may need one eventually if your scale and performance requirements are high. But it is likely that in fact you need something less daunting and expensive.
OK, back to linking.
Here are some facts that help us narrow our problem statement while preserving our ability to do more sophisticated things in the future:
- You can always change the addressing syntax without changing the semantics of the links. For example, you can change from using only ID references to using full XPointers without changing the meaning of any links as long as the addressed result is the same.
- Any link expressed as an element that points directly to another element (an "inline link") can be replaced with an "out-of-line" link that points to the two original elements without changing the semantics of the relationship expressed. The reverse is also true.
- Doing one-to-many linking or many-to-many linking is no harder than doing one-to-one linking in a generalized system. It mostly becomes a user interface issue (which is why HTML doesn't directly allow it).
- The nature of the things linked doesn't change the general nature of the issues inherent in doing semantic linking and addressing.
- Most of the complexity of link processing and management is in the addressing.
- Most of the complexity of addressing comes from managing addresses within a body of information under revision. If your data is static and unchanging, addressing is easy, just a simple matter of programming. It's when your data changes over time that things get interesting.
- The core requirements for linking and addressing in authoring support repositories and delivery support requirements are fundamentally different. In particular, authoring repositories must provide sophisticated mechanisms for doing indirect addressing while delivery repositories need not do any indirection and would rather not do any (in order to keep things as simple and quick as possible).
Taken together, these facts mean the following for us:
- We can focus on the simplest case, element-to-element links and know that the same issues and principles will apply to less common cases, such as element-to-text links.
- Our choice of addressing method will be the primary determiner of the cost of our system in terms of both cost to implement and cost to use.
I will also observe that in most technical documentation linking is limited to element-to-element links and usually to strictly binary links of one element to one element, for the simple reason that doing anything more sophisticated is challenging for writers from a rhetorical standpoint and is complicated by the inherent lifecycle management challenges posed by linking in general. That is, doing more than simple links is just too hard in most cases.
[NOTE: In examples that follow I will omit namespace declarations just to keep the examples simple but my policy is that all elements should be in a namespace other than the no-namespace namespace. Just so we're clear.]
OK, let's pull the covers off a link and see what makes it tick. Let's start with one we've seen, an XInclude link:
<?xml version="1.0?>Here we have a simple XInclude "include" link. This link is establishing a relationship between itself, the <xi:include> element, and the element that is the document element of the XML document named by the href= attribute. The semantics of the relationship are defined by the XInclude specification and are "transclude" or "use-by-reference".
<doc>
...
<:xi:include href="../common/warnings/dont_run_scissors.xml"/>
...
</doc>
Note that this is not a link between the xi:include element and document entity "dont_run_scissors.xml". It is also not a link between the document that contains the xi:include element and the document entity. It is a link from one element, the xi:include element to another element, the document element of the document entity named. This is very important and if you aren't seeing the distinction we need to stop now and make sure you do see it because this is crucial to our understanding going forward. To make it clearer, lets look at dont_run_scissors.xml:
<xml version="1.0"?>The relationship established by the xi:include element is between itself and the <warning> element that happens to be the document element of dont_run_scissors.xml.
<warning>
<p>Don't run with scissors.</p>
</warning>
Why is this? It's because XInclude defines a useful shortcut which is that, by definition (not just by convention), a reference to a document entity with no explicit XPointer is a reference to that document entity's document element.
Let's make this clear by changing our data a bit. Let's aggregate all our standard warnings into a single document for convenience:
<xml version="1.0"?>Now let's create a new version of our linking document to reflect this new organization of warnings [NOTE: I'm pretty sure my xpointer syntax is not complete. I'm keeping it simple for example purposes. See the spec for the exactly correct syntax]:
<warning_set>
<warning>
<p>Don't run with scissors.</p>
</warning>
<warning_set>
<warning>
<p>Don't stand on the top rung of a step ladder</p>
</warning>
<?xml version="1.0?>What have we changed? Because the warning we want is no longer a document element (it's no longer the root element of its containing document), we can't use just an href=--we have to add an xpointer= in order to address the element we want. So we've added an xpointer= attribute with an XPointer that addresses the first warning in the new warning_set document.
<doc>
...
<:xi:include href="../common/warnings/warnings.xml"
xpointer="xpointer(/*/warning[1])"
/>
...
</doc>
The relationship is still the the same: the xi:include element is pointing to the don't run with scissors warning. The addressing has changed (because the data changed) but the semantics are the same and the processing result will be the same.
And note that it doesn't matter how we address the target warning. Here I made the smallest possible change to the warning data (added a wrapper warning_set element) but I didn't change the target warning at all. In particular I didn't do what a lot of people would either assume is required or do instinctively: add an ID to the warning.
This is to make the point that how you do addressing doesn't frickin' matter as regards the semantics of the links. The only questions are "how hard is it to create the pointer in the first place and how hard will it be to resolve?" As it happens, with XPointer, most of it is pretty easy and you can do it in XSLT 1 (and it's really easy with XSLT 2). I've done it and I make that XSLT code freely available (I believe an older version is somewhere on the XSL FAQ site--I have a newer version that supports XSLT 2 but I need to post it somewhere). In any case, it's not that hard and it gets easier every day.
Have I made my point about addressing vs semantics? I hope so because it's crucial to making everything work. In particular, if you can't change the form of address without changing the meaning of your links, link management would be very hard indeed.
Having said that, it's also the case that the form of address you choose will affect many practical aspects of the system. In particular, if you choose a form of address that is not standards based (that is, is not XPointer or some form of schema-defined key/keyref) then you are at a minimum increasing the cost of implementation because you'll be on the hook for all the code components that have to work with those addresses (both to create them in new documents and to resolve them during processing). If the addressing mechanism is specific to a product (for example, references to object IDs in some proprietary repository) then you've tied yourself to that repository at the data level which I think is a very dangerous thing to do and should only be done when there is no alternative (and there's always an alternative).
Note too that if your address is to object IDs in a repository you are doing exactly what we did above when we used just the href= to point to dont_run_scissors.xml: you're addressing a storage object in order to address its root element. That is, any system that decomposes documents at the element level is making individual documents out of each of those elements. That's not necessarily bad (and we'll see later where having the ability to do that as needed is a good thing) but let's not pretend that you are addressing elements directly. You are not. A lot of the incorrect behavior of these systems (such as synthesizing invalid documents on export) comes from not realizing or admitting that their objects are documents and not elements in some element tree reflecting a single document (which is what they usually claim or the appearance they expose through UIs and APIs). Just saying.
OK, let's look at what we've done and what we've got so far:
- We started with a very simple link, an XInclude from inside one document to an element in another document. Our intent was to relate the xi:include element to a single warning element and we did that by pointing to the document entity that contained the warning element and for which the warning element was the root element.
In terms of our storage-management framework, this created a system of two documents with a dependency between the first document and the warning document of type "component of".
- We decided to put all our warnings into one document (for example, because they all go through a single approval workflow and must all be approved by the same deadline or because they're created and managed by one author). This required us to create a new document, warnings.xml. Into this document we copied the original warning from dont_run_scissors.xml as well as other warnings. We committed this new document into our system.
- By some means as yet unrevealed, we, the authors of the original document, came to know that the authoritative version of our warning is now in warnings.xml and that we need to create a new version of our document that reflects this new location. So we checked out our doc (let's call it doc_01.xml), added the necessary xpointer= attribute to the xi:include element, and committed this new version into the repository.
There's some interesting stuff going on here that I need to point out:
- The original version of doc_01.xml continues to irrevocably point to the original warning in dont_run_scissors.xml. The creation of warnings.xml did not change anything about this. If you were to process version 1 of doc_01.xml right now you would get the same result you got before we created warnings.xml--that is, the warning we would use would be the one dont_run_scissors.xml, not the one in warnings.xml.
- There are two versions of the don't run with scissors warning that we, as humans doing this work, know are versions in time. However, the information we have seen so far does not explicitly relate the two versions in any way and only weakly implies it through the two versions of doc_01.xml, which differ only in the form of address used for the xi:include (but note that could be because we decided to use an entirely different warning--there's nothing about the link that says we were linking to a new version of the same warning resource (in SnapCM terms)). And not that making each element its own document wouldn't help us here because the whole point was we wanted all the warnings in one document. If we want that reasonabl level of storage organization flexibility then we have to step up to being able to both address elements that are not document elements and provide some way of tracking the version history of elements regardless of their storage locations. Fortunately it's not too hard to do.
- The change to the warning, in this case a change to its physical location required us to react by creating a new version of our document doc_01.xml even though the content of the warning itself did not change and therefore we had no other reason to need to change doc_01.xml. This is very important. This is the essential problem in the management of versioned hyperdocuments. Think about the implications here for a large body of documents all of which use this standard warning.
From this simple use case, which is pretty much the simplest use case, you should start to see a few things with some clarity:
- Moving from addressing elements indirectly via reference to the XML documents of which they are the root to addressing elements anywhere inside their containing documents complicates things a good bit (mostly for address creation, which really means for authoring user interfaces).
- There is a need to track the version history of elements, not just storage objects. It would really be nice to know where our warning, as a unit of managed information in a non-trivial workflow, has been over its lifetime.
I picked warnings on purpose because they are the most obvious example of information for which there could be severe legal and safety implications and for which you therefore need to know what you said when and where you said it and what documents used which version and what time in the past. That is, when ScissorCo gets sued you need to be able to prove that your authors used the right warning in the right documents and therefore the plaintif should have known not to run with them. I also chose warnings because they are an obvious target of re-use and they tend to go through an authoring and revision workflow separate from any documents that use them. Keep that in mind as we go forward. Warnings are just an obvious instance of a more common general case in use-by-reference, which is using information among publications or data sets with different workflows that have no necessary or natural synchronizations. For example, where core content is developed on a per-engine basis but is used in publications whose workflow schedule is driven by specific product development and release cycles.
- During authoring (that is, during the revision life cycle of the information) there is a strong requirement for various forms of indirect addressing in order to avoid the very problem we ran into here: change to a link target requires changing the link source even though the semantics of the link were otherwise not affected.
The SnapCM model provides one form of indirect addressing, the dependency link, but that alone is not sufficient if we want to enable direct addressing of elements regardless of how they are stored (because SnapCM dependencies are only between storage objects). If your requirements can be met by only doing linking and addressing of document root elements then it is sufficient (although the implication is sometimes that you end up with a lot of very small documents). But it's not that hard to step up to doing indirect addressing of elements anywhere.
Finally, I'll leave you with one question: what W3C or OASIS or IETF standard provides a mechanism for doing indirect addressing of XML elements that are not document elements? [I left out ISO because we already know the answer: HyTime (ISO/IEC 10744:1996).]
Next time: Why indirection is so important for authoring
Labels: XCMTDMW "xml content management" indirection xinclude xlink linking hytime snapcm xpointer



9 Comments:
Hi Eliot-- You were the one who taught me the really hard linking concepts (indirection!) many years ago, so I hate to point out one small inaccuracy in your excellent disquisition. But... "XLink is limited to the domain of linking among XML components" -- not.
XLink gives you a horizontally applicable way to embed hyperlink information into XML documents, but the link might be from that XML document to a non-XML resource (in the case where the URI reference is to some other kind of thing), and/or the link structure might be "out of band" and reflect a relationship of two -- or more -- resources that aren't even present in that XML document.
FWIW,
Eve
Good point. I think what I was really thinking of was "XPointer", which is limited to addressing XML-based resources.
I will correct the post when I finally come back to continuing this thread (which should be soon...)
You are both oversimplifying the situation, understandably. It was a long time ago, and the memories must be painful....
XLink allows you to point to resources of any media-type, both XML and non-XML. However, you can only point to a fragment of a media-type to the extent that the IETF registration of that media-type permits.
Thus there is no way, for example, to point into a plain text file, because there is no standard definition of fragment identifiers for text/plain resources. Erik Wilde proposed one a long time ago, but it went nowhere.
On the other hand, you can point into HTML files at elements that have names-or-ids, and various other recently developed file types support different kinds of fragment identifiers.
In the case of generic application/xml (and friends) resources, subresource addressing is restricted to bare IDs, numeric element paths, or both. So there is no standards-compliant way using XLink to point to anything but an element.
(Technically, RFC 3023 doesn't even allow that, but it has been in process of being superseded (whew!) for quite a while now, and I think we can safely assume the XPointer framework + namespaces() + element() behavior that was agreed on years ago.)
I'll be picky because I've thought a lot about this myself. My first weblog was devoted to the topic.
As a general comment, I found that for link markup everyone agrees that a link should have some indication of what it's linking to (addressing) and some indication that a link is there (in markup, a link anchor). Everyone also agrees that rich linking should have more than that, but no one can agree on what else. XLink obviously didn't hit even an 80/20 point for anyone besides the XBRL folk, and it amply demonstrated that the conflating of UI and non-UI issues is an easy trap to fall into with advanced linking architectures.
"Doing one-to-many linking or many-to-many linking is no harder than doing one-to-one linking in a generalized system." [my emphasis] Your qualification takes a lot out of your assertion, but your next XCMTDMW post does demonstrate that generalization is difficult. (For the implementor, at least, and I never liked arguments that the fact that the implementation will hide the complexity from the user justifies a standard's complexity.)
"If you choose a form of address that is not standards based (that is, is not XPointer or some form of schema-defined key/keyref) then you are at a minimum increasing the cost of implementation." I get the impression from your post that you're going to talk more about implementation, and I look forward to it. I need to be convinced that XPointer has been implemented enough that using it will save someone money.
HyTime: You perhaps inadvertently pinpointed the reason for HyTime's failure when you wrote "somebody has to implement the instantiation of the representation." The benefits of a standard that everyone has to implement themselves are a little too abstract to inspire much adoption. This was of course compounded by the complexity of HyTime's many levels of abstraction. This is all expressed much better in the "How many HyTime consultants does it take to screw in a lightbulb?" near the end of the Not the SGML FAQ.
Re: Implementation
I will certainly talk more about it, both in terms of general implementation issues and in the context of my ongoing activities with XIRUSS-T (and now, thanks to new work-related activities, other products as well). It's also no secret that I have a number of issues with how DITA 1.x does addressing (and to a lesser degree, how it does linking, mostly quibbles with some of its use-by-reference semantics). I will talk about DITA and addressing more as the DITA TC moves into the 2.0 time frame. I am explicitly not saying anything about it until 1.1 is done so as not to distract the TC from the very important task of getting 1.1 defined and published.
Re: HyTime
Yes, as you can imagine, I have mixed feelings. On the one hand, I'm proud of the technical achievement of HyTime 2 (ISO/IEC 10744:1997), which defined a very solid, internally consistent, mathematically complete, completely generic system for representing hyperlinks and addresses. There were several HyTime-based systems and at least one I know of (Woodward Governor's) is still in production use.
Nevertheless, HyTime reflected a big-iron, big-standards way of thinking that was already in the process of being obsoleted by the Web when HyTime 1 was published in 1992. I completely realize that HyTime, as it was formulated, was simply not appropriate for the Web world irrespective of its technical merits. I have no problem with that--it's just a fact. I also observe that when the XLink activity started I had both become dissallusioned with the W3C (I'm over it now) and had run out of energy and patience with standards in general and would have been physically incapable of participating in the development of the XLink spec. It's one reason I don't go out of my way to criticize the design even though I might not agree with all the design decisions--I never provided my opinions when they could have mattered. That and I just don't find XLink directly useful for authoring, for reasons I've discussed and can discuss in more detail some other time. I think the market will do more than I ever could to tell us whether or not XLink is sound and/or useful (just as it did for HyTime).
So I prefer to see HyTime as both lessons learned and as a design model from which we can pull the good parts for new things that are more appropriate for the Web world and the reality at large.
This is one reason it doesn't really bother me that XPointer is limited in the types of things it can address--in practice the things you need to address are accounted for and if, for some reason, they weren't, you have several options, from simply XMLifying what you want to point to or stepping up and defining the fragment addressing support (which is what you would have had to do with HyTime anyway). I think this approach is a reasonable balance between generality and practicality and it obviously works for most people in practice.
Re: XPointer
I don't think there's any problem with using XPointer--it's well supported in a number of generic XML processing libraries. In addition, the 80% of XPointer that you really need day to day is trivial to implement with XSLT 1 or 2, i.e., the ability to select elements via a tree location or an attribute value such as "//*[@id = 'foo']".
For XIRUSS-T I use an open-source library I found a few years ago that did as much XPointer as I needed in Java. I haven't bothered to see if there's a more authoritative implementation now (i.e., in Xerces or a related Apache package or in Java 5).
For XSLT I implemented as much of XPointer as I needed and made that code generally available (and yes I still need to package up and post my XSLT 2 version of that code).
I read your post and I found it interesting but I have a few questions.
What exactly is the document element? The document's root element?
Could you provide a brief descrition of HyTime? From the context I gather that it's a pre-web standard for creating links... but where was it's intended audience? And what exactly is a Groove?
Also, FYI... in your examples you've used an incorrectly written less-than entity in the xi:include element's start tag in a couple places (you used a colon instead of a semi-colon).
You lost me when you started talking about references to object IDs.
You will find the HyTime standard itself here: http://www1.y12.doe.gov/capabilities/sgml/wg8/document/n1920/html/n1920.html
The starting point for general information on HyTime is www.hytime.org. However, the link to the materials linked to above is broken from the HyTime.org site and I'm not sure who maintains those pages any more.
The HyTime standard addresses a number of different requirements and use cases. It can be broken down into these general areas:
- Linking and addressing. It defines a generic SGML/XML-based syntax and semantics for representing hyperlinks and addresses that allow you to link anything to anything given an appropriate abstraction of the address targets ("groves").
- Space and time-based scheduling. A generic way to represent the position of an object in one or more spacial dimensions and, optionally, a time dimension, within some defined abstract space. This was originally developed as an abstraction for music, which can be viewed as a one-dimensional physical space (pitch) with a time dimension.
Given such a space, you can create time-based links by addressing a region of the space using the generic addressing mechanism, for example, to say "any notes with this pitch that fall within this time period".
- Projection and rendition--the mapping of objects in one abstract space (i.e., the "score") into another abstract space in order to present a specific preformance or rendition (i.e., the "performance").
- "SGML extended facilities". The infrastructure part of the specification that defined all sorts of useful things that the other parts needed, such as SGML Architectures, queries, the whole grove mechanism, and so on. Pretty much everything in the extended facilities now has a corresponding W3C-defined analog, except for architectures, which are more or less provided by DITA's specialization mechanism.
In practice, the only part of HyTime that got much attention or use was the linking and addressing. Steve Newcomb and I worked on completing the Standard Music Description Language (ISO/IEC 10743) as an application of HyTime but Steve got too caught up in his Topic Map pursuits and I had other work to do and there was no constituency crying out for SMDL so it has lain fallow lo these many years.
The audience for HyTime was developers of SGML applications and systems implementors.
A "grove" (not "groove", although we certainly had our fun with the whole grove/groove thing) is an abstract data structure ("in-memory") that is the ultimate target of any HyTime-based address. The grove facility defines a generic node-with-properties mechanism that you then use to define the specific node-and-property representation of any particular data type. The HyTime standard provides a formal grove definition for SGML itself.
All HyTime addressing functions are formally defined as operations on groves. This makes HyTime addressing closed over groves (so there is no handwaving) and allows anything to be meaningfully linked to anything else because you can always define a grove representation for anything you've got.
The closest we have to groves in the W3C XML world is the XML Info Set and defining an XML representation for everything. Which works but isn't quite as architecturally satisfying as groves. But then even groves weren't as satisfying as they could be so maybe it was ultimately wasted effort. Certainly the market has spoken pretty clearly on this point.
The fact is that the requirement to actually link anything to anything wasn't nearly as strong as we thought it was--we solved a problem most people just didn't need to have solved, at least not in the mathematically complete way we solved it. Ce la vie.
The "document element" is the root element of the document.
The notion of "object ID" is a general notion within content management or any process where you are processing objects. An object, by definition, has identity such that every object can be reliably distinguished from every other object in the same storage or processing space. By the same token, given a two object references, you can unambiguously determine whether the two references refer to the same object or to different objects.
Therefore, any system that manages data in any sort of objecty way, whether its a normal file system or a content management system or a generic object database, will have some notion of "object ID" by which you can refer to objects within that system.
In the context of managing XML data, given a CMS that, for example, makes every element an object (and thefore, provides it with some sort of object ID), you might be tempted to use those object IDs directly in your addressing syntax within your XML data. You can always make this work, but the problem is that your data is now dependent on that specific content management system and will not work outside of it without complete rewriting of the addresses (and possibly of target identifiers).
Thus it's much safer and more general to do all your addressing using standard mechanisms that rely only on the inherent properties and content of the XML data itself. Of course you can still apply under-the-covers optimizations that take advantage of whatever facilities your content management system provides, but the data itself should remain pure if at all possible.
Of course, having said that, it is also the case that, as a rule, the storage object identifier part of addresses always have to be rewritten on import and export so maybe it's not such a big deal to use object IDs in the data as stored. Hmmm.
Sir! I just thanked EEKim on the HS link for having pointed me/us to your blog.
A short snapper: you write "This definition is reflected by the XLink and HyTime standards" ... and I note that XTM is absent, though in C34/WG3 I find it along side HyTime, but no reference to XLink. Am I committing a category error in thinking that XTM should be in that list?
I think you are: XTM is an application of XLink and therefore depends on its generic linking and addressing mechanisms.
By the same token, ISO Topic Maps are an application of HyTime.
Note that the core semantics to XTM and ISO Topic Maps are identical, demonstrating conclusively that the mechanics of how you represent links and addresses is totally irrelevant to the application-level semantics of those links.
Historical note: ISO Topic Maps started as one of the activities of the Committee for the Application of HyTime (CApH). I was at the meeting where the term "topic map" was coined. My memory is that I first suggested the term but I really don't remember for sure. But that meeting, back in 1992, was the first time that the concept under that name was written down. The original goal was to define a generic way to represent back-of-the-book indexes and thesauri's. I think things got a little out of control after that.
Post a Comment
<< Home