Subscribe to Dr. Macro's XML Rants

NOTE TO TOOL OWNERS: In this blog I will occasionally make statements about products that you will take exception to. My intent is to always be factual and accurate. If I have made a statement that you consider to be incorrect or innaccurate, please bring it to my attention and, once I have verified my error, I will post the appropriate correction.

And before you get too exercised, please read the post, date 9 Feb 2006, titled "All Tools Suck".

Friday, January 05, 2007

Specializing xi:include

I've posted before about how useful it is to specialize the XInclude include element--it makes authoring easier, it lets you define constraints on what can be referenced, etc.

But until now I'd not really appreciated another serious benefit: It avoids ambiguous content models.

I ran into this in the process of modifying the DocBook 5.0RC1 XSD schemas to add xincludes. The obvious approach of just adding xi:include wherever something that could be included is allowed did not work because it created all sorts of ambiguity problems. Doh!

Consider this content model from DocBook schemas (somewhat modified by me for my local use):
<xs:sequence>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="docbook:glossary"/>
<xs:element ref="docbook:bibliography"/>
<xs:element ref="docbook:index"/>
<xs:element ref="docbook:toc"/>
</xs:choice>
<xs:choice>
<xs:sequence>
<xs:group ref="dbparms:all_blocks" maxOccurs="unbounded"/>
<xs:element minOccurs="0" maxOccurs="unbounded" ref="docbook:section"/>
</xs:sequence>
<xs:sequence>
<xs:element maxOccurs="unbounded" ref="docbook:section"/>
</xs:sequence>
</xs:choice>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="docbook:glossary"/>
<xs:element ref="docbook:bibliography"/>
<xs:element ref="docbook:index"/>
<xs:element ref="docbook:toc"/>
</xs:choice>
</xs:sequence>
The intuitive thing would be to allow xi:include in each place where section or section-like things are allowed.

But this creates a horribly ambiguous content model. Now I happen to thing that the ambiguity rules are completely bogus, nevertheless, having chosen to live in XSD land I'm stuck with them (at least for now).

But it should be immediately obvious that if we specialize xi:include to reflect the specific element types of the things we want to include, for example docbook:section_include, then the ambiguity problem goes away because you'll be adding tokens with the same distinction as the existing tokens, so you can never create an ambiguity that wasn't already there.

I also observe that since xi:include's complex type is named named then you can do the specialization formally using substitution groups at the XSD level. Hmmm.

Labels:

4 Comments:

Blogger John Cowan said...

So validate against the RNG schemas (thoughtfully provided for you by Norm) with xi:xinclude added (through the customizations), and then validate with XSD *after* including.

1:15 AM  
Blogger John Cowan said...

This comment has been removed by the author.

1:16 AM  
Blogger Unknown said...

The sad thing about XML Schema is that it is not possible to declare that elements in a foreign namespace are allowed everywhere. This would provide for true modular XML vocabularies. Processors could be conceived to look only at elements in the namespaces of their interest and "look through" everything else.

On the other hand, XML Schema is expressed in XML itself, so it is possible to transform it in order to allow foreign namespaces. This could be done on the fly when an XML Schema is loaded.

7:32 AM  
Blogger Chandu... said...

how to add a xi:include tag in xml document in vc++ environment?

I need a msxml api which includes xi:include tag in a xml file? I am working in vc++ environment and my back end is xml.

2:17 AM  

Post a Comment

<< Home