Subscribe to Dr. Macro's XML Rants

NOTE TO TOOL OWNERS: In this blog I will occasionally make statements about products that you will take exception to. My intent is to always be factual and accurate. If I have made a statement that you consider to be incorrect or innaccurate, please bring it to my attention and, once I have verified my error, I will post the appropriate correction.

And before you get too exercised, please read the post, date 9 Feb 2006, titled "All Tools Suck".

Friday, July 20, 2007

InDesign CS3 and XML Authoring: Could be Good

In my new job at Really Strategies I have started digging pretty deeply into how to get XML into and out of Adobe InDesign CS3. This has turned out to be pretty interesting.

In InDesign CS2 the XML support was somewhat weak. While you could import an XML document into InDesign and then associate styling with it, it was very simplistic in that you had no direct way to do context-based associations and no easy way to script it, either on import or inside the editor.

In CS3 that has largely changed. CS3 adds several new XML support features that appear to serve to make InDesign a quite powerful XML rendering tool that could be integrated loosely or tightly with any other XML authoring tool to create an interesting environment. (You could, in theory use InDesign to author the XML as well but it wasn't really designed for that and I don't think it's a good use of resources to try to make it an XML editor, not when the process I outline here is so easy to implement.)

Here's the general mechanism I'm working toward:

1. Using InDesign, you create a template document that will accept your XML. This requires setting up all the usual styling stuff (page masters, frames, named styles) as well as creating instances of the markup structures that will populate different text frames (InDesign's XML import works by matching imported elements to existing elements and replacing the existing ones with matching structures on import, more or less).

2. You create an XSLT that takes your XML source and "augments" it with Adobe-specific attributes that specify the per-element-instance mapping to InDesign paragraph and characters, as well as generated elements for any generated text that needs to be rendered as a separate paragraph (analogous to the gentext psuedo elements Arbortext Editor uses to manage generated text display).

This XSLT can be pretty simple--it's just an identity transform with a little bit of per-element-type logic to define the mapping (and it could be further parameterized through some sort of more direct mapping specification, although I'm not sure it's worth the effort). This script could also re-order things as needed, generate TOCs, etc. But the minimum required is pretty small. There're a few more things you need to handle, but they can be generalized easily enough.

The main gotcha here is that InDesign is sensitive to newlines in the XML data, because newlines trigger the application of paragraph styles. What I've found so far is that you have to manage the text content very carefully so that you only emit newlines at true paragraph boundaries. This also means that you only apply paragraph styles to the lowest-level elements that will become paragraphs in InDesign--you can't just blindly apply styles at higher levels in the XML hierarchy (InDesign is not XSL-FO).

3. You run this transform outside of InDesign. InDesign lets you apply a transform as part of the import process, but we don't want to do that for reasons that will become clear in a moment (unless I've missed a feature of InDesign, which is quite possible--I'm still coming up to speed on its intricacies).

I use OxygenXML for most XML editing and it provides a very convenient mechanism for applying a transform to a document and saving the result wherever you want. But any good XML editor should provide a way to do this so that you have some sort of "run the transform" button or menu item. The key is that the result (the XML with the InDesign augmentations) is always put to some consistent place.

4. Import the augmented XML (not the XML you're authoring in your XML editor) into InDesign using InDesign's XML import (without applying the XSLT) but being sure to check the "link to XML" check box and select "merge" not "append"--this is the key.

5. Go back to your XML editor, make changes to the XML and push the "transform" button again.

6. Switch to InDesign and bring up the Links pallet. In that you'll find your XML document listed. Select it and click the "update link" button. Magically, your XML changes are re-imported into InDesign and the styles applied.

Hey presto! Immediate, easy, convenient pagination of XML using InDesign. Something that was not immediate, easy, or convenient with CS2.

I haven't looked into it but it should be possible to script the triggering of the link update as well, although that might require a little C code, I'm not sure. But it's clear that by this mechanism you can use InDesign as a "page preview" mechanism from any XML editor with very little work.

Beyond the simple element-to-style mapping you can do on import, CS3 also provides scripting support for working with XML in the form of XPath-based functions that allow you to easily apply any script to elements in context. I haven't used this yet but a brief look at the docs suggests that it's just the thing to take your XML to the next level.

It's still not going to give you what products like Typefi give you, which is complete complex layout heuristics, but it should be sufficient for relatively simple layouts such as typify technical documentation. It occurred to me, for example, that it wouldn't be very hard to create a process that would allow you to use InDesign to create nice books from DITA source using this mechanism. Hmmm

Note that you can download a one-month eval of InDesign from Adobe's Web site.

11 Comments:

Anonymous Michael Friedman said...

Eliot,

Have you found the mechanism that calls the various page masters that are available? Are these related to content - for example, a specific matched element in the XSL creates a new page using a different page master?

And that got me thinking to some features like "first page", "last page" in sequences of pages within a given flow. Is this kind of capability available?

Michael

1:54 PM  
Blogger Eliot Kimber said...

There's no direct mechanism that I've seen for associating content to page masters automatically.

Either you already have the pages set up and you just flow content into it or you'll need to use scripting to create new pages and populate them based on element types in context.

One could imagine using something like FO's conditional page sequence masters to declaratively define page sequences (in terms of InDesign master page names) and then apply them via scripts. That would be instead of just hard coding the logic in your script. You could establish a naming convention for master pages to indicate things like first, last, etc. (you can already get even/odd from the page properties).

You could also use script labels or extended labels to further annotation page masters and master spreads, if you needed more distinction than you can get from page master names or if you wanted to allow local names but have a separate classification mechanism for use with a more generic script.

9:27 PM  
Blogger Silver Arrow said...

Hey
nice article :)
Do you know if there is a way to convert back an XSL:FO into INX format so that we could eventually re-work the layout/content when the fo is coming from another source?
Thx a lot

8:36 AM  
Blogger bme said...

Eliot,

I'm very interested in this issue, actually I have tried to achieve this but found it difficult. It would be nice if some could publish/develop some DITA2InDesing application. Have you had the time to take this issue further?

///bme

2:12 PM  
Blogger Eliot Kimber said...

I have not yet had a chance to do anything with DITA to InDesign, unfortunately. It's still high on my list of things to do if time permits but at the moment I can't predict when or if time will permit.

7:27 PM  
Anonymous Manny said...

Eliot,
I've come across your Blog and have watched your introduction to DITA videos, which have helped me learn a lot. I work for a company that specializes in printing multi-national instruction manuals. Our clients and us use FrameMaker and InDesign for the bulk of this work. We are looking at getting into a DITA workflow with our clients. I've been tasked with getting a conventional InDesign file into DITA topics so it could then be handed off to the DITA CMS system we are contemplating. My first stab at this was to individually tag each item on the page for XML, but it's extremely laborious as it's a 48 page times 3 language document.
Since this article on InDesign CS3 and XML was published last July, have you had any more time to look at what an InDesign to DITA conversion would take to accomplish?

12:44 PM  
Blogger Eliot Kimber said...

WRT to InDesign-to-DITA conversion: I think that's pretty much the same as any other non-XML format into DITA. What I do have some more practical experience with is automating the conversion of unstructured InDesign into structured XML. The key is having consistent paragraph and character style names and application. Given that you can use the automatic "map styles to tags" to generate an XML version of the InDesign data that you can then attack with traditional XML tools (e.g., XSLT).

Trying to do the tagging directly in InDesign would be too hard, I think--InDesign simply wasn't designed for that sort of activity.

As for DITA-to-InDesign, I have gotten approval to start an open-source project for developing a DITA-to-InDesign plug-in for the Open Toolkit. I'll have a formal announcement as soon as I get the project set up, which should be in the next week or so.

1:45 PM  
Anonymous Anonymous said...

I'm just starting down this path, but do you suppose indesign could be the ticket for importing an xml file off the web (in my case an xml file of used cars from our site's database) and be able to import that content into a template that would be bound for the newspaper? Typically we would have to copy and paste content for each car, price, description etc. into each little box on the template.

Any ideas?

2:29 PM  
Blogger nigel said...

Hi, I'm trying to locate an XSLT that will add carriage returns to exported FileMaker Pro data, so that I can apply Paragraph Styles in InDesign. I've no experience of XSLT so having to write one myself is a bit daunting. Can you point me in the right direction.
Thanks, Nigel

4:30 PM  
Blogger Eliot Kimber said...

Would have to see what the Frame output looks like, but the basic approach is demonstrated in the (not at all complete) XSLTs that are in the DITA2InDesign source repository (dita2indesign.sourceforge.net).

Essentially it's an identity transform that processes all the PCDATA content to adjust the whitespace and newlines appropriately.

11:43 PM  
Anonymous Anonymous said...

Eliot,

Basically i just want to know that if it is at all possible for indesign to distinguish between one record and th e next in a data merge and apply a particular master to a flagged record. Is that at all possible?

Jon

7:06 PM  

Post a Comment

Links to this post:

Create a Link

<< Home