Subscribe to Dr. Macro's XML Rants

NOTE TO TOOL OWNERS: In this blog I will occasionally make statements about products that you will take exception to. My intent is to always be factual and accurate. If I have made a statement that you consider to be incorrect or innaccurate, please bring it to my attention and, once I have verified my error, I will post the appropriate correction.

And before you get too exercised, please read the post, date 9 Feb 2006, titled "All Tools Suck".

Friday, March 24, 2006

Old Home Week at DITA 2006 and Eve's Blog

I've added a link to Eve Maler's blog to the list of blogs of people who's opinion I respect as I just discovered her blog. I've worked with Eve since my very early days doing SGML stuff back when I was at IBM and Eve was at Digital Equipment. She impressed me in a number of ways from the first moment we met and she continues to impress. I'm eager to see what she has to say about stuff. I also observe that she has a band (Mudcats, at least that's their name as of 8 March). I also have a band (Toothache Ballerinas), although we have yet to actually perform in public. So now I'm even more impressed.

I discovered Eve's blog through the blog of Scott Hudson. I discovered Scott's blog because when I did a vanity google of "eliot kimber utilikilt" Scott's entry of September 13, 2004 was in the first results. Turns out I inspired Scott to get a Utilikilt when I wore one while presenting at Extreme Markup (in 2004? I lose track). Turns out Scott is also an XML practitioner and generally interesting guy. I got to hang out with Scott last night where we drove together, along with DocBook master Bob Stayton, to the dinner that the XMetal team from Blast Radius hosted for the DITA TC (Thanks Jeff).

DITA 2006 is the first XML-related conference (or conference of any kind) that I've attended in a couple of years and I'd forgotten how much fun it is to hobnob with brother (and sister) wizards for a few days. You get to have intense and uber-geeking conversations about technical stuff, something I don't get to do much these days, and of course catch up on people's families, jobs, life experiences and whatnot.

DITA 2006 is being held in the Research Triangle Park area of North Carolina, where I worked for IBM for 10 years. I was surprised and pleased to see some of my IBM collegues from those times. The startling thing to me was that, in appearance, they had not changed at all. The didn't even seem to have aged. It was almost like I'd never been gone and we were sitting down for lunch in the cafeteria just like we did back then.

This DITA conference reminds me of the early SGML conferences where you had a small number of really enthusiastic people who were all excited about new technology and new possibilities. For the last few years the XML conferences have been by contrast quite dull, with little excitement and few truly new things. XML will be 10 years old in February of next year, although I really date it to December of 1996, which is when we announced its existence to the world at SGML '96. So it's pretty well settled down into being the almost invisible, boring infrastructure it is. Which is good, but it makes the conferences more like plumbing supplier conventions then the geek fests they used to be.

So it's fun to be here and get to breath some of that heady air again.

Oh, and check out They're providing a "hosted XML content management solution", built on top of XHive's Docato product and also integrating a translation support tool (with which I'm not familiar at all). Two of the North American members of DocZone are former collegues of mine from Innodata Isogen and I'm pretty sure that what they're selling is legit. I haven't worked personally with Docato or XHive, but a number of my colleagues have and from their reports it sounds like it doesn't suck too bad, so that bodes well for DocZone. DocZone also claims to provide DITA support out of the box, although I don't really know what that means. But I think this product bears following. The company is very new but XHive and Docato are reasonably mature for XML-aware content management tools.

Dr. Macro says check it out.


Thursday, March 23, 2006

DITA 2006: Questions from the floor, Part 1

I'm at the DITA 2006 conference today. I was in the first slot after the plenary, speaking about "Are you ready for DITA?" Most of the people in the audience were still trying to decide whether or not DITA would be appropriate.

One of the questions from the floor was "if you are a small writing group, how do you decide whether to use DocBook or DITA?". During my talk I had focused on the concerns of larger enterprises, where you can probably get the budget needed to design and implement a specialized DITA-based solution and where the ROI is probably pretty clear (because for example, you have lots of opportunity for re-use because you have lots of products or whatever).

But for small groups, the one- to six-person writing groups that are primarily focused on producing manuals--where things like online or Web-based delivery and re-use a secondary or minor consideration--the choice is less clear.

While the initial cost of entry for using DITA is very low, the cost of doing even basic specialization and customization of the supporting tools (for example, a custom print style sheet) is not free and may be beyond the very limited resources that a small group has, where your budget may be defined as "how many hours can you work on this without missing your deadline?".

By contrast, for doing books, DocBook's cost of specialization may be quite low, for the simple reason that it's a very mature system (it's been around for about 10 years now) and is focused specifically on doing books. At the same time, you can get a reasonable amount of modularity and reuse with DocBook by applying some discipline and by using the new XInclude support for including book components. For many small writing teams, this will be sufficient to meet their modularity and re-use requirements.

In my talk I focused on the need to make sober, well-informed business decisions about whether or not to use DITA. For larger enterprises it's pretty much a no brainer, because the benefits are pretty clear and the initial cost of implementation is relatively low compared to the existing documentation costs and potential savings.

For small enterprises the business analysis is driven much more by the cost of implementation because the value from re-use and modularity may not be as compelling simply because there's a much smaller potential scope for re-use and the added effort involved in doing modular authoring might exceed available resources.

So for small groups the answer will often be that in fact it makes more sense to use DocBook and apply modularity approaches as needed through discipline.

This is certainly the case today. However, one thing to keep in mind is that DITA is very new and its infrastructure is still being developed, so it doesn't have the same degree of maturity and completeness as DocBook's. But this will change over the next couple of years. In addition, DITA 1.1, currently being developed by the DITA Technical Committee, will add a number of features needed to make doing books with DITA much easier. So over the next couple of years the cost difference between DITA and DocBook for small groups focused on books will be reduced. Also, we can expect a formal and concerted convergence, or at least an alignment, of DITA and DocBook such that it should be more a choice based on whether your focus is books or modular delivery rather than DITA or DocBook, because both will provide the same core features of modularity and specialization, both will have similarly function free toolkits, and both will share the same core element types.


Roomba Update

I love Roomba. I want to marry Roomba. Roomba is my precious.

I've been moving out of my old house and getting it ready to rent. I took Roomba with me (now battle scarred from vacuuming under our old couch with the sharp bits) and as I cleared out a room, I'd put Roomba in to clean up while I dealt with the next room. By the time I was done, the first room was clean (or at least clean enough that the cleaning crew that would be sent in could focus on the really hard stuff).

It continues to amaze me just how cool Roomba is. Get one today.


Thursday, March 09, 2006

Cool Geek Toy: iRobot Roomba Vacuum

For my birthday my father got me an iRobot Roomba vacuum. This is something I would have never bought for myself but now that I have it I wonder how I ever lived without it.

This is not, strictly, speaking, an XML-related tool but it's such an interersting bit of technology that I feel compelled to talk about it. And I think it has something to teach us about tools that don't suck (in the metaphorical sense--obviously the Roomba literally sucks as it is in fact a vacuum).

First the Roomba is practical: it's a darn good vacuum (my only complaint is that it's dirt container is small so you have to empty it often). I have been responsible

The Kirby
for the vacumming in my home since I was a lad and I take it pretty seriously (even if I'm a total slacker about actually doing it). Growing up we had a Kirby with all the attachments and I loved that machine--it was so versitile and it was a good vacuum. But the Roomba has something I never had with respect to vacuuming: tenacity and focus. The thing has sensors that tell it how clean the floor is and it will not stop until it feels the floor is clean enough or its battery runs out (it supposed to last about 2 hours). It's sort of the Terminator of vacuums. By contrast, I'll vacuum thoroughly but not that thoroughly--after all I have a life.

I've only had a chance to run it once so far, but by the time I had to turn it off so I could go to bed, it had actually started to restore the knap to the high-traffic area of our worn-out berber carpet. We have two bassett hounds, so I had to empty it about every 10 minutes (because, being a slacker, it had been a while since I had last run the vacuum). But I was happy to service the little guy if it meant really getting the floors clean for once.

Second: it's clearly a very sophisticated bit of artificial intelligence. When you watch it work a room you can see that there are some sophisticated algorithms and strategies that it uses to follow walls and work around obstacles. It is also remarkably determined when it gets stuck, acting in a way that seems very lifelike. It reminded of nothing so much as a crustacian when it go stuck under the couch--it would try to back out, wiggle around, stop, move a bit side to side, then try again before I finally got it unstuck. It's clearly not just randomly roaming around.

Third: it's pretty flat so it can go under higher furniture, get into the kick space under cabinets, and so on. It's fun to watch it go under a chair, do it's thing, and emerge again.

Fourth: it has an API so you can hack it.

So what can we learn about tools from the Roomba?

First, while it's cool, it's not about being cool, its about being a good vacuum and a practical tool. It's coolness is a side effect of how it goes about being a good vacuum. I think that's important for any tool--tools that set out to be cool tend to not have much staying power. But tools that set out to do something useful in a new and better way tend to end up being cool. I've had many conversations over the years along the lines of "this would be a really cool thing to do with XML" and I don't remember any of those coming to much. Note how this is different from "this does this useful thing with XML in a cool way". Those tools tend to have staying power.

Second it's well executed--the engineering quality is very high. It has to be just to survive in the harsh environment of the typical home. To be more than a toy it has to be rugged and durable and easy to use--all the things one looks for in a production tool. The Roomba can serve as inspiration and example to those of us who create tools.

Fourth: while it does a very mundane job it serves as a platform for testing concepts and approaches that have much wider application. The same attributes that make a good vacuum also make a good battlefield robot or construction robot or whatever. The iRobot guys clearly understood that by tackling the relatively narrow task of building a useful, practical, affordable vacuum robot that they would learn lots of important things that they would need to know to do more challenging things. They also knew that a cute robot that vacuums is something everybody can understand and relate to, making it an excellent marketing tool for robots in general and their company in particular.

In XML you often have a similar marketing challenge: you have a very general, very powerful, very abstract technology with lots of potential applications but most people have a hard time seeing it in that way--they need concrete examples they can relate to. I've seen tools and created tools (and standards) that were very powerful but very abstract and it became difficult to sell them because there was no easy-to-relate-to concrete application. Again, the Roomba can serve as an example of how to do it right. The Roomba, while focused and concrete and easy to relate to, is not trivial--it's useful in its own right and so should be any demonstration or foot-in-the-door tool you create in order to market a more general or abstract technology.

At the same time the Roomba has an API that lets you change and extend its behavior. I haven't had a chance to study its API in detail but a quick review of the docs indicate that it has the features I look for in an API:

- It's complete over the features of the tool. That is, you have access to all the primitives

- It's well documented

- It's accessible through a variety of languages (their API spec has Python examples of how to use the API--yeah! [Did I mention that I'm a Python fan? Did I mention that I will dive under a train before I'll ever right another line of Perl? Does that sound like another rant waiting to happen? Can you already see where that rant would end up? Should I appologize in advance to all those well-meaning people who still think Perl is a good language? I suppose so. Is this irresponsible flame bait? Do I care?]

As an integrator, while I look for good user interfaces and solid engineering, what I really want is a solid API that is complete, well documented, well designed (consistent method names and signatures, sensible object models, etc.), and so on. At the same time it should not be bigger than it needs to be. Also, poor APIs tend to reflect generally poor engineering quality and visa versa--a good API is usually evidence of overall good engineering (although this is not always true--sometimes a good API is just lipstick on a pig, hiding bad code underneath it).

So as tool creators and integrators we can take the Roomba as an inspiration and example of how to do it right. Among the XML tools you use are there any Roombas? Are there any "man this is so not a Roomba"s?

As tool users and evaluators we can take the Roomba as an exemplar of what to look for in a good tool: has a clear purpose, performs it well, is durable and reliable, has a good user interface, has a good API, is extensible, is appropriately priced (provides good value), and is a joy to use (for a certain value of joy).

Dr. Macro says check out the Roomba.