Subscribe to Dr. Macro's XML Rants

NOTE TO TOOL OWNERS: In this blog I will occasionally make statements about products that you will take exception to. My intent is to always be factual and accurate. If I have made a statement that you consider to be incorrect or innaccurate, please bring it to my attention and, once I have verified my error, I will post the appropriate correction.

And before you get too exercised, please read the post, date 9 Feb 2006, titled "All Tools Suck".

Thursday, August 24, 2006

Office Open XML: Good or Evil?

In response to a prospect's alleged comment that "Word will save tables as CALS tables" (or something to that effect--I got the comment second or third hand) I downloaded the Office 2007 Beta and started looking into the whole Office Open XML thing.

First, as far as I can tell from a little hands-on testing and Google searches, there's no built-in support for CALS or OASIS Exchange tables in Office 2007, at least in the beta. The table markup in Office Open XML is definitely the same Word ML stuff from Office 2003.

I also read some some of the commentary on both sides of this issue and found it amusing. It's amusing because it's just so typical of everyone involved.

My feelings about Microsoft as an enterprise are no secret, but I'll outline them here:

- As a true blue legacy IBMer I was raised with a built-in hatred of Microsoft (I lived through the whole Windows vs. OS/2 times, having started at IBM about the time the IBM XT was released). I try not to let this youthful indoctrination color my objective analyses too much.

- I feel strongly that enterprises should compete on value not proprietary lock-in and therefore have many objections to Microsoft's core business practices. This is particularly frustrating to me because of my next opinion.

- Microsoft has lots of smart people who can and do create excellent software. That is, Microsoft is more than capable of competing on value alone, at least now that it has established market dominance. Of course there is the issue of free vs licensed software that does throw a wrinkle into this equation--if OpenOffice is free how does Microsoft get legitimate revenue in a value-only competition? They would have to offer enough extra value to make it worth paying for. In fact they probably do but it would be a leap of faith for them to go that route (although Open Office XML may in fact represent an unavoidable move in that direction anyway, see below).

- Microsoft also has lots of people, smart or not, who make totally boneheaded design and implementation decisions that then get baked into products forever. I'm thinking specifically of the fact that Word has not been able to manage the auto-numbering of nested numbered lists since Version 2 (and maybe not before then). Some of this is just people not thinking it through, as always happens in software development, but I think a lot of it is a corporate culture of "get it out quick will fix it in the next release", that is not valuing engineering quality quite as much as I think they should (which really means, caring more about maximizing revenue than about providing the best possible solutions to customers--which if you're a stockholder is a good thing but if you're a user is a bad thing [that being one of the essential problems with Capitalism as an economic system ]).

To a large degree this makes Microsoft no different from most software companies. The difference of course is Microsoft's monopoly position in both operating systems and office software--it paints a big target on their backs. But Microsoft isn't doing anything that IBM didn't do for 20 years before the PC came out.

I used to rant about how evil MS Office (and in particular MS Word) was as a proprietary format--it locked your data into a format you didn't own and over which you had no control. That was definitely bad and anyone who accepted that agreement was a dupe and fool. This led of course to discussions of why (at the time) SGML was A Better Way. And what tool were the slides for those presentations done in almost without exception? Of course it was PowerPoint. [I did try on occasion to hack my own SGML-based presentation systems but I never had the time or tools to make it really work and I had to be able to interoperate with my less-enlightened colleagues.]

I must also confess that after years of resisting I got an XBox and actually subscribe to Official XBox Magazine (and Lego Star Wars II is going to ROCK). So clearly when they want to do it right Microsoft can: they're big, they've got lots of talent at their disposal. In short, they can choose to do things however they want to.

And I'll just add that for all the rantings I've spewed about Bill Gates and his evil business practices, the Bill and Melinda Gates Foundation demonstrates that he's actually got a heart and is actively trying to do serious good for the world, so full props to Mr. Bill for putting his billions to use.

Oh, and I hate MS Word with the fiery passion of a thousand burning suns. I'd sooner chew off my own arm than spend any time actually authoring words in Word. I've spent so many years authoring XML that having to deal with $*%&# like doing a backspace at the end of a paragraph destroys its formatting with no good way to get it back or the complete inability to do autonumbering and any other number of just stupid things that people tolerate day and after day for reasons that I can't understand and the egregious waste of productivity that I've observed in my own XML-steeped colleagues who are literally sitting next to me just makes me want to SCREAM. But that's just me.

So what about Office 2007 and Office Open XML?

I'm not going to bother to form a technical opinion about the relative merits of, for example, ODF and OOX because it just doesn't matter. I mean really. At the end of the day the people who create Word documents (poor bastards) or spreadsheets or presentations are the ones who care and they only care about whether they can get the work done reasonably quickly and does it look right? They don't care about formats or XML data islands or how metadata is stored relative to the core content. They also don't, by and large, care about interoperation because everybody uses Word don't they?

Microsoft has consistently demonstrated that their policy is to use standards only when it suites their interests. They were dragged into XML kicking and screaming (despite being founding members of the XML Working Group) because they knew it would be a chink in their proprietary armor that would allow wedges to be driven in. But then XML took hold and they had no choice so they embraced it, which is to their credit. That they embraced it by just XMLifying RTF is no surprise but at least they did it. And they documented it, something they never did completely with RTF (I'm sure there are those of you who remember when alternating versions of Word would fail to parse RTF that was valid per the RTF spec in different ways).

And Office 2003 even let you edit XML documents in arbitrary schemas (as long as they were in a namespace and defined using XSD schemas, a decision which is too strict but since it's my preferred policy for XML usage generally I can't really fault them). Of course this feature is largely useless for lots of reasons but hey they did it, so good for them. It demonstrated that they weren't just giving lip service to XML--they took the trouble to design and build a working arbitrary XML editor. [Now if they would just make it useful I would be happy.]

But it's hard not to see Open Office XML as a cynical attempt to satisfy the European Union and fight OpenOffice in the standards arena. All the arguments about "backward compatibility" and "we have to support all the features" are really not germain: if they really cared about there being a single universal standard for office documents they would have started with ODF and gone from there, since it already existed and is certainly close enough to what they need to be a starting point. They could have chosen to eat the cost of using MathML instead of their own math presentation markup. They could have chosen to use SVG instead of their own vector graphic language. It would have cost more, both in development time and application migration, but they could have easily said "As a company we are fully committed to open standards and are willing to do what it takes to make it work." But they didn't, for whatever reason. This saddens me a little, because there was an opportunity here that would have had some real benefit, probably, but it doesn't surprise me at all (in fact, if they had done it that would have surprised me).

I don't think any of this will materially change the day-to-day situations of people who use office software (whether MS Office or OpenOffice).

I do think it's a good thing that Office 2007 now stores its data exclusively in XML by default and I think the use of Zip files to organize the different parts which are stored as individual documents is the right thing to do and I applaud Microsoft for that decision.

And even though the ECMA standardization of Open Office XML is driven by cynical business motives, it's still a standard which means that it is truly open (in the sense that there is no license cost or exposure for using the format or implementing support for it) which will be to our benefit. I suspect that it will have the same effect that using XML did: it will force Microsoft to compete more on value than on lock-in, to engineer things a bit more carefully, and to be more consistent in their implementations from release to release.

For integrators it definitely makes it easier for us to connect things to Office (i.e., creating an X-to-OOX transform or adapter) with some assurance that the code we write today will still work five years from now.

So while I think it's pretty clear that Office Open XML was driven almost entirely by self-serving business needs I can't see how its a bad thing in general and it looks like it's actually a good thing if you recognize the reality that most office documents are in fact created in MS Office.

Now as for the new user interface--that's going to take some getting used to, but since I don't use Word it doesn't really matter to me, does it?



Anonymous Anonymous said...

I think MS-word's Autonumbering is a scheme by MS to take over the world. Hate it! got more bugs than my shorts. But alas, I am stuck....

Why does everybody else's documents I receive seem to execute auto-numbering without problems. Can anyone out there direct me to a fix????

Jim In Louisville...

10:00 PM  

Post a Comment

<< Home