Recently Alex Brown of the ISO made a blog post regarding a test he ran on a Word 2007 DOCX (OOXML) file on the final standard of OOXML that was ratified by the ISO. Suddenly, the infamous Groklaw took up the results, mangled it out of proportion and came up with a sensational verdict of "Office 2007 itself fails the OOXML standard with 122,000 errors". Other news sites quickly picked this up and started reporting the same. I too was quite surprised with this result till I went over to the actual blog post and read through it myself, instead of trusting what Groklaw had reported. Let's analyze what's been said.
Very clearly, Alex states that:
The STRICT conformance model is quite a bit different from Ecma 376, essentially because most of that format's most notorious features (non ISO dates, compatibility settings like autospacewotnot, VML, etc.) have been removed. Thus the expectation is that existing Office 2007 documents might be some distance away from being valid according to the strict schemas [My emphasis --Vinod]
This basically means that since the format's specification has changed (due to the changes requested by many countries in the first round of voting after it was submitted), it can be expected that these changes wouldn't have gotten implemented yet. It's obvious if you think about it. Microsoft submitted the original specification for its OOXML format to the ISO. When countries decided that the specification requires a large number of changes, Microsoft went back, worked hard and incorporated those changes into the specification. Obviously they didn't spend time and effort in making those changes into the product itself before the specification was accepted - since for all they knew it might get rejected again or more changes could have been asked for.
So when there are 122K errors on the STRICT conformance model, it is to be expected - as Alex Brown very clearly states above. Somehow people tend to skip over that part for their own convenience. The really great part comes a little further down:
TRANSITIONAL conformance model is quite a bit closer to the original Ecma 376. Countries at the BRM (rather more than Ecma, as it happened) were very keen to keep compatibilty with Ecma 376 and to preserve XML structures at which legacy Office features could be targetted. The expectation is therefore that an MS Office 2007 document should be pretty close to valid according to the TRANSITIONAL schema.
Sure enough (again) the result is as expected: relatively few messages (84) are emitted and they are all of the same type ... [My emphasis again --Vinod]
Reading this lets you know that a different conformance model also exists for working on a transitional format which contains a super set of stuff that the STRICT has and Office 2007 is expected to be compatible with it. And surprise, surprise, it sure is. There were 84 warnings that were generated on the same document using the TRANSITIONAL model - and they were all for an element
which according to the specification should have been using "true" instead of "on" (and "false" instead of "off"). That's it - a simple little thing to fix isn't it? Now that the OOXML spec is becoming a standard, MS can go ahead and make the changes in the product to make it conform to the standard and apply it in any major Office 2007 update as well as in the next version of Office. And this is what the entire hullabaloo was about.
Basically I think it is time that news sites read the original source of any "news" and make interpretations themselves, rather than rely on obviously biased reports from sites like Groklaw or Slashdot. Anti-OOXML fanatics also need to get their act together and when they make a claim, substantiate it with actual facts rather than spewing fire and brimstone over nothing. You can also read a much more detailed analysis of this over at Doug Mahugh's blog and discuss it different forums.