- From: James Clark <jjc@jclark.com>
- Date: Sat, 15 Sep 2012 10:25:44 +0700
- To: Uche Ogbuji <uche@ogbuji.net>
- Cc: public-microxml@w3.org
- Message-ID: <CANz3_Eb3C5PCJ60P+Gu=sDdJV20qMxY5BexCT2d6nSrkbCccGQ@mail.gmail.com>
On Sat, Sep 15, 2012 at 3:31 AM, Uche Ogbuji <uche@ogbuji.net> wrote: > > And I want to reiterate that for me the compatibility goal means all > MicroXML docs are WF XML. I don not think we should be constrained to have > a fully backward compatible data model, though I agree we should carefully > consider each DM incompatibility we introduce (as we're doing in this case). > I would put things slightly differently. I think compatibility with the data model is a goal, but it's not a hard constraint: we might not achieve it fully when it conflicts with other goals. This issue is very difficult because it involves conflicting goals. (a) normalize newlines/tabs to spaces: this conflicts with our goal to minimize the ugliness/weirdness in MicroXML (we ought to formulate this as an explicit design goal) (b) no literal tabs and newlines in attribute values: this conflicts with our goal of supporting authoring in plain text editors; the forbidding tabs aspect also conflicts (to a lesser extent) with our goal to minimize ugliness/weirdness (c) newlines in attribute values allowed and left as newlines: this conflicts with the goal of data model compatibility How bad is (c)? - it's an uncorrectable difference in parsing; you cannot fix it up with a post-parsing stage, because newlines/tabs that are entered as numeric character references are preserved in XML, and the data model does not tell you which characters came from numeric character - any difference in parsing however slight does make a difference for some applications like digital signatures - with our hardline approach in excluding XML Namespaces, this is currently the only data model incompatibility: there's a big difference between being 100% compatible and being anything less; in the former case users don't have to think about the issue at all - being 100% compatible would make the "marketing" message much crisper and easier to understand On the other hand - I don't think any users would actually want the XML behaviour - I suspect applications that deal with token lists will handle newlines and tabs just fine (anything that follows either the XML Schema rules on lists or the RELAX NG rules will handle them) The tab issue makes (b) less attractive to me. As for (a), it also seems really bad to me: this would be the only thing in the rules of how to construct the data model from a WF MicroXML document that would be totally surprising to somebody unfamiliar with XML. I think my preference is (c). James
Received on Saturday, 15 September 2012 03:26:32 UTC