W3C home > Mailing lists > Public > www-tag@w3.org > July 2009

Re: Proposal: @parsing="loose | strict"

From: Felix Sasaki <felix.sasaki@fh-potsdam.de>
Date: Wed, 15 Jul 2009 19:59:19 +0900
Message-ID: <ba4134970907150359n79a4002av296cbd9d9fa102e2@mail.gmail.com>
To: Larry Masinter <masinter@adobe.com>
Cc: Doug Schepers <schepers@w3.org>, "www-tag@w3.org WG" <www-tag@w3.org>
2009/7/15 Larry Masinter <masinter@adobe.com>

> (I'm cc'ing www-tag rather than public-html since my
> comments are relative to the TAG versioning issue, and
> I don't have much confidence that the HTML-WG wants to
> do more about versioning than get a report from the TAG.)
>
>
>
> Doug, I like this approach. I think it fits into the general
> direction we're taking in the TAG to deal with "versioning".
> It's not clear to me whether the global indicator that the
> language "version" you want has strict parsing belongs as a
> separate attribute. It's not clear whether the situation where
> old user agents would ignore parsing="strict" and allow
> non-strict content to be parsed would lead to the rise of
> material that is marked "strict" but in fact only works with
> "loose".
>
> I also wonder whether there are more kinds of strictness than
> just "parsing" strictness that are also useful, and how many
> strict/non-strict version indicators we would need to capture
> them.



Another kind of strictness might be related to allow or disallow distributed
extensibility, e.g. with the values "extensible" or "nonextensible".

Felix



>
>
> For example, are there "strict" versions of JavaScript APIs which
> don't allow access to document.root, for example, to allow
> for mashups?
>
> Just trying to get the issues out on the table, with the idea that
> a holistic approach to versioning might also deal with strict vs
> loose parsing.
>
> Larry
> --
> http://larry.masinter.net
>
>
> -----Original Message-----
> From: tag-request@w3.org [mailto:tag-request@w3.org] On Behalf Of Doug
> Schepers
> Sent: Tuesday, July 14, 2009 12:16 AM
> To: public-html@w3.org
> Subject: Proposal: @parsing="loose | strict"
>
> Hi, HTML WG-
>
> There are advantages and disadvantages to both the strict ("draconian")
> and error-correcting parsing of markup.  HTML evolved to have loose
> parsing with undefined and browser-specific error correction, and XML
> was designed and well-defined to have strict parsing (probably as a
> reaction to the chaotic HTML approach).
>
> We have come full circle on the matter, and the HTML5 spec marries many
> of the advantages of both approaches, by offering a well-defined
> error-correction model.  This has the advantage that it is sometimes
> easier to author (though it can make debugging more difficult), the more
> profound advantage that it hides problems from the reader, and the even
> more important advantage that it is more or less how browsers already
> parse HTML documents.
>
> However, it cannot gracefully address all the situations in which strict
> parsing is an advantage:
>
> * For authoring, it is often useful to know when you have validity or
> well-formedness errors, which helps debug script and CSS, and doing this
> on the fly in the browser is faster and easier while developing than
> reiterative validation with a separate tool;
>
> * Strict markup works predictably for mashups and mixtures of different
> markup languages;
>
> * Draconian error handling enforces structure and content models for
> mission-critical applications, such as the canonical "financial
> transactions" example, where the reader *wants* to know about problems
> in the markup [1], and for use cases that are low-tolerance for
> potential errors (such as the government and some industries).
>
> To meet this need, I propose a new attribute, 'parsing', which, when
> placed on the document root, defines the type of parsing which a UA must
> use when parsing the document.  The values would be "loose" and
> "strict", with loose parsing as the default (an omitted @parsing
> attribute would result in loose parsing).
>
> When the parsing is loose, the error-correction algorithms defined in
> HTML5 must be applied; when the parsing is strict, there must be no
> error-correction (as is commonly the case for XHTML in most browsers).
>
> This way, authors could optionally enforce strictness when they want or
> need to, and then change/remove the value when they are ready for
> publication, or when the needs change.  It is possible that there would
> be instances where strict parsing makes it out of development and into
> production code, but this would have relatively few negative
> consequences (the kind of author who uses this would probably product
> strict code anyway, and would know it if they didn't), and would be
> easily corrected.  And, quite frankly, some people simply prefer
> stricter parsing for aesthetic or whatever, and this would provide them
> with that option while not imposing it on others.
>
>
> Had this option been available in XML from the beginning, many problems
> and community schisms may have been avoided.  I believe that presenting
> the option for strict parsing may change how the various communities
> approach HTML5, and avoid further schisms.  I see this as having
> relatively low costs for the specification, and very little
> implementation cost, since browsers will already have both modes (even
> IE has a built-in XML parser, though it doesn't use it for XHTML).
> Please correct me if my assumption here is wrong.
>
> I also believe that this is backwards-compatible, since the default will
> be loose parsing as is already applied, and forwards-compatible, since
> any alternate future parsing models (such as the proposed XML2 or XML5,
> or some use case we don't see today) can be specified as the value for
> @parsing in a later specification without changing how it would be used
> as defined in HTML5.  It may lay the groundwork for a new formulation of
> error-correcting XML, as Anne proposed.
>
>
> I'm hoping that the dust has sufficiently settled about the parsing
> debate that we can hold a logical discussion of this proposal on its
> merits.
>
>
> (Meta: I chose the keywords of the attribute and values for brevity, and
> I'm not at all married to them; treat them as placeholders for the
> purposes of discussing this proposal; another option might be something
> like @error-correction="true | false".  Please don't suggest different
> names quite yet unless they represent a functional difference to this
> proposal.  Also, I've BCC'ed the TAG just so they know.)
>
> [1] http://www.tbray.org/ongoing/When/200x/2004/01/11/PostelPilgrim
>
> Regards-
> -Doug Schepers
> W3C Team Contact, SVG and WebApps WGs
>
>
Received on Wednesday, 15 July 2009 11:00:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:14 GMT