Re: Who is the Intended Audience of the Markup Spec Proposal? from Lachlan Hunt on 2009-01-23 (public-html@w3.org from January 2009)

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
Date: Fri, 23 Jan 2009 15:05:14 +0100
To: "Michael(tm) Smith" <mike@w3.org>
Cc: public-html <public-html@w3.org>
Message-ID: <4979CE9A.8010103@lachy.id.au>
Michael(tm) Smith wrote:
> Lachlan Hunt <lachlan.hunt@lachy.id.au>, 2008-11-19 19:39 +0100:
>>    Regarding the draft of the Markup Spec [1], it is unclear to me who the 
>>  intended audience of the document is. [...]
>>  While it is clear that the scope is limited to markup producers, there are 
>>  still a wide variety of such producers, each with many overlapping and 
>>  distinct needs.  In order to review the proposal the proper context, it is 
>>  necessary to clearly define who the intended audience is so that it can be 
>>  evaluated in regards to whether or not it adequately addresses the needs of 
>>  that audience.
> 
> I agree that clearly defining somewhere the intended audience for
> any spec is in principle potentially quite useful.
> 
> But would you say that it's also necessary or useful for the 
> existing HTML5 draft to have its intended audience clearly defined 
> somewhere? Or that it also can't be evaluated without such a clear 
> definition of its intended audience?

The HTML5 spec already includes a section that seems to define it's 
audience, and, although there's always room for improvement, it seems 
fairly clear to me already.

http://www.whatwg.org/specs/web-apps/current-work/#audience

> I would certainly not want to make the mistake of attempting to 
> define the intended audience but failing to articulate it broadly 
> enough or accurately, and so end up prematurely constraining 
> evaluations about its usefulness.

The problem is that the current scope is too broad and contains somewhat 
contradictory statements about who is and is not included.

For instance, it states:

   "This specification limits its scope to providing the details
    necessary for HTML producers to create conformant documents."

where _HTML producers_ is defined as:

   "HTML authors (that is, people) [...] that produce HTML content."

However, the Out of Scope section contradicts this by stating:

   "does not attempt to be a “tutorial” or “how to” authoring guide"

The issue is that the scope statement doesn't adequately limit itself to 
the subset of authors who are comfortable with reading and understanding 
formal grammars and who have some use for them.  However, such authors 
are few and far between.  Based on the content of the document, a more 
accurate scope would include developers of conformance checking tools, 
including online services like v.nu, HTML editor's that provide a 
validation feature (like Dreamweaver), and CMSs that use conformance 
checking as part of their input validation.  Thus I believe this 
document is more suitable for authoring tool developers than it is for 
authors.

>>  The set of markup producers include authors of various skill levels, 
>>  authoring tool vendors (like Dreamweaver), tutorial writers, CMS vendors, a 
>>  wide range of other tool vendors who's products provide some form of 
>>  automatically generated HTML output.
> 
> I would say that all of the above are potentially part of the 
> intended audience for the document...
> 
>>  Each of those groups requires different kinds of information from different 
>>  sections of the spec, depending on their specific needs and abilities.  For 
>>  instance, an author who's just writing a blog needs the syntax, semantics 
>>  and content models, but needs it in a way that is very reader friendly.
> 
> I will concede that a casual author of that sort is not likely to 
> be relying directly on this document for information,  but more
> likely instead to rely on an authoring guide

In that case, I think we can agree that the scope needs to be more 
restricted in this area.

>>  Whereas an authoring tool vendor would be able to handle a more formal 
>>  grammar to explain the conforming syntax, but also very likely needs the 
>>  parsing
> 
> I would think (hope) that most authoring-tool vendors are not 
> going to be writing their own HTML5 parsers from scratch,

While it's certainly possible for authoring tools to reuse existing 
parsing libraries, historically, some such tools have included their own 
parsing, and in some cases, rendering.  For example, Dreamweaver has 
traditionally relied on its own custom parsing, rendering and validation 
modules, rather than reusing existing libraries.

> I can't see that authoring-tool developers necessarily need to 
> know or care what the parsing rules are. I can imagine some of 
> them might rather just not know at all and would prefer a separate 
> spec that provided just the document-conformance criteria.

While basic source code editors that provide basic syntax highlighting 
may not need full parsing and generally get by with basic pattern 
matching, WYSIWYG editors at least need some form of parsing and any 
editor that includes a validation feature — something that the formal 
Relax NG grammars found in this document are well suited to — will need 
parsing.  Whether they implement the parsing themselves or use an 
off-the-shelf library is a different question.

>>  Based on the way the draft is written, it's not clear that the draft 
>>  adequately addresses the needs any particular group, nor provides all the 
>>  required information.
> 
> Whether or not is provides all the required information depends on
> what you think all the required information is. This spec is an
> attempt at defining only the required information that is
> necessary for determining whether a particular serialized instance
> of the HTML language is a conformant instance of the language or not.

It would help if that was stated clearly in the scope section, .

>>  It also purports to be a normative document, which indicates that it's 
>>  supposed to be more than just an informative syntax guide.
> 
> Yes, it does currently, though it didn't initially and it's not 
> outside of the realm of possibility that we could eventually end 
> up deciding that it should only be informative and not normative.
> 
> The main reason I added the text that's in there now that asserts 
> it's normative was that I want it to be evaluated using the same 
> expectations for rigor and precision that lack of ambiguity that 
> readers have (or should have) for the existing HTML5 spec or any 
> other spec that attempts to be rigorous and precise and 
> unambiguous.

All documents, whether they are normative or informative, should be held 
to the same rigour and precision.  That in itself is not a satisfactory 
reason for this to be a normative document.

> I could see that without some explicit statement in 
> there making it clear it attempts to be normative, some people 
> were already describing it as an authoring guide and evaluating in 
> terms of their expectations for what an authoring guide should be, 
> and not in terms of what the spec actually attempts to be.

I believe this has more to do with the scope describing it as being for 
authors, rather than apprropriately limiting it to being primarily for 
developers of conformance checking utilities.

>> This is a problem because it duplicates and restructures a lot of 
>> information from the spec itself, but not always by copying it
>> verbatim. Even if it did copy everything verbatim and elimited the 
>> possibility of conflict, I don't understand why any of it needs to 
>> be normatively defined twice.
> 
> If we were to move forward with publishing this spec as a 
> Recommendation-track deliverable, I would expect that ideally it 
> would be as the single normative definition of what a conformant 
> HTML document instance is. That is, we would not be normatively 
> defining anything twice -- whatever we published in this document 
> would necessarily supersede any definition anywhere else of what a 
> conformant HTML document instance is.

Previous threads have already dealt with and the idea of splitting the 
spec along the lines of authoring and implementation conformance 
requirements, and it has already been explained why doing so isn't 
workable; and given that I've established above that this document isn't 
suitable for most authors, attempting to use this to do so will be even 
less workable.

> But I suppose that even if we were not to make it normative, this
> document could also have some value as an informative source. And
> if we were to decide to publish it as such, the current assertions
> within it about it being the normative definition for the language
> would need to be changed to make it clear that it's not.

Agreed.

>> Since both would be normative, what would happen in the event of a conflict? 
> 
> If it were published as a normative spec, it would necessarily be
> the only normative definition of what a conformant document
> instance of the HTML language is, so there would be no conflicts.

Splitting out the authoring conformance requiements from the main spec 
isn't an option, for the reasons outlined in previous threads.

>>  There is also some overlap with the information provided in authoring guide, 
>>  such as providing the element content models, describing the syntax, etc.
> 
> I suppose that's an unavoidable overlap. It's not possible for it
> to normatively define what a conformant document is without 
> providing the content models or the syntax rules.

Fair enough.

>>  However, unlike the authoring guide, it's written in what appears to be some 
>>  type of formal grammar, which isn't really reader friendly, and seems less 
>>  suitable for authors than the way the spec itself is written.
> 
> The choice of using a formalisms to define the content models is
> not entirely consistent with the approach used in specs defining
> other document formats.

I don't understand what you're trying to say here.

> That spec uses both normative prose and informative formalisms.
> And the formalism that the content models in my draft are
> expressed in the same formalism (RELAX NG Compact Schema) as what
> that spec uses.

The major problem with having such formal grammars like Relax NG be in a 
normative spec is that we should not be endorsing any one set of formal 
grammars over any other, regardless of whether that's DTDs, Relax NG, or 
any other.  Even including them as informative sections within a 
normative spec is likely to be problematic.

As you're aware, in the past, HTML 4 and XHTML 1 have endorsed official 
DTDs, yet these days, the problems and limitations with DTDs for 
expressing conformance are well known.  But despite these limitations, 
DTD-based validation is widely seen as the One True Way of evaluating 
conformance of HTML4 and XHTML1.  We should not repeat this same mistake 
with HTML5.  Conformance checking tools should be free and encouraged to 
use alternative methods, based on their own needs, just like any other 
implementers.

However, describing and publishing the Relax NG schemas in an 
informative note will likely serve as a useful guide for people, even if 
they're using alternative techniques.  Additionally, if any other 
schemas are produced, they too could potentially be published informatively.

-- 
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Received on Friday, 23 January 2009 14:05:56 UTC