- From: Karl Dubost <karl@w3.org>
- Date: Thu, 14 Jun 2007 15:22:39 +0900
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: HTML WG <public-html@w3.org>
Le 14 juin 2007 à 14:16, Henri Sivonen a écrit : > The applicable conformance criteria are [machine-checkable] > criteria for document conformance both in the spec itself and in > normatively referenced other specs. Yes on the same line here. My trouble is what is [machine-checkable]. I see possible discussion on people argueing on what is automatically checkable or not. It is why I said an objective list of criterias to include or exclude would make it easier. >> "A conformance checker must check for the first two criterias. >> 1. Criteria that can be expressed in a DTD. >> 2. Criteria that cannot be expressed by a DTD, but >> can still be checked by a machine. >> 3. Criteria that can only be checked by a human." > > I think it is a bad idea to formulate by mentioning "DTD", because > it wrongly implies an implementation where non-DTD checks augment > DTD-based validation. Good point. DTD/XML Schema/RelaxNG/Whatever schema language. It will make the conformance criteria easier to read. What about "An HTML 5 conformance checker must implement all the machine checkable criteria of this specification. Note: There are criteria that can only be checked by a human and then do not affect an HTML5 conformance checker. Some of the machine criteria can't be expressed by the current schema languages. You should not rely only on schema languages to create an HTML5 conformance checker." >> Then there is a work to know what we consider being checkable by >> machine or human. > > If something is checkable algorithmically without a probabilistic > heuristic (i.e. without guess about the author's intent or about > the meaning of natural-language text), it is machine checkable. In > my experience, at least with a computer science background, it is > obvious whether a given conformance criterion is machine-checkable > when reading the spec. Let's try with a concrete example: q element for quotes. The q element represents a part of a paragraph quoted from another source. Does that mean q is contained in a paragraph (address, aside, navm footer, li, dd, figure, and p.)? http://dev.w3.org/cvsweb/~checkout~/html5/spec/Overview.html#paragraph Does that mean that the content of q element is a part of a paragraph? If it's the former, it means div > q fails. If it's the latter, we can't check that it is indeed a paragraph from another source. Except on closed system. - Let's say human criteria. Content inside a q element must be quoted from another source, whose URI, if it has one, should be cited in the cite attribute. not verifiable. human criteria, except closed system, but might be a repetition of the first sentence. depending on how we interpret it. If the cite attribute is present, it must be a URI (or IRI). checkable. It means that the HTML5 conformance has to check that is it a valid URI or IRI http://www.ietf.org/rfc/rfc3986.txt http://www.ietf.org/rfc/rfc3987.txt User agents should allow users to follow such citation links. N/A. Conformance checkers are not user agents. If a q element is contained (directly or indirectly) in a paragraph that contains a single cite element and has no other q element descendants, then, the citation given by that cite element gives the source of the quotation contained in the q element. Here it is tricky. the association of <p><q cite="urn:isbn:2-07010-579-2">Plus vague et plus soluble dans l'air,</q> est un vers de l'<cite>Art Poétique, Œuvres poétiques complètes, Paul Verlaine.</cite></p> Here a tool can extract "Plus vague et soluble dans l'air," Art poétique, Œuvres poétiques complètes, Paul Verlaine urn:isbn:2-07010-579-2 I can have a process which uses only machines and tries to match the isbn and the title and/or the author. It will be only done by machine. Using for example services like http://worldcatlibraries.org/ wcpa/isbn/2-07010-579-2. Even easier in the case of HTTP URIs. All of that can be done by machine only. >> Conformance checkers must check that the input >> document conforms when scripting is disabled, and >> should also check that the input document conforms >> when scripting is enabled. (This is only a "SHOULD" >> and not a "MUST" requirement because it has been >> proven to be impossible. [HALTINGPROBLEM]) >> >> Is the intented purpose of this is to define two levels of >> Conformance? > > What would the other level of conformance be? If it involves > executing scripts, would it be OK for conformance to the other > level to be undecidable be machine in a general case? # must and should * Conformance checkers must check all the must and should. * Conformance checkers must check all the must only. If the script is not executed and because it is a should, the Conformance checker silently ignores it and says conforms? Then *another* conformance checker (more performant) had success running scripts and sees that the document does not conform, what does it say? Once the document is conformant and once it is not. > > A snapshot of the DOM in a browser at a user-chosen point in time > could be checked for conformance, though. This would, again, not > involve executing scripts during the conformance checking process. user-chosen point in time? No human interaction, we said above. >> The term "HTML5 validator" can be used to refer to a >> conformance checker that itself conforms to the >> applicable requirements of this specification. >> >> The way it is written here would mean that the piece of software >> has to be written in HTML 5, which doesn't make sense in many cases. > > No, the *applicable* conformance criteria for whether a conformance > checker itself is a conforming conformance checker aren't the > conformance criteria for documents. > >> Suggestion: "The term HTML5 validator can be used to refer to a >> software that meets the Conformance Checker requirements of this >> specification." > > Yeah, it is better to say what "applicable" means. Indeed. -- Karl Dubost - http://www.w3.org/People/karl/ W3C Conformance Manager, QA Activity Lead QA Weblog - http://www.w3.org/QA/ *** Be Strict To Be Cool ***
Received on Thursday, 14 June 2007 06:22:52 UTC