- From: Jason White <jasonw@ariel.ucs.unimelb.edu.au>
- Date: Fri, 15 Nov 2002 13:20:03 +1100
- To: Web Content Guidelines <w3c-wai-gl@w3.org>
To resolve various outstanding issues surrounding checkpoints 1.1, 1.3, 1.6, 4.2 and 4.3 I propose to add a new technical term to the document and to use it consistently throughout. I am open to suggestions of what the term should be, my initial proposal is that we use the phrase "explicit encoding", which should be carefully defined in the guidelines document and in the glossary. This move is best justified by describing the problem it is intended to resolve. Early in the development of WCAG 2.0, it was realized that the terms "markup" and "markup language" are too restrictive to characterize all of the various mechanisms by which logical structure and other information can be represented in a well defined form that is amenable to automated processing. The main limitation of these terms is that they can be read as designating certain kinds of syntactical constructs (e.g., XML or SGML syntax) rather than the more abstract concept of a well defined data structure that permits retrieval of the required information. The W3C has recognized this, hence the development of the XML information set, which separates the tree structure of an XML document (and the various informational items it contains) from the standard syntax defined in the XML 1.0 specification. Other relevant examples include metadata, which although often expressed in markup languages are not part of an XML or (X)HTML document with which they are associated, and, in non-W3C technologies, tagged PDF, which is analogous to XML in many respects - it comprises a tree of elements with associated attributes - but is represented in an entirely different syntax. Of course, if content is stored in a data base and retrieved from there by the user agent, then there may be no markup language at all, merely, perhaps, an XML information set, or equivalent. Thus I think we need a technical term that expresses what is conceptually common to all of the foregoing examples. "Explicit encoding" appears to be reasonably suggestive of what is meant, though of course its precise signification would have to be specified in a definition. The checkpoint 1.3 success criteria would then be rewritten to take this new term into account, e.g., at level 1 criterion 2: Each of the following is provided in an explicit encoding Likewise for checkpoint 1.1 ("Non-text content that can be expressed in words has a text equivalent associated with it in an explicit encoding" - or we could separate this into two criteria: first that there is a text equivalent, and secondly that it is associated with the non-text content via an explicit encoding). This terminology would also be employed with respect to acronyms and abbreviations, concept codes (under checkpoint 4.2, as Lisa has proposed) and in other suitable contexts. If necessary, instead of writing simply "explicit encoding" (under checkpoint 1.3 for example) one could write "explicit encoding, for example a markup language) to clarify the point for the casual or inattentive reader, while conformance would still be judged by reference to the technical term and its corresponding definition. On the subject of the definition itself, further work is required. Here is a preliminary attempt, however. An explicit encoding includes, but is not limited to 1. a markup language 2. metadata 3. an information set. Information is said to be encoded explicitly if it can be programmatically derived (by a deterministic algorithm) from the format or data structure in which it is represented. Note that the reference to programmatic derivation comes from the existing wording of checkpoint 1.3. Obviously the definition can, and ought to be, tightened, but I think the general idea is clear. Concept codes, on this view, are merely an extension of the idea of an explicit encoding. Comments? Suggestions? Counter-proposals?
Received on Thursday, 14 November 2002 21:20:10 UTC