Re: Checkpoint 2.2: Proposal for revised provision regarding XML/SGML content from Jon Gunderson on 2002-09-30 (w3c-wai-ua@w3.org from July to September 2002)

From: Jon Gunderson <jongund@uiuc.edu>
Date: Mon, 30 Sep 2002 13:30:31 -0500
To: Al Gilman <asgilman@iamdigex.net>, w3c-wai-ua@w3.org
Message-Id: <5.1.0.14.2.20020930132139.020dcab0@staff.uiuc.edu>
Hi Al,

Thanks for your comments on the requirements of Checkpoint 2.2.  At this 
point I would like to only require the use of header information (media 
type) in determining if the format is composed of text.  I suggest we add a 
informative note saying that when header (media type) information is not 
available the user agent can do other things to determine if a format is 
composed of text and should provide a source view.  But I do not want to 
make the no media type case a requirement, since it would be expanding the 
scope of of the checkpoint at this point in the process.

If you feel strongly we can add it the issues list for the next version of 
the document.

Jon


At 08:44 PM 9/28/2002 -0400, Al Gilman wrote:

>At 01:23 PM 2002-09-28, Ian B. Jacobs wrote:
>>Dear UAWG,
>>
>>This is a proposal to fix a provision of checkpoint 2.2
>>(text view) in UAAG 1.0. Al, I'm addressing you explicitly
>>to ask whether the proposal is properly worded.
>
>Does this situation perhaps fit your concept of "normative inclusions"?
>
>In other words, the proposal quoted below reads as though it defines what
>is a text format for the purposes of this checkpoint, although it is probably
>safer to say that these two conditions define normative inclusions.  Any
>media object satisfying either of these conditions MUST be treated as a text
>object for the purposes of this checkpoint.  But it is not true that all
>media objects the User Agent SHOULD treat as a text media object meet these
>criteria.  For example, a media object of type message/rfc-822 should be
>viewable as text.  I don't think we need to put that in as a normative
>inclusion, but I think it better to state the two conditions as given as
>normative inclusions (clarifying, but not exhausting, the cases to be
>included).
>
>I would also add
>
>   c) [In the absence of authoritative MIME-type information to the contrary,]
>any media object identified as XML or SGML by a DOCTYPE indication conforming
>to the rules of those formats.
>
>It is overly onerous to check a media object for text-ness in general.  But
>it is not overly onerous to check the head of a file for a legal DOCTYPE
>indication.
>
>Note that User Agents receive media objects in many ways.  The HTTP protocol
>is somewhat reliable as regards providing MIME-type information for the data
>so delivered.  The file: and ftp: URI schemes, on the other hand, do not
>provide this kind of information.  In cases like these, using the built-in
>markings of XML and SGML objects is appropriate and should be expected.
>
>Al
>
>PS:
>
>Yes, I think that the group over-reacted before and that this approach
>should remove the developer concern while better preserving the substance of
>the provision.
>
>PPS: The bit about 'in the absence of authoritative information to the
>contrary' could be omitted above.  It's pretty picky.  But User Agents that
>receive data identified in the HTTP headers as of type
>application/octet-stream, for example, should *not* be *required* to check
>the contents at all, even if there is a DOCTYPE inside the file and the
>whole thing is validatable XML or SGML.  There is a principle related to
>this in the working papers [findings] of the TAG.  I still think that User
>Agents should not be discouraged from offering a text view of anything that
>may make some sense in that view.  On the other hand, they should not be
>required to do so when there is strong evidence against this expectation.
>
>>In the Candidate Recommendation, the checkpoint included:
>>
>>   "all SGML and XML applications, regardless of
>>    Internet media type (e.g., HTML 4.01, XHTML 1.1, SMIL, SVG,
>>    etc.)."
>>
>>Some developers during CR indicated that without Internet media type 
>>information, one cannot reliably sniff content to determine that a piece 
>>of content is an instance of an XML application. So in the last call 
>>draft, we deleted this provision.
>>
>>Steven Pemberton's last call comments [1] include:
>>
>>   "I would beef up this definition to at least include XML."
>>
>>We resolved for issue 548 [3] to include the media type
>>application/xhtml+xml in this checkpoint. But upon reflection,
>>I think that's the wrong solution.
>>
>>I think that my proposal to delete the XML provision was the wrong 
>>solution to the problem the developers raised in CR.
>>
>>I now understand the ambiguity in the phrase "any XML application, 
>>regardless of Internet media type". One could interpret this
>>to mean "the user agent has to figure out whether this is XML content, 
>>whatever the media type." I think this ambiguity led
>>to the comments in CR; I have not confirmed with the reviewers.
>>
>>I believe the intention of the second provision was to
>>provide a text view for content identified by media type as
>>XML and SGML content, even if UAAG 1.0 did not identify
>>all such media types (hence "regardless of media type").
>>
>>I would like to reinstate a revised provision as a response
>>to Steven Pemberton's comments.
>>
>><PROPOSAL>
>>For the purposes of this checkpoint, a text format is either:
>>
>>    1) any media object given an Internet media type of
>>      "text" (e.g., "text/plain", "text/html", or "text/*"),
>>       as defined in RFC 2046 [RFC2046], section 4.1., or
>>
>>    2) any media object identified by Internet media type to
>>       be an instance of an XML or SGML application.
>>
>>   Note: As an example, RFC 3023 [2] defines some Internet media
>>   types for XML content.
>></PROPOSAL>
>>
>>I think this proposal:
>>
>>  a) Would satisfy Steven Pemberton's suggestion to "beef up" the
>>     checkpoint.
>>
>>  b) Is in line with our original intent (pre-last call) to provide
>>     a text view of XML content. We deleted that requirement thinking
>>     it was technically unsound, but that discussion was based on
>>     misunderstanding.
>>
>>Thank you,
>>
>>  - Ian
>>
>>[1] http://lists.w3.org/Archives/Public/w3c-wai-ua/2002JulSep/0149
>>[2] http://www.ietf.org/rfc/rfc3023.txt
>>[3] http://www.w3.org/WAI/UA/issues/issues-linear-lc4.html#548
>>
>>--
>>Ian Jacobs (ij@w3.org)   http://www.w3.org/People/Jacobs
>>Tel:                     +1 718 260-9447

Jon Gunderson, Ph.D., ATP
Coordinator of Assistive Communication and Information Technology
Division of Rehabilitation - Education Services
MC-574
College of Applied Life Studies
University of Illinois at Urbana/Champaign
1207 S. Oak Street, Champaign, IL  61820

Voice: (217) 244-5870
Fax: (217) 333-0248

E-mail: jongund@uiuc.edu

WWW: http://www.staff.uiuc.edu/~jongund
WWW: http://www.w3.org/wai/ua
Received on Monday, 30 September 2002 14:24:47 UTC