Re: ACTION-308 (part 2) Updates to 'The Self-Describing Web'

Hi Noah,

On Jan 7, 2010, at 5:21 PM, noah_mendelsohn@us.ibm.com wrote:

> John Kemp wrote:
> 
>> On Jan 6, 2010, at 4:52 PM, noah_mendelsohn@us.ibm.com wrote:
>> 
>> [...]
>> 
>>> Furthermore, the draft text really doesn't explain how allowance for 
>>> sniffing would change the rest of the SDW story.
>> 
>> And that was deliberate. I am not "allowing sniffing" so much 
>> as saying, "if you are going to sniff then do it this way". I 
>> didn't intend to change the meaning of the SDW story at all, or
>> its relationship to the use of authoritative metadata.
> 
> I know, but that's what I'm unhappy about.  I think that once we even 
> bring up the possibility, we should explain the implications.

Fair enough.

> 
> 
> Your draft text is:
> 
> <original>
> As noted above, and for other reasons (such as content aggregation), it 
> may not be possible for a browser to reliably determine, via inspection of 
> a Content-Type HTTP header or other external metadata alone, the intended 
> interpretation of Web content. In such cases, a browser may inspect the 
> content directly (commonly known as "sniffing"). The consequences of such 
> an action are described in [AuthoritativeMetadata]. In particular, 
> sniffing Web content should only be done using an accepted and secure 
> algorithm, such as [BarthSniff].
> </original>
> 
> I would probably be happier with something close too:
> 
> <proposed>
> For the Web to have the desirable properties described in this finding, 
> it's essential that content be served with a media-type that correctly 

I'd prefer 'accurately' to 'correctly' in the above.

> labels its content, and likewise it's essential that user agents such as 
> browsers interpret the received data per the specifications for that 
> media-type.
> Unfortunately, there are many servers on the Web that are not properly 
> configured, and which serve incorrect Content-types.

Isn't it the case that some servers accept content labelled incorrectly (or not all) by another author (or server) and then simply serve it with the Content-type supplied by the original author (or possibly attempt to "sniff" the content themselves)? Should they then be configured to always send an empty Content-type as a more honest admission that they don't know whether the Content-type associated with the content by some other party is accurate? 

>  In particular,
> content intended to be interpreted as text/html, image/jpeg or other 
> common types is sometimes served as text/plain.
> Such incorrect labeling of content is contrary to Web architecture, and it 
> undermines many of the valuable Web characteristics described by this 
> finding.
> 
> Nonetheless, in part because such mislabeled content is common, certain 
> browsers and other user agents have been coded to guess or "sniff" the 
> intended content type, particularly for responses that are explicitly 
> typed as text/plain.  Such sniffing breaks the chain of accountability 
> described in this finding, making it more difficult for a user to hold the 
> publisher responsible for a document's contents.
> 
> Other negative consequences of sniffing are described in the 
> [AuthoritativeMetadata].  For example, "sniffing" can also expose the user 
> agent to security vulnerabilities;  these can to some degree be minimized 
> by using more secure algorithms, such as the ones described in 
> [BarthSniff].
> </proposed>

I'm happy making these changes modulo my comments above.

> 
> This might actually go in as a new, short Chapter 7 in SDW, I think.  That 
> would bump the conclusions section to become #8.

That sounds fine to me.

Regards,

- johnk

> 
> 
>>> After all, we give 
>>> examples in which providers of data are held legally accountable for 
>>> having published certain content, precisely because the chain
>> of normative 
>>> specifications makes clear their correct interpretation.  In 
>> a world where 
>>> people start to "sniff", am I accountable for the (mis) 
>> interpretation of 
>>> something served as text/plain that just happens to resemble 
>> some other 
>>> media type?  The whole point of SDW is to tell stories like that.
>>> 
>>> So, I agree with Larry that we should steer clear of 
>> elevating sniffing to 
>>> being even a good practice at the architecture level (it's not a 
>>> "principle" in the sense of AWWW principles in any case); 
>> even if we do 
>>> want to acknowledge that widespread use of sniffing in practice in a 
>>> revised SDW, I think it behooves us to carefully explain how the core 
>>> stories about accountability and lack of ambiguity are affected.
>> 
>> I agree that it would be good to explain the ambiguity 
>> introduced by sniffing.
> 
> See above for a rough proposal
> 
>>> I think 
>>> we have two choices:  1) leave SDW alone -- it tells a quite coherent 
>>> story at the architecture level, and we can view instances of
>> sniffing as 
>>> deviations from the architecture
>>> or 2) do a very careful job of explaining 
>>> just what does and doesn't change in the SDW story given thatsniffing 
>>> happens.
>> 
>> I have roughly attempted your choice 1) with the understanding 
>> that this was the will of the group. As you note though, we 
>> could do a much more careful job of explaining what changes 
>> given that sniffing happens.
>> 
>> Regards,
>> 
>> - johnk
> --------------------------------------
> Noah Mendelsohn 
> IBM Corporation
> One Rogers Street
> Cambridge, MA 02142
> 1-617-693-4036
> --------------------------------------
> 
> 
> 
> 
> 
> 
> 
> 
> John Kemp <john@jkemp.net>
> Sent by: www-tag-request@w3.org
> 01/07/2010 10:15 AM
> 
>        To:     noah_mendelsohn@us.ibm.com
>        cc:     Larry Masinter <masinter@adobe.com>, "www-tag@w3.org WG" 
> <www-tag@w3.org>
>        Subject:        Re: ACTION-308 (part 2) Updates to 'The 
> Self-Describing Web'
> 
> 
> 
> On Jan 6, 2010, at 4:52 PM, noah_mendelsohn@us.ibm.com wrote:
> 
> [...]
> 
>> Furthermore, the draft text really doesn't explain how allowance for 
>> sniffing would change the rest of the SDW story.
> 
> And that was deliberate. I am not "allowing sniffing" so much as saying, 
> "if you are going to sniff then do it this way". I didn't intend to change 
> the meaning of the SDW story at all, or its relationship to the use of 
> authoritative metadata.
> 
>> After all, we give 
>> examples in which providers of data are held legally accountable for 
>> having published certain content, precisely because the chain of 
> normative 
>> specifications makes clear their correct interpretation.  In a world 
> where 
>> people start to "sniff", am I accountable for the (mis) interpretation 
> of 
>> something served as text/plain that just happens to resemble some other 
>> media type?  The whole point of SDW is to tell stories like that.
>> 
>> So, I agree with Larry that we should steer clear of elevating sniffing 
> to 
>> being even a good practice at the architecture level (it's not a 
>> "principle" in the sense of AWWW principles in any case);  even if we do 
> 
>> want to acknowledge that widespread use of sniffing in practice in a 
>> revised SDW, I think it behooves us to carefully explain how the core 
>> stories about accountability and lack of ambiguity are affected.
> 
> I agree that it would be good to explain the ambiguity introduced by 
> sniffing.
> 
>> I think 
>> we have two choices:  1) leave SDW alone -- it tells a quite coherent 
>> story at the architecture level, and we can view instances of sniffing 
> as 
>> deviations from the architecture
>> or 2) do a very careful job of explaining 
>> just what does and doesn't change in the SDW story given that sniffing 
>> happens.
> 
> I have roughly attempted your choice 1) with the understanding that this 
> was the will of the group. As you note though, we could do a much more 
> careful job of explaining what changes given that sniffing happens.
> 
> Regards,
> 
> - johnk
> 
>> 
>> Noah
>> 
>> --------------------------------------
>> Noah Mendelsohn 
>> IBM Corporation
>> One Rogers Street
>> Cambridge, MA 02142
>> 1-617-693-4036
>> --------------------------------------
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> 
> 

Received on Friday, 8 January 2010 13:20:22 UTC