Re: Review of latest RDFa Core 1.1

My implementation looks at the file extension if no media type is available. Actually, I am not really sure about these things; but most of the media types include suggested file extensions. Because we are talking about a very very small number of media types that we have to recognize (strictly speaking, we have to recognize HTML5, XHTML and that is it; if we want to be nice, one would recognize SVG and, if Toby really writes the note, Atom, too), this seems to be quite enough in practice. I do not do sniffing on doc type.

Ivan


On Mar 6, 2011, at 03:18 , Shane McCarron wrote:

> Gregg,
> 
> Thanks for your comments.  My replies are inline:
> 
> On 3/5/2011 3:05 PM, Gregg Kellogg wrote:
>> I've been doing my own review and updating my processor at the same time. Here are some of my notes:
>> 
>> Notes on review of 3/1 Editors Draft of RDFa Core 1.1: 
>> http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview.html
>> .
>> 
>> Section 4.1 RDFa Processor Conformance
>> 
>> Here it states that the processor *must* examine the media type and process the document as application/xml if it is unable to determine the media type. However, there is no description of how the processor should make this determination. Presumably, for a document retrieved over HTTP, the media type can be determined by examining the Content-Type header to determine this. However, what about documents not retrieved via a medium which reports Content-Type? For example, can a file extension be used to determine the media type of a file on a local file system (e.g., .html, .xhtml, .svg, etc.). I don't see that IANA defines associated file extensions for media types.
>> 
>> Is it acceptable to examine the root element name to determine the document type?
>> 
>> This section requires some clarification.
>> 
> 
> The working group debated this, and decided that it was impossible to specify anything beyond media type.  I personally agree that the DOCTYPE is a fine way to announce the type of a document (hence the name).  But this sort of 'sniffing' seemed beyond the will of the working group. For similar reasons, defining specific suffixes is something that is certainly not going to happen.
> 
> Personally, I would be open to changing the text to read like the following:
> A conforming RDFa Processor must examine the media type of a document it is processing to determine the document's Host Language. If the media type is unavailable, a conforming RDFa Processor MAY look at the document's DOCTYPE to determine if its Formal Public Identifier matches that of a known Host Language.  If the RDFa Processor is unable to determine the document's Host Language, or does not support the Host Language, the RDFa Processor must process the document as if it were media type application/xml. See XML+RDFa Document Conformance.
> However, note that the text above PRECLUDES the HTML5-style doctype with no FPI.  There might be a way to encompass that in this clause... but I personally feel that the absence of an FPI really can't be taken as announcement that a document is written in a specific Host Language.
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Monday, 7 March 2011 09:15:38 UTC