- From: Shane McCarron <shane@aptest.com>
- Date: Sun, 24 May 2009 12:48:51 -0500
- To: Philip Taylor <pjt47@cam.ac.uk>
- CC: Julian Reschke <julian.reschke@gmx.de>, Sam Ruby <rubys@intertwingly.net>, RDFa Community <public-rdfa@w3.org>, "public-rdf-in-xhtml-tf.w3.org" <public-rdf-in-xhtml-tf@w3.org>, HTML WG <public-html@w3.org>
Philip Taylor wrote: > > Hmm, I think I'm not clear on the context of your statements, so I may > be misunderstanding... As I see things now: > > In the context of RDFa-in-XHTML, any XML parser will preserve the case > of attributes, and as far as I'm aware (though I haven't tested it > extensively) all current RDFa-in-XHTML implementations do > case-sensitive comparisons of prefixes, and the spec requires that, so > it's all self-consistent and fine. I don't think I can speak for all current implementations, but yes - that's how it is supposed to work. > > In the context of RDFa-in-text/html, all current implementations treat > attribute names as lowercase and then do case-sensitive prefix > comparisons. So e.g. <div xmlns:vCard="..." property="vCard:..."> will > fail to extract any triples (because the only defined prefix is > "vcard", not "vCard"). Again, I can't speak for all current implementations. However, certainly *MY* implementation is broken in this way. > > The lowercasing of attribute names is not an issue restricted to > legacy UAs - it's a part of the way HTML works, and (very likely) the > way HTML will always work, and any current or future RDFa-in-text/html > processor that uses an HTML parser will work this way. Well.... its not the way HTML works. HTML just says they are case insensitive. Or rather, the SGML declaration for the HTML 4 DTD says this. In the HTML DOM, my reading is that element and attribute names are returned in uppercase [1]. So no, I don't think they are supposed to be lowercased. I think they are supposed to be uppercased, and I think that some implementations do it wrong. > > In particular, I tested > http://philip.html5.org/demos/rdfa/case-sensitivity-nonwf.html with > recent versions of http://www.w3.org/2006/07/SWD/RDFa/impl/js/ and > rdfQuery in Firefox 3.0 (which die with exceptions but otherwise do > things in lowercase); and pyRdfa, and Swignition, and > http://developer.search.yahoo.com/help/objectfinder?url=..., and all > appear to work in the same way. (Are there any others that support > text/html input that I'm missing?) SPREAD does, but it is not super easy to find or use. Tryhttp://htmlwg.mn.aptest.com/rdfa/extract_rdfa.pl?format=n3&type=html&uri= > > Given that all these implementations work the same, and it would be > very difficult to change them to preserve attribute name case (because > they could no longer use a standard text/html parser), it seems to me > that the specification must specify this behaviour, so that all the > RDFa-in-text/html processors can extract the same triples from the > same documents and so that they can all conform to the spec. > > Am I missing something here? No, I don't think you are missing something. After looking at this a bit, I think it might make sense to indicate that in RDFa in HTML prefix names are case-insensitive. My implementation currently does not work this way, but it would be a relatively easy change. Obviously this is something we would need consensus on in the community. As Julian rightly points out in another mail, it is possible that someone writing a tag-soup based parser would not have this problem and would be opposed to "dumbing down" the HTML profile to accomodate the HTML DOM. But I am inclined to agree that you (and others) are correct - that HTML element and attribute names are inherently case-insensitive, and so a profile for HTML needs to take this into account. [1] http://www.w3.org/TR/DOM-Level-2-HTML/html.html#ID-5353782642 -- Shane P. McCarron Phone: +1 763 786-8160 x120 Managing Director Fax: +1 763 786-8180 ApTest Minnesota Inet: shane@aptest.com
Received on Sunday, 24 May 2009 17:58:57 UTC