- From: Niklas Lindström <lindstream@gmail.com>
- Date: Fri, 15 Apr 2011 11:51:14 +0200
- To: Mark Birbeck <mark.birbeck@webbackplane.com>
- Cc: public-rdfa-wg <public-rdfa-wg@w3.org>
Hi Mark! That the interpretation of the lexical representation within an RDFa attribute is dependent on context -- i.e. the base URI and prefixes within scope -- is perfectly fine by me. But that the value space mixes CURIEs and URIs so that a prefix declaration irrevocably will override what could very well be intended as a scheme on a URI is quite problematic. Consider this example, with the premise that the value of @about is provided by somone (e.g. the news group owner) who *does not* control the template behind the RDFa markup: <html prefix="news: http://example.org/def/news# rdfs: http://www.w3.org/2000/01/rdf-schema#"> <body rel="news:hasGroup"> <div about="news://news.server.example/example.group.this/" property="rdfs:label">The Example News Group</div> This can be fixed, while retaining the feature of not always having to use SafeCURIEs, by not allowing @about and @resource to contain unsafe CURIEs, i.e containing characters which makes them confusable with absolute URIs. I therefore suggest that the use of SafeCURIEorCURIEorURI is changed to RestrictedCURIEOrSafeCURIEorURI, where RestrictedCURIE means QName, or "isegment-nz-nc", or "reference ::= ipath-absolute / ipath-noscheme / ipath-empty" (which Nathan suggested). Best regards, Niklas 2011/4/13 Mark Birbeck <mark.birbeck@webbackplane.com>: > Hi Niklas, > > Everything you say is true. :) > > However, the big change in the working group's thinking came when we > decided that it was impossible to guarantee correct interpretation of > strings of text based solely on their format, and so instead we should > rely on the strings' contexts. > > By using context to aid in the interpretation of a string we get a lot > more flexibility, and we can unambiguously work out what things like > this mean: > > foaf:Agent > > Without context it *looks* like all of the following: > > * a string of text with no particular meaning; > * a QName; > * a CURIE; > * a relative URI using the 'foaf' scheme. > > However, we decided in the working group that if no prefix mapping for > 'foaf' was defined in the context for this string, then the string was > *by definition* not a CURIE. > > Whether it therefore becomes a string of text or a URI is a separate > processing step, and nothing to do with CURIE processing, but by > taking the approach we did in the CURIE processing layer we at least > made it possible for 'foaf:Agent' to be interpreted as a URI. > > The converse also holds; if a mapping for 'foaf' is defined, then the > string above is *by definition* a CURIE. Now whether some host > language decides to interpret the string as a CURIE above a URI is up > to that host language, but RDFa does so. > > Personally I was very pleased when we took the step to take context > into account when interpreting strings. Until that point we were > trying to achieve the impossible -- imagining that a string on its own > could tell you everything about what it was. Now it's very easy to > interpret both of these strings correctly: > > foaf:Agent > http://www.w3.org/ > > simply by using the context. > > Best regards, > > Mark > > > 2011/4/11 Niklas Lindström <lindstream@gmail.com>: >> Hi all! >> >> Is it correct that the RDFa WG is currently recommending letting >> CURIEs share the same value space as regular URIs, and so that any >> prefix defined with the same value as a scheme, like "http", "https", >> "news" etc. will change the URI for any absolute URI using those >> schemes? >> >> I remember worrying about this last year, but I haven't followed the >> decision process in detail since then. It just worries me that letting >> these things collide will blow up for anyone who happens to use at >> least "http" or "https" as prefixes (perhaps rendering prefixes using >> a tool, or getting them from a profile out of their control). Or >> perhaps worse, people believing it safe to use anything but "http(s)" >> as prefixes, which will work until something other than those two >> comes along in the next 10 years or so. It might happen; and if it >> does, it may quite probably be beyond the controls of RDFa specs and >> tools. >> >> (An example: some vocabulary "Wide Exceptional Graphs" becomes >> popular, using "wxg" as a prefix. Then Google comes along with a new >> wxg scheme ("Web Extended by Google"), and soon lots of resources are >> linked with that instead of old "http". Or for that matter, that some >> other scheme [3] becomes popular again for whatever reason.) >> >> I vaguely recall the WG saying something about defining "http" as a >> prefix is bad practise. But this turns up here and there, not least >> since the HTTP Vocabulary Draft [1] (<http://www.w3.org/2006/http#>) >> recommend it as a prefix. And I just ran across "http" as a prefix in >> the Tabulator source as well [2]. >> >> While I understand that it is confusing to use it as a prefix, I am >> not convinced that it is safe to combine the CURIE and URI value space >> like this. At least not without a limit on the CURIEs allowed in the >> joint CURIEorURI space. For instance, not allowing CURIEs in that >> space to use anything after the prefix+':' other than say an >> isegment-nz-nc from RFC 3987, or something to that effect (like a >> "[A-Za-z0-9_-.]+" regexp). >> >> If there was such a restriction on the format of CURIEs are allowed in >> the CURIEorURI mix (and that anything not matching it would be >> considered a full URI), I would definitely sleep better. :) >> >> Am I missing something crucial, or overly worried about the risk of collisions? >> >> Best regards, >> Niklas >> >> [1]: http://www.w3.org/TR/HTTP-in-RDF10/ >> [2]: http://dig.csail.mit.edu/hg/tabulator/file/9a135feff10f/chrome/content/js/rdf/rdflib.js#l5644 >> [3]: http://en.wikipedia.org/wiki/URI_scheme >> >> >
Received on Friday, 15 April 2011 09:52:03 UTC