- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Thu, 08 Jul 2010 12:29:46 -0400
- To: RDFa WG <public-rdfa-wg@w3.org>
A discussion about ISSUE-24 that happened after the RDFa WG telecon. <markbirbeck> I'm trying to understand this last issue; does anyone know where in HTML 5 it says that one cannot rely on the case of the tokens in the @rel attribute? <markbirbeck> Oscar Wilde on the version attribute: http://www.quotationspage.com/quote/26788.html <manu> hehe :) <manu> markbirbeck: In HTML4, rel values were case insensitive. <manu> HTML5 is supposed to be backwards compatible with HTML4 and thus it is assumed that they're case insensitive in HTML5 as well. <manu> However, nobody has presented any spec text or done tests to see if this is the case. <manu> so, this may be a good example of the implementations deviating from the spec <markbirbeck> Well, to be more precise, it's the *link types* that are case-insensitive. <manu> yes, exactly right. <markbirbeck> So there's no problem with allowing someone to write "Next" or "next". <markbirbeck> But that doesn't mean all @rel values should be case-insensitive. <manu> so, the question is - should we generate a triple for rel="LiCeNsE" in HTML+RDFa 1.1? <markbirbeck> We only need to be backwards-compatible on the tokens referred to in HTML 4.01. <markbirbeck> No. :) <markbirbeck> But for rel="NeXt"...yes. <manu> I agree - but if we say No, what's the reasoning? <manu> ah, sorry <manu> I keep forgetting that rel="license" isn't in HTML 4.01 as a Link Type <manu> (I think?) <markbirbeck> No, it's not. <manu> So, this is what I was trying to get at on the phone... we make case insensitivity only apply to HTML 4.01 terms. <manu> s/terms/Link Types/ <manu> but there is a parallel issue - are all @rel/@rev values treated in a case-insensitive manner? <manu> I think the answer to that is no <manu> and in fact, I think case is preserved for all rel/rev values. <manu> but I haven't done the tests to see if that's the case. <markbirbeck> Sorry...I thought HTML 5 did exact matching, but it doesn't (just re-read it). <markbirbeck> Any link type that doesn't contain a colon must be compared case-insensitively. <manu> right, which screws us <manu> well, kind-of <markbirbeck> I missed the point as to why that screws us, though...sorry. :( <manu> it makes it confusing... <manu> so, if we have <a vocab="http://purl.org/dc/terms/" rel="conformsTo" href="http://example.org/foo">Conforms to Foo</a> <manu> can we retrieve the value of @rel from an HTML5-built DOM, preserving the case? <manu> or does the HTML5-built DOM lowercase "conformsTo"? <manu> because if it forces "conformsTo" to lowercase, we're screwed. <markbirbeck> I don't see anything that indicates it changes the values in the attributes. <manu> because the predicate that is generated is "http://purl.org/dc/terms/conformsto" and not "http://purl.org/dc/terms/conformsTo" <markbirbeck> I'm reading it that authors are free to type what they like. <manu> right, that's my understanding as well <manu> but nobody has tested it... <manu> (this is also confusing because there are multiple things we're talking about) <markbirbeck> I don't see anything in the pre-processing steps that would even hint at that (just break on space boundaries). <markbirbeck> And also, the fact that anything with a colon in *is* case-sensitive would imply that they are not going to mung the attribute values. <manu> yes, but why would you say that if /all/ values are case-sensitive? <manu> why mention that values with a colon are case sensitive? Does that mean that values without a colon aren't case sensitive? <markbirbeck> I'm not with you...all values aren't case-sensitive, are they? <manu> That's what I'm saying - I think that all values /are/ case-sensitive. <markbirbeck> Yes, that's what I said earlier...the point that I missed when I first read it: <markbirbeck> "The link types that contain no U+003A COLON characters (:), including all those defined in this specification, are ASCII case-insensitive values, and must be compared as such." <markbirbeck> "Thus, rel="next" is the same as rel="NEXT"." <manu> as far as they are /compared/ yes. <manu> So there are two things: comparison and what's stored in the DOM. <markbirbeck> Yes, I see that, but I'm not seeing anything that effects the DOM. <manu> As far as comparison is concerned - rel="next" and rel="NeXt" are the same. <markbirbeck> (In the spec...) <markbirbeck> You realise that this only matters to us if we have this default namespace thing? <markbirbeck> @vocab... <manu> as far as what's stored in the DOM: Can "next" be stored in the DOM as a value for @rel, can "NeXt" be stored in the DOM as a value for @rel? <markbirbeck> Well, the thing is that the spec doesn't say which round things are, so I really doubt that they can mung the value. <manu> yes, exactly <manu> I don't think that they can munge the value. <markbirbeck> If they said that everything is equivalent to the upper-case version (for example) then you could at a push convert the value to UC. <manu> and if they can't munge the value, both HTML5 and XHTML5 are case sensitive. <manu> which means we're good. <markbirbeck> But the spec only says that "abc" == "aBc" == "aBC" .... <markbirbeck> It doesn't say that they are all equivalent to "ABC". <manu> I don't follow... they are all equivalent to "ABC" aren't they? <manu> Comparison: "abc" == "aBc" == "aBC" == "ABC" <markbirbeck> Well, only insofar as they are also all equivalent to "abc", and they are all equivalent to "aBc". <manu> but in the DOM, they're stored as: "abc", "aBc", "aBC", "ABC" <markbirbeck> I.e., they are all equivalent to each other, but there is no lingua franca, so to speak. <manu> So, here's what I think affects RDFa <manu> well, both affect RDFa, but in different ways. <markbirbeck> I.e., the spec doesn't say they are all equivalent to 'x', it just says they are all equivalent to each other. <markbirbeck> There's a world of difference between: <manu> This is how Comparison (equivalence) affects RDFa: We MUST generate triples for NeXT, PREV, inDEx, etc. <markbirbeck> "abc" == "aBc" == "aBC" == "ABC" <markbirbeck> and: <markbirbeck> ("abc" == "ABC") + ("aBc" == "ABC") + ("aBC" == "ABC") <markbirbeck> Ok...we can generate those triples though, can't we? <manu> This is how DOM Storage affects RDFa: If HTML5 parsers munge the values in @rel and @rev when building the DOM, we're in trouble. If HTML5 parsers preserve case in @rel and @rev when building the DOM, we're in good shape. <manu> yes, and the triples we generate should be all lower-case. <manu> MUST be all lower-case. <markbirbeck> Well, now you are mixing up a couple of things. <markbirbeck> The triples we generate will be based on the token mappings. <manu> that's true <markbirbeck> "next" is a token that maps to a URI, isn't it? <manu> yes <manu> but remember... we're not just matching on next <manu> We're also matching on NeXt and NExt <markbirbeck> So we support "NexT", but it's just the same URI. <manu> that is correct. <markbirbeck> Right, but the algorithm is simply that when checking for HTML link-types, use case-insensitive matching. <manu> right <markbirbeck> Now, if HTML 5 leaves the values alone, then we're done. <manu> but that means that it's not just simply "look it up in our list of mappings" <markbirbeck> I don't think it ever was. <markbirbeck> But anyway, it's only an extra function call when you are testing. <manu> it's "determine if the term is a special HTML4.01 term, and then lowercase it, and then look it up in our list of mappings" - or something to that effect. <markbirbeck> If you like. <markbirbeck> :) <manu> so this is all good, but it's the the issue I'm concerned about... <markbirbeck> Or just "compare case-insensitively". <markbirbeck> I.e., match the HTML specs. <manu> the issue I'm concerned about is if there is an HTML5 processor out there that would take this: rel="conformsTo" and store it in the DOM as this: rel="conformsto" <markbirbeck> Yes, now if that *is* the case, then it doesn't affect tokens (or "terms") but it does affect @vocab. <manu> right <manu> and like I said, I don't think that's the case... but I'm concerned about it. <markbirbeck> I've never liked @vocab anyway, but I see why other people do. :) <markbirbeck> I've always felt that @profile is the big leap forward, and is what people will use more. <manu> I agree <markbirbeck> And that is not called into question if HTML 5 munges up @rel values. <manu> but @vocab is useful when you're doing quick one-off snippets. <manu> that all use the same vocab <markbirbeck> Sure. <manu> like OGP or Google's <manu> @vocab is good for beginners. <markbirbeck> But it occurred to me the other day that we should clarify what this does: <markbirbeck> <div profile="#local-profile"> <markbirbeck> ... <markbirbeck> </div> <markbirbeck> So there may be other shorthands available to us. <markbirbeck> And of course, we may not even have to worry, if @rel is untouched. -->| tinkster (tai@81.2.120.180) has joined #rdfa * ivan is peeking in <ivan> mark, @profile value is defined as @href or @src; which also allows for relative URI-s afaik <ivan> and I am a bit afraid that would lead to a mess <ivan> ie, we may want to specify that @profile values are absolute URI-s <ivan> sorry <markbirbeck> No need to apologise. It's not decided yet. :) <ivan> not absolute URIs but not fragments <ivan> or something like that <manu> I'm concerned that, while useful, it complicates implementations and I'm not sure it has a big up-side. <ivan> manu, I agree <manu> I'm straining to think of how I'd use that... <ivan> it will screw up implementations, actually, because processors can get into an infinite cycle <manu> use an in-page profile... <markbirbeck> You guys are too literal. :) <markbirbeck> I'm merely saying we might find another way to give people these quick shorthands. <manu> Hard not to be literal when you're dealing with text on IRC :P <manu> oh, so you mean, replace @vocab w/ something else. <manu> ? <markbirbeck> No...I mean I'm simply throwing out the notion that there are other mechanisms we could work on. <markbirbeck> I'm not trying to define it here. <ivan> let us try not to reopen closed issues, gents, we do want to finish this working group in time:-) <markbirbeck> You've missed the point Ivan. <ivan> ? <manu> no harm in discussing this stuff out of WG, though... <markbirbeck> *If* HTML 5 screws up @rel values and affects their case then the *only* thing that is affected is @vocab. <ivan> yes <manu> right <markbirbeck> If HTML 5 leaves them alone, then nothing changes. <markbirbeck> I'm not trying to get rid of @vocab. <markbirbeck> But if HTML 5 pulls the rug from under us, I'm saying that it's the only thing that would be affected. <markbirbeck> That's it...nothing more. <manu> wouldn't all terms be munged in the case we're worried about... it would affect all terms in the list of mappings. <markbirbeck> So, your scenario (on the telecon) of a profile that has all of the FOAF tokens in it is fine. <manu> why is it fine? <ivan> yes, I realize that <markbirbeck> If HTML 5 munges the attributes then we simply say that tokens are compared case-insensitvely. <ivan> it is, because the exact mappint is in the @profile file <ivan> what is scrwed is @vocab="FOAFURI" <ivan> and then using workplaceHomepage, for example <markbirbeck> Yes, and that's what I was explaining to Manu. <markbirbeck> Right. <manu> ah right, I see. <manu> I had to go back and read how we process xmlns: and @prefix <manu> we force all keys to lowercase. <manu> in the mapping, that is. <markbirbeck> Do we? <manu> all xmlns: values are forced to lowercase. <markbirbeck> I read that in the issue but was surprised. <manu> one sec, let me look at the latest spec. <ivan> but that is fine, it does not affect the triples, just the way prefixes are used <markbirbeck> That's not due to this issue is it? <ivan> that is due to xmlns <markbirbeck> That's to do with @xmlns processing in HTML 5, I think. <manu> Mappings are defined via @prefix. For backward compatibility, some Host Languages may also permit the definition of mappings via @xmlns. In this case, the value to be mapped is set by the XML namespace prefix, and the value to map is the value of the attribute — a URI. Regardless of how the mapping is declared, the value to be mapped must be converted to lower case, and the URI is not... <manu> ...processed in any way; in particular if it is a relative path it is not resolved against the current base. Authors should not use relative paths as the URI. <markbirbeck> Right. But why do we do it for @prefix? <markbirbeck> Anyway...I think I'm up to speed now. <manu> I forget why we do it for @prefix - perhaps to be orthogonal. <markbirbeck> I thought there was some place in the HTML 5 spec that was talking about converting attribute values. <markbirbeck> And I couldn't find it...worried me. <markbirbeck> :) <manu> in any case, what we have in the spec now is bad. <markbirbeck> But sounds like there isn't. <manu> right? <markbirbeck> Which part? <manu> that would mean: prefix="conformsTo: http://purl.org/dc/terms/conformsTo" would lowercase "conformsTo", right? <markbirbeck> Yes, it would. <manu> which would mean that this wouldn't generate a triple: rel="conformsTo" ? <manu> ah, no, I don't think that's right <manu> we keep the 'term mappings' separate from 'prefix mappings' <markbirbeck> Unfortunately, so. :) <markbirbeck> I'll be raising that at last call, of course. ;) <manu> @xmlns: and @prefix only affect 'prefix mappings' <manu> but it saved us in this instance, Mark! :) <manu> well, not really. <markbirbeck> Not really, because you could do case-insensitive matching on the tokens. <markbirbeck> I.e., you could just adopt the HTML 4/5 approach and have done with it. <markbirbeck> Interesting that everyone is critical on relative URIs in @rel, but that's exactly what @vocab+@rel is. <markbirbeck> s/critical on/critical of/ <markbirbeck> But anyway, one approach is to simply adopt the HTML 4/5 approach, and then we're safe, regardless of what the browser does. <markbirbeck> We then have to think of a different way of doing @vocab. |<-- tinkster has left irc.w3.org:6665 (Ping timeout) <manu> I don't think "everyone is critical of relative URIs" <manu> I think "people are critical of relative URIs in certain attributes like @datatype and @property" <manu> people are critical of relative URIs for predicates. <manu> or using relative URIs to specify predicates via @property, @datatype, etc. <manu> that the predicate that you're using is defined in the same document... I think that's what people are critical of <markbirbeck> It seems more difficult to 'ban' it than to leave it in, but anyway, the thing is that @vocab is almost (but not quite) a relative URI. <markbirbeck> It is a little confusing for people, because it gives us yet another way of creating URIs. <markbirbeck> The tricky bit now though is that we have to decide whether to continue with case-sensitive values in @rel... <markbirbeck> ...with the risk that some browsers might pull the rug from under us. <markbirbeck> Or protect ourselves now from this possibility by making the values into 'token-only' values... <markbirbeck> ...and defining token-processing as being case-insensitive. * manu nods. -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Myth Busting Web Stacks - PHP is Faster Than You Think http://blog.digitalbazaar.com/2010/06/12/myth-busting-php/2/
Received on Thursday, 8 July 2010 16:30:19 UTC