W3C home > Mailing lists > Public > public-rdfa-wg@w3.org > July 2010

ISSUE-24 discussion

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Thu, 08 Jul 2010 12:29:46 -0400
Message-ID: <4C35FCFA.4010401@digitalbazaar.com>
To: RDFa WG <public-rdfa-wg@w3.org>
A discussion about ISSUE-24 that happened after the RDFa WG telecon.

<markbirbeck> I'm trying to understand this last issue; does anyone know
where in HTML 5 it says that one cannot rely on the case of the tokens
in the @rel attribute?
<markbirbeck> Oscar Wilde on the version attribute:
<manu> hehe :)
<manu> markbirbeck: In HTML4, rel values were case insensitive.
<manu> HTML5 is supposed to be backwards compatible with HTML4 and thus
it is assumed that they're case insensitive in HTML5 as well.
<manu> However, nobody has presented any spec text or done tests to see
if this is the case.
<manu> so, this may be a good example of the implementations deviating
from the spec
<markbirbeck> Well, to be more precise, it's the *link types* that are
<manu> yes, exactly right.
<markbirbeck> So there's no problem with allowing someone to write
"Next" or "next".
<markbirbeck> But that doesn't mean all @rel values should be
<manu> so, the question is - should we generate a triple for
rel="LiCeNsE" in HTML+RDFa 1.1?
<markbirbeck> We only need to be backwards-compatible on the tokens
referred to in HTML 4.01.
<markbirbeck> No. :)
<markbirbeck> But for rel="NeXt"...yes.
<manu> I agree - but if we say No, what's the reasoning?
<manu> ah, sorry
<manu> I keep forgetting that rel="license" isn't in HTML 4.01 as a Link
<manu> (I think?)
<markbirbeck> No, it's not.
<manu> So, this is what I was trying to get at on the phone... we make
case insensitivity only apply to HTML 4.01 terms.
<manu> s/terms/Link Types/
<manu> but there is a parallel issue - are all @rel/@rev values treated
in a case-insensitive manner?
<manu> I think the answer to that is no
<manu> and in fact, I think case is preserved for all rel/rev values.
<manu> but I haven't done the tests to see if that's the case.
<markbirbeck> Sorry...I thought HTML 5 did exact matching, but it
doesn't (just re-read it).
<markbirbeck> Any link type that doesn't contain a colon must be
compared case-insensitively.
<manu> right, which screws us
<manu> well, kind-of
<markbirbeck> I missed the point as to why that screws us,
though...sorry. :(
<manu> it makes it confusing...
<manu> so, if we have <a vocab="http://purl.org/dc/terms/"
rel="conformsTo" href="http://example.org/foo">Conforms to Foo</a>
<manu> can we retrieve the value of @rel from an HTML5-built DOM,
preserving the case?
<manu> or does the HTML5-built DOM lowercase "conformsTo"?
<manu> because if it forces "conformsTo" to lowercase, we're screwed.
<markbirbeck> I don't see anything that indicates it changes the values
in the attributes.
<manu> because the predicate that is generated is
"http://purl.org/dc/terms/conformsto" and not
<markbirbeck> I'm reading it that authors are free to type what they like.
<manu> right, that's my understanding as well
<manu> but nobody has tested it...
<manu> (this is also confusing because there are multiple things we're
talking about)
<markbirbeck> I don't see anything in the pre-processing steps that
would even hint at that (just break on space boundaries).
<markbirbeck> And also, the fact that anything with a colon in *is*
case-sensitive would imply that they are not going to mung the attribute
<manu> yes, but why would you say that if /all/ values are case-sensitive?
<manu> why mention that values with a colon are case sensitive? Does
that mean that values without a colon aren't case sensitive?
<markbirbeck> I'm not with you...all values aren't case-sensitive, are they?
<manu> That's what I'm saying - I think that all values /are/
<markbirbeck> Yes, that's what I said earlier...the point that I missed
when I first read it:
<markbirbeck> "The link types that contain no U+003A COLON characters
(:), including all those defined in this specification, are ASCII
case-insensitive values, and must be compared as such."
<markbirbeck> "Thus, rel="next" is the same as rel="NEXT"."
<manu> as far as they are /compared/ yes.
<manu> So there are two things: comparison and what's stored in the DOM.
<markbirbeck> Yes, I see that, but I'm not seeing anything that effects
the DOM.
<manu> As far as comparison is concerned - rel="next" and rel="NeXt" are
the same.
<markbirbeck> (In the spec...)
<markbirbeck> You realise that this only matters to us if we have this
default namespace thing?
<markbirbeck> @vocab...
<manu> as far as what's stored in the DOM: Can "next" be stored in the
DOM as a value for @rel, can "NeXt" be stored in the DOM as a value for
<markbirbeck> Well, the thing is that the spec doesn't say which round
things are, so I really doubt that they can mung the value.
<manu> yes, exactly
<manu> I don't think that they can munge the value.
<markbirbeck> If they said that everything is equivalent to the
upper-case version (for example) then you could at a push convert the
value to UC.
<manu> and if they can't munge the value, both HTML5 and XHTML5 are case
<manu> which means we're good.
<markbirbeck> But the spec only says that "abc" == "aBc" == "aBC" ....
<markbirbeck> It doesn't say that they are all equivalent to "ABC".
<manu> I don't follow... they are all equivalent to "ABC" aren't they?
<manu> Comparison: "abc" == "aBc" == "aBC" == "ABC"
<markbirbeck> Well, only insofar as they are also all equivalent to
"abc", and they are all equivalent to "aBc".
<manu> but in the DOM, they're stored as: "abc", "aBc", "aBC", "ABC"
<markbirbeck> I.e., they are all equivalent to each other, but there is
no lingua franca, so to speak.
<manu> So, here's what I think affects RDFa
<manu> well, both affect RDFa, but in different ways.
<markbirbeck> I.e., the spec doesn't say they are all equivalent to 'x',
it just says they are all equivalent to each other.
<markbirbeck> There's a world of difference between:
<manu> This is how Comparison (equivalence) affects RDFa: We MUST
generate triples for NeXT, PREV, inDEx, etc.
<markbirbeck> "abc" == "aBc" == "aBC" == "ABC"
<markbirbeck> and:
<markbirbeck> ("abc" == "ABC") + ("aBc" == "ABC") + ("aBC" == "ABC")
<markbirbeck> Ok...we can generate those triples though, can't we?
<manu> This is how DOM Storage affects RDFa: If HTML5 parsers munge the
values in @rel and @rev when building the DOM, we're in trouble. If
HTML5 parsers preserve case in @rel and @rev when building the DOM,
we're in good shape.
<manu> yes, and the triples we generate should be all lower-case.
<manu> MUST be all lower-case.
<markbirbeck> Well, now you are mixing up a couple of things.
<markbirbeck> The triples we generate will be based on the token mappings.
<manu> that's true
<markbirbeck> "next" is a token that maps to a URI, isn't it?
<manu> yes
<manu> but remember... we're not just matching on next
<manu> We're also matching on NeXt and NExt
<markbirbeck> So we support "NexT", but it's just the same URI.
<manu> that is correct.
<markbirbeck> Right, but the algorithm is simply that when checking for
HTML link-types, use case-insensitive matching.
<manu> right
<markbirbeck> Now, if HTML 5 leaves the values alone, then we're done.
<manu> but that means that it's not just simply "look it up in our list
of mappings"
<markbirbeck> I don't think it ever was.
<markbirbeck> But anyway, it's only an extra function call when you are
<manu> it's "determine if the term is a special HTML4.01 term, and then
lowercase it, and then look it up in our list of mappings" - or
something to that effect.
<markbirbeck> If you like.
<markbirbeck> :)
<manu> so this is all good, but it's the the issue I'm concerned about...
<markbirbeck> Or just "compare case-insensitively".
<markbirbeck> I.e., match the HTML specs.
<manu> the issue I'm concerned about is if there is an HTML5 processor
out there that would take this: rel="conformsTo" and store it in the DOM
as this: rel="conformsto"
<markbirbeck> Yes, now if that *is* the case, then it doesn't affect
tokens (or "terms") but it does affect @vocab.
<manu> right
<manu> and like I said, I don't think that's the case... but I'm
concerned about it.
<markbirbeck> I've never liked @vocab anyway, but I see why other people
do. :)
<markbirbeck> I've always felt that @profile is the big leap forward,
and is what people will use more.
<manu> I agree
<markbirbeck> And that is not called into question if HTML 5 munges up
@rel values.
<manu> but @vocab is useful when you're doing quick one-off snippets.
<manu> that all use the same vocab
<markbirbeck> Sure.
<manu> like OGP or Google's
<manu> @vocab is good for beginners.
<markbirbeck> But it occurred to me the other day that we should clarify
what this does:
<markbirbeck> <div profile="#local-profile">
<markbirbeck> ...
<markbirbeck> </div>
<markbirbeck> So there may be other shorthands available to us.
<markbirbeck> And of course, we may not even have to worry, if @rel is
-->| tinkster (tai@ has joined #rdfa
* ivan is peeking in
<ivan> mark, @profile value is defined as @href or @src; which also
allows for relative URI-s afaik
<ivan> and I am a bit afraid that would lead to a mess
<ivan> ie, we may want to specify that @profile values are absolute URI-s
<ivan> sorry
<markbirbeck> No need to apologise. It's not decided yet. :)
<ivan> not absolute URIs but not fragments
<ivan> or something like that
<manu> I'm concerned that, while useful, it complicates implementations
and I'm not sure it has a big up-side.
<ivan> manu, I agree
<manu> I'm straining to think of how I'd use that...
<ivan> it will screw up implementations, actually, because processors
can get into an infinite cycle
<manu> use an in-page profile...
<markbirbeck> You guys are too literal. :)
<markbirbeck> I'm merely saying we might find another way to give people
these quick shorthands.
<manu> Hard not to be literal when you're dealing with text on IRC :P
<manu> oh, so you mean, replace @vocab w/ something else.
<manu> ?
<markbirbeck> No...I mean I'm simply throwing out the notion that there
are other mechanisms we could work on.
<markbirbeck> I'm not trying to define it here.
<ivan> let us try not to reopen closed issues, gents, we do want to
finish this working group in time:-)
<markbirbeck> You've missed the point Ivan.
<ivan> ?
<manu> no harm in discussing this stuff out of WG, though...
<markbirbeck> *If* HTML 5 screws up @rel values and affects their case
then the *only* thing that is affected is @vocab.
<ivan> yes
<manu> right
<markbirbeck> If HTML 5 leaves them alone, then nothing changes.
<markbirbeck> I'm not trying to get rid of @vocab.
<markbirbeck> But if HTML 5 pulls the rug from under us, I'm saying that
it's the only thing that would be affected.
<markbirbeck> That's it...nothing more.
<manu> wouldn't all terms be munged in the case we're worried about...
it would affect all terms in the list of mappings.
<markbirbeck> So, your scenario (on the telecon) of a profile that has
all of the FOAF tokens in it is fine.
<manu> why is it fine?
<ivan> yes, I realize that
<markbirbeck> If HTML 5 munges the attributes then we simply say that
tokens are compared case-insensitvely.
<ivan> it is, because the exact mappint is in the @profile file
<ivan> what is scrwed is @vocab="FOAFURI"
<ivan> and then using workplaceHomepage, for example
<markbirbeck> Yes, and that's what I was explaining to Manu.
<markbirbeck> Right.
<manu> ah right, I see.
<manu> I had to go back and read how we process xmlns: and @prefix
<manu> we force all keys to lowercase.
<manu> in the mapping, that is.
<markbirbeck> Do we?
<manu> all xmlns: values are forced to lowercase.
<markbirbeck> I read that in the issue but was surprised.
<manu> one sec, let me look at the latest spec.
<ivan> but that is fine, it does not affect the triples, just the way
prefixes are used
<markbirbeck> That's not due to this issue is it?
<ivan> that is due to xmlns
<markbirbeck> That's to do with @xmlns processing in HTML 5, I think.
<manu> Mappings are defined via @prefix. For backward compatibility,
some Host Languages may also permit the definition of mappings via
@xmlns. In this case, the value to be mapped is set by the XML namespace
prefix, and the value to map is the value of the attribute  a URI.
Regardless of how the mapping is declared, the value to be mapped must
be converted to lower case, and the URI is not...
<manu> ...processed in any way; in particular if it is a relative path
it is not resolved against the current base. Authors should not use
relative paths as the URI.
<markbirbeck> Right. But why do we do it for @prefix?
<markbirbeck> Anyway...I think I'm up to speed now.
<manu> I forget why we do it for @prefix - perhaps to be orthogonal.
<markbirbeck> I thought there was some place in the HTML 5 spec that was
talking about converting attribute values.
<markbirbeck> And I couldn't find it...worried me.
<markbirbeck> :)
<manu> in any case, what we have in the spec now is bad.
<markbirbeck> But sounds like there isn't.
<manu> right?
<markbirbeck> Which part?
<manu> that would mean: prefix="conformsTo:
http://purl.org/dc/terms/conformsTo" would lowercase "conformsTo", right?
<markbirbeck> Yes, it would.
<manu> which would mean that this wouldn't generate a triple:
rel="conformsTo" ?
<manu> ah, no, I don't think that's right
<manu> we keep the 'term mappings' separate from 'prefix mappings'
<markbirbeck> Unfortunately, so. :)
<markbirbeck> I'll be raising that at last call, of course. ;)
<manu> @xmlns: and @prefix only affect 'prefix mappings'
<manu> but it saved us in this instance, Mark! :)
<manu> well, not really.
<markbirbeck> Not really, because you could do case-insensitive matching
on the tokens.
<markbirbeck> I.e., you could just adopt the HTML 4/5 approach and have
done with it.
<markbirbeck> Interesting that everyone is critical on relative URIs in
@rel, but that's exactly what @vocab+@rel is.
<markbirbeck> s/critical on/critical of/
<markbirbeck> But anyway, one approach is to simply adopt the HTML 4/5
approach, and then we're safe, regardless of what the browser does.
<markbirbeck> We then have to think of a different way of doing @vocab.
|<-- tinkster has left irc.w3.org:6665 (Ping timeout)
<manu> I don't think "everyone is critical of relative URIs"
<manu> I think "people are critical of relative URIs in certain
attributes like @datatype and @property"
<manu> people are critical of relative URIs for predicates.
<manu> or using relative URIs to specify predicates via @property,
@datatype, etc.
<manu> that the predicate that you're using is defined in the same
document... I think that's what people are critical of
<markbirbeck> It seems more difficult to 'ban' it than to leave it in,
but anyway, the thing is that @vocab is almost (but not quite) a
relative URI.
<markbirbeck> It is a little confusing for people, because it gives us
yet another way of creating URIs.
<markbirbeck> The tricky bit now though is that we have to decide
whether to continue with case-sensitive values in @rel...
<markbirbeck> ...with the risk that some browsers might pull the rug
from under us.
<markbirbeck> Or protect ourselves now from this possibility by making
the values into 'token-only' values...
<markbirbeck> ...and defining token-processing as being case-insensitive.
* manu nods.

-- manu

Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: Myth Busting Web Stacks - PHP is Faster Than You Think
Received on Thursday, 8 July 2010 16:30:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 04:55:07 GMT