Re: Canonicalizing relative URLs seen in URL type properties?

Hi Philip:

afaik, 

   http://schema.org/InStock 

is not an itemtype but the identifier of an individual representing a value of the type

   http://schema.org/ItemAvailability

So I am unsure on how to interpret your statement:

> Note of caution: the following is about URLs in itemtype, while the original thread is about URLs in itemprop. They are not the same.

http://schema.org/InStock will only be used in patterns like

<link itemprop="availability" href="http://schema.org/InStock"/>

You also said that any client doing any kind of lax handling of this identifier would be "non-compliant"; however, I am sure that at least some major search engines will eventually tolerate

<link itemprop="availability" href="http://schema.org/instock"/>

You can see in Google's testing tool that they internally normalize all itemprop URIs to lower case.


Martin

On Oct 20, 2011, at 11:31 AM, Philip Jägenstedt wrote:

> Note of caution: the following is about URLs in itemtype, while the original thread is about URLs in itemprop. They are not the same.
> 
> On Thu, 20 Oct 2011 11:02:36 +0200, Martin Hepp <martin.hepp@ebusiness-unibw.org> wrote:
> 
>> Hi Philipp:
>> 
>> (I think we started to discuss this in the old schema.org forum, but never managed to finish on this).
>> 
>> What is the Microdata take on the canonicalization of property and href/object identifiers?
>> 
>> So for example
>> 
>>    http://schema.org/InStock
>> 
>> could be also expressed using
>> 
>>    http://SCHEMA.ORG/InStock
>>    http://Schema.org/InStock
>>    http://schema.org:80/InStock
>> 
>> and other variations, supported by RFC 2616 [1, section 3.2.3].
>> 
>> From a HTTP protocol perspective, they are all equal, but even if desirable, it will be difficult for data consumers to spot the equivalence in queries if those are used as identifiers.
>> 
>> We once had a lengthy discussion on this in
>> 
>>   http://lists.w3.org/Archives/Public/public-lod/2011Jan/0134.html
>> 
>> and the general conclusion seems to have been as follows:
>> 
>> 1. When used as locators (i.e. to retrieve a representation), all variants will deliver the same representations.
>> 2. When used as identifiers (i.e. to reference to an entity), only the canonical URI is guaranteed to work.
>> 3. RDF implementations would work better if they did implicit canonicalization, at least for the basic HTTP URI variations from RFC 2616 section 3.2.3.
>> 4. TBL had a strong opinion that RDF environments should do the canonicalization, while others stressed the enormous technical difficulties given the broad range of URI schemes and their different canonicalization rules.
> 
> The spec [1] is quite explicit about this: "Item types are opaque identifiers, and user agents must not dereference unknown item types, or otherwise deconstruct them, in order to determine how to process items that use them."
> 
> While the item type is defined to be an absolute URL, it's never really treated as a URL and the exact string "http://schema.org/InStock" is the only string that can be used.
> 
> This trait is evident in the DOM API, where itemType is reflected as a string (not resolved like URL properties are) and document.getItems does a case-sensitive string match, not checking any kind of URL equivalence.
> 
> (One could argue that having two kinds of URLs are confusing and that itemtype should be resolved, but I won't, since it would make the DOM API more complicated.)
> 
> [1] http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#items
> 
> -- 
> Philip Jägenstedt
> Core Developer
> Opera Software
> 

Received on Tuesday, 1 November 2011 17:26:29 UTC