Re: The exact meaning of a 'global identifier' (itemid)

\On Jun 9, 2014, at 2:37 PM, Markus Lanthaler <markus.lanthaler@gmx.net> wrote:

> On Monday, June 09, 2014 11:14 PM, Jarno van Driel wrote:
>>> "...So, if your document lives at http://example.com/document the
>>> "global identifier" will behttp://example.com/document#fragment"
>> 
>> 1] So this because this is simply how it works then or because that's
>> how schema.org treats itemid? Now I'm not being a smartass here, I
>> just really want to understand how this is treated from a schema.org
>> POV
> 
> Because it’s defined that way in the Microdata spec:
> 
>     The global identifier of an item is the value of its element's
>     itemid attribute, if it has one, resolved relative to the element
>     on which the attribute is specified. If the itemid attribute is
>     missing or if resolving it fails, it is said to have no global
>     identifier.
> 
> 
>>> "...all properties would be merged so that you end up with a single
>>> item..."
>> 
>> 2] That's what I thought as well. Which is supported by the fact the
>> structured data linter resolves it this way. But both Google's and
>> Yandex's SDTT don't and there is no info I could find on how the
>> sponsors look at it. So inconclusive data VS no documentation; What am
>> I to believe for certain?
> 
> Only Google, Yandex etc. themselves will be able to tell you what they do with such data. Maybe you should formulate your question differently. Does it affect you if they do it one way or the other way? In which way does it affect you? Does it matter?

Of course, anyone may interpret Microdata however they find useful; it would be nice to get some definitive interpretation from schema.org partners, but they've (collectively) never been forthcoming on what the real semantics of schema.org.

That said, schema.org largely extends RDF Schema, and so, IMO, if you treat the interpretation as being consistent with RDF/RDFS, I think you'll be in the right direction; this is how the Structured-Data Linter interprets it. (if someone has an example where this is NOT the case, I'd be quite interested to see it).

When interpreting as RDF, it's useful to look at the Microdata to RDF note [1]. This defines @itemid as follows:

	[[An attribute containing a URL used to identify the subject of triples associated with this item ...]]

When considered as an RDF subject, we can then infer that each use of the same @itemid value does refer, in fact, to the same Resource. This means that multiple uses of it will have their property values coalesced to be about the same subject resource.

Furthermore, when interpreting @itemid, the algorithm says that if there is a global identifier which is an absolute URL, use that as the subject, otherwise use a blanknode [2]. It then becomes necessary to find the content-model of @itemid, which is defined in the Microdata note [2]:

	[[
	The global identifier of an item is the value of its element's itemid attribute, if it has one, resolved relative to the element on which the attribute is specified. If the itemid attribute is missing or if resolving it fails, it is said to have no global identifier.
	]]

This, @itemid values are interpreted the same way as other HTML attributes taking a URL [4]. This is consistent with the way that relative IRIs are resolved in other RDF specs, such as RDFa, although with regard to HTML's definition of a URL, rather than an IRI. Basically, "foo" is resolved relative to the base URL of the document, "#foo" is considered as a fragment identifier of that document base, and "/foo" would be considered an absolute path within the same domain as the document.

Some examples:

Given the document base <http://example.com/foo/bar>

@itemid="baz" => <http://example.com/foo/baz>
@itemid="#baz" => <http://example.com/foo/bar#baz>
@itemid="/baz" => <http://example.com/baz>
@itemid="http://example.org/baz" => <http://example.org/baz>

Gregg

[1] http://www.w3.org/TR/microdata-rdf/
[2] http://www.w3.org/TR/microdata-rdf/#generate-the-triples
[3] http://www.w3.org/TR/microdata/#items
[4] http://www.w3.org/TR/html5/infrastructure.html#resolving-urls

> --
> Markus Lanthaler
> @markuslanthaler
> 
> 
> 
> 

Received on Monday, 9 June 2014 22:09:34 UTC