Re: recommended pattern for markup-valued 'breadcrumb' properties in RDFa

On Nov 12, 2012, at 4:24 AM, Dan Brickley <danbri@danbri.org> wrote:

> Dear RDFa WG,
> 
> I'm looking for some advice on schema.org markup options. I hope to
> join the WG shortly but wanted to start a conversation as early as
> possible.
> 
> Schema.org's markup for breadcrumbs is both popular and (currently)
> broken. The issue at http://www.w3.org/2011/webschema/track/issues/10
> gives some backstory, but factors include Microdata's rule for
> concatenating subelements, as well as the difficulty of representing
> ordered lists of link/label pairs as simple triples without complex
> markup. For the purposes of this mail, I am only interested in the
> RDFa 1.1 possibilities.
> 
> Egor (cc:'d) has made a draft of a proposal for improving our design,
> http://www.w3.org/wiki/WebSchemas/Breadcrumbs . This draft explores an
> approach that makes explicit within the extracted graph, the ordering,
> labelling and URLs from a 'breadcrumbs' section of HTML.
> 
> I would very much like to get the RDFa WG's perspective on this issue.

Well, I can give you my perspective on this issue. From a Linked Data/RDF perspective, I would expect to see breadcrumbs to give me an ordered list of links to the relevant resources, not HTML markup that has meaning only to a human.

>From a Microdata+RDF perspective, schema:breadcrumbs is described as a property having an ordered list of values, so that parsing the following yields a list in Turtle:

<div itemscope itemtype="http://schema.org/WebPage">
    <div itemprop="breadcrumb">
      <a href="category/books.html">Books</a> >
      <a href="category/books-literature.html">Literature and Fiction</a> >
      <a href="category/books-classics">Classics</a>
    </div>
</div>

@prefix md: <http://www.w3.org/ns/md#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
@prefix schema: <http://schema.org/> .

<> md:item ([ a schema:WebPage;
       schema:breadcrumb ("""
      Books >
      Literature and Fiction >
      Classics
    """)]);
   rdfa:usesVocabulary schema: .

The intention was for each link to be a URI in this list, so you could do the following, instead:

<div itemscope itemtype="http://schema.org/WebPage">
    <div>
      <a itemprop="breadcrumb" href="category/books.html">Books</a> >
      <a itemprop="breadcrumb" href="category/books-literature.html">Literature and Fiction</a> >
      <a itemprop="breadcrumb" href="category/books-classics">Classics</a>
    </div>
</div>

Which would give you:

@prefix md: <http://www.w3.org/ns/md#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
@prefix schema: <http://schema.org/> .

<> md:item ([ a schema:WebPage;
       schema:breadcrumb (<category/books.html> <category/books-literature.html> <category/books-classics>)]);
   rdfa:usesVocabulary schema: .

In RDFa 1.1 (not Lite), you can do this with @inlist and @rel:

<div vocab="http://schema.org/" typeof="WebPage">
    <div rel="breadcrumb" inlist>
      <a  href="category/books.html">Books</a> >
      <a href="category/books-literature.html">Literature and Fiction</a> >
      <a href="category/books-classics">Classics</a>
    </div>
</div>

Giving: 

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
@prefix schema: <http://schema.org/> .

<> rdfa:usesVocabulary schema: .

 [ a schema:WebPage;
    schema:breadcrumb (<category/books.html> <category/books-literature.html> <category/books-classics>)] .

In RDFa 1.1 Lite, you'd need to use @property and repeat both @proprty on each <a>. but I don't think @inlist is officially part of RDFa 1.1 Lite.

> Looking at http://www.w3.org/TR/2012/REC-rdfa-core-20120607/#markup-fragments-and-rdfa
> and http://www.w3.org/TR/2012/REC-rdfa-core-20120607/#s-xml-literals
> it seems an alternate design might be possible with RDFa. Instead of
> trying to make the entire 'breadcrumb' structure explicit as a graph,
> we could put the whole breadcrumb into a single property value as a
> larger piece of markup. The current spec shows this example:
> 
> <h2 property="dc:title" datatype="rdf:XMLLiteral">
>  E = mc<sup>2</sup>: The Most Urgent Problem of Our Time
> </h2>
> 
> ...presumably this will be adjusted in the HTML+RDFa world. There was
> discussion in the RDF WG earlier this year towards HTMLLiteral or HTML
> as a datatype; http://lists.w3.org/Archives/Public/public-rdf-wg/2012May/0612.html
> and the latest drafts now have such a datatype:
> 
> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-html
> http://www.w3.org/TR/2012/WD-rdf11-concepts-20120605/#section-html
> (latest public and editor's drafts seem identical)

Not a fan of this use case, but I believe that our intention is to support rdf:HTML in HTML+RDFa 1.1, certainly my processor does.

> "5.2 The rdf:HTML Datatype
> 
> RDF provides for HTML content as a possible literal value. This allows
> markup in literal values. Such content is indicated in an RDF graph
> using a literal whose datatype is a special built-in datatype
> rdf:HTML. This datatype is defined as follows[...]"
> 
> Let's look at the older Microdata example we still publish and
> schema.org. Can we talk through how this might look as an HTML
> fragment?
> 
> First, the current example:
> 
> <body itemscope itemtype="http://schema.org/WebPage">
> ...
> <div itemprop="breadcrumb">
>  <a href="category/books.html">Books</a> >
>  <a href="category/books-literature.html">Literature & Fiction</a> >
>  <a href="category/books-classics">Classics</a>
> </div> ...
> </body>
> 
> Now, let's put that in RDFa 1.1, with the whole markup block as the
> value of the 'breadcrumb' property:
> 
> <body typeof="http://schema.org/WebPage">
> ...
> <div property="breadcrumb" datatype="rdf:HTML">
>  <a href="category/books.html">Books</a> >
>  <a href="category/books-literature.html">Literature & Fiction</a> >
>  <a href="category/books-classics">Classics</a>
> </div> ...
> </body>
> 
> 
> While this meets our goal of simple markup, I see a couple of
> potential problems. Firstly the name of the datatype looks a little
> odd from an HTML markup perspective.  Secondly, the RDF spec requires
> that all supporting context, declarations and base URIs be packed into
> the markup. So the relative URIs wouldn't work.
> 
> "Any language annotation (lang="…") or XML namespaces (xmlns) desired
> in the HTML content must be included explicitly in the HTML literal.
> Relative URLs in attributes such as hrefdo not have a well-defined
> base URL and are best avoided."
> 
> My conclusion so far is that our markup would have to be either
> 
> A)
> <body typeof="http://schema.org/WebPage">
> ...
> <div property="breadcrumb" datatype="rdf:HTML">
>  <a href="http://example.com/category/books.html">Books</a> >
>  <a href="http://example.com/category/books-literature.html">Literature
> & Fiction</a> >
>  <a href="http://example.com/category/books-classics">Classics</a>
> </div> ...
> </body>
> 
> B) put base="http://example.com/" in the HTML <head>.
> 
>> From http://www.w3.org/TR/2012/REC-rdfa-core-20120607/#s_curieprocessing
> I understand that an RDFa 1.1 parser will help by resolving relative
> URI paths, but only for the values of the core RDFa attributes. Am I
> correct to understand that they will not rewrite rdf:HTML markup
> blocks to make URI references absolute?

URI expansion comes from HTML semantics, and works with any attributes that takes a URL (although it is somewhat broken for @href and @src in HTML5).

> Apologies for the long mail, but both crawl data and schema.org site
> logs show that breadcrumb markup is of great interest to Web
> developers, so I would like to do everything possible to explore the
> design space while we still have some possibility to fine-tune the
> designs at schema.org and in the RDFa/HTML spec.
> 
> Does the direction I sketch make sense, from an RDFa WG perspective?
> Is there anything we can do to make the markup easier for publishers
> and developers? Would another named markup datatype that absolute-ized
> relative links be feasible at this stage? Did I miss any other design
> options? Would more formal requirements analysis be useful?

Another possibility would be to use BNodes for each element, with schema:name and schema:url, which would give you something like the following:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
@prefix schema: <http://schema.org/> .

<> rdfa:usesVocabulary schema: .

 [ a schema:WebPage;
    schema:breadcrumb (
      [a schema:Breadcrumb; schema:url <category/books.html>; schema:name "Books" ]
      [a schema:Breadcrumb; schema:url <category/books-literature.html>; schema:name "Literature" ]
      [a schema:Breadcrumb; schema:url <category/books-classics>; schema:name "Classics"
  )] .

That could fallout with reasonable application of @inlist, and @typeof. It could work in Microdata too, with greater use of @itemscope and @itemtype.

Gregg

> cheers,
> 
> Dan
> 

Received on Monday, 12 November 2012 18:40:58 UTC