Re: recommended pattern for markup-valued 'breadcrumb' properties in RDFa

On 6 Feb 2013 14:40, "Niklas Lindström" <lindstream@gmail.com> wrote:
>
> Hi Dan, all,
>
> I think these are all good examples of what's possible. It shows that
> with full RDFa, you can achieve fairly compact forms that capture a
> lot of information.
>
> In fact, even within the constaints of Lite, we can make use of an
> old, deprecated friend if we want to:
>
>     <ul property="breadcrumb" typeof="rdf:Seq">
>       <li property="rdf:_1" typeof>... url and name ...</li> >
>       <li property="rdf:_2" typeof>... url and name ...</a></li> >
>     </ul>
>
> But I don't think we want to.. :)

Correct, this is undeployably ugly - the last 12 years of RSS 1.0 history
makes that quite clear.

> Anyway, this has been mostly about what might be palatable for an
> author, depending on taste and motives (ranging from "throw something
> out there" to "mark up as much information as possible").
>
> And the important question is another. What is the breadcrumb property
> of a page intended for, in terms of how consumers are supposed to use
> the data? I wonder what needs there are here for rich semantics? For
> instance, I can't really think of many practical cases for doing
> SPARQL queries over huge datasets of (disjoint) interlinked breadcrumb
> resources. :) And if I want to expose a rich and complex set of
> interrelated, hierarchical pages, I would name the links between the
> pages (in each distinct page) using e.g. DC or SIOC. I *may* do that
> within a breadcrumb block, but if so I'd use e.g. dc:isPartOf, and not
> the instrumental concept of the breadcrumb.

I think it's an incremental thing. For now, ordered list of links;
eventually exposing site-wide categories more explicitly...

> So for capturing the digested, solid form of a breadcrumb, I think Dan
> is spot on: the rdf:HTML form is apt for the job. And I do think that
> @datatype="rdf:HTML" can be palatable enough for the authors to throw
> it in if prescribed by schema.org (barring that it's not RDFa Lite).

Thanks. Is there anyway we can subtype the datatype to have other names
that are more breadcrummy?

Re relative links.... yeah that could be an issue..

Dan

> Then to consuming that data (e.g. parsing the HTML fragment afterwards
> for doing something special in a service). For the relative links,
> just capture the web page URL as well, so they can be resolved against
> that later on. I suggest:
>
>     <body vocab="http://schema.org/" typeof="WebPage">
>       ...
>       <nav property="breadcrumb" datatype="rdf:HTML">
>         ...
>
> By putting the @typeof on the <body> and/or adding an explicit
> @resource="", the subject of the breadcrumb statement will be the URL
> of the current page – as it should be (the page is the WebPage)! Since
> that is then stored, just resolve the @href:s later on against it, and
> you'll have the full URLs. Or if really necessary, add:
>
>     <link property="url" href="">
>
> (within the @typeof="WebPage" block) and resolve against the resulting
base url.
>
> As for language, either recommend an explicit @lang within the literal,
e.g.:
>
>     <nav property="breadcrumb" datatype="rdf:HTML">
>       <ul lang="en">...
>
> Or the addition of:
>
>     <meta property="inLanguage" content="en">
>
> and an assumption that captured markup is also in that language.
>
> Regarding @datatype not being Lite, both that and @inlist comes up
> from time to time. That to me begs the question: could Lite eventually
> be updated? I think that should be influenced by schema.org picking
> prudently from the full set of RDFa features. Which should be very
> much in the spirit of letting many flowers bloom. ;)
>
> Cheers,
> Niklas
>
> On Wed, Feb 6, 2013 at 5:48 AM, Dan Brickley <danbri@danbri.org> wrote:
> > On 5 February 2013 15:10, Stéphane Corlosquet <scorlosquet@gmail.com>
wrote:
> >> Hi Dan,
> >>
> >> Like Gregg, I'm not a big fan of shoving the whole breadcrumb items
into a
> >> single rdf:HTML value.
> >>
> >> First maybe we should define what info you want to capture in the
> >> breadcrumbs. I'm going to assume that you want to have the URL and the
name
> >> of each item, and each item should be typed with
> >> http://schema.org/Breadcrumb (correct me if I'm wrong with this
assumption).
> >> If you're only interested in URLs or names only, the markup becomes
much
> >> simpler.
> >
> > We want all that, plus the ordering of the items too. Painful in RDF,
isn't it?
> >
> > Dan
> >
> >> ## 1. @rel and no schema:url
> >>
> >> The simpler and shorter option involves just adding a span element
inside
> >> each breadcrumb item and wrapping everything with @rel and @inlist
> >> attributes:
> >>
> >> <div vocab="http://schema.org/" typeof="WebPage">
> >>     <div rel="breadcrumb" inlist="">
> >>       <a typeof="Breadcrumb" href="category/books.html"><span
> >> property="name">Books</span></a> >
> >>       <a typeof="Breadcrumb"
href="category/books-literature.html"><span
> >> property="name">Literature and Fiction</span></a> >
> >>     </div>
> >> </div>
> >>
> >> which yields:
> >>
> >>  [ a schema:WebPage;
> >>     schema:breadcrumb (<category/books.html>
> >> <category/books-literature.html>)] .
> >>
> >> <category/books-literature.html> a schema:Breadcrumb;
> >>    schema:name "Literature and Fiction" .
> >>
> >> <category/books.html> a schema:Breadcrumb;
> >>    schema:name "Books" .
> >>
> >> I like this option the best because the markup is very succinct and
doesn't
> >> repeat any data. In this case, you don't get an explicit schema:url,
but
> >> instead you get this value from the URI of each breadcrumb resource (no
> >> blank nodes either!).
> >>
> >>
> >> ## 2. no @rel and no schema:url
> >>
> >> Like Gregg said, you can also assert @property for each item if you
want to
> >> avoid a wrapping @rel. This examples yields the same output as the
previous
> >> one:
> >>
> >> <div vocab="http://schema.org/" typeof="WebPage">
> >>     <div>
> >>       <a property="breadcrumb" typeof="Breadcrumb" inlist=""
> >> href="category/books.html"><span property="name">Books</span></a> >
> >>       <a property="breadcrumb" typeof="Breadcrumb" inlist=""
> >> href="category/books-literature.html"><span property="name">Literature
and
> >> Fiction</span></a> >
> >>     </div>
> >> </div>
> >>
> >>
> >> ## 3. @rel and schema:url
> >>
> >> If an explicit schema:url is required, it is still possible at the
expense
> >> of more markup:
> >>
> >> <div vocab="http://schema.org/" typeof="WebPage">
> >>     <div rel="breadcrumb" inlist="">
> >>       <span typeof="Breadcrumb"><a property="url"
> >> href="category/books.html"><span
property="name">Books</span></a></span> >
> >>       <span typeof="Breadcrumb"><a property="url"
> >> href="category/books-literature.html"><span property="name">Literature
and
> >> Fiction</span></a></span>
> >>     </div>
> >> </div>
> >>
> >> which yields
> >>
> >>  [ a schema:WebPage;
> >>     schema:breadcrumb ([ a schema:Breadcrumb;
> >>         schema:name "Books";
> >>         schema:url <category/books.html>] [ a schema:Breadcrumb;
> >>         schema:name "Literature and Fiction";
> >>         schema:url <category/books-literature.html>])] .
> >>
> >>
> >> ## 4. no @rel with schema:url
> >>
> >> Finally, the previous example with explicit schema:url also works with
> >> inline @property attributes and gives the same output:
> >>
> >> <div vocab="http://schema.org/" typeof="WebPage">
> >>     <div>
> >>       <span property="breadcrumb" typeof="Breadcrumb" inlist=""><a
> >> property="url" href="category/books.html"><span
> >> property="name">Books</span></a></span> >
> >>       <span property="breadcrumb" typeof="Breadcrumb" inlist=""><a
> >> property="url" href="category/books-literature.html"><span
> >> property="name">Literature and Fiction</span></a></span> >
> >>     </div>
> >> </div>
> >>
> >>
> >> In conclusion, could you live without an explicit schema:url? It does
reduce
> >> the amount of markup quite a bit, which is quite crucial in the
context of
> >> breadcrumbs where there can be a lot of items.
> >>
> >> Re. Egor's proposal where each child item is wrapped into its parent,
I'm
> >> not sure the HTML for that is very intuitive, I'd prefer to just have
a flat
> >> bunch of elements rather than nesting them in HTML, it's less error
prone
> >> IMO. His first argument was "Current breadcrumbs cannot be stored in an
> >> unordered storage like JSON", but afaik, JSON can preserve order. The
second
> >> argument is about multiple breadcrumb chains, and I admit it's a valid
> >> argument to have hierarchies of breadcrumb items in this scenario, but
is
> >> this use case very popular in the reality? could we see some examples?
> >>
> >> HTH,
> >> Steph.
> >>
> >>
> >>
> >> On Mon, Nov 12, 2012 at 1:40 PM, Gregg Kellogg <gregg@greggkellogg.net>
> >> wrote:
> >>>
> >>> On Nov 12, 2012, at 4:24 AM, Dan Brickley <danbri@danbri.org> wrote:
> >>>
> >>> > Dear RDFa WG,
> >>> >
> >>> > I'm looking for some advice on schema.org markup options. I hope to
> >>> > join the WG shortly but wanted to start a conversation as early as
> >>> > possible.
> >>> >
> >>> > Schema.org's markup for breadcrumbs is both popular and (currently)
> >>> > broken. The issue at
http://www.w3.org/2011/webschema/track/issues/10
> >>> > gives some backstory, but factors include Microdata's rule for
> >>> > concatenating subelements, as well as the difficulty of representing
> >>> > ordered lists of link/label pairs as simple triples without complex
> >>> > markup. For the purposes of this mail, I am only interested in the
> >>> > RDFa 1.1 possibilities.
> >>> >
> >>> > Egor (cc:'d) has made a draft of a proposal for improving our
design,
> >>> > http://www.w3.org/wiki/WebSchemas/Breadcrumbs . This draft explores
an
> >>> > approach that makes explicit within the extracted graph, the
ordering,
> >>> > labelling and URLs from a 'breadcrumbs' section of HTML.
> >>> >
> >>> > I would very much like to get the RDFa WG's perspective on this
issue.
> >>>
> >>> Well, I can give you my perspective on this issue. From a Linked
Data/RDF
> >>> perspective, I would expect to see breadcrumbs to give me an ordered
list of
> >>> links to the relevant resources, not HTML markup that has meaning
only to a
> >>> human.
> >>>
> >>> >From a Microdata+RDF perspective, schema:breadcrumbs is described as
a
> >>> property having an ordered list of values, so that parsing the
following
> >>> yields a list in Turtle:
> >>>
> >>> <div itemscope itemtype="http://schema.org/WebPage">
> >>>     <div itemprop="breadcrumb">
> >>>       <a href="category/books.html">Books</a> >
> >>>       <a href="category/books-literature.html">Literature and
Fiction</a>
> >>> >
> >>>       <a href="category/books-classics">Classics</a>
> >>>     </div>
> >>> </div>
> >>>
> >>> @prefix md: <http://www.w3.org/ns/md#> .
> >>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> >>> @prefix rdfa: <http://www.w3.org/ns/rdfa#> .
> >>> @prefix schema: <http://schema.org/> .
> >>>
> >>> <> md:item ([ a schema:WebPage;
> >>>        schema:breadcrumb ("""
> >>>       Books >
> >>>       Literature and Fiction >
> >>>       Classics
> >>>     """)]);
> >>>    rdfa:usesVocabulary schema: .
> >>>
> >>> The intention was for each link to be a URI in this list, so you
could do
> >>> the following, instead:
> >>>
> >>> <div itemscope itemtype="http://schema.org/WebPage">
> >>>     <div>
> >>>       <a itemprop="breadcrumb" href="category/books.html">Books</a> >
> >>>       <a itemprop="breadcrumb"
> >>> href="category/books-literature.html">Literature and Fiction</a> >
> >>>       <a itemprop="breadcrumb"
href="category/books-classics">Classics</a>
> >>>     </div>
> >>> </div>
> >>>
> >>> Which would give you:
> >>>
> >>> @prefix md: <http://www.w3.org/ns/md#> .
> >>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> >>> @prefix rdfa: <http://www.w3.org/ns/rdfa#> .
> >>> @prefix schema: <http://schema.org/> .
> >>>
> >>> <> md:item ([ a schema:WebPage;
> >>>        schema:breadcrumb (<category/books.html>
> >>> <category/books-literature.html> <category/books-classics>)]);
> >>>    rdfa:usesVocabulary schema: .
> >>>
> >>> In RDFa 1.1 (not Lite), you can do this with @inlist and @rel:
> >>>
> >>> <div vocab="http://schema.org/" typeof="WebPage">
> >>>     <div rel="breadcrumb" inlist>
> >>>       <a  href="category/books.html">Books</a> >
> >>>       <a href="category/books-literature.html">Literature and
Fiction</a>
> >>> >
> >>>       <a href="category/books-classics">Classics</a>
> >>>     </div>
> >>> </div>
> >>>
> >>> Giving:
> >>>
> >>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> >>> @prefix rdfa: <http://www.w3.org/ns/rdfa#> .
> >>> @prefix schema: <http://schema.org/> .
> >>>
> >>> <> rdfa:usesVocabulary schema: .
> >>>
> >>>  [ a schema:WebPage;
> >>>     schema:breadcrumb (<category/books.html>
> >>> <category/books-literature.html> <category/books-classics>)] .
> >>>
> >>> In RDFa 1.1 Lite, you'd need to use @property and repeat both
@proprty on
> >>> each <a>. but I don't think @inlist is officially part of RDFa 1.1
Lite.
> >>>
> >>> > Looking at
> >>> >
http://www.w3.org/TR/2012/REC-rdfa-core-20120607/#markup-fragments-and-rdfa
> >>> > and http://www.w3.org/TR/2012/REC-rdfa-core-20120607/#s-xml-literals
> >>> > it seems an alternate design might be possible with RDFa. Instead of
> >>> > trying to make the entire 'breadcrumb' structure explicit as a
graph,
> >>> > we could put the whole breadcrumb into a single property value as a
> >>> > larger piece of markup. The current spec shows this example:
> >>> >
> >>> > <h2 property="dc:title" datatype="rdf:XMLLiteral">
> >>> >  E = mc<sup>2</sup>: The Most Urgent Problem of Our Time
> >>> > </h2>
> >>> >
> >>> > ...presumably this will be adjusted in the HTML+RDFa world. There
was
> >>> > discussion in the RDF WG earlier this year towards HTMLLiteral or
HTML
> >>> > as a datatype;
> >>> > http://lists.w3.org/Archives/Public/public-rdf-wg/2012May/0612.html
> >>> > and the latest drafts now have such a datatype:
> >>> >
> >>> >
> >>> >
http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-html
> >>> > http://www.w3.org/TR/2012/WD-rdf11-concepts-20120605/#section-html
> >>> > (latest public and editor's drafts seem identical)
> >>>
> >>> Not a fan of this use case, but I believe that our intention is to
support
> >>> rdf:HTML in HTML+RDFa 1.1, certainly my processor does.
> >>>
> >>> > "5.2 The rdf:HTML Datatype
> >>> >
> >>> > RDF provides for HTML content as a possible literal value. This
allows
> >>> > markup in literal values. Such content is indicated in an RDF graph
> >>> > using a literal whose datatype is a special built-in datatype
> >>> > rdf:HTML. This datatype is defined as follows[...]"
> >>> >
> >>> > Let's look at the older Microdata example we still publish and
> >>> > schema.org. Can we talk through how this might look as an HTML
> >>> > fragment?
> >>> >
> >>> > First, the current example:
> >>> >
> >>> > <body itemscope itemtype="http://schema.org/WebPage">
> >>> > ...
> >>> > <div itemprop="breadcrumb">
> >>> >  <a href="category/books.html">Books</a> >
> >>> >  <a href="category/books-literature.html">Literature & Fiction</a> >
> >>> >  <a href="category/books-classics">Classics</a>
> >>> > </div> ...
> >>> > </body>
> >>> >
> >>> > Now, let's put that in RDFa 1.1, with the whole markup block as the
> >>> > value of the 'breadcrumb' property:
> >>> >
> >>> > <body typeof="http://schema.org/WebPage">
> >>> > ...
> >>> > <div property="breadcrumb" datatype="rdf:HTML">
> >>> >  <a href="category/books.html">Books</a> >
> >>> >  <a href="category/books-literature.html">Literature & Fiction</a> >
> >>> >  <a href="category/books-classics">Classics</a>
> >>> > </div> ...
> >>> > </body>
> >>> >
> >>> >
> >>> > While this meets our goal of simple markup, I see a couple of
> >>> > potential problems. Firstly the name of the datatype looks a little
> >>> > odd from an HTML markup perspective.  Secondly, the RDF spec
requires
> >>> > that all supporting context, declarations and base URIs be packed
into
> >>> > the markup. So the relative URIs wouldn't work.
> >>> >
> >>> > "Any language annotation (lang="…") or XML namespaces (xmlns)
desired
> >>> > in the HTML content must be included explicitly in the HTML literal.
> >>> > Relative URLs in attributes such as hrefdo not have a well-defined
> >>> > base URL and are best avoided."
> >>> >
> >>> > My conclusion so far is that our markup would have to be either
> >>> >
> >>> > A)
> >>> > <body typeof="http://schema.org/WebPage">
> >>> > ...
> >>> > <div property="breadcrumb" datatype="rdf:HTML">
> >>> >  <a href="http://example.com/category/books.html">Books</a> >
> >>> >  <a href="http://example.com/category/books-literature.html
">Literature
> >>> > & Fiction</a> >
> >>> >  <a href="http://example.com/category/books-classics">Classics</a>
> >>> > </div> ...
> >>> > </body>
> >>> >
> >>> > B) put base="http://example.com/" in the HTML <head>.
> >>> >
> >>> >> From
> >>> >> http://www.w3.org/TR/2012/REC-rdfa-core-20120607/#s_curieprocessing
> >>> > I understand that an RDFa 1.1 parser will help by resolving relative
> >>> > URI paths, but only for the values of the core RDFa attributes. Am I
> >>> > correct to understand that they will not rewrite rdf:HTML markup
> >>> > blocks to make URI references absolute?
> >>>
> >>> URI expansion comes from HTML semantics, and works with any attributes
> >>> that takes a URL (although it is somewhat broken for @href and @src in
> >>> HTML5).
> >>>
> >>> > Apologies for the long mail, but both crawl data and schema.org site
> >>> > logs show that breadcrumb markup is of great interest to Web
> >>> > developers, so I would like to do everything possible to explore the
> >>> > design space while we still have some possibility to fine-tune the
> >>> > designs at schema.org and in the RDFa/HTML spec.
> >>> >
> >>> > Does the direction I sketch make sense, from an RDFa WG perspective?
> >>> > Is there anything we can do to make the markup easier for publishers
> >>> > and developers? Would another named markup datatype that
absolute-ized
> >>> > relative links be feasible at this stage? Did I miss any other
design
> >>> > options? Would more formal requirements analysis be useful?
> >>>
> >>> Another possibility would be to use BNodes for each element, with
> >>> schema:name and schema:url, which would give you something like the
> >>> following:
> >>>
> >>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> >>> @prefix rdfa: <http://www.w3.org/ns/rdfa#> .
> >>> @prefix schema: <http://schema.org/> .
> >>>
> >>> <> rdfa:usesVocabulary schema: .
> >>>
> >>>  [ a schema:WebPage;
> >>>     schema:breadcrumb (
> >>>       [a schema:Breadcrumb; schema:url <category/books.html>;
schema:name
> >>> "Books" ]
> >>>       [a schema:Breadcrumb; schema:url
<category/books-literature.html>;
> >>> schema:name "Literature" ]
> >>>       [a schema:Breadcrumb; schema:url <category/books-classics>;
> >>> schema:name "Classics"
> >>>   )] .
> >>>
> >>> That could fallout with reasonable application of @inlist, and
@typeof. It
> >>> could work in Microdata too, with greater use of @itemscope and
@itemtype.
> >>>
> >>> Gregg
> >>>
> >>> > cheers,
> >>> >
> >>> > Dan
> >>> >
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Steph.
> >

Received on Thursday, 7 February 2013 00:12:09 UTC