- From: Ivan Herman <ivan@w3.org>
- Date: Tue, 9 Jan 2018 16:41:43 +0100
- To: Benjamin Young <byoung@bigbluehat.com>
- Cc: Leonard Rosenthol <lrosenth@adobe.com>, Hadrien Gardeur <hadrien.gardeur@feedbooks.com>, W3C Publishing Working Group <public-publ-wg@w3.org>
- Message-Id: <70CD6B2F-27E3-4A1A-9F31-66119EC994A2@w3.org>
> On 9 Jan 2018, at 16:19, Benjamin Young <byoung@bigbluehat.com <mailto:byoung@bigbluehat.com>> wrote: > > Given that a Web Publication MUST have an unique, identifying address, making that the value of the JSON-LD's `@id` should clear up the blank node issue. Look at the generated Turtle. There is an @id, and that works well. But the other "half" of the data is hanging on a blank node, connected to the @id via an owl:sameAs. That is simply the way the @context works. > > Also, since schema:Book is already being used, it seems that http://schema.org/sameAs <http://schema.org/sameAs> and http://schema.org/identifier <http://schema.org/identifier> would/could/should be used instead. But that works in the schema.org <http://schema.org/> environment only. schema:sameAs was created for the purpose of schema.org <http://schema.org/> exactly because owl:sameAs is ignored (even in schema.org <http://schema.org/>). But then we would have a kind of linked data that has a very strange semantics outside of schema.org <http://schema.org/>. Is this what we want? Ivan > > Cheers, > Benjamin > > -- > http://bigbluehat.com/ <http://bigbluehat.com/> > http://linkedin.com/in/benjaminyoung <http://linkedin.com/in/benjaminyoung> > From: Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> > Sent: Tuesday, January 9, 2018 9:33:09 AM > To: Leonard Rosenthol > Cc: Hadrien Gardeur; W3C Publishing Working Group > Subject: Re: A question on RWPM: why the 'metadata' tag? > > > > > On 9 Jan 2018, at 15:04, Leonard Rosenthol <lrosenth@adobe.com <mailto:lrosenth@adobe.com>> wrote: > > > > Sorry for coming in late and without background here - been swamped with completely unrelated work lately (. > > > > It appears (for reasons that aren't clear to me yet) that you are looking to serialize RDF-based data into JSON/JSON-LD, is that correct? > > Well… almost but not exactly. > > JSON-LD is a funny beast: data in JSON-LD can be looked as pure JSON to be used by various applications, so we want to make it comfortable for that purpose. However, we should also be careful about the "quality" of the RDF it encodes. These two requirements are sometimes contradictory, and may influence design decision to find the right balance. What I did was looking at this along those lines. > > > If so, then I will point you to a new Work Item in ISO TC 130 WG2TF4, the committee where XMP (the industry standard RDF-based metadata model for assets) is standardized. This new work item, which is being done jointly with other groups such as the IPTC, is to standardize a JSON-LD serialization of XMP. Which sounds like *exactly* what you are looking for as well. Yes?? > > It is obviously close. Thanks for the pointers! > > Ivan > > > > > Leonard > > > > On 1/9/18, 4:18 AM, "Ivan Herman" <ivan@w3.org <mailto:ivan@w3.org>> wrote: > > > > Sorry, stupid subject line mistake, s/manifest/metadata/ :-) > > > > I. > > > >> On 9 Jan 2018, at 13:13, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> wrote: > >> > >> Hadrien, > >> > >> (I did not want to put this as an issue at this moment; it may become relevant if we go down the RWPM way but, until then, this should not yet be an 'official' issue. So let us keep to the ML for now.) > >> > >> (My apologies to readers who do not have an RDF background; this is a fairly technical mail…) > >> > >> I looked at the opening example in[1]. What I was curious was to see how all this looks like in RDF. I used a converter that generated the following triples (I use Turtle which is much more RDF-like…): > >> > >> <<<< > >> > >> @prefix ns1: <owl:> . > >> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns# <http://www.w3.org/1999/02/22-rdf-syntax-ns#>> . > >> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema# <http://www.w3.org/2000/01/rdf-schema#>> . > >> @prefix schema: <http://schema.org/ <http://schema.org/>> . > >> @prefix xml: <http://www.w3.org/XML/1998/namespace <http://www.w3.org/XML/1998/namespace>> . > >> @prefix xsd: <http://www.w3.org/2001/XMLSchema# <http://www.w3.org/2001/XMLSchema#>> . > >> > >> <urn:isbn:978031600000X> a schema:Book ; > >> schema:author "Herman Melville" ; > >> schema:dateModified "2015-09-29T17:00:00+00:00"^^xsd:dateTime ; > >> schema:inLanguage "en" ; > >> schema:name "Moby-Dick" . > >> > >> [] schema:hasPart [ schema:fileFormat "text/html" ; > >> schema:name "Chapter 2" ; > >> schema:url "c002.html" ], > >> [ schema:fileFormat "text/html" ; > >> schema:name "Chapter 1" ; > >> schema:url "c001.html" ] ; > >> ns1:sameAs <urn:isbn:978031600000X> . > >> > >> <<<< > >> > >> Which, mostly, looks fine, except for that trick of using owl:sameAs to identify the canonical object (the book) with a blank node. I see several issues with that: > >> > >> - Die hard RDF/Linked Data people really try to avoid the usage of blank nodes, because they are a source of constant problems in various RDF related routines, algorithms, etc. There are cases when they are almost necessary (the objects in the schema:hasPart construction above look perfectly fine to me), but the outer blank node (ie, the '[]') would really put many people off. > >> > >> - I have not looked at the RDF tool landscape lately, but, afaik, OWL is often ignored by RDF related tools. owl:sameAs _may_ be an exception here and there (there are triple stores that do an owl:sameAs reasoning against their data) but this is not universal. (E.g., the Python RDFLib tool does not do that automatically, you have to use external libraries.) This also means that, e.g., SPARQL requests may fail on querying (in the example above) the "c002.html" part if they only use the ISBN identifier although, semantically, this should be fine. > >> > >> - (I am not sure the schema.org <http://schema.org/> tools are prepared for this, although that may not be a strong argument) > >> > >> So I was trying to get rid of this. The usage of owl:sameAs his is the artefact of mapping the "metadata" term against "owl:sameAs" in the context file. This is necessary because, in the RWPM you do have this separate "metadata" term: > >> > >> > >> <<<< > >> > >> "metadata" : { > >> "@type": "http://schema.org/Book <http://schema.org/Book>", > >> "title": "Moby-Dick", > >> "author": "Herman Melville", > >> "identifier": "urn:isbn:978031600000X", > >> "language": "en", > >> "modified": "2015-09-29T17:00:00Z" > >> } > >> > >> <<<< > >> > >> and if "metadata" is not mapped in context, it will be ignored (together with the JSON object). > >> > >> Hence the question: what does this "metadata" term bring to the table in the first place? Why can't one have the example in [1] be simply > >> > >> <<<< > >> { > >> "@context": "http://… <http://…>", > >> "identifier": "urn:isbn:978031600000X", > >> "author": "Herman Melville", > >> … > >> "spine": [ > >> … > >> ] > >> … > >> } > >> <<<< > >> > >> It strikes me as much more straightforward for authors/users as well. I have used this and got the much more straightforward Turtle output: > >> > >> <<<< > >> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns# <http://www.w3.org/1999/02/22-rdf-syntax-ns#>> . > >> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema# <http://www.w3.org/2000/01/rdf-schema#>> . > >> @prefix schema: <http://schema.org/ <http://schema.org/>> . > >> @prefix xml: <http://www.w3.org/XML/1998/namespace <http://www.w3.org/XML/1998/namespace>> . > >> @prefix xsd: <http://www.w3.org/2001/XMLSchema# <http://www.w3.org/2001/XMLSchema#>> . > >> > >> <urn:isbn:978031600000X> a schema:Book ; > >> schema:author "Herman Melville" ; > >> schema:dateModified "2015-09-29T17:00:00+00:00"^^xsd:dateTime ; > >> schema:hasPart [ schema:fileFormat "text/html" ; > >> schema:name "Chapter 2" ; > >> schema:url "c002.html" ], > >> [ schema:fileFormat "text/html" ; > >> schema:name "Chapter 1" ; > >> schema:url "c001.html" ] ; > >> schema:inLanguage "en" ; > >> schema:name "Moby-Dick" . > >> <<<< > >> > >> I can imagine that there *are* some terms that you do not want to appear in RDF. And that is fine: you already use the trick (e.g., for resources) whereby a term that has no mapping in a JSON-LD context (or is not a URI by itself) is ignored by a JSON-LD processor, ie, you can hide anything you want. > >> > >> WDYT? > >> > >> Ivan > >> > >> P.S. Note, b.t.w., that the JSON-LD 1.1 document[2], which is currently a CG draft, introduces the notion of 'nested properties'[3] which does something similar: it essentially says "ignore this term and the resulting nesting, it is semantically meaningless". Ie, if "metadata" would be defined as "@nest" in the context, we would get the same simplified Turtle. At this moment JSON-LD 1.1 is a CG draft, although there are plans to submit that work as a possible WG at W3C to issue it as a new version of JSON-LD, but that is not yet in motion. > >> > >> > >> > >> [1] https://github.com/readium/webpub-manifest <https://github.com/readium/webpub-manifest> > >> [2] https://json-ld.org/spec/latest/json-ld <https://json-ld.org/spec/latest/json-ld> > >> [3] https://json-ld.org/spec/latest/json-ld/#nested-properties <https://json-ld.org/spec/latest/json-ld/#nested-properties> > >> > >> > >> ---- > >> Ivan Herman, W3C > >> Publishing@W3C Technical Lead > >> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/> > >> mobile: +31-641044153 > >> ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704> > >> > > > > > > ---- > > Ivan Herman, W3C > > Publishing@W3C Technical Lead > > Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/> > > mobile: +31-641044153 > > ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704> > > > > > > > > > ---- > Ivan Herman, W3C > Publishing@W3C Technical Lead > Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/> > mobile: +31-641044153 > ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704> ---- Ivan Herman, W3C Publishing@W3C Technical Lead Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/> mobile: +31-641044153 ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704>
Received on Tuesday, 9 January 2018 15:41:59 UTC