Re: A question on RWPM: why the 'metadata' tag?

Please ignore if this is not the actual issue...

If the concern is the separation of the information in the JSON, and hence
creating a metadata tag not for modeling purposes just for readability,
then there is a solution in JSON-LD 1.1 of "nested" fields.  In effect this
feature allows for invisible-to-RDF structure to be present in the JSON.

Thus if the desired JSON looked like:

{
  "id": "isbn:...",
  "type": "Book",
  "metadata": {
    "title": "Moby Dick",
    "author": "Herman Melville"
  }
}

But the RDF should flatten the title and author back to the node identified
by the ISBN, then the context would be:

{
 ...
 "metadata": "@nest"
 ...
}

See the specification here:
https://json-ld.org/spec/latest/json-ld/#nested-properties


Hope that helps!

Rob


On Tue, Jan 9, 2018 at 6:33 AM, Ivan Herman <ivan@w3.org> wrote:

>
>
> > On 9 Jan 2018, at 15:04, Leonard Rosenthol <lrosenth@adobe.com> wrote:
> >
> > Sorry for coming in late and without background here - been swamped with
> completely unrelated work lately (.
> >
> > It appears (for reasons that aren't clear to me yet) that you are
> looking to serialize RDF-based data into JSON/JSON-LD, is that correct?
>
> Well… almost but not exactly.
>
> JSON-LD is a funny beast: data in JSON-LD can be looked as pure JSON to be
> used by various applications, so we want to make it comfortable for that
> purpose. However, we should also be careful about the "quality" of the RDF
> it encodes. These two requirements are sometimes contradictory, and may
> influence design decision to find the right balance. What I did was looking
> at this along those lines.
>
> > If so, then I will point you to a new Work Item in ISO TC 130 WG2TF4,
> the committee where XMP (the industry standard RDF-based metadata model for
> assets) is standardized.  This new work item, which is being done jointly
> with other groups such as the IPTC, is to standardize a JSON-LD
> serialization of XMP.  Which sounds like *exactly* what you are looking for
> as well.   Yes??
>
> It is obviously close. Thanks for the pointers!
>
> Ivan
>
> >
> > Leonard
> >
> > On 1/9/18, 4:18 AM, "Ivan Herman" <ivan@w3.org> wrote:
> >
> >    Sorry, stupid subject line mistake, s/manifest/metadata/ :-)
> >
> >    I.
> >
> >> On 9 Jan 2018, at 13:13, Ivan Herman <ivan@w3.org> wrote:
> >>
> >> Hadrien,
> >>
> >> (I did not want to put this as an issue at this moment; it may become
> relevant if we go down the RWPM way but, until then, this should not yet be
> an 'official' issue. So let us keep to the ML for now.)
> >>
> >> (My apologies to readers who do not have an RDF background; this is a
> fairly technical mail…)
> >>
> >> I looked at the opening example in[1]. What I was curious was to see
> how all this looks like in RDF. I used a converter that generated the
> following triples (I use Turtle which is much more RDF-like…):
> >>
> >> <<<<
> >>
> >> @prefix ns1: <owl:> .
> >> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> >> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
> >> @prefix schema: <http://schema.org/> .
> >> @prefix xml: <http://www.w3.org/XML/1998/namespace> .
> >> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
> >>
> >> <urn:isbn:978031600000X> a schema:Book ;
> >>   schema:author "Herman Melville" ;
> >>   schema:dateModified "2015-09-29T17:00:00+00:00"^^xsd:dateTime ;
> >>   schema:inLanguage "en" ;
> >>   schema:name "Moby-Dick" .
> >>
> >> [] schema:hasPart [ schema:fileFormat "text/html" ;
> >>           schema:name "Chapter 2" ;
> >>           schema:url "c002.html" ],
> >>       [ schema:fileFormat "text/html" ;
> >>           schema:name "Chapter 1" ;
> >>           schema:url "c001.html" ] ;
> >>   ns1:sameAs <urn:isbn:978031600000X> .
> >>
> >> <<<<
> >>
> >> Which, mostly, looks fine, except for that trick of using owl:sameAs to
> identify the canonical object (the book) with a blank node. I see several
> issues with that:
> >>
> >> - Die hard RDF/Linked Data people really try to avoid the usage of
> blank nodes, because they are a source of constant problems in various RDF
> related routines, algorithms, etc. There are cases when they are almost
> necessary (the objects in the schema:hasPart construction above look
> perfectly fine to me), but the outer blank node (ie, the '[]') would really
> put many people off.
> >>
> >> - I have not looked at the RDF tool landscape lately, but, afaik, OWL
> is often ignored by RDF related tools. owl:sameAs _may_ be an exception
> here and there (there are triple stores that do an owl:sameAs reasoning
> against their data) but this is not universal. (E.g., the Python RDFLib
> tool does not do that automatically, you have to use external libraries.)
> This also means that, e.g., SPARQL requests may fail on querying (in the
> example above) the "c002.html" part if they only use the ISBN identifier
> although, semantically, this should be fine.
> >>
> >> - (I am not sure the schema.org tools are prepared for this, although
> that may not be a strong argument)
> >>
> >> So I was trying to get rid of this. The usage of owl:sameAs his is the
> artefact of mapping the "metadata" term against "owl:sameAs" in the context
> file. This is necessary because, in the RWPM you do have this separate
> "metadata" term:
> >>
> >>
> >> <<<<
> >>
> >> "metadata" : {
> >>  "@type": "http://schema.org/Book",
> >>   "title": "Moby-Dick",
> >>   "author": "Herman Melville",
> >>   "identifier": "urn:isbn:978031600000X",
> >>   "language": "en",
> >>   "modified": "2015-09-29T17:00:00Z"
> >> }
> >>
> >> <<<<
> >>
> >> and if "metadata" is not mapped in context, it will be ignored
> (together with the JSON object).
> >>
> >> Hence the question: what does this "metadata" term bring to the table
> in the first place? Why can't one have the example in [1] be simply
> >>
> >> <<<<
> >> {
> >>   "@context": "http://…",
> >>   "identifier": "urn:isbn:978031600000X",
> >>   "author": "Herman Melville",
> >>   …
> >>   "spine": [
> >>      …
> >>   ]
> >>   …
> >> }
> >> <<<<
> >>
> >> It strikes me as much more straightforward for authors/users as well. I
> have used this and got the much more straightforward Turtle output:
> >>
> >> <<<<
> >> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> >> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
> >> @prefix schema: <http://schema.org/> .
> >> @prefix xml: <http://www.w3.org/XML/1998/namespace> .
> >> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
> >>
> >> <urn:isbn:978031600000X> a schema:Book ;
> >>   schema:author "Herman Melville" ;
> >>   schema:dateModified "2015-09-29T17:00:00+00:00"^^xsd:dateTime ;
> >>   schema:hasPart [ schema:fileFormat "text/html" ;
> >>           schema:name "Chapter 2" ;
> >>           schema:url "c002.html" ],
> >>       [ schema:fileFormat "text/html" ;
> >>           schema:name "Chapter 1" ;
> >>           schema:url "c001.html" ] ;
> >>   schema:inLanguage "en" ;
> >>   schema:name "Moby-Dick" .
> >> <<<<
> >>
> >> I can imagine that there *are* some terms that you do not want to
> appear in RDF. And that is fine: you already use the trick (e.g., for
> resources) whereby a term that has no mapping in a JSON-LD context (or is
> not a URI by itself) is ignored by a JSON-LD processor, ie, you can hide
> anything you want.
> >>
> >> WDYT?
> >>
> >> Ivan
> >>
> >> P.S. Note, b.t.w., that the JSON-LD 1.1 document[2], which is currently
> a CG draft, introduces the notion of 'nested properties'[3] which does
> something similar: it essentially says "ignore this term and the resulting
> nesting, it is semantically meaningless". Ie, if "metadata" would be
> defined as "@nest" in the context, we would get the same simplified Turtle.
> At this moment JSON-LD 1.1 is a CG draft, although there are plans to
> submit that work as a possible WG at W3C to issue it as a new version of
> JSON-LD, but that is not yet in motion.
> >>
> >>
> >>
> >> [1] https://github.com/readium/webpub-manifest
> >> [2] https://json-ld.org/spec/latest/json-ld
> >> [3] https://json-ld.org/spec/latest/json-ld/#nested-properties
> >>
> >>
> >> ----
> >> Ivan Herman, W3C
> >> Publishing@W3C Technical Lead
> >> Home: http://www.w3.org/People/Ivan/
> >> mobile: +31-641044153
> >> ORCID ID: http://orcid.org/0000-0003-0782-2704
> >>
> >
> >
> >    ----
> >    Ivan Herman, W3C
> >    Publishing@W3C Technical Lead
> >    Home: http://www.w3.org/People/Ivan/
> >    mobile: +31-641044153
> >    ORCID ID: http://orcid.org/0000-0003-0782-2704
> >
> >
> >
>
>
> ----
> Ivan Herman, W3C
> Publishing@W3C Technical Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
>
>


-- 
Rob Sanderson
Semantic Architect
The Getty Trust
Los Angeles, CA 90049

Received on Wednesday, 10 January 2018 16:21:18 UTC