Re: shapes as classes from Eric Prud'hommeaux on 2014-12-28 (public-data-shapes-wg@w3.org from December 2014)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Sun, 28 Dec 2014 17:16:30 -0500
To: "Johnston, Patrick - Hoboken" <pjohnston@wiley.com>
Cc: "kcoyle@kcoyle.net" <kcoyle@kcoyle.net>, "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>, Arthur Ryman <ryman@ca.ibm.com>
Message-ID: <20141228221628.GB25780@w3.org>
* Johnston, Patrick - Hoboken <pjohnston@wiley.com> [2014-12-28 14:26-0500]
> I am still struggling to understand the fuss about disconnected graphs. If
> you are prepared to impose constraints then surely there is an implicit
> connection, so they aren¹t _really_ disconnected. The connection may not
> be realized through a class construct, but it should be realizable through
> a shape, otherwise there is nothing to constrain (I think I am agreeing
> with Holger here). The shape needs a scope on which it can act: that may
> manifest as membership of a class (rdf:type); as some other explicit
> linkage (foo:isConstrainedBy) we may define through this group; as a
> SPARQL query on a graph store; or through something as tenuous as the web
> address through which the graph is rendered. The latter is I assume what
> S35 is trying to highlight? I don¹t understand why Peter says this would
> be out of scope: RDF evolved from the notion of being able to connect web
> resources*, I would be disappointed if what we cook up here didn¹t cover
> that as a scenario.
> 
> S35
> ===
> 
> In S35, you use the JSON-LD @graph construct
> (http://json-ld.org/spec/ED/json-ld-syntax/20120522/#named-graphs) to
> create a list of resources under a named graph. Presumably, the underlying
> story is that the developer wanted to provide a means to filtering
> resources available through a given endpoint by their project grouping.
> So, I would browse to <https://a.example.com/acclist>, get back the graph
> which contained the list of projects, as shown in the example, and I could
> then go to <https://a.example.com/acclist#alpha> and get those resources?
> Nigglingly, this doesn¹t follow convention
> (http://www.w3.org/wiki/HashVsSlash). Or is the intention that I would get
> a fuller graph with all possible resources at
> <https://a.example.com/acclist>? There is a wee bit of context missing
> from this to make it into a functioning user story.
> 
> So, really, Arthur, it looks like the underlying requirement is that you
> want shapes to apply to JSON-LD named graphs (and lists). This I am
> personally fine with, I have never been a fan of reverting to linked lists
> in RDF. However, the resources in the graph are only Œdisconnected¹ in the
> sense that the straight RDF representation loses the notion of the graph
> (I took the example and ran it through http://json-ld.org/playground/):
> 
>  <https://a.example.com/acclist#alpha>
> <http://purl.org/dc/terms/description> "Resources for Alpha project" .
>  <https://a.example.com/acclist#alpha> <http://purl.org/dc/terms/title>
> "Alpha" .
>  <https://a.example.com/acclist#alpha>
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> <http://open-services.net/ns/core/acc#AccessContext> .
>  <https://a.example.com/acclist#beta>
> <http://purl.org/dc/terms/description> "Resources for Beta project" .
>  <https://a.example.com/acclist#beta> <http://purl.org/dc/terms/title>
> "Beta" .
>  <https://a.example.com/acclist#beta>
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> <http://open-services.net/ns/core/acc#AccessContext> .
>  <https://a.example.com/acclist>
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> <http://open-services.net/ns/core/acc#AccessContextList> .
> 
> What isn¹t clear is an example of the kind of shape you want to apply to
> this graph. All of these resources have an RDF type, so if I wanted to
> constrain by acc:AccessContext I could, and wouldn¹t need to even think of
> the @graph construct. Would the intention be to constrain every resource
> to be tagged with one or more of the members of acclist, say through a
> specific predicate? What kind of shape relies specifically on the @graph
> construct?

This requirement on its own doesn't appear to require named graphs,
just the ability to match disconnected stuff in a single graph. At
least that's how I understood it in <http://w3.org/brief/NDI4>.


> While by no means part of the Œnew wave¹ of programmers, I can see that a
> more pragmatic approach to this sort of thing would make this far more
> accessible in practical application.
> 
> Another example 
> ===============
> 
> Let¹s say I publish news articles as HTML web pages on my site, and I mark
> each page up with the semantic tagging of my choice (RDFa or JSON-LD). I
> want to say that every news article has at least one author and a single
> copyright notice, for example. I could say that every article page is a
> manifestation of a specific class identified by a page URI on my site, and
> that leads me to the class-based approach to shaping. I could say that I
> really don¹t care - I am just working to a deadline and all that interests
> me is that I get the SEO tagging right and my page shows up pretty in
> search results. What I want to do is ensure that every news article I
> publish can be validated, and this is where I want to be able to invoke
> the shape concept.
> 
> Really, I am imposing additional constraints based on
> http://schema.org/Article, even though I am not saying it explicitly
> (schema.org is scared of OWL).
> 
> So all I do is state that every page on my site (a set of graphs?) adheres
> to the constraint that says every page must have at least one author and
> one, and only one, copyright notice, and I call that a shape. In JSON-LD
> terms each page is not necessarily a named graph, it is just a set of
> triples returned when I parse the page. I could represent it as a named
> graph if I set the @id property to the URI of the page of the article.
> However, I don¹t see that I should _need_ to declare this explicitly -
> here I would merely want to say that all articles on my site (pages tagged
> appropriately - see the examples at http://schema.org/Article) are in
> scope for my shape.
> 
> Large scale stores (S34)
> ========================
> 
> I understand that at large scale you might want to optimize the shapes to
> make them computationally efficient
> (https://www.w3.org/2014/data-shapes/wiki/User_Stories#S34:_Large-scale_dat
> aset_validation), so I can see that the scope of a shape could be defined
> in terms of a SPARQL query, or some other similar construct. However, I
> see this as leading to a different set of requirements than those implied
> by S35, so I am not sure why Dimitris has commented that they are
> connected. 
> 
> The soapbox bit
> ===============
> 
> I am not sufficiently well-versed in the theory to understand why a shape
> cannot always be a class in the pure RDF schema sense. I just want a Thing

One issue is that some data doesn't have type arcs and pretty much no
data has type arcs that fully enumarate all of the shapes that it
might fit. For example, your system might use foaf:Person in a couple
ways with different constraints on what properties must or may
appear. You can attach one schema to foaf:Person for one use but as
soon as you have two, you have mutually inconsistency and effectively
a truth maintenance system.


> that can have a defined scope (as described above) and one or more
> constraints it imposes on the resources that fall within that scope. If
> that¹s not always a class, fine. When classes are shapes, though, I would
> like them to be able to be interoperable with OWL constructs. This is
> simply because when I define an ontology for a graph I fully control, I
> actually mean the OWL to manifest as closed world assertions on that
> graph. This is the Stardog ICV approach - call it the 'closed world
> approximation¹ on the open model.
> 
> I also expect shapes to be able to coexist with existing validation
> mechanisms such as schematron for XML, for example - this is actually
> pretty important as we evolve EPUB in the publishing space (I work for
> John Wiley) to incorporate semantic tagging, and as the web itself absorbs
> the notion of packages which are more than simply web pages
> (http://w3ctag.github.io/packaging-on-the-web/). In other words, RDF
> doesn¹t exist in isolation, and nor should shapes. Even in the world of
> graph stores, hybrid approaches are starting to take hold, and I would
> expect something post-SPARQL in our future.

I'd be interested in banging out a use case with you. Do you have some leads?


> chz
> Patrick Johnston (magyarblip)
> 
> *Please don¹t hit me over the head about using the word Œresource¹.
> 
> 
> 
> 
> On 12/23/14, 2:06 PM, "Karen Coyle" <kcoyle@kcoyle.net> wrote:
> 
> >Thanks, Eric. The visualization really helps. I can now see that what
> >holds these two together is in the "proxy" statements, and that I wasn't
> >noticing the subtle differences in the URIs. (Also, I do wish that the
> >ORE proxy were a bit more amply defined. [1]) I'm not sure what makes a
> >package a package in Arthur's case. Arthur?
> >
> >kc
> >[1] http://www.openarchives.org/ore/1.0/datamodel.html#Proxies
> >
> >On 12/23/14 9:35 AM, Eric Prud'hommeaux wrote:
> >> * Karen Coyle <kcoyle@kcoyle.net> [2014-12-20 08:22-0800]
> >>>
> >>>
> >>> On 12/19/14 8:11 PM, Peter F. Patel-Schneider wrote:
> >>>> The narrative for S35 says "There is no path from the
> >>>> acc:AccessContextList node to either of the acc:AccessContext nodes.
> >>>> There is an implicit containment relation of acc:AccessContext nodes
> >>>>in
> >>>> the acc:AccessContextList by virtue of these nodes being in the same
> >>>> information resource."  This implicit connection is not part of RDF.
> >>>
> >>> An example would really help here. I have what may be a similar
> >>> example from the Europeana data. I'm not sure if this mailing list
> >>> takes attachments, so the (short) example is here:
> >>>
> >>> http://kcoyle.net/temp/edmtest.ttl
> >>>
> >>> I cut the data down from something with dozens of related files and
> >>> subject headings, but I think I kept the structure intact. The main
> >>> nodes of the model are edm:ProvidedCHO and ore:Aggregation. The data
> >>> is natively in RDF/XML but I have trouble reading that so I
> >>> converted it to TTL.
> >>>
> >>> Q: Is this an example of what is being discussed here?
> >>
> >> Running this through dot (attached), it seems like this includes a
> >> couple bibliographic resources (uh oh, "resources"!) which proxy for a
> >> third. This seems to be a well-connected graph. Arthur's example is of
> >> data which has no connections apart from some implied by being in the
> >> same package.
> >>
> >> <X> a <Foo> .
> >> <Y> a <Foo> .
> >> <Z> a <FooList> .
> >>
> >> The presence of something of type FooList appears to trigger some
> >> special processing which kicks off a search for <Foo>s (and possibly
> >> whines if there aren't any). Arthur, is that right?
> >>
> >> I'm not confident this is a good idea, but to try it out, I mocked up
> >> a notion of a conomitant shape:
> >>
> >> [[
> >>    start= {
> >>      a (oslc:AccessContextList),
> >>      CONCOMITANT @<ContextShape>+
> >>    }
> >>
> >>    <ContextShape> {
> >>      a (oslc:AccessContext),
> >>      dc:description xsd:string,
> >>      dc:title xsd:string
> >>    }
> >> ]]
> >> with a questionable RDF representation:
> >> [[
> >>      rs:property [
> >>          rs:name "???" ;
> >>          se:concomitantShape true ;
> >>          rs:valueShape <ContextShape> ;
> >>          rs:occurs rs:One-or-many ;
> >>      ] ;
> >> ]]
> >>
> >> http://w3.org/brief/NDI4
> >>
> >>
> >>> Thanks,
> >>> kc
> >>>
> >>>
> >>>>
> >>>>
> >>>> peter
> >>>>
> >>>>
> >>>> On 12/19/2014 06:01 PM, Karen Coyle wrote:
> >>>>> DC has at least one similar case, in use today. Can you, however, say
> >>>>> what you
> >>>>> mean by "some characteristic of two nodes"? What "characteristics"
> >>>>> would put
> >>>>> them out of scope?
> >>>>>
> >>>>> kc
> >>>>>
> >>>>> On 12/19/14 4:12 PM, Peter F. Patel-Schneider wrote:
> >>>>>> If the only connection is that they are in the same graph, then it
> >>>>>>might
> >>>>>> be in scope.  However, if there is some indication that the
> >>>>>>connection
> >>>>>> is somehow special because of the some characteristic of two nodes
> >>>>>>that
> >>>>>> are both in a particular graph, then I would say that this is out of
> >>>>>> scope.
> >>>>>>
> >>>>>> It appears to me that the latter is the case.
> >>>>>>
> >>>>>> peter
> >>>>>>
> >>>>>>
> >>>>>> On 12/19/2014 12:42 PM, Arthur Ryman wrote:
> >>>>>>> "Peter F. Patel-Schneider" <pfpschneider@gmail.com> wrote on
> >>>>>>>12/19/2014
> >>>>>>> 02:40:44 PM:
> >>>>>>>
> >>>>>>>> From: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
> >>>>>>>> To: Arthur Ryman/Toronto/IBM@IBMCA, public-data-shapes-wg@w3.org
> >>>>>>>> Date: 12/19/2014 02:41 PM
> >>>>>>>> Subject: Re: shapes as classes
> >>>>>>>>
> >>>>>>>> S35 talks about an implicit connection between acc:AcccessContext
> >>>>>>>> nodes
> >>>>>>> and
> >>>>>>>> acc:AccessContextList nodes.  This implicit connection appears to
> >>>>>>>> me to
> >>>>>>> be
> >>>>>>>> outside the scope of RDF.
> >>>>>>>>
> >>>>>>>> peter
> >>>>>>>>
> >>>>>>>
> >>>>>>> Peter,
> >>>>>>> I think this implicit connection is in scope because the concept
> >>>>>>>of an
> >>>>>>> RDF
> >>>>>>> graph is within the scope of RDF. The implicit connection between
> >>>>>>>the
> >>>>>>> nodes is a consequence of them being in the same RDF graph. A shape
> >>>>>>> language should let me describe a constraint such as "The graph
> >>>>>>>must
> >>>>>>> have
> >>>>>>> exactly one node of type acc:AccessContextList, and zero or nodes
> >>>>>>>of
> >>>>>>> type
> >>>>>>> acc:AccessContext."
> >>>>>>>
> >>>>>>> -- Arthur
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>> --
> >>> Karen Coyle
> >>> kcoyle@kcoyle.net http://kcoyle.net
> >>> m: 1-510-435-8234
> >>> skype: kcoylenet/+1-510-984-3600
> >>>
> >>
> >
> >-- 
> >Karen Coyle
> >kcoyle@kcoyle.net http://kcoyle.net
> >m: 1-510-435-8234
> >skype: kcoylenet/+1-510-984-3600
> >
> 

-- 
-ericP

office: +1.617.599.3509
mobile: +33.6.80.80.35.59

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

There are subtle nuances encoded in font variation and clever layout
which can only be seen by printing this message on high-clay paper.
Received on Sunday, 28 December 2014 22:16:41 UTC