Re: "layers" (was Re: the term "named graphs") from Pat Hayes on 2012-04-29 (public-rdf-wg@w3.org from April 2012)

From: Pat Hayes <phayes@ihmc.us>
Date: Sat, 28 Apr 2012 22:44:55 -0500
To: Sandro Hawke <sandro@w3.org>
Cc: Dan Brickley <danbri@danbri.org>, Andy Seaborne <andy.seaborne@epimorphics.com>, public-rdf-wg@w3.org
Message-Id: <EC1F1B40-22CC-494C-BD2B-0FBE6E766EFA@ihmc.us>
I am not following this AT ALL.  Can you expand slightly? Detailed questions inline below.

On Apr 28, 2012, at 8:04 AM, Sandro Hawke wrote:

> On Sat, 2012-04-28 at 12:57 +0200, Dan Brickley wrote:
>> On 28 April 2012 11:58, Andy Seaborne <andy.seaborne@epimorphics.com> wrote:
>>> On 28/04/12 05:49, Sandro Hawke wrote:
>>>> My concern is with how people
>>>> use the term in practice, and whether that usage conflicts with the
>>>> formal definition.
>>> 
>>> General usage is sloppy, imprecise and changes as convenient. Ambiguity in
>>> spoken language is normal.  We all manage.  But we are not all managers.
>> 
>> Yes, we have same issue with 'property', 'triple', 'statement' and
>> others; each might (if we're lucky) have a precise W3C RDF meaning,
>> but they shade into other related uses that there can't be such strict
>> standardised control over.
>> 
>> Property is probably the oddest. Sometimes in computing 'color',
>> 'size' etc. are themselves called properties, sometimes the size of
>> some particular thing is counted as one of its properties, and if it
>> had two different colours, they're each a property. So RDF properties
>> are close cousin to
>> http://en.wikipedia.org/wiki/Property_(programming) but different too;
>> I think we gain more by neighbourhood benefits than we suffer from
>> sloppyness and confusion there.
> 
> Yeah, lots of situations distinguish between attributes and properties
> and relations; we mush them together.   If we were starting with a blank
> slate, I'd suggest that "aspect" might be the best match for what we
> mean.

Blech. In logic they are all relations with different numbers of arguments. All other names are just flimflam. Aspect sounds totally wrong to me. Are our fathers aspects of us? If I own a house, is it one of my aspects? Doesnt make first cut. 

> 
>> "Named graph" by contrast is pretty much our phrase to do with as we
>> will (e.g. first page of google results are all "ours") . My guess is
>> that usage will get murky if we don't have a sloppier not-so-nitpicky
>> phrase also to throw around.
> 
> I think "named graph" is already being thrown around quite sloppily.
> Trying to answer my own survey, I found even I was very comfortable with
> the sloppy usage.
> 
> I'd rather come up with some precise new terms, and allow named graph
> that sloppy usage (in addition to the precise (u.G) meaning it also has
> in the SPARQL spec.
> 
>> I've pretty much convinced myself that "layer" is the best metaphor
>> there, and that we could productively encourage talk of data 'layers'
>> while leaving 'named graph' as the thing that has a much more rigid
>> official meaning.
> 
> I like that term, "layer".  Excellent....

To me it suggests a vertical dimension being involved. One layer is 'above' another, and this presumably means something rather important. Deep layers are deep for a reason, I guess... What reason?  (if not, why this usage?) 

> 
> For me, it works pretty well for what the fourth term in the quad
> denotes.

?? Not to me. What is the intuitive depth dimension for quad-4ths? Is it the date the IRI was coined, maybe?  When does one layer cover or hide another? 

>  It's similar to "graph container" but suggests much more
> strongly that it functions best as part of a whole.

Wha?? What is it about "layer" suggesting that? The cretacious is a layer. What is it part of?  (Do we speak the same language? Come to think of it, maybe not.) 

>  Different
> subgraphs are in different layers; you can look at just one layer, or
> look at the union of several layers.   It's not entirely intuitive that
> nodes and arcs (especially blank nodes) can be in multiple layers,

Indeed. FWIW, this is exactly what the 'surfaces' idea was supposed to *prevent*. Each bnode is on a single unique surface: that was the whole point. 

> but
> it's not too counter-intuitive either, I think.   I picture any node
> that occurs in the same location on two layers as being the same node.

Aaarghh. My head is exploding. The same LOCATION in two layers? what can that possibly even MEAN? That like saying the same street in two countries. 

> 
> So, let's look at the example dataset I used on the survey, without its
> default graph for now:
> 
>                @prefix :    <http://example.org/>
>                :g1 { :a :b 10 }
>                :g2 { :a :b 20 }
>                :g3 { :a :b 10 }
> 
> If you make the unique names assumption, then we have three layers.
> 
> If you make the closed world assumption (as one has to do during some
> database operations) -- that the triples we see here are all the triples
> there are in these layers, then we have either two or three layers.  We
> can't tell if g1 and g3 name the same layer.   Since they have  the same
> set of triples on them, they might be the same layer.  We can tell that
> g1 and g3 each do not name the same layer as g2, since clearly their
> layers have different triples on them.

But with incomplete semantics, each could have the ones it has and some others, so they could all be the same or indeed all different (maybe :g1 also includes :c :d 15 and :g3 also has :f :g 26)

BUt in any case, isn this all about named graphs? What has this got to do with layers? 

> 
> If you don't make either the UNA or the CWA, it would be possible that
> even g1 and g2 would be names for the same layer.   For example, if we
> later learned...
> 
>                :g2 { :a :b 10 }
>                :g1 { :a :b 20 }
> 
> then, as far as we knew, the layers would have the same triples on them.
> 
> I continue to like that a lot. 
> 
> What about Web Architecture?  If the name of a layer is dereferenceable,
> is it reasonable to expect/require the returned content to be a
> serialization of all the triples on that layer?   I think it probably
> is.    So we'd start to think about the different published and
> maintained foaf files, doap files, environmental quality surveys (in
> RDF), etc, each being a "layer".   The triples on the layer can change
> over time, but the layer is still a thing.  

That was one of the purposes of the surfaces idea, yes. They are an abstract model for g-boxes. But I didnt pitch this to the WG as we already had g-boxes, and that seemed to work OK. 

> 
> So, "layer" is the new "g-box".   (Maybe "RDF Layer" when we need to be
> formal.)  I think it's a much better term.  One cool thing is how much
> it raises the question, "layer in what?"    And that's a great question
> to be asking; of course, a dataset is a collection of layers,
> appropriately stacked, with their names attached, and one of them
> flagged as the "default".
> 
> Preliminary +100 on "layer".

If we are going to start pitching names, I will re-open my "surfaces" idea, described in http://www.slideshare.net/PatHayes/rdf-redux. It does have the great merit of completely getting rid of the merge/union terminology. Those slides dont mention SPARQL datasets, but in that terminology they would be a set of named surfaces and a default surface, each with an RDF graph on it. OR, if we want bnodes to be shared, they would be a single surface with the named graphs and default graph on it. 

We dont have to stick to the name "surface". I do like words like 'sheet' (of paper), 'page', etc.., however. 

Pat


> 
>    -- Sandro
> 
> 
> 
> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Sunday, 29 April 2012 03:45:37 UTC