Re: Making progress on graphs from Pat Hayes on 2012-05-16 (public-rdf-wg@w3.org from May 2012)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 15 May 2012 23:43:41 -0500
To: Sandro Hawke <sandro@w3.org>
Cc: RDF Working Group WG <public-rdf-wg@w3.org>, Antoine Zimmermann <antoine.zimmermann@emse.fr>
Message-Id: <EA03DD77-D715-417B-9E3E-11840CDCB092@ihmc.us>
On May 15, 2012, at 8:57 PM, Sandro Hawke wrote:

> On Tue, 2012-05-15 at 19:30 -0500, Pat Hayes wrote:
>> On May 14, 2012, at 10:31 AM, Sandro Hawke wrote:
>> 
>>> 97% agreement, Pat.   Good summary of where we are.
>>> 
>>> On Mon, 2012-05-14 at 01:34 -0500, Pat Hayes wrote:
>>>> Accepting Richard's proposal for syntax, we can try moving on to some decisions for semantics. Here, I think Sandro has a point: there are some large-scale top-down decisions to be made, and until we get these clear, arguing about details will only produce muddle. 
>>>> 
>>>> I can discern three main ideas that the WG has produced for a semantics for datasets. All of these treat the default graph as being, well, just a graph, so I will ignore this. They differ in how they interpret the name-graph pairings.
>>>> 
>>>> 1. (Peter) Datasets have no special semantics. People are free to use them however they find most useful. There is no notion of asserting a dataset (or maybe: asserting a dataset is just asserting its default graph.) 
>>>> 2. (Sandro) Datasets are used to attach names to graphs, so that the name can be used to refer to the graph in other content, in particular, in other RDF. Asserting the dataset is a form of graph baptism.
>>> 
>>> I guess that's okay for a very-short summary.   :-)
>>> 
>>>> 3. (Antoine) Datasets are used to assert graphs in contexts, so that the name indicates a context of interpretation for the IRIs in the named graph. Asserting the dataset is a making a statement in a kind of context logic version of RDF.
>>>> 
>>>> If anyone knows of any others, let them speak now or forever hold their peace. 
>>>> 
>>>> I suggest that until we can decide which of these overall pictures is best, any discussion of details of semantics is irrelevant. 
>>>> 
>>>> The arguments I have heard for and against are as follows. 
>>>> 
>>>> 1. Pro: makes life simple, and allows the world to continue to experiment freely. Con: Allows the world to experiment freely, so blocking interoperability. Does not provide a way to name graphs.
>>>> 2. Pro: provides a way to name graphs, which is needed. Con: is not the way that datasets are used in current practice.
>>> 
>>> I think my proposal does align with most of current practice
>> 
>> But Antoine says the same about his idea. Someone must be wrong. Or else, there are entire islands of current practice which are working independently of one another.
> 
> I'm not sure they are contradictory.   Antoine's proposal and mine are
> both flexible enough to cover current practice.

Wow. Really? They seeem to me to have almost nothing in common.  You give names to graphs, Antoine's doesn't. Antoine's requires allowing IRIs to mean different things in different contexts, your's doesn't. Semantically they are completely different. I don't see how one can get true graph naming in Antoine's framework, nor how yours can provide any notion of context.

>>> , and the
>>> other parts (things like using the primary-subject of a graph as its
>>> name) can be explained and handled in my framework.
>>> 
>>>> 3. Pro: corresponds to current actual usage. Con: based on a 'localist' view of IRI meanings which violates current assumptions or maybe best practice (?). Does not provide a way to name graphs. Requires change to RDF semantics.
>>>> 
>>>> FWIW, I observe that the proposal outlined in http://www.w3.org/2011/rdf-wg/wiki/AnotherSpin  allows one to express Antoine's content while using Sandro's naming semantics, which might be a way to reconcile things. 
>>> 
>>> I keep feeling like the context stuff is out of scope and we don't need
>>> to solve it now,
>> 
>> I agree it is further out than giving a name to a graph, but (to my own surprise) the NG discussion got involved with contexts right from the beginning (Antoine's idea was there almost from the start of the WG), and it hasnt gone away. So it seems to me that, like it or not, named graphs in practice involves thinking about contexts (If only to say explicitly that naming is not about contexts, because so many people think it is.) 
>> 
>> Also, I am more and more coming to think that even if this is slightly outside our charter, it is worth trying to do something about it. The idea of contexts has arisen very centrally in our own discussions, and it is currently very much a topic being widely discussed in the community (eg the entire multi-day workshop I cited previously), and we would be doing the world a favor by doing something about it. For example, several people contributing to the ongoing http-range-14 discussions are simply taking it for granted that IRI meanings depend upon the local context, as though this was simply obvious. (See http://www.jenitennison.com/blog/node/170 : "Like the meaning of a word, the sense that a URI refers to is a social understanding which emerges from use of the URI across the web, and a given URI may be used to refer to different senses in different sources of information or over time."  (blog dated 2012-05-11 20:11) 
>> 
>> If enough people think that the world is this way, then it actually is this way, whether we like it or not. 
> 
> It's not the number of the people, it's the ones who control the code.

I think its the ones who control the data, actually. These days, code is usually mash-ups, but they have to use the data that is out there, and has to respect and work with the semantic assumptions that underlie the way the data is encoded. 

Pat

> Running code is more like the physical universe than the social one.  We
> only need to convince them.   (Not that convincing them is easy, but
> it's not nearly as hard.)
> 
>    -- Sandro
> 
>> Pat
>> 
>>> but perhaps I'm being naive.  I do tend to look at the
>>> Semantic Web more as a designer (seeing how it should work) than as a
>>> user (seeing how it does work, and often fails). 
>>> 
>>>  -- Sandro
>>> 
>>>> Pat
>>>> 
>>>> On May 13, 2012, at 4:29 PM, Sandro Hawke wrote:
>>>> 
>>>>> On Sun, 2012-05-13 at 23:09 +0200, Antoine Zimmermann wrote:
>>>>>> +1 to the proposal and to move forward one piece at a time.
>>>>> 
>>>>> I think our decisions should be choices between complete solutions or
>>>>> pieces of complete solutions.
>>>>> 
>>>>> Otherwise we risk having no solutions, or only bad solutions, because we
>>>>> constrained the solution space blindly.
>>>>> 
>>>>> Richard's proposal (with some minor tweaks in how he defined dataset
>>>>> [1]) happens to be in line with my proposal, but I'm rather opposed to
>>>>> it as a matter of principal; I don't see how chopping up the design
>>>>> space like this is going to produce better results, and I'm quite
>>>>> concerned it will make things worse.
>>>>> 
>>>>> Please, just paint complete pictures, showing how to address all the use
>>>>> cases, or at least some interesting ones.  Then we can look at those
>>>>> pictures and decide among them.
>>>>> 
>>>>> (Antoine, you kind of did this.  We've never talked about your
>>>>> proposal.  I happen to strongly prefer mine, but yours did make sense.)
>>>>> 
>>>>> What we can do -- and maybe this is would be enough for what you want,
>>>>> Richard -- is make non-binding strawpolls to try to understand where
>>>>> people are coming from and what design features they are likely to
>>>>> support.
>>>>> 
>>>>> -- Sandro
>>>>> 
>>>>> [1] Specifically: can the IRIs occur more than once?  I assume we'd
>>>>> agree not.  More controversially, can named graphs be empty?  I'd argue
>>>>> no, in order to keep compatibility with quad stores.   SPARQL 1.1 Update
>>>>> struggles with this, saying EG you can create an empty graph but it
>>>>> might be instantly deleted.
>>>>> 
>>>>> 
>>>>>> 
>>>>>> Le 13/05/2012 22:54, Richard Cyganiak a écrit :
>>>>>>> Hi Ivan,
>>>>>>> 
>>>>>>> On 13 May 2012, at 16:15, Ivan Herman wrote:
>>>>>>>> it looks to me that Sandro's draft document:
>>>>>>>> 
>>>>>>>> https://dvcs.w3.org/hg/rdf/raw-file/d96c16480e42/rdf-spaces/index.html
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> would be a good way to 'settle' things (see [1]), too.
>>>>>>> 
>>>>>>> Sandro's draft takes explicit position on a *all* issues, many of
>>>>>>> which are highly controversial. By bundling non-controversial and
>>>>>>> controversial issues all into one big package, this blocks progress
>>>>>>> on the sub-issues where we actually seem to all agree. So I repeat:
>>>>>>> 
>>>>>>> 
>>>>>>> PROPOSAL: The abstract syntax for working with multiple graphs in RDF
>>>>>>> consists of a default graph and zero or more pairs of IRI and graph.
>>>>>>> This resolves ISSUE-5 (“no”), ISSUE-22 (“yes”), ISSUE-28 (“no”),
>>>>>>> ISSUE-29 (“yes”), ISSUE-30 (“they are isomorphic”), ISSUE-33 (“no”).
>>>>>>> 
>>>>>>> 
>>>>>>> So far I have heard no objections to this.
>>>>>>> 
>>>>>>> Best, Richard
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> At the moment it seems to collect all the various issues that we
>>>>>>>> have discussed with a fairly clear way of moving forward.
>>>>>>>> 
>>>>>>>> Ivan
>>>>>>>> 
>>>>>>>> 
>>>>>>>> [1]
>>>>>>>> http://lists.w3.org/Archives/Public/public-rdf-wg/2012May/0178.html
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> On May 13, 2012, at 16:59 , Richard Cyganiak wrote:
>>>>>>>> 
>>>>>>>>> All,
>>>>>>>>> 
>>>>>>>>> We've been talking our way up and down the design space for
>>>>>>>>> multigraphs for a year now, with not much to show for it. We
>>>>>>>>> still have not settled on a basic design.
>>>>>>>>> 
>>>>>>>>> Once we do settle on a basic design, the real work only starts
>>>>>>>>> since we need to nail down the details. This will take time. Our
>>>>>>>>> charter says that all documents should go to LC *this month*, and
>>>>>>>>> obviously we are nowhere near ready for this.
>>>>>>>>> 
>>>>>>>>> So I think it's time to stop exploring the design space, and
>>>>>>>>> start collapsing it by making decisions.
>>>>>>>>> 
>>>>>>>>> Obviously there is still strong disagreement on many things when
>>>>>>>>> it comes to multigraphs, but it seems to me that all proposals on
>>>>>>>>> the table accept a basic *abstract syntax* that is quite similar
>>>>>>>>> to the RDF datasets in SPARQL, and even the most adventurous
>>>>>>>>> experiments don't really stray from that forumla. Therefore:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> PROPOSAL: The abstract syntax for working with multiple graphs in
>>>>>>>>> RDF consists of a default graph and zero or more pairs of IRI and
>>>>>>>>> graph. This resolves ISSUE-5 (“no”), ISSUE-22 (“yes”), ISSUE-28
>>>>>>>>> (“no”), ISSUE-29 (“yes”), ISSUE-30 (“they are isomorphic”),
>>>>>>>>> ISSUE-33 (“no”).
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> RATIONALE: All proposals on the table are based on an abstract
>>>>>>>>> syntax very similar to SPARQL's notion of an RDF dataset,
>>>>>>>>> although there is no consensus on the semantics and the
>>>>>>>>> terminology. Making a decision on the basic abstract syntax would
>>>>>>>>> unblock the work, and allow various strands of required detail
>>>>>>>>> work to proceed independently, hopefully leading to additional
>>>>>>>>> resolutions to remaining questions, such as:
>>>>>>>>> 
>>>>>>>>> • What's the formal semantics of the abstract syntax? •
>>>>>>>>> Definition of the concrete syntaxes (N-Quads, etc.) • Describing
>>>>>>>>> how to work with this in the Primer • What do call the pairs?
>>>>>>>>> “Named graphs” or something else? • What to call the entire
>>>>>>>>> thing? “RDF dataset” or something else? • Can blank nodes be
>>>>>>>>> shared among graphs? • What additional terminology (rdf:Graph
>>>>>>>>> etc) needs to be defined?
>>>>>>>>> 
>>>>>>>>> Best, Richard
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home:
>>>>>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF:
>>>>>>>> http://www.ivan-herman.net/foaf.rdf
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> ------------------------------------------------------------
>>>> IHMC                                     (850)434 8903 or (650)494 3973   
>>>> 40 South Alcaniz St.           (850)202 4416   office
>>>> Pensacola                            (850)202 4440   fax
>>>> FL 32502                              (850)291 0667   mobile
>>>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973   
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 16 May 2012 04:44:36 UTC