- From: Pierre-Antoine Champin <pchampin@liris.cnrs.fr>
- Date: Fri, 25 Feb 2011 09:46:16 +0100
- To: Sandro Hawke <sandro@w3.org>
- CC: public-rdf-wg <public-rdf-wg@w3.org>
+1 to this; and I must say I kind of like the quirkiness of those terms. I have the feeling I'll miss them if they don't make it to the spec ;) Some comments below about blank nodes, which indeed raise some issues. On 02/25/2011 04:25 AM, Sandro Hawke wrote: > I'm still having trouble following the discussion due to ambiguity of > terms. But I don't want us to argue about terms at this stage. So I'd > like to propose some temporary terms. They are intentionally a little > quirky and not suitable for use in our final specs. Instead, they are > meant to be short and unambiguous and relatively memorable. At the end > of this email, I try to connect them to other people's terms. > > Here they are: > > 1. A "g-box" is a container, like a "set" data structure in > programming. It holds some RDF arcs, with their nodes. (Alternatively, > it holds some RDF triples.). G-boxes can overlap, sharing some of the > same nodes and arcs. Two g-boxes can happen to have the same contents > (right now) while being distinct g-boxes. G-boxes contents can change: > today a particular g-box might contain the triples { my:a my:b _:x. > my:a my:c _:x }, and tomorrow it might instead contain { my:a my:b _:x. > my:a my:c2 _:x }. > > 2. A "g-snap" as an idealized snapshot of a g-box; it's a mathematical > set of RDF arcs, with their nodes. (Alternatively, a mathematical set > of RDF triples.) Like g-boxes, g-snaps can overlap, sharing nodes and > arcs. Unlike g-boxes, it makes no sense to talk about g-snaps > changing: they are defined to be exactly the collection of their > elements. If a g-snap were to "change" it would simply be a different > g-snap. If two g-snaps have the same nodes/arcs, they are really the > same g-snap. The contents of a g-box at any point in time are a > g-snap. > > 3. A "g-text" is a particular sequence of characters or bytes which > conveys a particular g-snap in some language (eg turtle or rdf/xml). If > you can parse a g-text, you know what is in the g-snap it conveys > (except blank nodes, as discussed below). You can tell someone exactly > what is in a particular g-box at some instant by sending them a > g-text. (You send them the g-text which conveys the g-snap which is > the current state/contents of that g-box.) > > Are those terms and descriptions clear enough? Are there edge cases > they are missing? > > Now, about URIs: > > * A g-box can exist without any name or persistent way of referring to > it; it can exist as a data structure in a running program, or I > suppose it can exists in someone's mind. Long-lived g-boxes > probably SHOULD be given a preferred single working URL, but there > might be times when you do don't want to give it any, or when you > want to give it several URLs. > > * You can convey a g-snap with a g-text, but I don't think you usually > want to name them with URIs. Sometimes you want to put a g-snap > into a URI, but that's rare, since in many cases g-snaps are too > long for most URI-handling software. For constrained applications, > though, where overrun is unlikely or okay, you can embed a g-text > somewhere in an http URI (eg, as a query parameter), or maybe use > "data:" URI. > > And blank nodes? I think it works like this: > > * Two g-snaps can contain the same blank node. A simple example of > this is to take a g-snap containing at least one blank node, then > construct another by adding the triple { my:a my:b my:c }. The > original g-snap and the one resulting from the union both contain > the same blank nodes. As g-snap are mathematical sets, I agree they can contain the same blank nodes. Your constructive proof makes much sense to me. > * By a similar argument, I believe two g-boxes can also contain the > same blank node, although not all software will support this. Given > a g-box A, I could construct A' to contain whatever A contains and > also { my:a my:b my:c }. This happens sometimes in real programs; > I'd be curious to know which RDF APIs disallow sharing blank nodes > between their graph-storage instances; my experience is they allow > it when it's not a problem (eg they are both in memory right now). Here, I would be more cautious. From the definitions you gave above, it is clear that two g-boxes can contain the two g-snaps described above, thus sharing blank nodes. However I think this should not be allowed by the RDF conceptual model. On the contrary, we should force every blank node to appear in at most one g-box, defined as the *scope* of the blank node. This is the way I read Pat's mail: >> we simply stipulate, as a part of the underlying RDF conceptual >> model, that every blank node can occur in at most one RDF graph >> token. where I think "graph token" means "g-box" (although I agree with you that the proposed definition of "graph token" makes it look like g-text sometimes). > * In general, while g-texts do convey g-snaps, they do not identify > the blank nodes in them. So, in fact, if you go > > g-snap A --> g-text --> g-snap A' > > A=A' only if it does not contain blank nodes, because parsing a > g-snap results in all-new blank nodes. ... which is a way to enforce the scope-limitation of blank nodes. > We might define new RDF syntaxes which allow for several g-texts to > be grouped in such a way that blank nodes can be shared between them. > This is an issue for our work item, "Either [the turtle] syntax or a > related syntax should also support multiple graphs and graph stores." What would be the use of sharing a blank node across several g-texts if, as you stated above, "parsing [a g-text into] a g-snap results in all-new blank nodes"? It seems to me that the parsing would therefore lose the indentity of the blank node... I guess I need some concrete use cases to understand your point. > How's that sound? Make sense? much sense indeed. Thanks for that. Shall we write a wiki page where keep track of those different terminologies and how they align? I can do that... pa > > Okay, relating to other people's terms... > > "Tokens", as I read today's email, seem to mostly be g-texts but > sometimes be something that can change over time, and thus be a > container for a g-text, something we might call a "g-text-box". I > think this later meaning conflates things in a way which will cause > problems, eg for understanding content-negotiation. > > "Graphs" in the RDF Semantics are g-snaps. > > "Named Graphs", as in SPARQL 1.0, are g-boxes which happen to each > be assigned a URI. > > "Graph Literals", as suggested by N3 (and disagreeing with Nathan, > sorry), are a feature of an RDF syntax that allows you to denote a > g-snap by a special kind of term (a "graph literal"). In n3, it looks > like: > > { _:x my:says { _x: foaf:name "Sandro Hawke" } }. > > One can approximate this with every RDF syntax by using a > suitably-defined URI scheme or datatype, such as: > > { _:x my:says "_:x<http://xmlns.com/foaf/0.1/name> \"Sandro Hawke\""^^my:turtleCode } > > This isn't as convenient as the N3 approach, and doesn't doesn't allow > blank nodes to be shared (in the second example, the _:x's are not > connected), but it does work in existing RDF syntaxes. > > I'd better stop now. > > -- Sandro > > > >
Received on Friday, 25 February 2011 08:46:53 UTC