- From: David Booth <david@dbooth.org>
- Date: Thu, 29 Nov 2018 20:07:48 -0500
- To: W3C Semantic Web IG <semantic-web@w3.org>
- Cc: Nathan Rixham <nathan@webr3.org>
On 11/28/18 11:14 AM, Nathan Rixham wrote: > . . . if we referred to > this address thing as an unidentified object, and looked in our > databases, documents, code, apis, we'd find a huge portion of them are > comprised of these unidentified objects, where the set of property value > pairs is their identity, an identity that's good enough for purpose. Yes, exactly. Their properties form a composite key. In my experience this is *very* common, especially because it very advisable to use properties that uniquely identify each thing, just as it is very advisable in relational tables to have a primary key -- possibly composite -- for every table. > Under this unidentified object scenario, to be considering identifiers > for unidentified things seems like a strange question, as the whole > point is that it's unidentified. But it *is* identified, by its properties that form a composite key. The author just didn't bother to give it an explicit URI. > Realistically, saying we require everything to have a > name/identifier/uri is just a no go. Immediate real world first > responses would be (a) invalid rdf as the IDs would be ommitted, or (b) > encoding of objects in strings as string values, as in a chunk of json > or xml frag in a string property. I think there's a middle-way possibility here though. I agree that we don't want to burden users with explicitly creating URIs for everything. But it could be helpful for the tooling to do this under the hood. > Now, IMHO there's merit in generating IDs for bnodes, but behind the > interface not over wire, for use in canonicalization or storage engines > or code - *not* in a serialized document sent between parties. Exactly. The user should not normally see them. On 11/29/18 4:14 PM, Nathan Rixham wrote: > . . . as soon as we start skolemizing, throwing away redundant > nodes becomes a great deal more complex. Not if the RDF indicates the (composite) key for each object. If the key is known it becomes *easier* to collapse redundant nodes if URIs are used, because those nodes will have the same URI (assuming they are predictably generated based on the key). To summarize, if conventions for n-ary relations allow the user to conveniently indicate which properties constitute a (composite) key -- perhaps defaulting to all properties -- then in theory tools could use that information to automatically collapse duplicate nodes, whether they use blank nodes or URIs. But if this is done with URIs that are predictably generated from those keys -- instead of blank nodes -- then we get the advantage that existing tools *already* will collapse them, whereas they wouldn't if blank nodes are used. David Booth
Received on Friday, 30 November 2018 01:08:11 UTC