Re: RDF 2 Wishlist

On Nov 2, 2009, at 2:22 PM, Sampo Syreeni wrote:

> On 2009-11-02, Pat Hayes wrote:
> (Sorry for answering indirectly. I'm a bit late to the discussion.)
>>> * Deprecate RDF reification. Issue warnings, write document to  
>>> explain problems.
> I would argue against this. Reification, in one form or another, is  
> a highly valuable part of the standard, because it let's us pose  
> hypotheticals and metadata relating to them.

Only if you also ignore part of the spec. Which is a mess. I believe  
we can do this better.

> Eventhough Pat is likely to vehemently disagree with me on this one,  
> I'd take hazy reification/quotation/whatever semantics over the lack  
> of the basic mechanism, anyday of the week. I mean, otherwise we're  
> bound to have even *hazier* concoctions in its place.


>>> * Deprecate collections (Alt, Bag, Seq). See above.
> Another no on my part. Heavy semantic lifting is needed with these  
> as well, but the fact is, the basic concepts are extremely useful as  
> modelling primitives.

Are they, though? My sense is that the lisp-style lists used by OWL  
are used much more than bag, seq, alt. (Actually, the lists are the  
collections: these three are the containers.)

> Without stuff like this, what are we left with, semantically  
> speaking? Triples? They don't carry semantics at all; they're just  
> propositions, and even limited to being binary.
>>> * Serialise named graphs (although I'm not super keen in general):  
>>> [...]
> A formal syntax for named graphs would be nice, yes. Even in RDF/XML  
> (which I personally loathe as a syntax). But again, they need to  
> have proper semantics. I'd advocate the one based in epistemic modal  
> logic: treat any named graph as a bunch of assertions, define formal  
> modal operators which can be used to give metadata about the  
> referred-to graph, and then let any referring stuff flag its beliefs  
> using that common and well-tried-out formalism. All the while  
> reserving formalized judgment, so that the open world assumption  
> holds also wrt any formal logical interpretation, such that people  
> using the basic assertions can judge for themselves how to interpret  
> the source material arising from a distributed source.
> E.g. source A might assert that it believes the whole logical  
> content of the named graph imported from source B, but still, I, as  
> the end user of the data, have the full capability of choosing which  
> beliefs of A's I'm willing to trust/believe-in, when I'm building up  
> my application.

The key lack right now is any standard way to refer to a 'part' of an  
RDF graph from the outside.

> I believe examples such as these suggest that TimBL's original  
> vision of a distributed, open-world-assumption semantic net  
> necessarily entails use of epistemic modal logic to formally deal  
> with the higher, trust-related layers of the cake. That could, and  
> should, be done implicitly at first, so that all of the implications  
> needn't be hardcoded right from the start in RDF Core. But the  
> possibility of later on formally dealing with beliefs should, I  
> think, still be left open.
>>> * Simple envelope: <document name="foo" type="application/ 
>>> turtle">...</document>
>>> * Sparql GSPO to dump datasets
> I think this sort of thing can be standardized outside of W3C. If  
> uptake is wide enough, then, standard it is. If not, one failed  
> attempt at standardization we once have.
>>> * Make bnode unlabelled, rather than existentially quantified var.
> No. From my relational background, I tend to treat bnodes like I'd  
> deal with perfect, opaque surrogate keys. Their only semantics are  
> to connect stuff together, while shying away from exposing  
> autogenerated hogwash to the end users. In that capacity, it doesn't  
> make sense to apply the one name assumption to them; in fact they've  
> been invented to go around said restriction where available  
> information about the real world referents leads to a diffuse  
> representation of even entity identity (or to cut down on the  
> internal redundancy of identifiers, when they're visible; that's  
> then a different deal altogether; more to do with data compression  
> than normalized data representation). It'd seriously hinder  
> knowledge representation, especially in a distributed, not necessarily
> perfect-knowledge or in particular controlled vocabulary, uniformly  
> well-keyed environment.
> To make it simple, it should be possible to have a number of  
> differently (and inferentially) keyed objects in the graph. Then we  
> need a truly blank node to mediate their relationships to other  
> stuff. Once that happens, the formal semantics immediately become  
> one of existential quantification, in the absence of a one-name- 
> assumption. That's model theory 101, basically.
>> Hmm, not at all obvious to me what this distinction amounts to.  
>> Unlabelled *is* existentially quantified, to all semantic purposes.  
>> Unfortunately, RIF has muddied this water by putting in meaningless  
>> distinctions.
> I'm no expert on RIF, but I believe this is once again an instance  
> of a muddled distinction between fully logical, and fully semantic,  
> constraints.
>>> * Prefixes: warn if some standard set not 'correct'. Have 'grab  
>>> all' namespace.
> That sort of thing has been, and should be, externalized from the  
> definition. We have separate and more focused standards to deal with  
> this.
>>> * Lang _and_ type. Reason for exclusivity lost in mists of time.
> Yes. I'd ditch this sort of stuff right now. If you want metadata on  
> a literal, it shouldn't really be a literal -- it should be a named  
> entity, and the metadata should hang off it. The literal, it should  
> simply be the terminal point where all of the inferencing stops,  
> after all of the metadata has already been fully ingested. It should  
> remain a dumb literal, which is only interpreted after we're done  
> with the metadata attached to it.
> If even that... Personally I'm of the opinion that literals should  
> be removed from the model altogether.

Oh no, they are the bread and butter of all the linked data. I'm all  
for putting datatyped literals into logic itself, in fact.

>>> * Bnodes as predicates. See above. Does SPARQL allow it?
> This is useful, I think. It preserves the symmetry between subjects,  
> predicates and objects. That sort of thing rhymes well with my  
> relational background, where the symmetry is absolutely perfect, and  
> where I use that symmetry to advantage on a daily basis in my work.  
> It also rhymes well with the fact that, in a truly distributed  
> semantic web, which uses triples-only no least, it's quite probable  
> that a) there are going to be multiple names for the same thing, and  
> that b) people would want to avoid referring to specific names of  
> even predicates, instead preferring to identify them by their  
> properties. In that case, it makes ample sense to use a blank node  
> as a predicate as well.


>>> * RDF/XML inverse properties. Make writing more pleasant.
> Yes. But explicitly make these syntactic sugar. Not something that  
> is part of the base data model.
>>> * Equivalence relations. Seems like every use of sameAs is  
>>> incorrect.
> No. The semantics exist in DAML/OIL/OWL. If the particular retard  
> you're referring to can't comprehend them, it ain't gonna help if  
> the definition is moved around to somewhere else, either. It'd just  
> break modularization within the framework. ;)

Thats not the real issue. The problem is, people need something weaker  
than sameAs to express a link, in many cases. Its not all people  
misusing sameAs because they don't understand it: they misuse it  
because there is no alternative, and they have to use something. Its  
up to us to provide some better alternatives.

>> In brief: there are at least 4 distinct notions of same-but-not- 
>> sameAs Ive managed to identify so far, and Im sure there will be  
>> more.
> I can just imagine. Especially since I've just been enjoying  
> Brachman's modern classic "What IS-A is and isn't: an analysis of  
> taxonomic links in semantic networks."
>> Bottom line: no single solution will work, so no RDF2 magic bullet.  
>> But Im sure we can do something useful.
> Personally I'd argue most of the things that cause opprobrium and  
> confusion at the moment are stuff that could be corrected via 1)  
> more precise and understandable documentation, 2) easier syntax, for  
> us so called lazy people, and 3) some work on formal semantics,  
> which also takes a wider perspective on the real life problems  
> people are using RDF to solve.

Its the last one that I think we are obliged to attempt.

> Fourth, it perhaps wouldn't be a bad idea to intentionally allow a  
> whole slew of logical confusion, either, as long as the core spec  
> remained clean;

That is one good strategy in the present state of the art, yes. See  
how SKOS approaches similar issues.

> that way the semantic web could develop in the unorganized manner  
> that the first web did. Without undue effort towards correctness,  
> until it bumped into the useful, necessary, third party engine which  
> actually cared about that sort of thing.

Well, linking data using not-quite-sameAs-maybe is something that very  
many people care a lot about right now. I hear more about this issue  
than any other. Most of the nasties in RDF are just ugly, or  
nuisances, but this is a real urgent problem that will get worse very  


> -- 
> Sampo Syreeni, aka decoy -,
> +358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile

Received on Monday, 2 November 2009 21:03:59 UTC