Re: RDF 2 Wishlist

Sandro Hawke wrote:
> So, what should W3C standardize next in the area of RDF, if anything?
> OWL 2 added a bunch of stuff to OWL that users wanted and implementors
> were willing to tackle.  Are there things like that around RDF?
> 
> My own answer is in a recent blog post:
>      http://decentralyze.com/2009/10/30/rdf-2-wishlist/
> 
> What's yours?

Sorry, I'm a bit late contributing here; I missed the start of the topic
a couple of days ago.

I've been considering proposing changes to RDF as part of my PhD, the
subject of which is Trust on the Semantic Web, so some of my points here
will come from that perspective.

Reification and Literals are the things first on my list, closely
followed by work on Turtle/N3-like syntaxes.
I won't go into the latter as I think others have covered that far
better than I could.

The stance I take is that the robust extensibility of RDF should be the
foremost concern; there should not be anything which is impossible or
horrible to represent in RDF, otherwise we risk RDF becoming obsolete,
and having to review the data-model again in the future, rather than
only the core vocabularies.

* Meta-Triples

To be able to meet the challenges of implementing Semantic Web agents we
will NEED some way of asserting meta-information about individual
Triples. The exiting use-cases are many; provenance, belief, trust and
quoting are but a few. We can expect any half-decent Semantic Web agent
ought to use all of these.

I'm not taken by Reification, it's ugly, ill supported and over-complex.
You're forced to represent your information in both forms if you want to
use it and assert meta-information about it at the same time.

Graph-level meta-information is good, but it won't be enough. For
example context-based trust could presumably lead to different levels of
belief in different triples from the same graph.
To restrict assertions to graph scope, or force the creation of
singleton graphs would render RDF a pain to use, and thus eventually
obsolete.

My proposal is for a URN scheme for statements such that a statement
itself may be the subject or object of a triple.
The URN scheme would be defined such that it is valid only within a
given context, similar to current blank-node IDs, except that I would
consider a KB or a Triplestore query output a single context so that
queries can tie things together easily.

Obviously you wouldn't be able to state meta-triples about things people
have said in a remote document without re-stating the triple.
However, I think this is a fair compromise, a means for quoting triples
would allow you to state whether or not you also believed this triple.

* Extensible Literals

Prior to reading this thread I'd not considered the case of Literals as
the subject of a triple, so my proposals here are a little shaky.

In addition a language and a datatype we need the ability to state other
meta-information about literals.
My example here comes from the EU requirements for a data interchange
format for research bodies (CERIF 2008).
Personally I think the standard is badly engineered, but it remains that
  it can represent something which it is incredibly ugly to do in RDF.
In addition to the language of a Literal, they are required to denote
the translation status of the literal; whether it is the original, or a
machine or human translation.
I can't find a way of representing this in RDF that does not make me
feel unclean.

Another example; what is the language of an Integer? Yes there are other
numbering systems, but Language is a misnomer here.

These meta-facts about literals need to be extensible, and I'm confident
that we could do this in a backwards compatible manner.

In terms of proposals I have nothing concrete here, perhaps use two new
URN spaces as above (one for subject, one for object), I'm not sure;
suggestions are more than welcome.

Marcus

Received on Thursday, 5 November 2009 16:52:54 UTC