Re: RDF API thoughts

Hi Toby,

Comments in line:

Toby Inkster wrote:
> Here are some thoughts on the RDF API draft which a recent discussion in
> the #swig IRC channel helped my clarify in my head.
> 
> The current RDF API draft is really a Notation 3 API in stealth.
> 
> The current RDF API draft goes beyond the RDF data model in several
> ways. I'm a big fan of the Notation 3 data model - the RDF data model
> comes with various restrictions which are seemingly arbitrary. However,
> for the RDF API it makes more sense for us to stick with the RDF data
> model.

The initial draft API was aimed less at N3, and more at an RDF model 
with the seemingly arbitrary constraints removed.

There are essentially four possibly contentious items, listed below with 
reasons for each

1: The loosening of the Triple interface to include any node in any position
  - because this is possibly the most arbitrary restriction on the RDF 
model, primarily imposed because of serialization limitations rather 
than for model or semantic reasons.

2: The inclusion of Graph Literals
  - because they may very well be included in the next revision of RDF, 
as per the potential RDF WG draft charter.

3: The removal of the datatype /or/ language constraint on literals
  - because we may move to a place where PlainLiteral, xsd:string and 
related are joined in a single datatype (which would mean a literal 
could/may have both a datatype and a language).

4: The inclusion of profiles
  - because they are all ready part of RDFa and may be adopted by RDF 
and other serializations (again as per the RDF WG draft charter)

The reason all of these items are in the initial draft of the API, is to 
ensure that the API is both backwards and forwards compatible, inclusive 
rather than exclusive - if it's later agreed that exceptions should be 
thrown in places or some of these interfaces moved out to a note, then 
at least we know that the API has been designed with these 
considerations in mind, and that it's compatible without having to 
immediately start work on a 1.1/2 API and potentially break BC or "hack" 
them on later.

> Why? Firstly, packaging something up and calling it an RDF API when it
> goes significantly beyond the RDF API will irritate some people.

Well, tbh they can still implement the API and just impose the arbitrary 
restrictions themselves, many are imposed by serializations anyway - 
nothing is actually lost other than a "conformance badge".

> Secondly, it will make it difficult to implement the RDF API as a layer
> on top of existing RDF toolkits.

As per above, it can still be implemented, just with some per 
implementation restrictions, and some scope to grow in to.

I'm by no means saying that this is the case, just that it's one of many 
possible approaches.

> Notation 3 support should be stripped out and worked on as an extension
> to the RDF API separate document. A Notation 3 API extension is almost
> certainly beyond the RDFa Working Group's current charter, but that is
> fine as it doesn't stop RDFa Working Group members from working on this
> extension outside of RDFa WG time. The Notation 3 API extension should
> probably not be Rec track, at least not until some time when Notation 3
> itself is.
> 
> The only nod towards Notation 3 that the RDF API itself should offer is
> to avoid making that extension difficult. In other words, don't add
> normative requirements that would preclude a conformant RDF API
> implementation from also supporting Notation 3.
> 
> In particular I'd like to see the following changes made to the RDF API:
> 
> 1. Drop the GraphLiteral interface.

Any particular reason why it needs dropped now? we could always factor 
it in to a different note or remove it should the RDF WG not define RDF 
literals.

> 2. Allow but do not require RDFEnvironment.createTriple to throw an
> exception if the triple would not be RDF compatible. (For example, if
> the subject is a literal.)
> 
> 3. Allowbut do not require Graph.add to throw an exception if the triple
> being added is not RDF compatible.

Agree that the exceptions may be a good idea, unsure on the terminology 
used, preference going to "not supported by the implementation" or such 
like.

> 4. Allowbut do not require DataSerializer.serialize to throw an
> exception if the triple being added is not RDF compatible, or if it
> otherwise cannot be serialised. (For example, some RDF compatible
> triples cannot be serialised as RDF/XML because their predicate URIs
> cannot be represented as a valid QName.) Possibly provide a mode for
> DataSerializer objects to silently skip triples which cannot be
> serialised.

Generally agree, if we keep the interface it's a problem to address - 
but overall 50/50 on whether the Serializer interface is going to be 
more trouble than it's worth.. perhaps it doesn't need standardized..

> Other high-level changes unrelated to Notation 3 that should be made:
> 
> 1. Literals are currently allowed to have both a datatype and a
> language. This goes beyond even the Notation 3 data model, so I do not
> see what is gained by allowing this in the RDF API. Literals should be
> forced to pick a camp.

Simply, because we may move to a where both datatype and language are 
used together on literals, it's a wording constraint regardless as the 
interface must have both properties on it, and if the constraint were 
added, it'd be almost impossible to actually implement (more accurately, 
possible with ecmascript v5, but rarely implemented).

> 2. Drop special support for rdf:PlainLiteral. rdf:PlainLiteral is not
> used much in the wild and was never intended to be - it's a datatype
> used internally by OWL 2 and RIF. Adding special support for it
> complicates implementations for no reason - it should be treated the
> same as any other custom datatype.

Indeed, I didn't think there was special support for rdf:PlainLiteral, 
just a tertiary note added to the spec to be included or removed should 
it be needed - again just a potential use case to have covered should it 
be needed.

> 3. Move terms, prefixes and profiles into the RDFa API or, better yet,
> drop them altogether. They make the API harder to implement and harder
> to grok while adding very little benefit. For those people who want to
> use prefixes and terms, they can be implemented quite trivially in "user
> space".
> 
> See:
> http://lists.w3.org/Archives/Public/public-rdfa-wg/2010Oct/0131.html

Profiles are required by RDFa processors (default profile etc), may well 
be adopted by the RDF WG for turtle and RDF, hence the specification - 
generally the interfaces will need to be supported by one of the two 
APIs regardless, can't just be dropped afaict. (RDFa processors require 
Profile support, RDFa API will require terms and prefixes to be 
supported in order to interact with an RDFa document via the DOM, again 
afaict)

> 4. Move or copy PropertyGroups from the RDFa API into the RDF API. The
> Graph interface should have getItemsByType, getItemBySubject and
> getItemsByProperty methods which return PropertyGroups or arrays of
> them.
> 
> 5. Rename PropertyGroup to something like "RDFItem", "RDFResource" or
> "Description", or indeed "FartWarbler" - any name you pick out a hat
> will be better than the status quo.

Agree re naming, as for 4, highly debatable for multiple reasons, and 
probably requires a discussion all by itself.

Best,

Nathan

Received on Tuesday, 11 January 2011 15:10:55 UTC