Re: Deprecate http://www.w3.org/1999/02/22-rdf-syntax-ns# in favour of /ns/rdf# ?? from Dan Brickley on 2013-12-02 (semantic-web@w3.org from December 2013)

From: Dan Brickley <danbri@danbri.org>
Date: Mon, 2 Dec 2013 14:08:22 +0000
To: Christophe Guéret <christophe.gueret@dans.knaw.nl>
Cc: Phil Archer <phila@w3.org>, Richard Cyganiak <richard@cyganiak.de>, Charles McCathie Nevile <chaals@yandex-team.ru>, Ruben Verborgh <ruben.verborgh@ugent.be>, SW-forum Web <semantic-web@w3.org>, "team-rdf-chairs@w3.org" <team-rdf-chairs@w3.org>
Message-ID: <CAFfrAFqdJtY_-WJCgv3rKREeXD3AoqCuYdN=LMCS_FUzG4n7Lg@mail.gmail.com>
On 2 December 2013 11:33, Christophe Guéret
<christophe.gueret@dans.knaw.nl>wrote:

> Hi Dan,
>
> That's indeed a possible scenario but I see no need for merging RDF and
> RDFS, and trigger big changes.
>
> "Say W3C announces that rdfs and rdfs will "be considered equivalent to"
>> the URI http://w3c.example.org/rdfcore#"
>>
> In fact, we can take a much more conservative approach:
> * "http://www.w3.org/1999/02/22-rdf-syntax-ns#" deprecated in favour of "
> http://www.w3.org/ns/rdf#"
> * "http://www.w3.org/2000/01/rdf-schema#" deprecated in favour of "
> http://www.w3.org/ns/rdfs#"
>
> With redirections and links. This would align RDF and RDFS with other
> vocabularies such as PROV (http://www.w3.org/ns/prov#) and let the W3C
> provide a clean and consistent offer in terms of vocabularies: all of them
> are under http://www.w3.org/ns with specific sub-namespace for every
> vocabulary.
>
>
> "heated debate during several ESWC 2014 linked data events on whether the
>> new namespace should have used # or /"
>>
> Well, I see that point being debated since I started working on SemWeb
> technologies and it does not seem it will ever really go away...
> Furthermore, both RDF and RDFS use of # could already be debated and won't
> be affected by the switch to a new namespace
>
> "The WG charter MUST address the rdf:/rdfs: version handling issue"
>>
> So far, there is no much version handling done on vocabularies. Whatever
> namespace is used just serves the most up to date version, precisely
> because people don't want to rewrite their triples. I don't see why/how
> changing the namespace would suddenly make version handling a MUST in all
> the documents. (Not saying version handling is not relevant but I don't
> think that's the point here)
>

Let me try to rephrase the concern. If all we want is to update W3C's
namespace documents, this is easy. If we want the new URIs to actually be
used and to be useful, we're in for a world of frustration.

There are a lot of interacting pieces in the ecosystem around RDF, and this
is by design --- we made these standards so that data can flow between
applications. If we want to upgrade this network to understand a new
equivalence between e.g. <http://www.w3.org/1999/02-22-rdf-syntax-ns#type>
and <http://www.w3.org/ns/rdf#type> we'll need to say where in the
information flow we expect this to happen. Parsers, APIs, databases, all
have the potential to do something, or to do nothing, and will be
interacting with other components which may also be ignoring or acting on
the equivalence information. Often an application does not have total
control of the information that is passed to it, or the nature of the
storage and other components it uses. In this semi-structured chaos,
introducing URI aliases is pretty difficult since it is not clear whose job
it is to deal with them.

Broadly we can talk about publication and consumption roles, but even those
are complex when you try to break down who the actors and decision makers
are.

Consider someone building on Drupal 8. If D8 uses RDFa, then certain pieces
of RDFa 1.1 / Lite syntax (notably @typeof) will generate <
http://www.w3.org/1999/02-22-rdf-syntax-ns#> -based URIs in all compliant
RDFa 1.1 parsers, regardless of what the publisher or the Drupal team, or
Drupal extension and theme authors think or do. However, other markup
idioms are under control of the site publisher, the Drupal software team,
or other players (themes/addons/etc.), i.e. someone gets to choose whether
the site writes <http://www.w3.org/1999/02-22-rdf-syntax-ns#type> vs <
http://www.w3.org/ns/rdf#type> etc.

People in the position to be making the decision whether to use the old,
new, or both URIs for rdf:type will try to take into account likely
behaviour(s) of RDF consumers. Whether parsers will pass on the data in
2012/2013 style, or try to modernise it by turning  <
http://www.w3.org/1999/02-22-rdf-syntax-ns#type> into <
http://www.w3.org/ns/rdf#type>, or double up the data. Or whether
rule/inference capable systems are starting to ship that come with built-in
option to understand both versions of rdf:type transparently. Some of those
people we put in the position of having to choose old rdf: ns versus new
rdf: ns will be choosing on behalf of downstream users that they don't know
much about.

Generally if word gets around that lots of important tools can deal smartly
with either, we might see more publishers start to use <
http://www.w3.org/ns/rdf#type> instead of <
http://www.w3.org/1999/02-22-rdf-syntax-ns#type>  when they're in a
position to choose the exact URI. When they're using W3C's existing
shorthands for these, the decision is still out of their control. This
might start to put pressure on RDFa implementors to generate the new
flavour of triples from rdf:type instead of the old; which in turn puts
pressure on W3C to update the specs. Which rarely happens very quickly.

Meanwhile, what about people using RDF APIs and storage systems. Consider a
Python application that uses e.g. rdflib 2.4.2 to write data into (and
later retrieve from) a remote SPARQL store, where the remote store could be
based on any version of Virtuoso, ARC2, Jena or Sesame. The data transfer
could be via toolkit-specific APIs or via
http://www.w3.org/TR/sparql11-update/. The exact nature of the data being
managed is not under the control of the application author (e.g. it might
partially come from XMP extractions from PDF, or from RDFa parsing, etc.),
and so might contain various triples either using <
http://www.w3.org/1999/02-22-rdf-syntax-ns#type> or <
http://www.w3.org/ns/rdf#type> or both. Should the Python app expect to be
able to get back from these databases the exact set of triples it inserted?
Is it reasonable for the databases to say "that's a matter for the
application, we're not messing with the internals of user data"? or "we'll
normalize all data we see to modern form.". Whose job is it to implement
the equivalencies between old-rdf:type and new-rdf:type? If old-style or
partially old-style data is transmitted via programmatic or REST/SPARQL
API, should the sender or the receiver be normalizing it? Should data
access mechanisms - APIs, SPARQL or other query languages e.g. Gremlin -
reflect the actual form of data, or its idealized/modernized/normalized
form?

As far as SPARQL is concerned,
http://www.w3.org/TR/sparql11-entailment/#entEffects therefore seems
relevant. You can then think about querying the 'raw' data (for years,
likely and mix of old and new rdf: namespace URIs) or perhaps a
virtual/abstract modernized graph if the database exposes it as an
entailment regine. But we you have the problem of trying to second-guess
what applications want, e.g. which behaviour should be the default, what
non-SPARQL data access methods should do, etc.

I'm still finding it hard to be optimistic about the prospects. What's the
alternative? Maybe to try to find ways to make it easier for specialists to
remember at least the rough dates?

Maybe instead we could start to celebrate "RDF Day" as Feb 22nd 1999?

1999/02/22-rdf-syntax-ns ->
http://en.wikipedia.org/wiki/February_1999#1999_February_2

We could remember what else happened on that day?
http://articles.baltimoresun.com/1999/feb/22 ... and then help RDF
developers / publishers remember what was added via RDFS the following
January by associating it with "what happened after Y2K? In RDFS's
namespace, from January 20000, W3C added subPropertyOf, subClassOf, Class,
range and domain to RDF(S). What was RDF before that? Something quite
minimal that had a very basic notion of structured data graphs using
properties (rdf:Property) and where some things had a relationship called
rdf:type to some other things, but which waited for RDFS before we said
"and the value of an rdf:type is an rdfs:Class; and the meaning of a
property can be described through its associations to classes (which form
hierarchies) and to other properties (which also form hierarchies), hence
rdfs:subClassOf, rdfs:subPropertyOf, rdfs:domain, rdfs:range". At some
point we still have to confess to rdf:resource being syntax and
rdfs:Resource being a type, and to rdf:datatype being syntax and
rdfs:Datatype being a type. But at least there's some rough symmetry there.

Dan
Received on Monday, 2 December 2013 14:08:49 UTC