Re: Proposal for clarification of RDF

 From: "Aaron Swartz" <me@aaronsw.com>

> > The current RDF Recommendation is almost impossible to
> > implement because the discipline of a DTD was not used.
>
> Would it be acceptable to you if some other form of rigorous
> documentation was used, such as a tree grammar, canonicalized
> BNF, or XSLT? The arguments you provide for DTDs do not seem so
> strong.

The following are the criteria I would use:
 - described using a well-known schema language which as been
   developed independently of the particular use intended, been
   widely discussed, and preferably standardized, preferably by ISO.
 - directly executable by a validator, with several implementations
 - widely used by practitioners in many different contexts.

The following I regard as wastes of time:
  - ad hoc or custom-made notation
  - non-exectuable schemas
  - used by only a niche community, e.g. academics in particular.

> > Furthermore, the advent of RDFS raises compatability issues, in
> > that certain elements are used in RDFS, but are only general
> > names in RDF.
>
> I'm not quite sure what you mean by this, but it seems to be a
> misunderstanding of the RDF spec. RDF can take any type of
> element name and still generates triples -- RDFS simply uses
> this extensibility mechanism like any other vocabulary would.
> Why are there compatibility issues?

The point is that the RDF specification pupports to define (among
other things) a transfer syntax for RDF. It even goes as far as
enumerating the elements.  But then, in another CR, suddenly
a whole lot of other elements appear in the same namespace.

The problem is the expectation (quite reasonable) that when
a specification states gives names, these names are treated
as an exhaustive enumeration. If the extra rdf: elements in
RDFS represent all the possible rdf: elements, then it is
just a matter of bad layering;  that bad layering is a good enough
reason not to move RDFS to PR.

If it is the intent that the RDF/RDFS spec does not exhaustively
list all the elements in the rdf: namespace, then that should be
stated explicitly.

To make an extreme comment to try to show why this is a bad idea,
could I make up an element  rdf:the  and claim it means "the idea of 'the'
as used in the RDF specification"?   In other words, when RDFS takes
a term from the RDF spec, and makes it into an element name in the RDF
namespace, is this a general capability which anyone is allowed to do,
or is it only the RDFS spec (or the RDF WG, or the W3C) which can do it?
If the rdf namespace is open like that, is it only defined terms, or any
nouns,
or any words?

It is all very well to say "this is obvious: the specs from the W3C
collectively
define what elements are in the rdf: namespace", but you should not take the
extent to which I may be confused as an indication of my stupidity but
rather as an indication of the unsatisfactory state of things in the RDF
specs.  I have read a lot of specs in my time (and even drafted a few) and
the RDF specs are below standard.

> > It regularizes the use of namespaces and prefixes: when used on
> > an element which has an rdf: prefix, rdf attributes must _not_
> > themselves have a namespace prefix; when used on an element
> > which does not have an RDF prefix, the rdf attributes _must_
> > have a prefix.
>
> This goes against the WG's recent decision to always require a
> namespace prefix.

Sure, I was not aware of that when I posted: it changes several things, and
is a good step forward.  The WG should put out an RDF 2nd ed. as soon as
possible to alert people to this, as a matter of courtesy.

> > It promotes the elements named in RDFS but not RDF into
> > first-class citizens.
>
> Why would you want this? Isn't it at the expense of all the
> vocabularies which don't happen to be W3C RECs?
>
> > <!ENTITY % rdf-alt-syntax-atts
> > '        rdf:_1 CDATA #IMPLIED
>
> Your schema seems to only allow 8 rdf:_n attributes. Isn't this
> a serious limitation that is not in the spec?

There was a comment improperly delimited saying "add more as needed".

In any case, as I say above, any syntax that cannot be captured by a
standard schema should not be allowed.  No other markup language that I am
aware of use something like _n.  It is so impractical.

> Your DTD also seems to be extremely incomplete. I am no DTD
> expert, but there are a number of strange and wacky RDF
> syntactic constructs that you do not seem to conver, nor do you
> seem able to specify RDF's openness for all sorts of elements
> and attributes.
>
> So what are your thoughts on other forms of representation?

There are parameter entities in positions to allow other attributes.

The point is that an independent, standard, executable, mature schema
standard (such as DTDs) force a certain discipline. The specification writer
will always tend to leave things they regard as obvious tacet. Pity the poor
reader. An independent, standard, executable, mature schema language forces
the specification writer to express themselves in an unnatural way: this is
entirely good.

You would know that the reason "program proving" methodologies have
not succeeded is that one needs to debug the proof.  The more that
the program proof is executable, the more practical it is (e.g. assert(),
invariants, pre-condition/post-conditions, etc).   So merely writing some
ad hoc BNF is the worst kind of tokenism.

The point of a DTD is not to completely define a markup language.  (The
distinction between document type declarations and document type definition
is so old and fundamental: a DTD is not a data model.) It is to specify to
some extent the representation constraints.  ANY is not bad, or some sign of
failure of modeling capability.


Cheers
Rick Jelliffe

Received on Friday, 22 June 2001 03:04:47 UTC