Re: Clarifying RDF #2: a Schematron schema

Rick Jelliffe wrote:


> I wonder how they wrote their RELAX schema (and also Michael's XML Schema
> schema).

Err, "they" meaning I simply:

1) sat down with a text editor, a copy of the RDF 1.0 M&S rec in one window
and a copy of the latest RELAXNG draft/tutorial in another window.
2) went through the M&S EBNF productions in section 6 in turn and converted
each production into RELAXNG.
3) used James Clark's validator against a few test cases and fixed the
patterns until I didn't get any errors.

This was the first RELAXNG schema I've written, and first time I'd read the
spec or tutorial. I wanted to see if it was possible to write a tree regular
expression which captures the RDF grammar.

see http://www.openhealth.org/RDF/RDFSurfaceSyntax for a discussion of
suggested modifications of the RDF syntax (including an aboutQ and resourceQ
attribute that accepts QNames). Note that this schema
http://www.openhealth.org/RDF/RDF.rng assumes a different 'rdf' namespace
(as it is a different syntax than RDF 1.0) hence test cases should be sure
to change the namespace URI attached to the 'rdf' prefix:) The RDF 1.0
syntax schema is available at: http://www.openhealth.org/RDF/RDF1.rng

> If they just rewrote my earlier DTD from 1999, they cannot be very good
:-)

DTDs are quite inadequate for capturing RDF syntax in any meaninful way for
a variety of reasons. The EBNF in the RDF 1.0 spec is also inadequate
because it does not capture the 'rules' of XML i.e. that the prefix is not
important, that attribute order is not important, that empty elements and
elements with a start tag, no content and an end tag are considered
equivalent by the XML Infoset, etc. etc. Think of the RELAXNG schema as an
XML Infoset aware pattern and this approach makes sense.

> And the issue of handling the dreaded abbreviated syntax looms: I think
only
> Schematron could handle that (actually, the Schematron schema posted does
> not validate the abbreviated syntax thoroughly, due to my lack of time and
> patience with it.)
>
> I think both RELAX and XML Schemas should be good for modeling RDF, in
that
> their abstraction mechanisms provide a type/tag distinction.  But they
both
> share the same problem as DTDs: one needs to enumerate the element names
> explicitly in the schema for all intents and purposes.

err, no. not for RELAXNG

You may wish to read the latest versionsof the RELAXNG documents. I think it
is totally unfair to assert such a "type" vs. "pattern" distinction ... are
you saying this from a preconception you have, or something written in the
documents?

e.g. the pattern

<define name="foo">
    <element>
        <not>
            <nsName ns=""/>
        </not>
    </element>
</define>

matches _any_ element in any namespace, and does _not_ match elements which
are not namespace qualified. Pretty simple isn't it.

An architecture like
> RDF which has type implication ("if this element has a child, it must be
> this thing regardless of its name") flies in the face of conventional
markup
> practise (but it is the kind of pattern that I think crops up regularly,
and
> it is one reason why Schematron has patterns as its abstraction not
elements
> or "types".)
>

To me, a regular expression is a "pattern" which is why I was interested to
see if a tree regular expression can express XML patterns (i.e. RDF). With
the sole exception of the ability to constrain element or attribute _names_
to the pattern "rdf:_1 ... rdf:_n" it seems to work surprisingly well.

The question for the WG, which is why I will cc: rdf-comments, is what the
role of such constraints on these names ought to be, for example a container
can have other properties than those named _n etc. This is an issue still
under discussion. Similarly test cases that require such a constraint will
fail under this schema.

Another reason I like the RELAXNG schema is that they (being James Clark and
the TC) have produced a formal semantics for the language, which I suspect
may help in the formal definition of the RDF language.

-Jonathan

Received on Wednesday, 20 June 2001 09:49:30 UTC