Components, concepts and URIs: The case for abstraction wrt SCDs from Henry S. Thompson on 2005-06-20 (www-tag@w3.org from June 2005)

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Mon, 20 Jun 2005 17:34:08 +0100
To: Schema Interest Group <w3c-xml-schema-ig@w3.org>
Message-ID: <f5bmzplq96n.fsf@erasmus.inf.ed.ac.uk>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[This note is to discharge my action [1] from the recent TAG f2f,
which says

 "HT to reflect TAG discussion on (namespaceName, sort, localName)
  problem space, as compared to schema component naming problem space,
  to XML Schema WG."

What does this mean?  By way of an illuminating (I hope) parallel,
consider the at one time much-discussed notion of "The (X)HTML 'P'
element".  There is a feeling that, subject to only moderate
assumptions about good behaviour on the part of the HTML WG, this is a
coherent concept independent of the details of its definition, at
either the syntactic (DTD/Schema) or semantic (box model, etc.) level.

Roughly speaking, 'P' is a name in the XHTML namespace with a certain
use, definition, set of properties, etc., which has evolved over time,
but which has a conceptual unity of identity none-the-less.

It's _that_ level of thing which the SWBP WG want simple (barename)
names for.  This is _not_ what the SCD effort is aiming at providing
names for, hence the disconnect in our discussions with Dan Connolly
wrt his feedback [2] on the Last Call SCD draft [3].

Once this distinction is clarified, it then becomes appropriate to
discuss how one might provide URIs for what amount to sorted expanded
names, that is for points in the space of possible

 < namespaceName-or-null, sort, localName >

triples.

By 'sort' and 'sorted' here I am (ad/re)verting to terminology the
Schema WG has sometimes used wrt the varieties of schema components,
i.e. we have 33 different *sorts* of components (14 structural
components, 4 informative facets and 15 constraining facets).

Of these, only 8 are available by name at the top level of schemas:

 Attribute Declaration
 Attribute Group Definition
 Complex Type Definition
 Element Declaration
 Identity-Constraint Definition
 Model Group Definition
 Notation Declaration
 Simple Type Definition

But it's not clear that those are at _quite_ the right level -- they
are, after all, W3C XML Schema-specific.  Seems like we need to look
more at the kind of things there are (so far) in our conception of XML
language, or indeed anything which might have (sorted) expanded names
wrt a namespace, so that our sorts are more like

  Element
  Attribute
  Type
  Identity-Constraint
  . . .?

The crucial point is that there would be no implication of a
unique/stable/invariant mapping from such a sorted expanded name to
_any_ formal definition of syntax (e.g. in a W3C XML Schema or a BNF
or . . .) or semantics (e.g. in a spec. or an RDF graph).

Broadly speaking, there seem to be two ways to go (all of the
following are attempts to name the 'output' element in the XSLT
namespace)('sexn' stands for "sorted expanded name"):

 1) Build on top of http: URIs, using the fragment identifier and/or
    path components:

  a)   http://www.w3.org/1999/XSL/Transform#element_output
  b)   http://www.w3.org/1999/XSL/Transform#element(output)
  c)   http://www.w3.org/1999/XSL/Transform#sexn(element::output)
  d)   http://www.w3.org/1999/XSL/Transform/element/#output

 2) Follow the model of the Orchard/Salz QName URN proposal [4]
    and define a new URN namespace:

       urn:sexn:element:output:http://www.w3.org/1999/XSL/Transform

All of the sub-cases of (1) require further details about media types,
conventions wrt retrievable representations, etc.

The lowest overhead would be (1c) -- a single new XPointer scheme
would have to be registered.  Some mechanism would however be needed
for managing the space of allowed sort names

(1b) requires a bit more overhead, in that every sort has to be a
registered XPointer scheme, but that also removes the need for an
additional sort registry.

I _think_ (1a) and (1d) would require more work as regards media
types/retrievable representations etc.

AWWW recommendations [5] mean we should consider something like (2) only
if no variant of (1) is acceptable.

Only (2) straightforwardly covers the case of sorted names in no
namespace.

Hope this helps get the conversation started.

ht

P.S. Noah had a piece of this [aA]ction too -- I sure he can and will
speak for himself, he's certainly not responsible for my introduction
above.

[1] http://www.w3.org/2005/06/16-tagmem-minutes.html#action04
[2] http://lists.w3.org/Archives/Public/www-xml-schema-comments/2005JanMar/0080.html
[3] http://www.w3.org/TR/xmlschema-ref/
[4] http://www.ietf.org/internet-drafts/draft-rsalz-qname-urn-00.txt
[5] http://www.w3.org/TR/webarch/#pr-reuse-uri-schemes
- -- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFCtvAFkjnJixAXWBoRAtyjAJ46vrLyrI35/STUJxi3iR01ti9nAACfV/0O
pMeoapvyWaj3ICbC6Z/VvwI=
=XoV3
-----END PGP SIGNATURE-----
Received on Monday, 20 June 2005 16:34:27 UTC