Re: XML catalog draft

Paul Grosso wrote:

>> Why is # not included in SpecialChars? 
>RestrictedLiteralChars is a synonym for 8879's "minimum data character"
>(production 78).  SpecialChars is a synonym for 8879's "Special" character
>class which 8879 defines to be the characters shown for SpecialChars above
>and which does not include the # character.
>In other words, if we allowed # in SpecialChars, an XML PublicID would
>not be a value 8879 minimum literal and hence not a valid 8879 public id.

I knew this, but the question I wanted to raise is whether there is a real
need to restrict XML public IDs to the same rules as SGML ones. I'm not
convinced there is, despite my strong allegience to SGML. My reasons are
stated below.

>>                                          (It might be nice to use the URL
>> fragment identifier as part of a public ID in some cases, even though this
>> might lead to incompatiblities with SGML name rules, which were done before
>> URLs became popular.) 
>It seems to me that a URL makes more sense as a system identifier than
>a public identifier.  You could use a public identifier and map it into
>a URL via the catalog, but I wouldn't say something with a URL
>fragment identifier needs to be able to be a public ID.

Let me give an example of why I think there is a case for # in XML public IDs.

My material on ODA, SGML and XML all sits in a single file. I could give
this file a ISO 9070 conformant public id of
+//EU::ECHO::IM::INFO2000::OII//TEXT Document Standards//EN

This works fine in a catalog that is simply a catalog. The problem I have is
in reusing this information as part of a XML XLG link group. Here I need to
refer to a lower level, to the fragment IDs for ODA, SGML and XML. Now I
could do something like: 
+//EU::ECHO::IM::INFO2000::OII//TEXT Document Standards::ODA//EN
+//EU::ECHO::IM::INFO2000::OII//TEXT Document Standards::SGML//EN
+//EU::ECHO::IM::INFO2000::OII//TEXT Document Standards::XML//EN

OK, I'm still kosher SGML here, but the mapping between system IDs and
public IDs would be easier if I could adopt the following alternative names:

+//EU::ECHO::IM::INFO2000::OII//TEXT docstand.html#ODA//EN
+//EU::ECHO::IM::INFO2000::OII//TEXT docstand.html#SGML//EN
+//EU::ECHO::IM::INFO2000::OII//TEXT docstand.html#XML//EN

This form would allow automatic translation of my existing anchors into
public IDs, without having to add a manual intervention stage to define the
name that the file is to be known as. It is with this in mind that I asked
that we think carefully as to whether there would be advantages in allowing
# to be used in XML public IDs. (I have also asked that it be added to the
SGML minimum data character set when SGML is revised.)
Martin Bryan, The SGML Centre, Churchdown, Glos. GL3 2PU, UK 
Phone/Fax: +44 1452 714029   WWW home page: http://www.u-net.com/~sgml/