[Prev][Next][Index][Thread]

Re: XML catalog draft



Paul Grosso wrote:

>> Why is # not included in SpecialChars? 
>
>RestrictedLiteralChars is a synonym for 8879's "minimum data character"
>(production 78).  SpecialChars is a synonym for 8879's "Special" character
>class which 8879 defines to be the characters shown for SpecialChars above
>and which does not include the # character.
>
>In other words, if we allowed # in SpecialChars, an XML PublicID would
>not be a value 8879 minimum literal and hence not a valid 8879 public id.

I knew this, but the question I wanted to raise is whether there is a real
need to restrict XML public IDs to the same rules as SGML ones. I'm not
convinced there is, despite my strong allegience to SGML. My reasons are
stated below.

>>                                          (It might be nice to use the URL
>> fragment identifier as part of a public ID in some cases, even though this
>> might lead to incompatiblities with SGML name rules, which were done before
>> URLs became popular.) 
>
>It seems to me that a URL makes more sense as a system identifier than
>a public identifier.  You could use a public identifier and map it into
>a URL via the catalog, but I wouldn't say something with a URL
>fragment identifier needs to be able to be a public ID.

Let me give an example of why I think there is a case for # in XML public IDs.

My material on ODA, SGML and XML all sits in a single file. I could give
this file a ISO 9070 conformant public id of
+//EU::ECHO::IM::INFO2000::OII//TEXT Document Standards//EN

This works fine in a catalog that is simply a catalog. The problem I have is
in reusing this information as part of a XML XLG link group. Here I need to
refer to a lower level, to the fragment IDs for ODA, SGML and XML. Now I
could do something like: 
+//EU::ECHO::IM::INFO2000::OII//TEXT Document Standards::ODA//EN
+//EU::ECHO::IM::INFO2000::OII//TEXT Document Standards::SGML//EN
+//EU::ECHO::IM::INFO2000::OII//TEXT Document Standards::XML//EN

OK, I'm still kosher SGML here, but the mapping between system IDs and
public IDs would be easier if I could adopt the following alternative names:

+//EU::ECHO::IM::INFO2000::OII//TEXT docstand.html#ODA//EN
+//EU::ECHO::IM::INFO2000::OII//TEXT docstand.html#SGML//EN
+//EU::ECHO::IM::INFO2000::OII//TEXT docstand.html#XML//EN

This form would allow automatic translation of my existing anchors into
public IDs, without having to add a manual intervention stage to define the
name that the file is to be known as. It is with this in mind that I asked
that we think carefully as to whether there would be advantages in allowing
# to be used in XML public IDs. (I have also asked that it be added to the
SGML minimum data character set when SGML is revised.)
----
Martin Bryan, The SGML Centre, Churchdown, Glos. GL3 2PU, UK 
Phone/Fax: +44 1452 714029   WWW home page: http://www.u-net.com/~sgml/