Re: The furore over PUBLIC

>   1 fixed encoding:  everything in UTF-8 (alternate form:  everything
>     in UCS-2)
>   2 anarchy:  no requirements, processors do what they like
>   3 required minimum which all processors must support; nothing to
>     prevent them from supporting more.  If I know that *the processor(s)
>     I care about* support the encoding I like, I can use it.  If I
>     care about having *all* processors be able to read my docs, I
>     know what to do, and I can ensure that all processors can handle
>     my docs.
> 
> Having a public identifier without a resolution mechanism is very
> like having no rules at all about character encoding.  Nothing to
> prevent you from doing what you like in the processor you write
> yourself, and nothing to prevent your *having* to write your own,
> because nothing you can count on being present in every conforming
> processor.

I think that most of us adocate a system that is much like 3: with a simplified
catalog -- exactly analogous to our character encoding decision.

> What I don't understand is this.  How would this picture change if
> PUBLIC were legal in XML, without a standard resolution mechanism?
> You'd still have to write all your own software, you could still share
> it with friends, your friends would still be able to use it only if they
> agreed with you that searching Altavista was an OK method of resolution.

There are a few things that would change. 

First, our documents would be valid while we wait for a public identifer 
mechanism to shake out. Validity in the absence of interoperability still 
allows us to get our work done: we can use standard parsers and editors, but
might have to supplement our public identifiers with system identifiers in
the short term.

Second, I would expect that standardized mechanism to shake out almost 
immediately: Socats exist and work.  Whether we require them or not, 
software will support them, *if* they are legal in the standard. Many SGML
products already support SOCATS and would continue to if they were legal.
OTOH, if they are illegal, software will have a good excuse to ignore
them, and they may even remove SOCAT support in order to not promote 
proprietary "extensions."

Third, there would be a natural,legal place to "slide in" various resolution 
mechanisms in the short term and when URNs become "real". Our documents
would be able to take advantage of those automatically whereas URL-only 
documents would have to be manually "upgraded" to take advantage of URNs.

Fourth, it provides a standardized hook for experimentation, rather than
various hacks such as <!--URN:FOO--> and <?URN: FOO?>. Let me point again
to the example of the humble processing instruction: why is it so widely
accepted? Because sometimes that escape hook is necessary, as in the design
of XML itself. TEI also has many such escape hooks. So does DSSSL.

> I guess I think one point of defining standards is defining interfaces,
> and one point of defining interfaces is to ensure that products from
> different vendors can support the same interface.  (There may be more to
> it than this, but I can't think of anything else off the bat.) 

There are two parts: standardizing the parts that are to be interoperable,
and declaring the parts which are to be to the system's discretion. Think of
REND or TYPE in TEI.

But all in all, I think it would be best to standardize a simple catalog 
format that would provide a baseline of interoperability around which other
systems can be built without modifications to documents.

 Paul Prescod

Received on Thursday, 27 March 1997 19:55:37 UTC