Re: Library Standards and URIs

Ronald E. Daniel (rdaniel@acl.lanl.gov)
Thu, 5 Jan 1995 09:47:02 -0700


From: "Ronald E. Daniel" <rdaniel@acl.lanl.gov>
Date: Thu, 5 Jan 1995 09:47:02 -0700
Message-Id: <199501051647.JAA09349@idaknow.acl.lanl.gov>
To: winograd@cs.stanford.edu
Subject: Re: Library Standards and URIs
Cc: uri@bunyip.com


Terry Winograd sez:
> I have the feeling that we are slowly moving
> towards something that is already standard in the programming domain --
> extensible collections of class or record definitions.

Yup. One of the things I would kind of like to see would be for one
attribute set to inherit from another. The stuff we are talking about
is getting very close to a declarative language for metadata. This is
OK by me, but not everyone wants the complexity that brings.

> [URC:
>   Class  Corebib <url:http//mysite.net/corebibdef>
>   Class Rated <url:http//yoursite.com/ratingsdef>
>   Class MARC <urn:loc.uri/official/marcdef>
[rest of nice example deleted]

This identification of classes, and the way attributes from a class
are gathered together in the URC looks pretty nice so far. Grouping
the attributes from a class solves the name collision problem I
was worried about.

> I have not used SGML syntax here, although there is an obvious translation,
> because of the DTD problem -- since the class definitions (which is an
> open-ended extensible collection) each have their own attributes, there is
> no convenient way to map attribute names to tag names in a DTD -- we can't
> have one huge DTD for everyone's classes, but you don't want a
> mix-and-match DTD specific to every document.   From my point of view, this
> is a strike against using SGML (as opposed to some other variant of
> nested, tagged syntax which doesn't employ the same definition mechanisms).

Yes, it is a concern. I was talking about this problem yesterday
with Terry Allen and we are not convinced that SGML is the way to go.
However, for now we will go along with it until we come up against
something we need to do it can't handle. This issue of where the DTD
stops and the attribute classes begin may be it.

However, let me play Devil's advocate for a moment and revisit your
assertion that we "don't want a mix-and-match DTD specific to every
document". What is the difference between a mix-n-match DTD that is
assembled on the fly and the use of attribute classes in your example?
There is the network retrieval aspect, so assume a slight extension to
SGML that lets us use URIs in entity references, such as
  <!ENTITY core URI "URL:http://foo.bar.com/text/core_attribs.foo">
instead of the current scheme of
  <!ENTITY core SYSTEM "core_attribs.foo">
to get the DTD from local files. What is the conceptual difference between
the two?
  
> Anyone can add arbitrary attributes, by providing a class definition on the
> net.
...
> The proposal here is to make that notion fully general -- that every
> attribute is identified when used as belonging to a class, and every class
> has a definitions file on the net

Right. This is very much what I would like to see.

> there needs to be further
> discussion of what goes into these files and how much is machine-readable
> or human-readable.

Right. To push this discussion along, here is a straw-man proposal.
These files are DTD fragments, modeled along the lines of the TEI's DTD
fragments that can be mixed, matched, and overridden by user-specified
DTD fragments. They convey the syntax and are machine-readable. The
semantics are conveyed in human-readable comments. People who put the
fragments on the net without specifying the semantics get flamed :-)

As the proposer of the strawman,  I get to take the first couple of
hits at it. First, this requires URI support in SGML applications, which
may get true SGML people a little upset. Second, it does nothing to solve
the name collision problem (unless we use the SUBDOC stuff which I am
told is deprecated).

Anyone out there want to kick the strawman or spring to its defense?

Ron