URC spec 6/6

Ronald E. Daniel (rdaniel@acl.lanl.gov)
Fri, 9 Jun 1995 06:59:26 -0600


From: "Ronald E. Daniel" <rdaniel@acl.lanl.gov>
Date: Fri, 9 Jun 1995 06:59:26 -0600
Message-Id: <199506091259.GAA20138@idaknow.acl.lanl.gov>
To: uri@bunyip.com
Subject: URC spec 6/6


Appendix A  SGML Declaration


This appendix presents the boring SGML declaration that is  needed for
specificity.  Serious consideration is  being given to the use  of ISO
10646, we are waiting  on the HTML working  group to decide what  will
happen in that area.  For now we use the so-called "Latin-1" character
set encoding.

This SGML declaration was stolen from Dan Connolly's work for the HTML
Check package.  Hey, I steal from the best.



<!SGML  "ISO 8879:1986"
--
  SGML Declaration for Uniform Resource Characteristic (URC)
  DTDs and instances as used in the World Wide Web (WWW).

  This is AMAZINGLY similar to the sdecl for HTML prepared
  by Dan Connolly. Any errors introduced into it are my
  fault.

  Ron Daniel  5/24/95
--

CHARSET
         BASESET  "ISO 646:1983//CHARSET
                   International Reference Version (IRV)//ESC 2/5 4/0"
         DESCSET  0   9   UNUSED
                  9   2   9
                  11  2   UNUSED
                  13  1   13
                  14  18  UNUSED
                  32  95  32
                  127 1   UNUSED
     BASESET   "ISO Registration Number 100//CHARSET
                               ECMA-94  Right  Part  of  Latin  Alpha-
bet Nr. 1//ESC 2/13 4/1"
     DESCSET   128 32 UNUSED
               160 96 32

CAPACITY        SGMLREF
                TOTALCAP        150000


Ron Daniel                                                   [Page 21]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

                GRPCAP          150000

SCOPE    DOCUMENT
SYNTAX
         SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
                           19 20 21 22 23 24 25 26 27 28 29 30 31 127
         BASESET  "ISO 646:1983//CHARSET
                   International Reference Version (IRV)//ESC 2/5 4/0"
         DESCSET  0 128 0
         FUNCTION
              --  SPACE       32
                  TAB SEPCHAR  9
                  LF  SEPCHAR 10
                  FF  SEPCHAR 12
                  CR  SEPCHAR 13 --

-- The above is an accurate description of the usage of FUNCTION --
-- characters in HTML implementations; that is, there is no      --
-- Record Start or Record End character, and no occurences of    --
-- character 10 or 13 are "ignored" by the parser.               --
-- But because few SGML implementations support this concrete    --
-- sytax, we include the one below.                              --

-- Note that in order to get correct behaviour w.r.t. newline    --
-- processing, you will have to play some tricks in construcing  --
-- the document entity for parsing in order to keep the parser   --
-- from ignoring newlines in surpirsing ways                     --

  RE          13
                  RS          10
                  SPACE       32
                  TAB SEPCHAR  9


         NAMING   LCNMSTRT ""
                  UCNMSTRT ""
                  LCNMCHAR ".-"
                  UCNMCHAR ".-"
                  NAMECASE GENERAL YES
                           ENTITY  NO
         DELIM    GENERAL  SGMLREF
                  SHORTREF SGMLREF
         NAMES    SGMLREF
         QUANTITY SGMLREF
                  NAMELEN  72    -- somewhat arbitrary; taken from
internet line length conventions --
                  TAGLVL   100
                  LITLEN   1024
                  GRPGTCNT 150
                  GRPCNT   64



Ron Daniel                                                   [Page 22]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

FEATURES
  MINIMIZE
    DATATAG  NO
    OMITTAG  YES
    RANK     NO
    SHORTTAG NO
  LINK
    SIMPLE   NO
    IMPLICIT NO
    EXPLICIT NO
  OTHER
    CONCUR   NO
    SUBDOC   NO
    FORMAL   YES
  APPINFO    NONE
>




B DTD for Default Attribute Set


This appendix presents  the default SGML  DTD. This  is what would  be
returned as a  result of resolving  the default  AID. (Just what  that
default AID will be depends on  the result of the URN effort,  we will
assume <urn:x-dns-2:uri.acl.lanl.gov:default-dtd> for now).

This started out as a shameless ripoff of work by Eric Miller of OCLC.
I have mutated it to add location information (the <URL>  element) and
a construct for grouping information that is related to one particular
<instance> of a resource.  I also butchered the <coverage> element.



<!--
    This is the ISO8879:1986 document type definition for the
    default URC attribute set.  It is stongly based on the DTD
    developed for the Dublin Metadata set by Eric Miller of OCLC.
    This DTD is subject to discussion and change by the members
    of the IETF's URI working group, or by anyone who cares to
    express an opinon to me.


    Ron Daniel  5/26/95
-->


<!-- ============ Parameter Entities ===============

This DTD makes extensive use of parameter entities so that the


Ron Daniel                                                   [Page 23]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

elements can easily be overridden by people wishing to extend or
modify the default attribute set. People creating new base attribute
sets are encouraged to use the same style.  -->

<-- Parameter entities for Elements
    These define the content models for the elements in the default
    set. The convention for those names is n.xxx, where xxx is the
    generic identifier of the element. -->

<!ENTITY   % n.URC "(Author | Title | Subject | Identifier | URL |
                    Instance | Form | Publisher | Date | ObjectType |
                    OtherAgents | Relation | Source | Language |
                    Coverage)" >
<!ENTITY   %  n.Instance "(Author | Title | Subject | Identifier |
                    URL | Form | Publisher | Date | ObjectType |
                    OtherAgents | Relation | Source | Language |
                    Coverage)"
  -- Everything in n.URC except Instance -- >


<!ENTITY   %  n.Author      "(#PCDATA)" >
<!ENTITY   %  n.Title       "(#PCDATA)" >
<!ENTITY   %  n.Subject     "(#PCDATA)" >
<!ENTITY   %  n.Identifier  "(#PCDATA)" >
<!ENTITY   %  n.URL         "(#PCDATA)" >
<!ENTITY   %  n.Form        "(#PCDATA)" >
<!ENTITY   %  n.Publisher   "(#PCDATA)" >
<!ENTITY   %  n.Date        "(#PCDATA)" >
<!ENTITY   %  n.ObjectType  "(#PCDATA)" >
<!ENTITY   %  n.OtherAgent  "(#PCDATA)" >
<!ENTITY   %  n.Relation    "(#PCDATA)" >
<!ENTITY   %  n.Source      "(#PCDATA)" >
<!ENTITY   %  n.Language    "(#PCDATA)" >
<!ENTITY   %  n.Coverage    "(Spatial | Temporal | #PCDATA)*" >
<!ENTITY   %  n.Spatial     "(#PCDATA)" >
<!ENTITY   %  n.Temporal    "(#PCDATA)"

<-- Parameter entities for Attributes
    Almost all of the elements can have a "scheme" attribute that
    can be used to more precisely  indicate their semantics. The
    URL and INSTANCE elements do not have a scheme at this time.
 -->

<!ENTITY   % Subject.Scheme
             "LCSH | MeSH | Sears | Abstract | OtherScheme" >

<!ENTITY   % Title.Type
             "Main | SubTitle | PartTitle | Alternate |
              Abbrev | OtherType" >
<!ENTITY   % Title.Scheme
             "AACR2 | OtherScheme" >


Ron Daniel                                                   [Page 24]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995


<!ENTITY   % Author.Type
             "Name | Email | OtherType" >
<!ENTITY   % Author.Scheme
             "AACR2 | OtherScheme" >

<!ENTITY   % OtherAgent.Type
             "Editor | Sponsor | Principal | Compiler | Funder |
              Composer | Cataloger | Illustrator | Translator |
              OtherType"  >
<!ENTITY   % OtherAgent.Scheme
             "AACR2 | OtherScheme" >

<!ENTITY   % Form.Scheme
             "IMT | X.400 | OtherScheme">

<!ENTITY   % Identifier.Scheme
             "URN | URL | LCCN | ISBN | ISSN | SICI | MessageID |
              FPI | OtherScheme" >
<!-- URL stays in here for backward compatibility, but people should
     start to think about URNs as identifiers and URLs as retrival
     strings.  -->

<!ENTITY   % Date.Scheme
             "RFC822 | YYYY | YYYY-MM-DD | OtherScheme" >

<!ENTITY   % Relationship.Scheme
             "URN | URL | LCCN | ISBN | ISSN | SICI | MessageID |
              FPI | OtherScheme" >
<!ENTITY   % Relationship.Type
             "Supersedes | Continues | Continued.From |
              Contained.In | Superseded.By | Cites | Extracted.From |
              Is.Part.Of | Contains | IsIndexOf | IsIndexedBy |
              GlossaryOf | Predecessor | Successor | IsDerivativeOf |
              Child | Parent | Sibling | OtherType" >



<!-- Element list: Subject to change as this thing gets refined. Some
     elements for meta-metadata (version info on the URC itself)
     are the most likely candidates for addition. -->

<!ELEMENT   URC        - -  (%n.URC;)*
 -- Note that unlike the Dublin DTD, the URC DTD does not define an
    EXTENSION element. Instead, people extending the default URC DTD
    are expected to create a new DTD with a new Attribute set ID. -- >


<!ELEMENT   Author      - -     (%n.Author;)
 -- Name of the persons and organizations primarily responsible for
    the intellectual content of the resouce. Encode one name per


Ron Daniel                                                   [Page 25]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

    element. For personal names use Last, First (or whatever
    the cultural norm is for sorted lists of names). -- >
<!ATTLIST Author     Type     (%Author.Type;)         #IMPLIED
                     Scheme   (%Author.Scheme;)       #IMPLIED >

<!ELEMENT   Title       - -     (%n.Title;)
 -- The name of the object, if it has one. -- >
<!ATTLIST Title      Type     (%Title.Type;)          #IMPLIED
                     Scheme   (%Title.Scheme;)        #IMPLIED >

<!ELEMENT   Subject     - -     (%n.Subject;)
 -- The field of knowledge to which the resource belongs. The default
    content of the subject element is simple keywords. The scheme
    attribute can be used to indicate the use of a controlled
    vocabulary. -- >
<!ATTLIST Subject    Scheme   (%Subject.Scheme;)      #IMPLIED >

<!ELEMENT   Identifier  - -     (%n.Identifier;)
 -- String or number used to uniquely identify this resource.
    Typically this will be the URN of the resource, in which case
    the scheme attribute should be used to indicate that. Other
    identification schemes may also be used. -- >
<!ATTLIST Identifier Scheme   (%Identifier.Scheme;)   #IMPLIED >

<!ELEMENT   URL         - -     (%n.URL;)
 -- A Uniform Resource Locator for an instance of the resource. -- >

<!ELEMENT   Instance    - -     (%n.Instance;)
 -- Instance is a grouping construct to bind together information that
    relates to one particular instance of a resource. Typically it
    will bind a URL to its format, language, etc. However, examples
    can be constructed that use so many other elements that we allow
    anything to go in there.

    We want to say things like
    URC
      author
      title
      subject
      instance
         URL
         form
      instance
         URL
         form

 -- >

<!ELEMENT   Form        - -     (%n.Form;)
 -- The particular data representation of the resource. Typically
    this will be an Internet Media Type (formerly known as MIME


Ron Daniel                                                   [Page 26]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

    content type). In such a case the SCHEME attribute should be used
    to identify it. -- >
<!ATTLIST Form       Scheme   (%Form.Scheme;)         #IMPLIED >

<!ELEMENT   Publisher   - -     (%n.Publisher;)
 -- The agent or agency responsible for making the resource
    available. The value of this element should follow the
    guidelines for the AUTHOR element. -- >

<!ELEMENT   Date        - -     (%n.Date;)
 -- The date of publication. The scheme element can be used to
    indicate the particular format of the date string. -- >
<!ATTLIST Date       Scheme   (%Date.Scheme;)         #IMPLIED >

<!ELEMENT   ObjectType  - -     (%n.ObjectType;)
 -- The abstract category of the resource, such as article, image,
    dictionary, etc. -- >

<!ELEMENT   OtherAgent  - -     (%n.OtherAgent;)
 -- Other person(s) and/or organization(s) who have made a
    significant contribution to the resource. The value of this
    element should follow the guidelines for the AUTHOR element. -- >
<!ATTLIST OtherAgent Type     (%OtherAgent.Type;)     #IMPLIED
                     Scheme   (%OtherAgent.Scheme;)   #IMPLIED >

<!ELEMENT   Relation    - -     (%n.Relation;)
 -- Relationship of this resource to another resource. This
    element should specify what the relationship is, as well as
    the target of the relationship. The TYPE attribute is used for
    this purpose, the SCHEME attribute indicates how the
    destination is encoded.  -- >
<!ATTLIST Relation   Type     (%Relationship.Type;)   #IMPLIED
                     Scheme   (%Relationship.Scheme;) #IMPLIED >

<!ELEMENT   Source      - -     (%n.Source;)
 -- Objects, either electronic or printed, from which this
    resource was derived. This is a special case of the RELATION
    element. -- >

<!ELEMENT   Language    - -     (%n.Language;)
 -- The predominant natural language of the resource. -- >

<!ELEMENT   Coverage    - -     (%n.Coverage;)
 -- The spatial extent and/or temporal duration characteristic of
    the resource, e.g. "19'th Century France". -- >

<!ELEMENT   Spatial     - -     (%n.Spatial;)
 -- For more precise indication of the spatial extent characteristic
    of the resource. -- >

<!ELEMENT   Temporal    - -     (%n.Temporal;)


Ron Daniel                                                   [Page 27]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

 -- For more precise indication of the temporal duration
    characteristic of the resource. -- >




C DTD for Meta Attribute Set


The attribute set  definition given in  Appendix B  is believed to  be
widely applicable, but  is certainly not universally  applicable.   We
encourage others to create their own attribute sets that  more closely
meet their needs.

As described in section 3, since the  AID is a URN, there needs  to be
a URC that talks  about the attribute  set definition.   That URC  has
particular needs not  addressed by the  default attribute  set.   This
appendix presents an attribute set definition that is more  useful for
describing attribute set definitions.   When  people create their  own
attribute sets, they are strongly encouraged to use this attribute set
to describe them.

The AID of this  attribute set is  <urn:x-dns-2:uri.acl.lanl.gov:meta-
dtd> for now.   IANA should assign a  URN once that has been  settled.
An example of how to use this  to create a new attribute set  is given
in Appendix D.



<!--
    This is the ISO8879:1986 document type definition for the
    URC meta-attribute set. In other words, when people prepare
    alternative attribute sets, this is the attribute set they
    use to talk about their new attribute sets.

    This is actually a small modification to the default attribute
    set, adding the "Parent" relation.

    !!!! Talk about infinite regress and the need for entity manager
    to understand the "root" string. Where does that go? !!!

    Ron Daniel  6/7/95
-->


<!ENTITY  % Relationship.Type  "Parent | Base | OtherType"
 --  The  Parent  and  Base  relationships are  to  give  us  informa-
tion on the
        single  inheritance  hierarchy.  Parent  points  to   the  at-
tribute set from
       which  the  resource  was  derived.  Base  points  to  the  at-


Ron Daniel                                                   [Page 28]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

tribute set at
    the root of the single inheritance hierarchy. If only one is
    specified, they assumed to be equal. -- >


<-- No new elements are defined, and the content models are not
    changed, so just suck in the default attribute set DTD. -->

<!ENTITY  default-as  SYSTEM  "urn:x-dns-2:uri.acl.lanl.gov:default-1-
dtd">
$default-as;




D Custom Attribute Set Example


The role of  the meta  attribute set  definition given  in Appendix  C
is somewhat confusing.   Furthermore, we  anticipate that most  custom
attribute sets will  actually be  small modifications  of the  default
attribute set  - merely  adding  an element  or two.    This  appendix
presents a  sample DTD  to  illustrate how  to  perform such  a  minor
enhancement.   It shows  a couple  of ways  of adding  a new  element,
<AccessControl>, to the default attribute set.

The first way is the simplest.   It adds parameter  entity definitions
to the DOCTYPE declaration in the URC  itself.  It does not  require a
new attribute set to be widely published, but has limitations.



<!DOCTYPE URC SYSTEM "urn:x-dns-2:uri.acl.lanl.gov:default-1-dtd"
[
<!ENTITY   % n.URC "(Author | Title | Subject | Identifier | URL |
                    Instance | Form | Publisher | Date | ObjectType |
                    OtherAgents | Relation | Source | Language |
                    Coverage | AccessControl)" >
<!ENTITY   %  n.AccessControl"(#PCDATA)" >
<!ELEMENT   AccessControl    - -     (%n.AccessControl;)
 -- Provides a system-specific listing of the groups and individuals
    with read and write access to the resource. This is intended only
    for the purposes of an example in this specification. Any real
    access control list will need more work than this.  -- >
]
>
<urc>
<identifier scheme="URN">
urn:dns:pchs.k-12.okc.ok.us:student-papers-1995/geo3
</identifier>
<author>Smith, Fred</author>


Ron Daniel                                                   [Page 29]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

<title>
Wanker! : A vicious, seditious, and tendentious history of George III
</title>
<Subject>American Revolution</subject>
<Subject>(In)famous crackpots of history</subject>
<instance>
<URL>
http://www.pchs.k-12.okc.ok.us/student-papers/1995/smith/geo3.html
</URL>
<form scheme="IMT">text/html</form>
<AccessControl>
write: fsmith, admin
read: any
</AccessControl>
</instance>
</URC>



The disadvantage of this approach is  that it does not create  a named
attribute set.  The  new AS is anonymous.   Furthermore, text/sgml  is
the only syntax that can use this method.

The second  way  to  create a  new  attribute  set is  to  create  and
publish a new  DTD that extends  another DTD. The  DTD below adds  the
<AccessControl> element to the default attribute set.



<!--
    This is a ISO8879:1986 document type definition for a small
    extension to the default URC attribute set.

    It is quite possible that this is bug-infested, I have not
    had this checked by man or machine, I hacked it up right before
    sending the first draft of this specification to the URI-WG
    mailing list.

    The general idea is to specify the changes to the default AS,
    and then to include the default AS by means of a SYSTEM entity.

    Ron Daniel  6/7/95
-->

<!ENTITY   % n.URC "(Author | Title | Subject | Identifier | URL |
                    Instance | Form | Publisher | Date | ObjectType |
                    OtherAgents | Relation | Source | Language |
                    Coverage | AccessControl)" >

<!ENTITY  default-as  SYSTEM  "urn:x-dns-2:uri.acl.lanl.gov:default-1-
dtd">


Ron Daniel                                                   [Page 30]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

<!ENTITY   %  n.AccessControl      "(#PCDATA)" >
<!ELEMENT   AccessControl    - -     (%n.AccessControl;)
 -- Provides a system-specific listing of the groups and individuals
    with read and write access to the resource. This is intended only
    for the purposes of an example in this specification. Any real
    access control list will need more work than this.  -- >

<-- Suck in the default attribute set's DTD -->
&default-as;




If  we  assume  this   attribute  set  has   an  AID  of   <urn:x-dns-
2:uri.acl.lanl.gov:access-ctl-dtd>, a  simple URC  prepared using  it,
and transmitted in the text/sgml syntax might look like:


<!DOCTYPE URC SYSTEM "urn:x-dns-2:uri.acl.lanl.gov:access-ctl-dtd">
<urc>
<identifier scheme="URN">
urn:dns:pchs.k-12.okc.ok.us:student-papers-1995/geo3
</identifier>
<author>Smith, Fred</author>
<title>
Wanker! : A vicious, seditious, and tendentious history of George III
</title>
<Subject>American Revolution</subject>
<Subject>(In)famous crackpots of history</subject>
<instance>
<URL>
http://www.pchs.k-12.okc.ok.us/student-papers/1995/smith/geo3.html
</URL>
<form scheme="IMT">text/html</form>
<AccessControl>
read:any
write:fsmith, admin
</AccessControl>
</instance>
</URC>


If we were to resolve the URN of the new attribute set and  get back a
URC, that URC might look like:


<!DOCTYPE URC SYSTEM "urn:x-dns-2:uri.acl.lanl.gov:meta-dtd">
<urc>
<identifier scheme="URN">
urn:x-dns-2:uri.acl.lanl.gov:meta-dtd
</identifier>
<author>Daniel, Ron</author>

Ron Daniel                                                   [Page 31]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

<Subject scheme="abstract">
This attribute set adds a stupid access control element to the
default attribute set. Don't use this, use someting better.
</subject>
<relation type="parent" scheme="urn">
urn:x-dns-2:uri.acl.lanl.gov:default-1-dtd
</relation>
<instance>
<URL>
http://www.acl.lanl.gov/attribute-sets/access-ctl-dtd.sgml
</URL>
<form scheme="IMT">text/sgml</form>
</instance>
</URC>



    This Internet Draft expires ??  ??, 199?.