- From: Ronald E. Daniel <rdaniel@acl.lanl.gov>
- Date: Fri, 9 Jun 1995 06:59:26 -0600
- To: uri@bunyip.com
Appendix A SGML Declaration
This appendix presents the boring SGML declaration that is needed for
specificity. Serious consideration is being given to the use of ISO
10646, we are waiting on the HTML working group to decide what will
happen in that area. For now we use the so-called "Latin-1" character
set encoding.
This SGML declaration was stolen from Dan Connolly's work for the HTML
Check package. Hey, I steal from the best.
<!SGML "ISO 8879:1986"
--
SGML Declaration for Uniform Resource Characteristic (URC)
DTDs and instances as used in the World Wide Web (WWW).
This is AMAZINGLY similar to the sdecl for HTML prepared
by Dan Connolly. Any errors introduced into it are my
fault.
Ron Daniel 5/24/95
--
CHARSET
BASESET "ISO 646:1983//CHARSET
International Reference Version (IRV)//ESC 2/5 4/0"
DESCSET 0 9 UNUSED
9 2 9
11 2 UNUSED
13 1 13
14 18 UNUSED
32 95 32
127 1 UNUSED
BASESET "ISO Registration Number 100//CHARSET
ECMA-94 Right Part of Latin Alpha-
bet Nr. 1//ESC 2/13 4/1"
DESCSET 128 32 UNUSED
160 96 32
CAPACITY SGMLREF
TOTALCAP 150000
Ron Daniel [Page 21]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
GRPCAP 150000
SCOPE DOCUMENT
SYNTAX
SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27 28 29 30 31 127
BASESET "ISO 646:1983//CHARSET
International Reference Version (IRV)//ESC 2/5 4/0"
DESCSET 0 128 0
FUNCTION
-- SPACE 32
TAB SEPCHAR 9
LF SEPCHAR 10
FF SEPCHAR 12
CR SEPCHAR 13 --
-- The above is an accurate description of the usage of FUNCTION --
-- characters in HTML implementations; that is, there is no --
-- Record Start or Record End character, and no occurences of --
-- character 10 or 13 are "ignored" by the parser. --
-- But because few SGML implementations support this concrete --
-- sytax, we include the one below. --
-- Note that in order to get correct behaviour w.r.t. newline --
-- processing, you will have to play some tricks in construcing --
-- the document entity for parsing in order to keep the parser --
-- from ignoring newlines in surpirsing ways --
RE 13
RS 10
SPACE 32
TAB SEPCHAR 9
NAMING LCNMSTRT ""
UCNMSTRT ""
LCNMCHAR ".-"
UCNMCHAR ".-"
NAMECASE GENERAL YES
ENTITY NO
DELIM GENERAL SGMLREF
SHORTREF SGMLREF
NAMES SGMLREF
QUANTITY SGMLREF
NAMELEN 72 -- somewhat arbitrary; taken from
internet line length conventions --
TAGLVL 100
LITLEN 1024
GRPGTCNT 150
GRPCNT 64
Ron Daniel [Page 22]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
FEATURES
MINIMIZE
DATATAG NO
OMITTAG YES
RANK NO
SHORTTAG NO
LINK
SIMPLE NO
IMPLICIT NO
EXPLICIT NO
OTHER
CONCUR NO
SUBDOC NO
FORMAL YES
APPINFO NONE
>
B DTD for Default Attribute Set
This appendix presents the default SGML DTD. This is what would be
returned as a result of resolving the default AID. (Just what that
default AID will be depends on the result of the URN effort, we will
assume <urn:x-dns-2:uri.acl.lanl.gov:default-dtd> for now).
This started out as a shameless ripoff of work by Eric Miller of OCLC.
I have mutated it to add location information (the <URL> element) and
a construct for grouping information that is related to one particular
<instance> of a resource. I also butchered the <coverage> element.
<!--
This is the ISO8879:1986 document type definition for the
default URC attribute set. It is stongly based on the DTD
developed for the Dublin Metadata set by Eric Miller of OCLC.
This DTD is subject to discussion and change by the members
of the IETF's URI working group, or by anyone who cares to
express an opinon to me.
Ron Daniel 5/26/95
-->
<!-- ============ Parameter Entities ===============
This DTD makes extensive use of parameter entities so that the
Ron Daniel [Page 23]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
elements can easily be overridden by people wishing to extend or
modify the default attribute set. People creating new base attribute
sets are encouraged to use the same style. -->
<-- Parameter entities for Elements
These define the content models for the elements in the default
set. The convention for those names is n.xxx, where xxx is the
generic identifier of the element. -->
<!ENTITY % n.URC "(Author | Title | Subject | Identifier | URL |
Instance | Form | Publisher | Date | ObjectType |
OtherAgents | Relation | Source | Language |
Coverage)" >
<!ENTITY % n.Instance "(Author | Title | Subject | Identifier |
URL | Form | Publisher | Date | ObjectType |
OtherAgents | Relation | Source | Language |
Coverage)"
-- Everything in n.URC except Instance -- >
<!ENTITY % n.Author "(#PCDATA)" >
<!ENTITY % n.Title "(#PCDATA)" >
<!ENTITY % n.Subject "(#PCDATA)" >
<!ENTITY % n.Identifier "(#PCDATA)" >
<!ENTITY % n.URL "(#PCDATA)" >
<!ENTITY % n.Form "(#PCDATA)" >
<!ENTITY % n.Publisher "(#PCDATA)" >
<!ENTITY % n.Date "(#PCDATA)" >
<!ENTITY % n.ObjectType "(#PCDATA)" >
<!ENTITY % n.OtherAgent "(#PCDATA)" >
<!ENTITY % n.Relation "(#PCDATA)" >
<!ENTITY % n.Source "(#PCDATA)" >
<!ENTITY % n.Language "(#PCDATA)" >
<!ENTITY % n.Coverage "(Spatial | Temporal | #PCDATA)*" >
<!ENTITY % n.Spatial "(#PCDATA)" >
<!ENTITY % n.Temporal "(#PCDATA)"
<-- Parameter entities for Attributes
Almost all of the elements can have a "scheme" attribute that
can be used to more precisely indicate their semantics. The
URL and INSTANCE elements do not have a scheme at this time.
-->
<!ENTITY % Subject.Scheme
"LCSH | MeSH | Sears | Abstract | OtherScheme" >
<!ENTITY % Title.Type
"Main | SubTitle | PartTitle | Alternate |
Abbrev | OtherType" >
<!ENTITY % Title.Scheme
"AACR2 | OtherScheme" >
Ron Daniel [Page 24]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
<!ENTITY % Author.Type
"Name | Email | OtherType" >
<!ENTITY % Author.Scheme
"AACR2 | OtherScheme" >
<!ENTITY % OtherAgent.Type
"Editor | Sponsor | Principal | Compiler | Funder |
Composer | Cataloger | Illustrator | Translator |
OtherType" >
<!ENTITY % OtherAgent.Scheme
"AACR2 | OtherScheme" >
<!ENTITY % Form.Scheme
"IMT | X.400 | OtherScheme">
<!ENTITY % Identifier.Scheme
"URN | URL | LCCN | ISBN | ISSN | SICI | MessageID |
FPI | OtherScheme" >
<!-- URL stays in here for backward compatibility, but people should
start to think about URNs as identifiers and URLs as retrival
strings. -->
<!ENTITY % Date.Scheme
"RFC822 | YYYY | YYYY-MM-DD | OtherScheme" >
<!ENTITY % Relationship.Scheme
"URN | URL | LCCN | ISBN | ISSN | SICI | MessageID |
FPI | OtherScheme" >
<!ENTITY % Relationship.Type
"Supersedes | Continues | Continued.From |
Contained.In | Superseded.By | Cites | Extracted.From |
Is.Part.Of | Contains | IsIndexOf | IsIndexedBy |
GlossaryOf | Predecessor | Successor | IsDerivativeOf |
Child | Parent | Sibling | OtherType" >
<!-- Element list: Subject to change as this thing gets refined. Some
elements for meta-metadata (version info on the URC itself)
are the most likely candidates for addition. -->
<!ELEMENT URC - - (%n.URC;)*
-- Note that unlike the Dublin DTD, the URC DTD does not define an
EXTENSION element. Instead, people extending the default URC DTD
are expected to create a new DTD with a new Attribute set ID. -- >
<!ELEMENT Author - - (%n.Author;)
-- Name of the persons and organizations primarily responsible for
the intellectual content of the resouce. Encode one name per
Ron Daniel [Page 25]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
element. For personal names use Last, First (or whatever
the cultural norm is for sorted lists of names). -- >
<!ATTLIST Author Type (%Author.Type;) #IMPLIED
Scheme (%Author.Scheme;) #IMPLIED >
<!ELEMENT Title - - (%n.Title;)
-- The name of the object, if it has one. -- >
<!ATTLIST Title Type (%Title.Type;) #IMPLIED
Scheme (%Title.Scheme;) #IMPLIED >
<!ELEMENT Subject - - (%n.Subject;)
-- The field of knowledge to which the resource belongs. The default
content of the subject element is simple keywords. The scheme
attribute can be used to indicate the use of a controlled
vocabulary. -- >
<!ATTLIST Subject Scheme (%Subject.Scheme;) #IMPLIED >
<!ELEMENT Identifier - - (%n.Identifier;)
-- String or number used to uniquely identify this resource.
Typically this will be the URN of the resource, in which case
the scheme attribute should be used to indicate that. Other
identification schemes may also be used. -- >
<!ATTLIST Identifier Scheme (%Identifier.Scheme;) #IMPLIED >
<!ELEMENT URL - - (%n.URL;)
-- A Uniform Resource Locator for an instance of the resource. -- >
<!ELEMENT Instance - - (%n.Instance;)
-- Instance is a grouping construct to bind together information that
relates to one particular instance of a resource. Typically it
will bind a URL to its format, language, etc. However, examples
can be constructed that use so many other elements that we allow
anything to go in there.
We want to say things like
URC
author
title
subject
instance
URL
form
instance
URL
form
-- >
<!ELEMENT Form - - (%n.Form;)
-- The particular data representation of the resource. Typically
this will be an Internet Media Type (formerly known as MIME
Ron Daniel [Page 26]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
content type). In such a case the SCHEME attribute should be used
to identify it. -- >
<!ATTLIST Form Scheme (%Form.Scheme;) #IMPLIED >
<!ELEMENT Publisher - - (%n.Publisher;)
-- The agent or agency responsible for making the resource
available. The value of this element should follow the
guidelines for the AUTHOR element. -- >
<!ELEMENT Date - - (%n.Date;)
-- The date of publication. The scheme element can be used to
indicate the particular format of the date string. -- >
<!ATTLIST Date Scheme (%Date.Scheme;) #IMPLIED >
<!ELEMENT ObjectType - - (%n.ObjectType;)
-- The abstract category of the resource, such as article, image,
dictionary, etc. -- >
<!ELEMENT OtherAgent - - (%n.OtherAgent;)
-- Other person(s) and/or organization(s) who have made a
significant contribution to the resource. The value of this
element should follow the guidelines for the AUTHOR element. -- >
<!ATTLIST OtherAgent Type (%OtherAgent.Type;) #IMPLIED
Scheme (%OtherAgent.Scheme;) #IMPLIED >
<!ELEMENT Relation - - (%n.Relation;)
-- Relationship of this resource to another resource. This
element should specify what the relationship is, as well as
the target of the relationship. The TYPE attribute is used for
this purpose, the SCHEME attribute indicates how the
destination is encoded. -- >
<!ATTLIST Relation Type (%Relationship.Type;) #IMPLIED
Scheme (%Relationship.Scheme;) #IMPLIED >
<!ELEMENT Source - - (%n.Source;)
-- Objects, either electronic or printed, from which this
resource was derived. This is a special case of the RELATION
element. -- >
<!ELEMENT Language - - (%n.Language;)
-- The predominant natural language of the resource. -- >
<!ELEMENT Coverage - - (%n.Coverage;)
-- The spatial extent and/or temporal duration characteristic of
the resource, e.g. "19'th Century France". -- >
<!ELEMENT Spatial - - (%n.Spatial;)
-- For more precise indication of the spatial extent characteristic
of the resource. -- >
<!ELEMENT Temporal - - (%n.Temporal;)
Ron Daniel [Page 27]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
-- For more precise indication of the temporal duration
characteristic of the resource. -- >
C DTD for Meta Attribute Set
The attribute set definition given in Appendix B is believed to be
widely applicable, but is certainly not universally applicable. We
encourage others to create their own attribute sets that more closely
meet their needs.
As described in section 3, since the AID is a URN, there needs to be
a URC that talks about the attribute set definition. That URC has
particular needs not addressed by the default attribute set. This
appendix presents an attribute set definition that is more useful for
describing attribute set definitions. When people create their own
attribute sets, they are strongly encouraged to use this attribute set
to describe them.
The AID of this attribute set is <urn:x-dns-2:uri.acl.lanl.gov:meta-
dtd> for now. IANA should assign a URN once that has been settled.
An example of how to use this to create a new attribute set is given
in Appendix D.
<!--
This is the ISO8879:1986 document type definition for the
URC meta-attribute set. In other words, when people prepare
alternative attribute sets, this is the attribute set they
use to talk about their new attribute sets.
This is actually a small modification to the default attribute
set, adding the "Parent" relation.
!!!! Talk about infinite regress and the need for entity manager
to understand the "root" string. Where does that go? !!!
Ron Daniel 6/7/95
-->
<!ENTITY % Relationship.Type "Parent | Base | OtherType"
-- The Parent and Base relationships are to give us informa-
tion on the
single inheritance hierarchy. Parent points to the at-
tribute set from
which the resource was derived. Base points to the at-
Ron Daniel [Page 28]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
tribute set at
the root of the single inheritance hierarchy. If only one is
specified, they assumed to be equal. -- >
<-- No new elements are defined, and the content models are not
changed, so just suck in the default attribute set DTD. -->
<!ENTITY default-as SYSTEM "urn:x-dns-2:uri.acl.lanl.gov:default-1-
dtd">
$default-as;
D Custom Attribute Set Example
The role of the meta attribute set definition given in Appendix C
is somewhat confusing. Furthermore, we anticipate that most custom
attribute sets will actually be small modifications of the default
attribute set - merely adding an element or two. This appendix
presents a sample DTD to illustrate how to perform such a minor
enhancement. It shows a couple of ways of adding a new element,
<AccessControl>, to the default attribute set.
The first way is the simplest. It adds parameter entity definitions
to the DOCTYPE declaration in the URC itself. It does not require a
new attribute set to be widely published, but has limitations.
<!DOCTYPE URC SYSTEM "urn:x-dns-2:uri.acl.lanl.gov:default-1-dtd"
[
<!ENTITY % n.URC "(Author | Title | Subject | Identifier | URL |
Instance | Form | Publisher | Date | ObjectType |
OtherAgents | Relation | Source | Language |
Coverage | AccessControl)" >
<!ENTITY % n.AccessControl"(#PCDATA)" >
<!ELEMENT AccessControl - - (%n.AccessControl;)
-- Provides a system-specific listing of the groups and individuals
with read and write access to the resource. This is intended only
for the purposes of an example in this specification. Any real
access control list will need more work than this. -- >
]
>
<urc>
<identifier scheme="URN">
urn:dns:pchs.k-12.okc.ok.us:student-papers-1995/geo3
</identifier>
<author>Smith, Fred</author>
Ron Daniel [Page 29]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
<title>
Wanker! : A vicious, seditious, and tendentious history of George III
</title>
<Subject>American Revolution</subject>
<Subject>(In)famous crackpots of history</subject>
<instance>
<URL>
http://www.pchs.k-12.okc.ok.us/student-papers/1995/smith/geo3.html
</URL>
<form scheme="IMT">text/html</form>
<AccessControl>
write: fsmith, admin
read: any
</AccessControl>
</instance>
</URC>
The disadvantage of this approach is that it does not create a named
attribute set. The new AS is anonymous. Furthermore, text/sgml is
the only syntax that can use this method.
The second way to create a new attribute set is to create and
publish a new DTD that extends another DTD. The DTD below adds the
<AccessControl> element to the default attribute set.
<!--
This is a ISO8879:1986 document type definition for a small
extension to the default URC attribute set.
It is quite possible that this is bug-infested, I have not
had this checked by man or machine, I hacked it up right before
sending the first draft of this specification to the URI-WG
mailing list.
The general idea is to specify the changes to the default AS,
and then to include the default AS by means of a SYSTEM entity.
Ron Daniel 6/7/95
-->
<!ENTITY % n.URC "(Author | Title | Subject | Identifier | URL |
Instance | Form | Publisher | Date | ObjectType |
OtherAgents | Relation | Source | Language |
Coverage | AccessControl)" >
<!ENTITY default-as SYSTEM "urn:x-dns-2:uri.acl.lanl.gov:default-1-
dtd">
Ron Daniel [Page 30]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
<!ENTITY % n.AccessControl "(#PCDATA)" >
<!ELEMENT AccessControl - - (%n.AccessControl;)
-- Provides a system-specific listing of the groups and individuals
with read and write access to the resource. This is intended only
for the purposes of an example in this specification. Any real
access control list will need more work than this. -- >
<-- Suck in the default attribute set's DTD -->
&default-as;
If we assume this attribute set has an AID of <urn:x-dns-
2:uri.acl.lanl.gov:access-ctl-dtd>, a simple URC prepared using it,
and transmitted in the text/sgml syntax might look like:
<!DOCTYPE URC SYSTEM "urn:x-dns-2:uri.acl.lanl.gov:access-ctl-dtd">
<urc>
<identifier scheme="URN">
urn:dns:pchs.k-12.okc.ok.us:student-papers-1995/geo3
</identifier>
<author>Smith, Fred</author>
<title>
Wanker! : A vicious, seditious, and tendentious history of George III
</title>
<Subject>American Revolution</subject>
<Subject>(In)famous crackpots of history</subject>
<instance>
<URL>
http://www.pchs.k-12.okc.ok.us/student-papers/1995/smith/geo3.html
</URL>
<form scheme="IMT">text/html</form>
<AccessControl>
read:any
write:fsmith, admin
</AccessControl>
</instance>
</URC>
If we were to resolve the URN of the new attribute set and get back a
URC, that URC might look like:
<!DOCTYPE URC SYSTEM "urn:x-dns-2:uri.acl.lanl.gov:meta-dtd">
<urc>
<identifier scheme="URN">
urn:x-dns-2:uri.acl.lanl.gov:meta-dtd
</identifier>
<author>Daniel, Ron</author>
Ron Daniel [Page 31]
INTERNET-DRAFT An SGML-based URC Service June 7, 1995
<Subject scheme="abstract">
This attribute set adds a stupid access control element to the
default attribute set. Don't use this, use someting better.
</subject>
<relation type="parent" scheme="urn">
urn:x-dns-2:uri.acl.lanl.gov:default-1-dtd
</relation>
<instance>
<URL>
http://www.acl.lanl.gov/attribute-sets/access-ctl-dtd.sgml
</URL>
<form scheme="IMT">text/sgml</form>
</instance>
</URC>
This Internet Draft expires ?? ??, 199?.
Received on Friday, 9 June 1995 08:59:35 UTC