- From: Ronald E. Daniel <rdaniel@acl.lanl.gov>
- Date: Fri, 9 Jun 1995 06:58:40 -0600
- To: uri@bunyip.com
3 Attribute Sets The primary purpose of the URC service is to resolve URNs into URLs for the purpose of resource retrieval. However, the URC makes a very convenient place to store metadata - data about the resource. Ron Daniel [Page 5] INTERNET-DRAFT An SGML-based URC Service June 7, 1995 Frequently this will be bibliographic information, but [1] requires that there be no restrictions on the data that can be placed in the URC. The URC is intended to be a container for metadata about a wide variety of Internet resources. Satellite images, poems, scientific datasets, fine art images, gene sequences, ... are all reasonable candidates for publication in the Internet's Uniform Resource Architecture. All of these resource types will need different sorts of metadata. Other attributes, such as ``subject'' may be used in different fashions. Because of this diversity, we make the fundamental assumption: There are no metadata elements (such as author, title, subject, etc.) that are applicable to all resources. Because of this assumption, we need a means of specifying what attributes are being used in a particular URC, as well as their syntax and semantics. This need brings up the notion of an attribute set and the attribute set identifier. Definitions: An Attribute Set (AS) is the particular collection of elements that may appear in a particular URC. An Attribute Set Definition (ASD) is a machine-parsable specification of the elements in an attribute set. An Attribute Set Identifier (AID) is a URN that can be resolved to obtain the attribute set definition. Using a URN to identify the attribute set of a URC has two advantages. First, URNs are unambiguous, so we can tell if the contents of one ``subject'' field are comparable to another. Second, using a URN lets us retrieve the attribute set definition if we need to. The definition is a machine parsable grammar specification for the URCs. This allows us to parse novel URCs, although dealing with the semantics of novel elements is still an unsolved problem. A further enhancement to this model is that an AS can be a modification of an existing AS. The child AS would specify only the additions and changes to the parent AS. Thus, attribute sets can form a single inheritance scheme back to some presumably well-known base attribute set. Multiple Inheritance (MI) of attribute sets was considered and explicitly rejected for reasons of complexity, robustness, complexity, poor behavior in distributed systems, complexity, lack of universal language support, and complexity. Furthermore, the author believes that MI is just too complex. Dig? The attribute set definition shall be an SGML DTD. Parameter entities Ron Daniel [Page 6] INTERNET-DRAFT An SGML-based URC Service June 7, 1995 shall be used to allow element definitions to be overridden in a single inheritance scheme. Such an approach is illustrated in Appendices B and D. The AS definition specifies the syntax of a URC in a machine-usable fashion. There are three complications to this model. First, we must also provide a specification of the semantics of the elements. At this time, we are unaware of any machine-usable semantic specification schemes with the generality needed for the URC task. Therefore, we rely on human-readable specification of the semantics of the elements. The semantics of the elements in the attribute set shall be indicated by comments in the DTD. Check w/ comp.text.sgml types on schemes for automatically extracting comments for documentation purposes. Another mechanism is available for locating machine-parsable semantic definitions once they become available. But before we can describe that, we must talk about the other complications. A second complication concerns the URC syntax. Having the attribute set defined as an SGML DTD only allows us to automatically parse URCs that are conveyed in an SGML transfer syntax. Note that other syntaxes are explicitly allowed as a feature of this proposal. Thus, if a request is made for a text/plain syntax, the result is not parsable using the AS definition. This is not a great problem. First, it is easy enough to request the URC in a text/sgml syntax, which is required to be conformant with the AS DTD. Second, we rarely care about parsing according to ISO 8879. Because the primary use for the URC service is URN to URL resolution, we will usually parse the URC in a heuristic fashion, rather than retrieve all the inherited DTD fragments. The default AS is provided to simplify the task of heuristic parsing. A third complication arises as a result of using a URN for the AID. Assume we have retrieved a URC, call it URC-1, that specifies its AID (AID-1). Also assume that we wish to retrieve the attribute set definition. We resolve AID-1, which is a URN, and get back a URC (URC-2) that lists locations for the AS definition. What is the AID in URC-2? How do we avoid infinite regress? This standard defines a basic meta-attribute set definition that is suitable for the URC of an attribute set (see Appendix C). To avoid infinite regress, AIDs can either be a URN, or the distinguished string "root". Providing a URC for the AS definition is a complication, but it also provides us with a natural extension mechanism for dealing with the semantics of an attribute set. Just as a normal document might have ASCII and PostScript representations, the AS definition might have SGML and KQML representations. These alternate representations are how we can provide versions of an AS definition with machine-readable semantic definitions. Ron Daniel [Page 7] INTERNET-DRAFT An SGML-based URC Service June 7, 1995
Received on Friday, 9 June 1995 08:58:39 UTC