- From: David G. Durand <dgd@cs.bu.edu>
- Date: Thu, 5 Dec 1996 22:37:05 -0500
- To: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>, w3c-sgml-wg@w3.org
I am CCing my response to a query from Michael about 9070 identifiers in case it is of interest to the rest of the list. Of course the only reference of record is the standard: I do have a copy, though. I've attached some very old URN ramblings about FPIs below. In brief, and by example, the basic syntax for ISO 9070 Object Identifiers is a::b::c//d::e::f::g Left // separated field is the name issuing authority. This is confusingly called the "Object owner", but it identifies the owner of the _name_ (not the object). Any "owner" of a prefix can delegate a subspace of their namespace to others, so if I own "a::b" I could have assigned "a::b::c" to you for your Object Identifiers. The "Object name" (the part following the //) has similar hierarchical identifiers, but no semantics or administrative procdedures attach to them. The bug in the 9070 standard is that SGML FPIs map all the fileds other than the authority into successive positions of the Object ID. So if I use an FPI like: -//SUN::SUNSOFT//DTD My weird DTD//EN It becomes: SUN::SUNSOFT//DTD::-::My Weird DTD::EN This is fine, but if I use :: separated items in organizing the object IDs of an SGML FPI, things get messed up: -//SUN::SUNSOFT//DTD dgd::dtds::weird-1//EN becomes: SUN::SUNSOFT//DTD::-::dgd::dtds::weird-1::EN And I can't extract the Language and version specifications dependably: They may be the last, or the last 2 items -- no marker as to which. And, since the count of fields is no longer fixed, I can't do it by counting either. Now this conversion is defined in an informative annex, so we need not be bothered by compatibility with this for XML, should we decide that this has any relevance to XML. One thing I really like about the 9070 syntax is that it is simple and general, lacking many of the odd mandatory fields in 8879 FPIs. On the other hand it is a different syntax, not in wide use. I've attached some old notes on 9070 I made for the URN list, years ago. They duplicate some of this, but touch on a few points not given above. -- David Attached archival material: As an "SGML guy" (though not the sort of SGML-zealot one sometimes sees, I hope), I have also been wondering this. Particularly given the limited use of SGML in WWW, the ISO naming stuff which is integrated with SGML seems a natural (at least for that application). Here's some possibly relevant info: The ISO FPIs are defined in ISO 9070. They are based on a two part structure: + naming authority + object identifier Each of these can be split into multiple hierarchical parts (allowing for complex object names and delegation of naming authorities). Root authorities can be assigned based on ISBN publisher numbers at the moment. The character set that can be used is case-insensitive and also restricted to be highly portable across national character sets. The syntax is character-based, rather than simply describing a sequence of octets. There are only two objections to the FPI standard as far as I can tell, based on the URN requirements doc. The first is a somewhat ugly syntax: "//" to delimit the two major parts, and "::" to delimit fields within each item. The second is an arbitrary length restriction 100 chars for owner name, 100 chars for object name -- seemingly chosen so that the corresponding SGML identifiers come to less than 250 characters. This last restriction is something that could be changed pretty easily through the ISO, I think, especially as the harmonization with the internet would appeal to the part of ISO that developed the FPI standards. (i.e. it was _not_ developed as part of OSI). It seems that the syntactic flexibility of the FPIs is sufficient to handle any naming needs I've seen proposed here, and the hierarchical authority assignment should be pretty scalable and decentralizable. 9070 has a provision for the use of ISBNs themselves as a root authority. This was a revision to the standard in the second edition (ISO/IEC 9070:1991(E)). This means that there is a non-ISO based authority to assign names. This is important since ISO 9070 naming authority (to be administered by ANSI) is not yet in operation. I have been told that the ISBN people are prepared to issue ISBN publisher numbers to anyone wishing to pursue electronic publication. The ISO 9070 character set is Upper/lower case, digits and "'()+,-.:=?/". The standard is defined in terms of a "character repertoire" not a particular encoding, so that national issues are a protocol rather than a naming issue. This lowest common denominator make make the "name" aspect of formal public identifiers less meaningful to Europeans and non-roman script users. Comparison rules (to determine sameness of named objects) are defined by 9070. I am not sure if they are case-sensitive or not, since I'm lacking access to some reference materials at the moment. >2) Public identifiers have a separation between owner-name components and >object-name components which has no equivalent in object identifiers. >(This separation may well prove artificial and lead to errors.) This may also, rather than a drawback, prove critical to enabling easy support for different methods of encoding object names, distinct from the issuing authorities. I am not a number. I am an undefined character. _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________
Received on Thursday, 5 December 1996 22:31:02 UTC