- From: Ron Daniel Jr. <rdaniel@lanl.gov>
- Date: Fri, 14 Feb 1997 10:45:29 -0700
- To: Jim Whitehead <ejw@ics.uci.edu>, w3c-dist-auth@w3.org
At 03:38 PM 2/13/97 -0800, Jim Whitehead wrote: [and I said] >>It doesn't talk about why people >>might want to break metainformation into packages, but if someone >>really wants me to say why that is (IMHO) vital, press my hot button. > >Press. This issue is very relevant to WEBDAV right now. OK. Sorry for the length of the response, but I did say it was a hot button issue. But let me answer your last concern first since I think we may have a misunderstanding: >Packaging also seems to have the drawback of >requiring a more complex metadata data model than links and resources due >to the complexity of the package data structure. Packages *are* resources. A package has a media type, can have a URI, etc. I think you are the one who cited Tim's page on metadata principles where he said that "metadata is data". That is exactly what I am arguing for. Clients and servers will need to exchange metadata for different purposes. Sometimes we need to create or update a bibliographic description of a web page. Sometimes we need to view the revision history or send commands to the version control system. Specialized document types will have specialized descriptions. Over time, user sites will impose new requirements (e.g. provenance tracking). In my considered technical opinion, these different purposes are better met by having purpose-specific media types defined, and allowing sets of such purpose-specific "packages" to be sent back and forth inside MIME multipart containers. (Later in the message I will provide my reasoning for this statement). The application/relations stuff is not a package container format. It is one particular package - one that talks about the relations between other packages. We can say things like: (is-bibliographic-description-of uri1 uri2) (is-digital-signature-for uri1 uri3) (is-revision-history-of uri1 uri4) where uri1 identifies the target resource and the other URIs identify the packages that contain particular forms of metadata. application/relations gets more complex because it does not restrict relations to being binary. What the group might do is define a simple format (perhaps only allowing binary relations) that all WEB-DAV compliant systems must support, but allow other formats to be used between consenting parties. Similarly, the group might define one common set of instructions for checking resources in and out of a version control system, but allow other formats for that information so that proprietary capabilities can be exploited. ============= OK, the preceding was to try and be more clear on what I mean by "packages" and provide a bit of motivation for why I think they are important. The rest of this message is to say more on why I think they are important, and it may help clarify exactly what I mean by packages. First, I assert that it is impossible in theory, let alone practice, to define the one true all-encompasing metadata set. Anything that purported to be comprehensive would be obsolete the instant it was declared complete, and would be FAR too large to be useable. Then of course there is the practical difficulty in getting all communities to coordinate their naming practices. What happens now is that particular communities who wish to share their resources define a community specific metadata standard, and they expend great effort to come up with standards that work well for the needs of their community. Museums have a standard, newspaper photo editors and archivists have a standard, the Geospatial communities have a standard, libraries have a standard, etc. (Actually, all these communities have several standards, but I digress). When resources are created in a community for the use of others in the community, they are most naturally, AND USEFULLY, described using the community's existing metadata standard. These standards will typically define a single syntax, certain required elements, and the order of appearance of particular elements. All of this is easily handled by allowing the description to be conveyed as a legal instance of the exchange format for whatever standard is being followed, and using a media type to let receiving applications know what they are receiving. This is a "package". Preventing communities from using their already-established metadata standards is unacceptable. Trying to define mappings between such standards and the notion of independent, textual, name:value pairs is ...(pause) difficult. :-) Furthemore, its not necessary. Send those community-specific and purpose-specific "packages" around as typed MIME entities and let helper applications deal with them. Third, I think that the "ease of use" some people see in having attributes as stand-alone elements is illusory. Consider the attribute "resolution". From dealing with images in web pages, we might agree that "resolution" means the width x height of an image in pixels. Satellite observation folks think resolution is the size of the smallest resolvable feature, measured in meters (or fractions thereof). Legislative folks think that a "resolution" attribute should carry an identifier for a particular legislative motion. Mediators think that "resolution" is a big text field containing a summary of the outcome of a conflict. etc. These interpretations don't even share a data type, never mind a meaning. The only way I know to disambiguate such uses of "resolution" is to augment the attribute name with the name of the schema from which it is drawn and which defines its meaning. You can either do that in-line for every element, or you can do it once for all the elements that are drawn from the same schema. The latter offers a considerable savings in terms of redundant information - especially once you start allowing attribute names and values to have language and characterset specifications. But doing it once typically requires grouping the elements into clumps that are 1/2 back to packages. Fourth, if I don't know the schema for an attribute, all I can reasonably do with it is throw it out. If you put them into packages with media types, its easy to tell if you know the format or not and you can drop it on the floor if you dont. You can put different formats with the same general purpose into multipart/alternative. (e.g. Dublin Core, SOIF, USMARC) >As near as I can tell, by packaging you sacrifice some ability to perform >content negotiation (e.g., the case where the natural language used for the >value of attributes varies within a schema -- if a book has a title which >is available in Russian/Cyrillic and Russian/Roman, and my browser can only >display roman character sets, while the rest of the information on the book >is in English). First, content negotiation can happen at the package level. A library catalog system could emit its records as MARC, or could use that info to generate a Dublin Core or SOIF or GILS or ... description, whatever the client has asked for. Or I could return them all to the client in a multipart/alternative wrapper and let the client figure out what to do. Second, the particular example you cite can be handled if the package format is defined in a capable enough manner. We might have something like --##### Content-type: application/foo; charset=ISO10646 <title lang="Russian/Cryillic">.....</> <title lang="Russian/Roman">...</> etc. --#### Not all packages will be defined in such a way that they can support this, but I contend that they don't all need to. Some packages will carry very simple information. Let them stay simple. Use multipart/alternative to allow simple and complex metadata packages for the same general purpose to be carried around. >However, packaging does have some accounting advantages >(it's easier to remove a package of metadata when the resource it describes >has been removed rather than a slew of individual metadata resources) and >packaged metadata may be easier to search by attribute value than >individual resource metadata. What I percieve as the major advantages for packages are: You can use existing metadata formats as-is. This is likely to turn up in a number of places in WEB-DAV. For example, I might want to show the revision history of a document. That can be sent as application/cvs-revison-report or some such thing. You can provide a simple, common metadata package to allow some level of interoperation between communities. (Something like the Dublin core or SOIF) Specialized communities can use their already-standardized metadata formats. This is very important. *very* If the user needs to edit the metadata package, media types and helper applications provide the means for doing it in a manner that is eminently extensible. Ron Daniel Jr. voice:+1 505 665 0597 Advanced Computing Lab fax:+1 505 665 4939 MS B287 email:rdaniel@lanl.gov Los Alamos National Lab http://www.acl.lanl.gov/~rdaniel Los Alamos, NM, USA, 87545
Received on Friday, 14 February 1997 12:46:07 UTC