Re: Attributes: Warwick Framework from Ron Daniel Jr. on 1997-02-13 (w3c-dist-auth@w3.org from January to March 1997)

From: Ron Daniel Jr. <rdaniel@lanl.gov>
Date: Thu, 13 Feb 1997 16:14:15 -0700
To: Judith Slein <slein@wrc.xerox.com>, Jim Whitehead <ejw@ics.uci.edu>
Cc: w3c-dist-auth@w3.org
Message-Id: <3.0.32.19970213161350.00987100@acl.lanl.gov>
At 07:07 AM 2/13/97 PST, Judith Slein wrote:

>There is also a set of more formal specifications, intended to become
>Internet Drafts, that I don't want to distribute before checking with the
>author (Ron Daniel at Los Alamos).  They look very rough right now, but
>include a description of a new application/relations MIME type that might be
>interesting to us in its own right, as a way of describing and transporting
>relationships between resources.

I've tidied up one of those documents slightly, it is appended. It defines
a media type for expressing the relations between the body parts in a
MIME multipart/related message (or between any URI-addressable resources).
There is an example in there that shows why people might want
to use multipart/related for WEB-DAV. It doesn't talk about why people
might want to break metainformation into packages, but if someone
really wants me to say why that is (IMHO) vital, press my hot button.

[Last-minute note. Judy just sent a message that talks about our phone
call and provides some of the reasons why multipart/related is a good
idea for splitting up "attribute sets" into packages. One minor nit
concerns media types. Currently media types are associated with formats,
not schemas.]

Regards,
Ron

========



proto-internet-draft                                            Ron Daniel
                                            Los Alamos National Laboratory
                                                              anyone else?


                    The application/relations Media Type

          This will be an Internet Draft when it grows up.
          The full disclaimer will then go here.


Abstract
========

The wide variety of resources that are accessible over the Internet,
and the diversity of uses to which they can be put, make it impossible
to define one schema for describing the resources. Instead, different
descriptive schemes will be used to meet different needs. The multipart
MIME types can be used to aggregate related descriptions. However, the
receiver of such a message has a difficult task. It must determine the
relations of the various body parts from their media types in order to
determine which ones to process and what processing they should
receive.

The application/relations media type is proposed as a means for
specifying the relationships between Internet-accessible resources.
Typically these resources will be body parts in a multipart/related
message, but they can be any resources identifiable by a URI. The
application/relations media type does not mandate a particular set of
semantics for relationships, instead it offers a way of specifying
multiple relationship schemas. The relationship names in those
schemas may even be URIs, so that relationships are first-class
resource. One such relationship schema is defined
in the companion draft "A Basic Schema for Expressing Relationships
between Network-accessible Resources"[cite].

Introduction
============

The Warwick Framework [cite] defines an architecture for containers and
packages of "metadata". One package might be a Dublin Core [cite]
description of a network-accessible resource. Another package might be
a more detailed description of the resource using a community-specific
standard such as FGDS (Federal Geospatial Data Standard?). Additional
packages might give a revision history of the resource, digital
signature of the resource, terms&conditions for accessing the resource,
etc. These packages can be aggregated into containers, which become new
packages in their own right. Versioning, signatures, etc.  can be
applied to the metadata packages as well as to the original resource.

This architecture has a natural implementation in MIME. Containers are
implemented as multipart/related entities [cite]. The packages are
multipart, message/external-body, or simply typed entities such as
application/usmarc.

While it is easy to encode a Warwick Framework package using MIME, the
receiver is presented with a rather formidable task. It must crack the
nested MIME wrappers, then try to determine the relations between the
various body parts. This latter task is, in general, impossible to
perform after the fact. The application/relations Internet Media Type
is proposed to solve that problem. It provides statements about the
relations between network-accessible resources. This could be used to
provide a "table of contents" for a Warwick Framework, or any
multipart/related, message.  This allows the sender to make explicit
the relationships between various body parts so that receiver can tell
what packages to process first.

Just as there is no one true set of "metadata elements", there is no
one true set of "relationships" between URI-addressible resources.
Therefore, the application/relations format does not define a set of
relations.  Instead, it provides two mechanisms for allowing different
relationship schemata to be used. First, and most general, the
relationship identifier may be a URI. This will provide a unique name
for the relationship between packages. It also opens up the possibility
for relationships to be self-defining. Format negotiation may be used
to fetch the approapriate definition of the relationship. Second, a
(schema URI) directive exists so that relationship names may be defined
relative to a labeled schema.  This allows for more familiar-looking
relationship names, such as (terms-for-access ...) instead of
(http://www.foo.com/schema1/terms ...).

The media type also allows us to state properties of particular
packages, such as the presence of URN resolution information, the use
of encryption, etc.


Overview and Example
====================

The application/relations format provides a simple syntax
(s-expressions) for stating that a particular relationship exists
between resources. The relationships are drawn from schemas that define
the name, cardinality, argument types, etc. The schema to use when
evaluating a relationship is specified by the (schema) expression. A
simple example of a application/relations entity might be:

(schema http://www.acl.lanl.gov/URN/simple-schema.scm
    (digital-signature http://www.acl.lanl.gov/~rdaniel/resume.html
                       http://www.acl.lanl.gov/~rdaniel/res-sig.asc)
    (revision-history  http://www.acl.lanl.gov/~rdaniel/resume.html)
                       http://www.acl.lanl.gov/~rdaniel/res-versions.cvs)
    (bibliographic-description http://www.acl.lanl.gov/~rdaniel/resume.html
                               http://www.acl.lanl.gov/~rdaniel/res-md.dc))

Relationships are typically between two sets of URIs, the source and
destination of the relationship. However, the definition of a relationship
is under the control of the enclosing (schema) element so n-ary relationships
are possible.

The argument to the schema element is a URI. That URI provides a unique
name to prevent namespace collisions between different relationships
that might happen to have the same name. The resource that is
identified by the URI may be anything, but should be a formal
specification of the relationship schema.  We expect that the resource
will have an executable media type, such as application/java-byte-code.
One particular such format, application/relation-schema, is defined in
a companion draft[cite].


Format
======

The format of an application/relations body part is a series of
s-expressions.  S-expressions are the parenthesized, prefix-notation
expressions used in programming languages like LISP. Note that we are
not defining a programming language. The application/relations media
type only defines the semantics of two expressions - one for defining
variables and another for specifying the relationship schema to use
when interpreting a relationship expression. All this media type does
is define a very simple syntax for expressing assertions about
relations between network-accessible resources.

Every application/relations s-expression MUST be a syntactically valid
expressions in the programming language Scheme[cite], assuming that the
relationship names are the names of Scheme procedures which are loaded
into the system upon encountering a (schema) expression. Note that the
converse is not true - not every legal Scheme expression is a valid
expression for use in a resource of type application/relations. In
particular, procedure definitions, either by use of (lambda) or (define
(proc-name proc-args) proc-body) are forbidden in application/relations
resources. The legal expressions in a application/relations body part
are a strict subset of those expressible in Scheme. The grammar
defining those expressions, using the EBNF of RFC-xxxx, is:

   body            :=  1*s-expr
   s-expr          :=  list / symbol / constant / comment
   list            :=  "(" *s-expr ")"
   symbol          :=  "define" / "schema" / user-symbol
   user-symbol     :=  ALPHA, *ALPHA-NUM-HQBG
      # User symbols must either be quoted or previously defined. Also,
      # symbols are CASE-INSENSITIVE.
   ALPHA           := "A" / "a" / "B" / "b" / .. . / "z"
   ALPHA-NUM-HQBG  := ALPHA / "0" / "1" / .. . / "9" / "-" / "?" / "!" / ">"
   constant        := string / quoted-list / quoted-symbol / #-const
   string          := """ any-valid-character """
   quoted-list     := "'(" *s-expr ")"
   quoted-symbol   := "'ALPHA", *ALPHA-NUM-HQBG
   #-const         := "#t" / "#f" / .. Scheme defines lots of these. Do we
                       really care?
   comment         := ";" any-non-EOL-or-EOF-char (EOL / EOF)
   EOL             := CR / LF / LFCR / CRLF


[to do: Carl Lagoze suggested allowing relation names to be URIs. This
 will allow relations to indicate their own schema, and could also allow
 code to be downloaded to implement a relationship - such as terms and
 conditions. Need to change things around a bit to accomodate that.

 BNF doesn't really seem the best way to define what is allowed here.

 Need to show the sorts of things that are allowed, e.g.
  (is-bibliographic target-uri description-uri)
  (collection       collec-uri (part-uri part-uri ... ))
  (collection       collec-uri
                       (part1-uri (propertyname1 value1) (prop2 val2))
                       (part2-uri (prop3 val3) (prop4 val4)))
]
   

More Examples
=============

As a moderately complex example, consider a scenario where a user 
asks to check out an HTML page for editing. The page has links to
embedded images, and there are associated documents providing
metadata about the HTML page which may also need to be edited if
the page is modified sufficiently. The document management system
locks out other edits to the page, and returns its HTML source in
a multipart/related wrapper. That wrapper also includes the images
and meta-info packages as additional body parts. The first body
part in the result might be an application/relations catalog describing
the various resources and their inter-relations.

The complete message might look like:

MIME-Version: 1.0
Content-type: multipart/related; boundary="#####"

--#####
Content-type: application/relations

(schema "http://www.acl.lanl.gov/URN/simple-schema.scm"

  ;; Define short names for these long URLs
    (define target-resource "http://www.acl.lanl.gov/~rdaniel/index.html")
    (define img-1  "http://www.acl.lanl.gov/96summer/tapeb1.gif")
    (define img-2  "http://www.acl.lanl.gov/People/rdaniel.gif")
    (define meta1  "http://www.acl.lanl.gov/~rdaniel/index.html.dc")
    (define hist1  "http://www.acl.lanl.gov/CVS/foo")

  ;; Here are the relations:
    ; can specify simple assertions about a resource.
    (is-target-resource  target-resource)

    ; We can specify binary relations between resources in the container
    (biblio-descrip  target-resource meta1)

    ; Relations can refer to external resources (hist1 not in this container)
    (version-history target-resource hist1)

    ; Relations can be to a set of targets. (Actually, relations should
    ;  be able to have arbitrary degree and cardinality, but that can be
    ;  saved for other formats if people really think binary is enough).
    (includes-images target-resource (img-1 img-2 img-3))

    ; We can specify additional properties of a relationship.
    (lock-file       target-resource 
                       ("http://www.acl.lanl.gov/boo" (type write-lock)))
)

--#####
Content-type: text/html
Content-ID: <foo@www.acl.lanl.gov>
Content-Location: http://www.acl.lanl.gov/~rdaniel/index.html

<html>
... many lines of fluff ...
</html>
--#####
Content-type: image/gif
Content-ID: <bar@www.acl.lanl.gov>
Content-Location: http://www.acl.lanl.gov/96summer/tapeb1.gif
Content-Encoding: Base64

... whole bunch of encoded GIF data ...
--#####
Content-type: image/gif
Content-ID: <zot@www.acl.lanl.gov>
Content-Location: http://www.acl.lanl.gov/~rdaniel/rdaniel.gif
Content-Encoding: Base64

... whole bunch of encoded GIF data ...
--#####
Content-type: application/x-DublinCore
Content-ID: <slam@www.acl.lanl.gov>
Content-Location: http://www.acl.lanl.gov/~rdaniel/index.html.dc

... some Dublin Core description of the page in here ...
(could even do this as a message/external-body if system thinks
 it won't need to be updated by the user but wants to allow for
 the possibility)
--#####



Starter Relationships
=====================

To allow a modicum of common treatment of the relationships, the
schema http://www.acl.lanl.gov/URN/simple-relations.html
defines the following relationships and properties:

    Relationships:
	is-bibliographic-info-for
	is-critical-review-of
	is-target-resource
	is-revision-history-of
	is-signature-of
	is-derived-from
	is-child-of
	is-parent-of
	is-terms-and-conditions-of
	is-content-rating-of

    Properties:
	has-urn-resolution-information
        ???
[These are totally off-the-cuff. Suggestions?]

Security Considerations:
========================

Warwick Framework components may be of any media type, including such
fun things as application/java-byte-codes, text/csh, text/perl, etc.
Be careful what you do with them.

Wanton access of all external packages related to an original resource
makes the job of traffic analysis easy, as well as being a nasty
consumer of network resources. Client software should only access
external packages when they have a reasonable expectation of being
able to put the package to use.

References:
===========

Warwick Framework

Dublin Core

multipart/related

RFC-xxxx  latest, greatest EBNF for the IETF



Ron Daniel Jr.              voice:+1 505 665 0597
Advanced Computing Lab        fax:+1 505 665 4939
MS B287                     email:rdaniel@lanl.gov
Los Alamos National Lab      http://www.acl.lanl.gov/~rdaniel
Los Alamos, NM, USA, 87545
Received on Thursday, 13 February 1997 18:16:08 UTC