- From: Roy T. Fielding <fielding@simplon.ICS.UCI.EDU>
- Date: Wed, 01 Jun 1994 19:06:40 -0700
- To: Multiple recipients of list <www-html@www0.cern.ch>
This discussion should really be on www-html, so I'm moving it in a rather arbitrary fashion...sorry (I'm beginning to dislike the split). I wrote on www-talk: ---------------------------------------------------------------------- <!-- The META element can be used to embed document metainformation not defined by other HTML+ elements for use by servers/clients capable of extracting that information. Servers should read the document head to generate HTTP headers corresponding to any META elements with the HEADER attribute, e.g. if the document contains: <meta header name="Expires" value="Tue, 04 Dec 1993 21:29:02 GMT"> The server should include the header: Expires: Tue, 04 Dec 1993 21:29:02 GMT as part of the HTTP response to a GET or HEAD request for that document. When the HEADER attribute is not present, the server should not generate an HTTP header for this metainformation; e.g. <meta name="IndexType" value="Service"> would not generate an HTTP header but would still allow clients or other tools to make use of that metainformation. Other likely names are "Keywords", "Created", "Owner" (a name) and "Reply-To" (an email address). --> <!ELEMENT META - O EMPTY> <!ATTLIST META id ID #IMPLIED -- to allow meta info -- header (header) #IMPLIED -- generate HTTP header -- name CDATA #IMPLIED -- metainformation name e.g. "Expires" -- value CDATA #IMPLIED -- associated value --> ---------------------------------------------------------------------- Dan Connolly replied: > In message <9406011511.aa29004@paris.ics.uci.edu>, "Roy T. Fielding" writes: >> corresponding to any META elements with the HEADER attribute, >> e.g. if the document contains: >> >> <meta header name="Expires" value="Tue, 04 Dec 1993 21:29:02 GMT"> >> >> The server should include the header: >> >> Expires: Tue, 04 Dec 1993 21:29:02 GMT > >Good. Examples. I love examples. As a counterexample, consider: > > <EXPIRES DATE="Tue, 04 Dec 1993 21:29:02 GMT"> That's fine, but it assumes that the server/tool knows the syntax of the EXPIRES element. The META element allows any name-value pair to be expressed (and thus parsed) without knowing the purpose of that name or value. We can then add new names without changing the spec and all implementations of the servers. >> as part of the HTTP response to a GET or HEAD request for that document. >> When the HEADER attribute is not present, the server should not generate >> an HTTP header for this metainformation; e.g. >> >> <meta name="IndexType" value="Service"> > > Counterexample: > <?indextype service> > >> would not generate an HTTP header but would still allow clients or >> other tools to make use of that metainformation. > > How would they make use of that information? Unless and until there's > a public agreement about what such data represents, you're talking > about private techniques. When such general consensus is reached, > then we add it to the spec. No? No. There is a substantial difference between a public agreement (which may include only a small subset of webspace) and general consensus, and even when they do coincide there is generally too long of a lag time between the need and the spec and then also between the spec and the implementation. In these situations, it is useful to have a general means for applying extensions which would not interfere with people who are not aware of those extensions. >> Other likely names are "Keywords", "Created", "Owner" (a name) >> and "Reply-To" (an email address). > > Yes... all these belong in the <HEAD>...</HEAD> of an HTML document. > The HEAD is by design isomorphic to the HTTP headers, or the headers > of a mail message or a news article. You don't need an extra META > element to say this. If this were true with the current implementation of HTML, then I would agree and we could all get on with our work without any need for the META element. If the HTML 2.0 spec is written such that <HEAD></HEAD> and <BODY></BODY> are required and explicit instructions are included that browsers not render anything within <HEAD>...</HEAD>, and then all offending browsers are fixed accordingly, then we can talk about using any element name within the HEAD as a response header. If you don't want to require that in HTML 2.0, then we are stuck with using META since it is the only way to provide such information across different versions of HTML -- a necessary requirement for my application. One question I have regarding use of HTML element names as headers is what are the character limitations on element names? From the DTD they appear to be close enough to rfc822 contraints, but is 34 characters the actual length restriction or just a uniqueness restriction (or am I misreading it)? >>> What is the meaning of the META element? I've heard several >>> things: >>> >>> Proposal: It's for http headers: >>> <META name="Expires" value="Tue Aug 12, 1994 10:33:32 CST"> >>> Answer: Then why not write: >>> <HTTP-HEADER name="Expires" ...> > >>Because metainformation may or may not also be useful as header information, >>depending on the capabilities of a given server and the existence of >>future tools which make use of that information. Nevertheless, it is still >>metainformation whether or not it is used within response headers. > > I would still like to see a definition of this term "metainformation." > You might say that the TITLE of a document is metainformation. You > might say ADDRESS is metainformation. But I don't see how this distinction > is useful. I consider TITLE to be metainformation as well -- the only distinction is that there exists a previously defined syntax for TITLE and none for EXPIRES. In fact, a syntax for EXPIRES could be defined as well, without having any impact on the existence of META elements. My definition: Metainformation is information about a collection of information (usually in the form of a document) in terms of that collection. (;-) The ADDRESS element does not represent metainformation, although its contents may include some metainformation. This is because it is defined to be a rendering element (and thus normal information) and may occur any number of times within a single document. >>> I can see the need for: >>> >>> <EXPIRES DATE="..."> >>> >>> but not a general HTTP header escape mechanism. >> >>But can you anticipate the needs of everyone? > > No, and I'm not trying to. New elements can and will be added over time. How??? It is not good enough to say that they will be added -- there needs to be a specific mechanism defined whereby they can be added without breaking existing implementations. We can't define a new content-type every time we need a new element. >> My original proposal called >>for an EXPIRES element like the above and an OWNER element like >> >> <OWNER name="..."> >> >>It was shot down because it does not satisfy the general need for document >>metainformation which can be parsed without pre-knowledge of the purpose of >>that metainformation. > > I find that original proposal quite on target, and I don't see how > the counterargument carries much weight. What examples motivate > this "general need for document metainformation" that are not > satisfied by new HEAD elements? Try parsing the OWNER element above without knowing its purpose (i.e. without knowing that you will find what you are looking for within the attribute called "name". Naturally, we could solve this dilemma by requiring all such elements to have no attributes and just content, e.g. <OWNER>...</OWNER> as is the case for TITLE. Of course, this is assuming that existing browsers are fixed. Another solution is to simply require all metainfo elements to have a simple, consistent attribute, e.g.: <OWNER value="..."> <EXPIRES value="..."> but that also begs the question of what to do about existing HEAD elements that do not follow the same conventions, e.g. LINK, BASE, NEXTID, and (ugh) ISINDEX. >>> Proposal II: It's for private indexing techniques. Then why not >>> use comments or processing instructions? >>> <?keywords a,b,c,d> >>> <?description lksjdflkjsdf> >>> or >>> <!-- @#@# KEYWORDS: a,b,c --> >>> <!-- @#@# DESCRIPTION: ... --> >> >>Because it is not for PRIVATE indexing techniques. > > This is news to me. What is this public agreement about how these > indexing techniques work? I can imagine some sort of relational > database abstraction behind it all or something... hmmm... > >> There is a multitude >>of uses for this information, most of which I did not think of when the >>META element was originally proposed. > > In how many cases is this information exchanged between parties, vs. > the number of cases when it is only used privately by one party? > For example, all users of the MOMspider tool have the option of providing additional metainformation within their HTML files such that MOMspider can use it in building its maintenance index. Such information currently includes LAST-MODIFIED, TITLE, OWNER, REPLY-TO, and EXPIRES. The first two are already provided via the server and HTML -- the last three can be obtained from META elements with the appropriate names. Currently, that group of users is extremely limited (i.e. me) and thus can be considered private. However, in a few weeks that will expand to several dozen sites -- is it still private? Within six months, I expect it to include at least half the information providers in webspace (assuming the tool works as expected). If that occurs, httpd server authors will see the need to include metainfo in response headers, thus allowing MOMspider to pick up that info from any URL tested with a HEAD request instead of just those files traversed at the local site. This information will become available not by public agreement, but simply because one tool (or possibly many) can make productive use of it. If the information is readily available, other clients will make use of it and thus whatever scheme is implemented first will become the defacto standard. Sound familiar? Personally, I would prefer the scheme that MOMspider starts with to be the most general possible, which is why I proposed it six months ago (long before I started implementing the tool). ....Roy Fielding ICS Grad Student, University of California, Irvine USA (fielding@ics.uci.edu) <A HREF="http://www.ics.uci.edu/dir/grad/Software/fielding">About Roy</A>
Received on Thursday, 2 June 1994 04:06:50 UTC