- From: Ian B. Jacobs <ij@w3.org>
- Date: Mon, 11 Feb 2002 15:39:33 -0500
- To: www-tag@w3.org
TAG teleconference 4 Feb 2002 All present: Tim Berners-Lee (TBL, Chair), Tim Bray (TB), Dan Connolly (DC), Paul Cotton (PC), Roy Fielding (RF), Chris Lilley (CL), David Orchard (DO), Norm Walsh (NW), Stuart Williams (SW), Ian Jacobs (IJ) Previous meeting 28 Jan: http://lists.w3.org/Archives/Public/www-tag/2002Jan/0235 Next meeting: 12 Feb face-to-face Regrets: CL See also IRC log: http://www.w3.org/2002/02/04-tagmem-irc A summary of open action items may be found at the end of this message. --------------------- Agenda: 1) uriMediaType-9: Why does the Web use mime types and not URIs? 2) whenToUseGet-7: How to handle idempotent queries? 3) namespaceDocument-8: What should a namespace document look like? 4) Language bindings 5) nsMediaType-3: Relationship between media types and namespaces? 6) Determining charsets 7) On using formal models for TAG work --------------------- -------------------------------------------------- 1) uriMediaType-9: Why does the Web use mime types and not URIs? http://www.w3.org/2001/tag/ilist#uriMediaType-9 -------------------------------------------------- DC: I don't know how we can contribute to life as we know it by addressing this. TBL: We could suggest that it would be good if mime types became first-class objects. TB: This issue has an IETF feel to me. DC: I don't agree there's a problem. I agree that people talk about this a lot. RF: MIME types are resources. As long as you have a well-established namespace, they become URIs whether people like it or not. The people who control MIME type space don't think they should be URIs. The TAG observed that different specifications (e.g., RDDL, Canonical XML) are using different conventions for making a URI of a media type. Resolved: Accept issue uriMediaType-9. Action RF: Summarize current approaches for making a URI of a media type. ---------------------------------------------------- 2) uriMediaType-7: How to handle idempotent queries? http://www.w3.org/2001/tag/ilist#whenToUseGet-7 ---------------------------------------------------- In addition to the original question of the issue (When to use GET?), the TAG added the question of how to handle idempotent queries (with a new POST-like method? GET plus a body?). ---------------------------------------------------- 3) namespaceDocument-8: What should a namespace document look like? http://www.w3.org/2001/tag/ilist#namespaceDocument-8 ---------------------------------------------------- Resolved: Accept issue namespaceDocument-8. ---------------------------------------------------- 4) Language bindings ---------------------------------------------------- On 24 Jan 2002, Jim Fuller sent a request [1] to the TAG to consider the "issue of language binding": "Language binding was explicitly dropped from XSLT 2.0, in the recognition that a common approach was required across the W3C." TBL: This is about API bindings. NW: When we published XSLT 1.1, it included language bindings (for how to do function calls from xslt). It created a firestorm. Nobody could agree that we should do this (in the XSLT WG). I have no confidence that this should be done across all possible bindings. CL: Most xslt implementations allow you to do this (via extensions). NW: Extension functions pose interoperability problems. DC: I'm torn on this. I like how XSLT extensions work in general (but for too many 404s). On the other hand, there are a lot of places in w3c specs for APis; they use central registries for tokens. NW: I observe that http://exslt.org/ publishes some common extension functions. Publish definitions. Implemented in various XSLT processors. You can use 'function-available' to find if they are available. The TAG spent some debating whether W3C should standardize a runtime library. PC: The issue here is not whether there should be a standard set of functionalities accessible from xslt, it is whether XSLT (or another specification) should standardize the extensibility mechanism that allows any extension function to be used. This is a rat-hole since would have to work across languages. You'd have to map datatypes across languages, which is no easy task. DC: The 'function-available' bit is what I'd want to look at, if anything. (SAX has such a thing, using URIs; yeah! DOM has one, that doesn't, last I looked. Boo.) TBL: This is normally done on the platform, not where W3C has normally been. Resolved: No action. [1] http://lists.w3.org/Archives/Public/www-tag/2002Jan/0194 -------------------------------------------------- 5) nsMediaType-3: Relationship between media types and namespaces? http://www.w3.org/2001/tag/ilist#nsMediaType-3 -------------------------------------------------- [Note from scribe: The minutes attempt to piece together several parallel discussion threads on the telephone and IRC. Some comments are not presented in the exact chronological order they were made in order to preserve the different threads.] The TAG discussed the example in section D.2 [2] of the XSLT 1.0 Recommendation. The style sheet starts: <html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" lang="en"> ... If this style sheet is fed to an HTML browser, the browser may consider it to be HTML (due to <html> element) and, while the document is not valid HTML, the browser might be able to handle it. If fed to an XSLT processor, the result will be entirely different: a generated HTML document. Several participants pointed out that this template is syntactic sugar: a simplification for: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/TR/xhtml1/strict"> <xsl:template match="/"> <html> ... The TAG considered the question: Is this an HTML document or an XSLT document? TBL proposal: The namespace on the root element determines subsequent behavior (i.e., the outermost piece rules). TB: There are important exceptions to that rule. NW: The above template essentially says "copy me, except when you encounter elements in XSLT namespace." The problem I have with calling this an xhtml document is that it wouldn't validate as an xhtml document. DO noted that the same issue would be relevant for other namespaces (e.g., SOAP) on the root element. DC: I'm trying to figure out whether we're designing what we want or describing what's already there. It's clear what happens when you hand a mixed document (such as the one in the example) to an XSLT processor. It's also clear if you hand the document to an HTML processor (though that's outside the HTML specification). There are lots of ways to handle this today (what this HTML browser does, or that XSLT processor does). If we're playing the "describe what exists" game, the architecture is: an XML document doesn't say what its purpose is; you have to have a protocol and a document before you know what the "meaning" is. If we're playing the "design what we want" game, I might agree that we want to be able to just look at a document to see what it means. DO: In this example, XSLT is using xhtml at the top as a shorthand. I don't think you can use the top level element as a guaranteed deciding factor for establishing what a document means. The fact that there is XSLT in the document says that it's an XSLT document. What happens if we have two vocabularies that both say that they want to be the "first thing". What about XSLT and XQUERY in the same document? They will argue over who is "more important." It sounds like the author needs to specify what the top-level processor should be. TB: I agree in general that namespace dispatching is appropriate and is better done contextually. But I think that trying to make a strong statement about the root element namespace may create more problems than it solves. I can come up with scenarios where you might want to reach into the middle of a document and do some things without regard to context. I don't want to send everything as application/xml and doing everything based on namespaces. PC: If you don't believe you should dispatch on namespaces, what should you dispatch on? TB: Media types if you can. It's more efficient for the sender to tell the recipient what is being sent, when the sender knows what the content is. At this point, the following discussion on IRC diverged from the discussion on the phone: CL: This situation is a result of the architecture that we have: a single document that has things "hanging off it." An alternative might be to send a wrapper that says "here are the pieces." As the "primary thing," the wrapper would convey the "meaning; other documents would be derived from it. DC: How is a "wrapper" different from a document? CL: A wrapper is like a manifest, there might not be a single top document. We can have a Web, not a tree. A wrapper doesn't get presented. its more like a zip file and a table of contents. DC: About the wrapper/manifest - it seems like the question of RDDL v. XML Schema; I don't see any fundamental difference. Either one can point to the other. The two conversations then rejoined on the subject of packaging. NW, DO: I agree with CL that this is a packaging issue. TBL: The TAG should be able to tell people that they can tell what a document is by looking at the bits in it and the MIME type. TBL: I'm worried by TB's statement that he will want to pull out content from the middle of a document based on its namespace. I want to be able to say "I got the following from so-and-so and I totally disagree with it." TB scenarios: a) I want to build a table of the XLinks in a document. I want to run through and look for content with xml:lang="ja" b) I want to be able to reach into a document and check for elements that have digital signature and check them. CL: If you look at the template, it tells you that, after transform, what processor should get the content. You get advance warning that N is the namespace you'll end up with as the root. DC scenario: A link-checker. TBL: How will I ever send a package without you delving into it and pulling something out and considering it a document? NW: That's not enforceable. The discussion then shifted to the topic of whether meaning was based on author's intention or the meaning that the recipient can gather from the content. TBL: What decides when it's ok to look inside? What criteria do you use? NW: I (the recipient) decide. For instance, you may have sent me something that I can't read (e.g., because it's in Japanese), but I see that there's some SVG, so I look at the picture. IJ: This is model we have for accessibility: the author proposes, the user disposes. TBL: But you do it when you understand that something is a package. RF: Packaging is not relevant to this discussion. DC: Packaging seems to be relevant because people keep saying "packaging" when this comes up. I don't understand the relevance either, Roy, but I can't dispute it. DC then imposed a reality check. DC: I get nervous if we are designing the world we want since no software does this. TBL: I'd like to look at a clean world, then look at the real world and see why people are doing different things. In some cases, the only difference is attitude. DC: If we had code that had a common model, I'd be happy to specify it. SW: In general, is the MIME type redundant information? Can it always be derived from the content (e.g., what about mixed content that might validate in different ways)? Reply: No, one cannot always derive a unique media type by looking at the content. RF: All content can be recognized as multiple types. DO: This is an interesting question - how do you know which pieces of content are targeting different processors? SOAP solves this problem. DC: I disagree, Dave. SOAP doesn't "solve" the multiple-protocols issue. I can take a SOAP document and look at it in emacs, which is not what the SOAP headers call for. TB: I stand by my claim that you can legitimately look inside a document and ignore container elements (see TB's scenarios above). I have spent a lot of time fighting for generic markup: one great virtue of generic markup is that it may be reused in ways author did not predict. DC: This is the Principle of Least Power [3] in action. RF: For the question "Does namespace always reflect media type?", the answer is no. Can you inspect content and derive a media type? No. Media type and namespace overlaps structural content and the purpose of how author intends content to be used. You can have whatever top-level element that you want. No matter how you look at document, can be interpreted as being in different namespaces. The point of the media type is to the convey author's intention, not dictate how the user must interpret the message. DC: There are many ways to look at document; I'm not sure whether we are going to specify some preferred ones. Discussion then shifted to the topic of the "meaning" of content versus information about how to process it. TBL: Some confusion here about what specs should say. There is lots of talk about processing models. I think our job in writing a specification is to say what a document type means; not what you do with it. Things get clearer when you talk about what a document means rather than what to do with it. For instance, commerce relies on knowing that something is an invoice, whether you put it on the wall, trash it, etc. DC: It's not the bytes in the content that make something an invoice, it's the context (message you sent me). TBL: Format specifications should be written so that they explain the meaning of a document if you get the bits and a mime type. Whether something is an invoice is based on a human-understandable protocol. The meaning of a document is independent of what you can do with it (see Axioms of Web Architecture: the meaning of a document [4]). DC: I totally disagree; the meaning of an invoice has everything to do what what you're expected to do with it. RF: It really sounds like these issues are wrapping themselves around the general issue of what a media type is on the Web. We need to write down a philosophy in a livable form. SW: I'd question whether documents have multiple interpretations and the meaning of a document can depend on the interpretation it is subject to. The TAG then reviewed a TB draft explaining TAG findings on issues w3cMediaType-1, customMediaType-2, and nsMediaType-3. Those findings are publicly available at: http://www.w3.org/2001/tag/2002/0129-mime [2] http://www.w3.org/TR/xslt#data-example [3] http://www.w3.org/DesignIssues/Principles#PLP [4] http://www.w3.org/DesignIssues/Meaning ----------------------- 6) Determining charsets ----------------------- DC: I verified that parameter names are local to each MIME type. TBL: Should we recommend that all xml mime types have a charset parameter? CL: I would object to assuming that everything has charset despite what a given specification says. Agenda item for next meeting: How do we resolve the problem in processing where there is a reference to charset that may not be defined? -------------------------------------- 7) On using formal models for TAG work -------------------------------------- TBL: Should we set as our goal to use formal models where appropriate (e.g., describe the mathematical relation between HTTP requests and responses)? DC has used Larch [5] to model HTTP. RF: I think someone who wants to see a model happen should write the model. With formal models, I usually run into a problem with a need to make simplifying assumptions about how the technology works. Usually, it's the things that make technology difficult are the parts that you most need to understand (and are difficult to model formally). E.g., edge cases are added over time as people in a Working Group realize that edge cases are tough to describe formally. And if you don't understand the (messy) edge cases, you don't understand the system. TBL: Using a formal model lets you see the invariants of a system. PC: I'm a little concerned spending time on formal definitions when our charter is for higher-level descriptions of architecture. While I agree that formal modeling is useful, I'd prefer to allocate that to a Working Group. DC: I'd be surprised if we did something that didn't lend itself to formalization. No decision. [5] http://www.w3.org/XML/9711theory/ ============================= Summary of action items ============================= Open: TBL: Find out what kind of editing access to the Web site will be available to TAG participants. Status: Not done, but TBL has been in contact with systems team. Assigned: 7 Jan 2002. PC: Draft a response to Duane Nickull on www-tag with recommendation to contact Web Services Architecture Working Group. Assigned: 28 Jan 2002. RF: Summarize different approaches currently used for mapping URIs to media types. Assigned: 4 Feb 2002. TB: Update draft findings on first three issues and make public. Assigned: 4 Feb 2002. Closed: DC: Verify that parameter names are local to each MIME type. Assigned: 28 Jan 2002. Done: http://lists.w3.org/Archives/Public/www-tag/2002Feb/0012 Summary: "charset" is not defined across all media types. There are NO globally-meaningful parameters that apply to all media. Reference: RFC2045 http://www.ietf.org/rfc/rfc2045.txt -- Ian Jacobs (ij@w3.org) http://www.w3.org/People/Jacobs Tel: +1 718 260-9447
Received on Monday, 11 February 2002 15:42:58 UTC