[Minutes] 21 Jan 2002 TAG teleconference

TAG teleconference
21 Jan 2002

Present: Tim Berners-Lee (TBL, Chair), Tim Bray (TB), Paul Cotton
(PC), Roy Fielding (RF), Chris Lilley (CL), David Orchard (DO),
Stuart Williams (SW), Chris Lilley (CL), Ian Jacobs (IJ)

On IRC: Dan Connolly (DC)

Regrets: Norm Walsh

Previous meeting 14 Jan:

Next meeting:    28 Jan

See also IRC log:

A summary of open action items may be found at the end of this


   1) Administration
      a) First TAG ftf meeting
      b) Panel at Technical Plenary 2002
      c) Meeting in May 2002 around AC meeting/WWW2002
   2) Media types and XML processing

1) Administration

a) First TAG ftf meeting

The TAG discussed its plans for a face-to-face meeting 12
February, with some participation by video link.

b) Panel at Technical Plenary 2002.

TBL: The most important thing I'd like to see out of meeting :
perceived holes in the architecture. The TAG will be listening
as well as presenting.

PC: The event is public. We need to ensure that topics to be
discussed can be discussed in public.

SW: Let's set expectations on how we will receive input.

PC: That will be covered in the panel introduction.

TBL: At Advisory Committee meetings, attendees find it more
interesting to discuss what has not been decided. Similarly, for
the TAG panel we can discuss what the TAG does not yet know about
Web architecture.

The TAG expects to discuss the panel agenda in more detail at the
first TAG ftf meeting. PC, also on the plenary organizing
committee, will stay on top of the plenary agenda. TAG
participants should read PC's proposal:

c) Meeting in May 2002 around AC meeting/WWW2002

The TAG discussed how it might participate at the WWW2002
developers day [1]. Based on TAG participant availability, TBL
will continue discussions of TAG involvement with the developer
day chair.

[1] http://www2002.org/program.html#devday

2) Media types and XML processing

The TAG discussed three issues regarding XML processing of media
types and namespaces. These are issues w3cMediaType-1,
customMediaType-2, and nsMediaType-3 on the TAG issues list [2].

w3cMediaType-1 [3]: Should W3C WGs define their own media types?
customMediaType-2 [4]: What commonality should there be among W3C
                         media types?
[3] http://www.w3.org/2001/tag/ilist#w3cMediaType-1
[4] http://www.w3.org/2001/tag/ilist#customMediaType-2

Should W3C Working Groups be defining media types?

There was general agreement among the TAG that yes, W3C should
define media types for its format Recommendations.

TB: Some people raised point that there may be some cases where
there's a low expectation that content will be served as a

TBL: So "yes", where there is a language being produced." As
opposed to a policy, for example.

PC: If the WG is defining something that can be served up as a
resource, define a media type for it.

What are the general guidelines or policies (if any) for W3C
working groups in defining their own media types?

TB: There is the practical matter of who actually defines the
media type. IETF? W3C WG?

TB, PC, TBL: Even though this is IETF work, the W3C WG needs to
ensure that this happens.

TBL: I've found that having a registration document and a
specification separate is kind of crazy. The job of the spec is
to define the media type.  Things may fall between the cracks if
there is a delay between publication of the W3C Recommendation
and registration of a media type.  We should tie registration to
the Rec track process. We should get the media type before

CL: The IETF wants a stable specification. Note that W3C is
asking people to serve content as early as Candidate

TBL, CR: The registration of the media type should appear in the
W3C specification no later than the start of Candidate

There was some discussion on RFC 3023, "XML Media Types" [5],

     "...standardizes five new media types -- text/xml,
     application/xml, text/xml-external-parsed-entity,
     application/xml- external-parsed-entity, and
     application/xml-dtd -- for use in exchanging network entities
     that are related to the Extensible Markup Language (XML).
     This document also standardizes a convention (using the
     suffix '+xml') for naming media types outside of these five
     types when those media types represent XML MIME
     (Multipurpose Internet Mail Extensions) entities."

CL: What happens if revisions to the specification make it

TB: Media types are less sensitive to versions. Also, I don't
think we should bend policy for the exceptional case.

TBL: Namespaces will refine the granularity.

DO: As TB said, the media type is fairly coarse-grained; people
should look to the namespace for more information. This is
guidance to a process that we think spec developers should go

TBL: Working Groups have to make clear the policy attached to the
namespace (e.g., will never change, may change in compatible
means, etc.). Today, W3C already has a namespace allocation
policy whereby Working Groups must state the policy associated
with namespaces they define in documents (e.g., what is expected
to change over time in the namespace).

Action DO: Draft a response to w3cMediaType-1 based on this
discussion. Some points included:

     - Yes, W3C should define media types for its specifications.
     - The registration form for media types should be part
       of the relevant W3C technical report, no later than
       Candidate Recommendation.

On RFC 3023 as the guidelines for registering media types

RF: Has this been tested? Do browsers work with this?

TB: The whole notion of +xml is to allow generic XML processors
to get in there. As far as I know, nobody has tried this.

RF: I'm worried about whether the "+" will be parsed by media
type parsers.

TBL: Is there SVG served with "+" in the media type?

CL: Yes.

TBL: If it broke browsers, we might have already heard about it.

     - W3C Working Groups should follow RFC3023 for
       registering media types for W3C-defined xml formats.

Homework: The Chair asked the TAG participants to read RFC3023
(in particular, the parts on character encodings in section 7.1)
in order to discuss the proposal.

[2] http://www.w3.org/2001/tag/ilist
[5] http://www.ietf.org/rfc/rfc3023.txt

nsMediaType-3 [6]: Relationship between media types and
[6] http://www.w3.org/2001/tag/ilist#nsMediaType-3

TBL: I think that the namespace of the first element is
definitive of the document. You can't jump into the middle of a
document in a space you know, if you don't know the outermost

DO: There was a relevant objection on the mailing list: there was
a shorthand example of xslt with an HTML root node. I think you
need to clarify your definition: is it the actual encoded root
element, or the "unshorthanded" version.

TBL: Is that document fundamentally an xslt script or an html
page?  you can look at it as an HTML page with some smart bits in

CL: Or as a template that will produce an HTML page.

TBL: It's nice for an xml document to be self-describing. The
only piece you can pick for doing this is the outermost piece.

DC: The XSLT/HTML example shows that the meaning of a document
depends on what sort of agent you present it to. Present that
document to an HTML user agent, and it'll sort of display it
(it'll get confused by the XSLT tags). Present it to an XSLT
engine and it'll fill out the template. Maybe it shouldn't be
that way, but it is.

DO: If you said "logical root node" in the xslt/html case, it
would work.

TB: We could also bite the bullet and say that this case is the
exception that proves the rule.

TBL: We need to define a rule that will work everywhere. We can
build in exceptions that we already know of.

DO: Does this issue come up with xml query and the use of query

PC: I don't think we've done enough thinking about this yet. TBL
is talking about an outer wrapper. We've not talked anywhere
about how principle processors calls secondary processor. That's
probably a bug in the overall arch. Not sure we get additional
value by talking only about root

TBL: Here's the MathML Working Group's problem: browsers behave
differently depending on how the xhtml + mathml is served. For
instance, Netscape is dispatching on the toplevel namespace, IE
is not. How can we make documents self-describing without relying
on the media type?

TBL Proposal:

   - When software that claims to support a given namespace, and
   when given an xml document where the document element is
   in that namespace, the software should process it correctly.

TB: Is this Independent of how it's served (media type)?

TBL: The media type, as RF said, is a hint.

TB: An issue is making me nervous. Yes, software should dispatch
on namespaces. However, I hesitate to go in the direction of
deprecating media types.  I don't want to fire up an xml
processor and parse the source to find out that content should be
handed to an SVG processor. People should not serve everything as

TBL: Some advantages of serving content, with say, the svg+xml
media type include: (1) more efficient (2) proxies can see into
it (3) software can see into it (e.g., operating system can
change icon).

RF: There are security issues on the client side. Handler
algorithms can bypass security mechanisms. If the client has been
given some content with specified media type, and the content say
it's something else, the client has to have the sense to stop and
ask the user to confirm.

TB: I think it's bad behavior for the client to sniff content and
ignore media type (e.g., when bytes suggest a file is HTML).

TB: To some communities, it's also a religious conviction: "If we
say (in the header) that it's UTF-16, then damnit it is."

PC: I'd like someone to write up scenarios as part of our work on
this issue.

/* The TAG discussed sniffing for character encoding
information. */

CL: You can get a situation where you have an incorrect encoding
in the file, and when you save to disk, you have to rewrite.

TBL: Fundamental question as to whether charset is intrinsic or

TB: An XML processor almost always can figure out what the
charset is, better than the server.

TB: I think we have pretty good agreement in principle that
dispatching on namespaces is a good thing and that media types
should not be deprecated. The issues will be about the corner

RF: I can add some information on the fundamentals of messages.
For example on the subject of charsets in the IETF: all of the
components in the system need to be consistent in how they
interpret the charset.  If some processors look at charset in
media type and others inside the document, then it's possible to
introduce security errors by modifying the charset in the

Action TB: Post a note to www-tag summarizing the issues
surrounding namespace and media type processing.

Done: http://lists.w3.org/Archives/Public/www-tag/2002Jan/0177

Summary of action items


   DO: Take a first stab at writing a policy to summarize
   resolution of issue w3cMediaType-1.
    Assigned: 14 Jan 2002.

  TBL: Find out what kind of editing access to the Web site will
  be available to TAG participants.
    Status: TBL Reports that CVS should be available. TBL
            thinks that people should get collaborator
            accounts at W3C.
    Assigned: 7 Jan 2002.

  PC/IJ: Summarize input on www-tag (including technical
  comments, liaison request). An initial categorization
  of input may be found in the IRC log of the 7 Jan 2002
    Assigned: 7 Jan 2002.


  TB: Post a note to www-tag summarizing issues about media type
  and namespace processing.
    Assigned: 21 Jan 2002.
    Done: See mail to www-tag:

  IJ: Follow up on W3C Process requirements regarding
  collaborator contributions with TB and RF.
    Assigned: 14 Jan 2002.
    Done: See contributor page:

  IJ: Write up a summary of an initial issue-tracking mechanism.
    Assigned: 14 Jan 2002.
    Done: See issue tracking policies page:

  IJ: Register three issues raised by XML Protocols WG.
    Assigned: 14 Jan 2002.
    Done: See issues list:

  DO: As part of preparation for TAG panel at W3C's Technical
  Plenary 2002, solicit input from chairs on what issues the TAG
  should address, and which documents the TAG should produce.
    Assigned: 7 Jan 2002.
    Done: See mail to Chairs (Member-only):

Ian Jacobs (ij@w3.org)   http://www.w3.org/People/Jacobs
Tel:                     +1 718 260-9447

Received on Wednesday, 23 January 2002 18:08:19 UTC