W3C home > Mailing lists > Public > www-html@w3.org > July 2005

Re: code and blockcode

From: Simon Siemens <Simon.Siemens@web.de>
Date: Tue, 12 Jul 2005 14:54:49 +0200
Message-Id: <42D3BD99.9070809@web.de>
To: XHTML-Liste <www-html@w3.org>

In this mail I try to summarize the previous discussion and to focus on
open points for the future discussion.

Currently we have the following situation:

   1. The contents of "code" and "blockcode" should be render as
      preformated text by default (independent of the value of an
      "codelang" attribute).
   2. Code search engines would have a big advantage of a "codelang"
      attribute. (The XHTML specification could mention this.)
   3. Developers would have an easier life, if they could install
      browser extensions that can handle specific code languages. (The
      XHTML specification could mention this.)
   4. It is an open point of discussion, how the code language is
      specified. /Maybe/ the MIME type is an option.

All in all a "codelang" attribute would increase the semantics of the
"code" and "blockcode" tag and would be of benefit for user agents in
the "world of coders" using the Internet.


There are two ways to handle the content of the "codelang":

-- 1 --

MIME types describe the content of container in a standardized way.
Searching the Internet let to the following results:

The IANA manages the MIME types. Everyone can register a new MIME type
at http://www.iana.org/cgi-bin/mediatypes.pl . There you can also find
the most interesting RFCs in this situation: RFC 2046 describes the
naming scheme behind the MIME types and RFC 2048 how new MIME types are
registered. Currently I can find MIME types like


I guess it would be no problem for a code language community to register
their own MIME type. However it was interesting to find, that there is
no MIME type text/javascript defined, although this is often used (even
by the W3C). Thus the MIME type system might not be as fixed as it might
look like. But it mostly works just fine!

-- 2 --

On the other hand we could use a naming scheme on our own. Defining the
30 most popular code languages would serve as a good base. Finally
authors will agree on one or two variants for each code language,
because they want to be found. It's similar to meta data in HTML. We
have "keyword" and "keywords", and everybody uses one of these, because
he wants to get indexed by search engines (I don't start a discussion
about the algorithms, search engines use ... ;-)). And keep in mind,
that "text/javascript" seems not to be registered by the IANA ...

All in all I think, that both variants have their own advantages and
drawbacks. And I also believe, that both would work in practice.

Received on Tuesday, 12 July 2005 12:54:52 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:06:11 UTC