Re: code and blockcode from Simon Siemens on 2005-07-11 (www-html@w3.org from July 2005)

From: Simon Siemens <Simon.Siemens@web.de>
Date: Mon, 11 Jul 2005 19:18:23 +0200
To: XHTML-Liste <www-html@w3.org>
Message-Id: <42D2A9DF.7080302@web.de>
Laurens Holst wrote:

> At work we use a similar tag, <bdoc:snippet type="xml"> which can also 
> be of type css and js. However, it would be better if this were a MIME 
> type, application/xml, application/ecmascript, text/css, etc. But I 
> don’t know whether that is appropriate for all languages...

Yes, MIME type sounds good. I've thought of it a few days ago too, but
can't remember, why I discarded it later on. Maybe someone else can
think of reasonable drawbacks, otherwise we should go this way. We could
collect an amount of code languages and compare it with the
corresponding MIME types.

The point of Orion Adrian seems not so important to me here (for other
use cases it sound reasonable). It doesn't matter if we have

    text/c++
    application/c++
    programming/c++
    language/c++

The only important thing here would be "c++", since we already know it's
code and not a picture. In fact, the preceding part could be "text/" in
all cases, otherwise we couldn't render it appropriately.

> Otherwise the options are to just use ‘class’ or create a similarly 
> extensible attribute. Role also comes to mind, but I don’t think that 
> one is appropriate here. A few predefined suggestions for browsers to 
> hook on would be useful.

I guess it needs an extra attribute. Class is mostly for format
templates. Using it for the code language would authors require to
format all java code snippets the same way. They (me too) wouldn't want
this. I agree with you, that role seems inappropriate as well. An
attribute "codelang" with MIME types in it fits best for me right now.

> Otherwise maintaining a list of languages including all version 
> numbers and establishing a naming guide for future languages would be 
> too bothersome, if you ask me, unless there is an ISO standard or 
> something that has such a thing.

I think graphical web browsers or voice browsers will not support more
than five languages. All beyond this will be treated as "pre". Only
search engines will support significantly more. Therefore I would
suggest to fix about 10 major languages (my example matlab is not
"major" enough) with their specific major versions. This would be a
handy list, UAs could keep in mind. All other languages can be
determined in the wild, if you give some tips like "Use only small
letters", "Use common acronyms/abbreviations instead of many words
(XHTML instead of Extensible ....)", ... The ability of search engines
and the preferences of authors will compensate.

However, as we thought of MIME types: Who determines them? Would this be
a solution even for not so popular languages?

> Hmm, I’m not entirely sure about that. That way, <ul> and <ol> and 
> <dl> could also be replaced by <list role="...">, etc.
>
> Although I agree that by itself it would not be a bad idea, I don’t 
> think it fits within the whole ‘known architecture’ idea. Besides, we 
> kind of arrived at the ‘what should be an element in XHTML, and what 
> as a role’ question again. So that is why I have doubts.

Your absolutely right, that this is the tag versus attribute discussion.
And a decision depends heavily on use case considerations then.  First
of all I would separate kdb and samp from var.

(kdb, samp, code) is a complete set by itself. Since it is rarely used,
I would suggest to have a attribute for the code/blockcode tag that
indicates whether we have input, output or processing code. This would
compact the standard further (two tags less, one attribute more. We
would even automatically have a block version for kdb, samp, since we
have blockcode), and would make it more handy for most authors (as they
don't care about code, kbd, samp, ...).

Each variable (the meaning of var) instead is part of code (even if it
stands alone). So, why don't we have keyword, constant, operator,
comment, ... as well? Therefore this "set" (I mean the tag set build up
of var) is incomplete and should be corrected somehow. Either we
eliminate the var tag totally, keep it for compatibility reasons, or
make a computer development module. The latter could be more feature
rich and would enable syntax highlighting, even if the user agent
doesn't know the code language. I wouldn't want to write such an XHTML
code by myself, but an IDE could help there with automatic XHTML export.
Maybe Orion Adrian could help us, what var is used for and why it is
important to distinguish var from code.

Best regards,

Simon
Received on Monday, 11 July 2005 17:18:28 UTC