Re: code and blockcode

On Tue, 12 Jul 2005, Simon Siemens wrote:

> We've talked at cross-purposes, 
> as I comment in the following:

OK, thanks, I see my mistake now; I thought the discussion had moved 
away from the topic of the Subject line.

> The whole thread is about the code tag and the last sentence I have quoted 
> took the bridge to programming languages:
> "For programming languages, there will be the same kind of  difficulties plus 
> the fact that the number of programming languages  keeps growing"

There's a connection, to some extent, but mostly we confuse ourselves by 
regarding programming "languages" (or computer languages in general) as 
languages. The analogy obscures more than it clarifies, on the average, 
and people even fail to see it as an analogy.

The quoted statement, too, is probably based on wrong reasoning by 
analogy. The difficulties will (mostly) not be of the same kind, though 
they may well be comparable in practical importance. _Some_ difficulties 
will be similar, like the fundamental vicious circle: authors do not 
indicate the language in documents, or indicate it wrongly; hence, 
browsers and other programs will pay little attention to it, and if they 
need language information, they guess the language heuristically; thus, 
authors have little motivation to indicate the language.

> The benefit of the additional 
> attribute is for search engines and specific users who manually activate a 
> feature for syntax highlighting.

The vicious circle makes this dubitable. Search engines do not pay 
attention to lang or xml:lang attributes, which would be far more relevant 
to their job. Why would they care about codelang, even if they try to 
distinguish code from prose? (I guess you were thinking primarily about 
specialized search engines, e.g. site-wide systems on systems that contain 
lots of documents with computer code inside them.)

So although codelang would make sense in principle, it would hardly be 
practical, and adding new constructs that are useful in principle only
does not improve credibility. Besides, we would need a whole system of 
identifiers for computer languages, perhaps with versioning.

> I don't want to remove "code" and "blockcode". I want to give it a better 
> meaning by an additional semantical attribute "codelang" (that's totally 
> different from xml:lang).

It would hardly be possible to keep it as separate from xml:lang. People 
keep confusing these things, and we keep enforcing such behavior by 
calling some code systems "languages". (I'm not saying we have a choice; 
we probably don't.)

_If_ such an attribute were added, the most obvious candidate for its name 
would be "type", since that name is already used in some elements where a 
data format is specified by its Internet media type, and technically the 
value here, too, would probably have to be specified that way, e.g.
<code type="text/html">&lt;font&gt;</code> tags considered harmful.x

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Received on Wednesday, 13 July 2005 05:58:00 UTC