Re: code and blockcode from Laurens Holst on 2005-07-11 (www-html@w3.org from July 2005)

From: Laurens Holst <lholst@students.cs.uu.nl>
Date: Mon, 11 Jul 2005 20:34:10 +0200
To: Orion Adrian <orion.adrian@gmail.com>, Simon Siemens <Simon.Siemens@web.de>
Cc: www-html@w3.org
Message-ID: <42D2BBA2.2070006@students.cs.uu.nl>
Orion Adrian wrote:

>>>Up to now blockcode is rather the same as pre. I don't see any advantage.
>>>
>>>However, adding an attribute codelang, which has values like perl,
>>>xhtml, matlab, java, ..., would give it the intended meaning. This would
>>>enable
>>>
>>>   Browsers to provide syntax highlighting
>>>   in addition to the preformated layout
>>>      
>>>
>>And automatic formatting, too.
>>    
>>
>While this is probably in store, it would be a huge undertaking even
>for one language. I've been reading the blogs of the IDE designers for
>popular IDE's and it seems this is something that is very difficult to
>do. Automatic formatting is akin to compilation for some languages (in
>that you need to parse many files to see what are classes).
>  
>
Not necessarily, and for many languages it isn’t so hard. It depends on 
what you consider ‘automatic formatting’. Actually, I’d say it can often 
be easier than syntax highlighting.

At work, I am doing XSLT-based automatic indenting for XML and CSS files 
(indenting blocks delimited by { and }, indenting done by setting margin 
or padding). JS is a little harder (because of // comments and the 
likes), but I am trying to see if I can get that working too (just the 
CSS rules were insufficient :)).

Based on that, I could also relatively easy create sections inside the 
code which can be collapsed by the user, etc.

Not that I’m saying that common browsers should do that... But for a 
website, it will be a useful hook for e.g. some JavaScript formatting. I 
can imagine scripts doing different kinds of processing for different 
languages becoming available, created by various authors.


Simon Siemens wrote:

> Yes, MIME type sounds good. I've thought of it a few days ago too, but
> can't remember, why I discarded it later on. Maybe someone else can
> think of reasonable drawbacks, otherwise we should go this way. We could
> collect an amount of code languages and compare it with the
> corresponding MIME types. 

Question is though: do all (or most) programming languages have a 
registered MIME type?

> An attribute "codelang" with MIME types in it fits best for me right now. 

Sounds reasonably ok to me...

> I think graphical web browsers or voice browsers will not support more
> than five languages.

If any at all, although I can imagine CSS or JS being supported (given 
that they have already got a parser for those).

> All other languages can be determined in the wild, if you give some 
> tips like "Use only small letters", "Use common acronyms/abbreviations 
> instead of many words (XHTML instead of Extensible ....)", ... The 
> ability of search engines and the preferences of authors will compensate.

That is a difficult problem, and question is whether such rules are made 
which cover all cases, or whether it is just left to the people to make 
some standards. The latter case, I do not think is desirable because it 
negates much of the use of such an attribute (then we might just as well 
stick to the ‘no standard’ situation we have now), but the former case 
is very difficult. For example, how are you going to differentiate the 
syntax of Z80 assembly from x86 assembly, just ‘assembly’ (or ‘asm’? 
there we go already) won’t do, etc.

There is also the issue that there is a good chance that people will 
e.g. start using ‘c++’ for any C-style language (e.g. PHP) just because 
that one is supported and the real language type of the code isn’t. Or 
‘javascript’ (or ‘js’?) for c++...


~Grauw

-- 
Ushiko-san! Kimi wa doushite, Ushiko-san!!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Laurens Holst, student, university of Utrecht, the Netherlands.
Website: www.grauw.nl. Backbase employee; www.backbase.com.
Received on Monday, 11 July 2005 18:34:15 UTC