W3C home > Mailing lists > Public > public-html@w3.org > March 2015

<code> element and scripting languages.

From: Andrea Rendine <master.skywalker.88@gmail.com>
Date: Fri, 6 Mar 2015 01:32:19 +0100
Message-ID: <CAGxST9mz57HdF9TmQfRs3DXFTHgHp0vcvfMHbGsP7tMGp9XkLQ@mail.gmail.com>
To: public-html@w3.org
Greetings again.
I came up the idea I am going to write after reading these lines:
There is no formal way to indicate the language of computer code being
marked up. *Authors who wish to mark code elements with the language used,
e.g. so that syntax highlighting scripts can use the right rules, can use
the class attribute, e.g. by adding a class prefixed with "language-" to
the element.
(http://www.w3.org/html/wg/drafts/html/master/semantics.html#the-code-element
<http://www.w3.org/html/wg/drafts/html/master/semantics.html#the-code-element>)*
Actually IMO this is not the best way to recognize code snippets. @class
attribute is both ubiquitous and devoid of any semantic meaning.
On the other hand, I had a funny experiences some days ago while requesting
an automated translation of a page in my language. This page contained PHP
and JS code snippets, as well as a native scripting language. This means
that it was full of control expressions such as "if ... else", "while",
"function", "print" and so on.
As you can easily imagine, these words in the snippet had been translated,
thus making the snippets themselves useless.
So I thought: why can't there be an agreement at least in HTML community to
use @lang for this purpose on code-snippet elements and generally speaking
in HTML documents?
It wouldn't be difficult: according to BCP47, valid *existing* language
tags can contain "private use subtags" in the form of a string consisting
of "x-" followed by up to 8 alphabetic characters. This subtag can either
follow a primary/regional language tag, or be present as stand-alone.
This means that a snippet in the form <code lang="x-php">, for example,
would be both easy to read, easy to target for syntax highlight extensions,
and able to tell its content apart from parent elements defining a language
for the whole document.
Please tell me what you think about it.
Thanks.
*AR*
Received on Friday, 6 March 2015 00:32:46 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:16:42 UTC