W3C home > Mailing lists > Public > public-html-comments@w3.org > March 2015

Re: <code> element and scripting languages

From: Andrea Rendine <master.skywalker.88@gmail.com>
Date: Sun, 15 Mar 2015 13:29:58 +0100
Message-ID: <CAGxST9mCbBY5jw+f6r_KePLU60c2a6-dEek4Lb7-mjyVmQR4KA@mail.gmail.com>
To: public-html-comments@w3.org
Mr Faulkner,
I know it. I know all the stuff about @translate and as I said before, even
if I didn't know that, I received some answers about it.
My issue is:
 1. this is the point where everything started from. Code snippets need a
way to identify the programming language used. Without any agreement,
authors now use data- attributes but of course this introduces a double
measure of customisation (attribute name and possible values; however data-
attribute are specific for a script and not intended for public use, e.g.
search engines or UA tools). The spec suggests to use "class" attribute,
but again without any standard (class attribute, however, while "describing
the nature of the content" have no native semantic meaning, nor they are
intended to influence the element in any other way rather than applying CSS
selectors).
 2. programming languages aren't *exactly* spoken languages, but they *are*
languages, with words and a syntax. Even without any other consideration,
they shouldn't be considered as the same language of the text in the page.
Think about a page in French or in Italian with a PHP snippet, full of
"if", "else", "while", "for" and so on. PHP is not French or Italian, of
course. But it isn't even English because of some words. As said in a
previous message (thanks to Stuart Wakefield for the point), the lang
attribute is also used by speech syntesizers, spell checkers and other
advanced page tools. If no other lang value is possible, a code snippet
should have @lang="" (unknown language) in order to prevent weird readings
or incorrect mistake checks.
 3. the issue about translation, which wouldn't probably be prevented by
lang="" as without an explicit declaration, translators could be programmed
to use heuristics in order to recognize the language (Chrome does).

So, in order to address ALL the previous aspects, this is what a Javascript
snippet inside a <code> element should look like:
<code lang="" translate="no" class="language-javascript">
where the scripting language indication is totally custom, so it could be
the same of
<code lang="" translate="no" data-programming-language="javascript">
and where javascript could be substituted by js, jscript, ecmascript or
anything else; anyway in a future scenario with a new attribute it could be:
<code lang="" translate="no" programming="javascript">
Or, with only one indication
<code lang="x-js">
Temporarily it should still be <code lang="x-js" translate="no"> because
some UAs apply translation heuristics to custom language tags. It wouldn't
be the same if this were standardised. It is a base, though, and it
addresses the previous 2 points which are the issue core (translating is a
minor argument and it was just to highlight the fact that we are facing a
language recognition matter).
Hope my idea is clearer now.
Received on Sunday, 15 March 2015 12:30:25 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 15 March 2015 12:30:26 UTC