W3C home > Mailing lists > Public > public-html-comments@w3.org > March 2015

Re: <code> element and scripting languages

From: Andrea Rendine <master.skywalker.88@gmail.com>
Date: Sun, 15 Mar 2015 13:42:24 +0100
Message-ID: <CAGxST9kXfEvmMBcegve7BFmPwC7dsAkgCqPcX_aRqH6wS4J-GQ@mail.gmail.com>
To: public-html-comments@w3.org
As a side note, this suggestion was first discussed in
https://www.w3.org/Bugs/Public/show_bug.cgi?id=24942 where it seemed
sensible.
BCP47 also defines "extension subtags", i.e. language subtags introduced by
a singleton other than "x-" (reserved to private use), which should be
organised in a specification (
https://tools.ietf.org/html/bcp47#section-2.2.6). Of course it would be
even clearer and more standard, but I didn't suggest that because of 2
issues:
 1. extension subtags cannot be present without a primary language subtag,
i.e. lang"p-javascript" wouldn't be valid, while lang="en-p-javascript"
would. Were this the case, what should be the primary language subtag? The
same of the document or the natural base language for the programming
script (e.g. English for Perl-based)? For private use subtags this is not
the case, as they can be specified alone (though in this case the primary
language subtag could be e.g. the language for strings and comments).
 2. How should an "ideal" specification deal with new programming
languages? I.e. if a spec is published and then someone invents a
programming language called "omega", this wouldn't be usable until the spec
is updated on order to include "xx-p-omega". For private use subtags a
table in wiki projects could be sufficient, as it is now done for <meta@name>
and <link@rel>.

2015-03-15 13:29 GMT+01:00 Andrea Rendine <master.skywalker.88@gmail.com>:

> Mr Faulkner,
> I know it. I know all the stuff about @translate and as I said before,
> even if I didn't know that, I received some answers about it.
> My issue is:
>  1. this is the point where everything started from. Code snippets need a
> way to identify the programming language used. Without any agreement,
> authors now use data- attributes but of course this introduces a double
> measure of customisation (attribute name and possible values; however data-
> attribute are specific for a script and not intended for public use, e.g.
> search engines or UA tools). The spec suggests to use "class" attribute,
> but again without any standard (class attribute, however, while "describing
> the nature of the content" have no native semantic meaning, nor they are
> intended to influence the element in any other way rather than applying CSS
> selectors).
>  2. programming languages aren't *exactly* spoken languages, but they
> *are* languages, with words and a syntax. Even without any other
> consideration, they shouldn't be considered as the same language of the
> text in the page. Think about a page in French or in Italian with a PHP
> snippet, full of "if", "else", "while", "for" and so on. PHP is not French
> or Italian, of course. But it isn't even English because of some words. As
> said in a previous message (thanks to Stuart Wakefield for the point), the
> lang attribute is also used by speech syntesizers, spell checkers and other
> advanced page tools. If no other lang value is possible, a code snippet
> should have @lang="" (unknown language) in order to prevent weird readings
> or incorrect mistake checks.
>  3. the issue about translation, which wouldn't probably be prevented by
> lang="" as without an explicit declaration, translators could be programmed
> to use heuristics in order to recognize the language (Chrome does).
>
> So, in order to address ALL the previous aspects, this is what a
> Javascript snippet inside a <code> element should look like:
> <code lang="" translate="no" class="language-javascript">
> where the scripting language indication is totally custom, so it could be
> the same of
> <code lang="" translate="no" data-programming-language="javascript">
> and where javascript could be substituted by js, jscript, ecmascript or
> anything else; anyway in a future scenario with a new attribute it could be:
> <code lang="" translate="no" programming="javascript">
> Or, with only one indication
> <code lang="x-js">
> Temporarily it should still be <code lang="x-js" translate="no"> because
> some UAs apply translation heuristics to custom language tags. It wouldn't
> be the same if this were standardised. It is a base, though, and it
> addresses the previous 2 points which are the issue core (translating is a
> minor argument and it was just to highlight the fact that we are facing a
> language recognition matter).
> Hope my idea is clearer now.
>
Received on Sunday, 15 March 2015 12:42:51 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 15 March 2015 12:42:51 UTC