W3C home > Mailing lists > Public > public-html@w3.org > March 2015

Re: <code> element and scripting languages.

From: Brian Kardell <bkardell@gmail.com>
Date: Thu, 5 Mar 2015 19:53:56 -0500
Message-ID: <CADC=+jdtr=+2YXssjjUmJ66Pqu3gkw-VRd5PsAZDxqokhTC0cw@mail.gmail.com>
To: Andrea Rendine <master.skywalker.88@gmail.com>
Cc: "public-html@w3.org" <public-html@w3.org>
On Thu, Mar 5, 2015 at 7:32 PM, Andrea Rendine
<master.skywalker.88@gmail.com> wrote:
> Greetings again.
> I came up the idea I am going to write after reading these lines:
> There is no formal way to indicate the language of computer code being
> marked up. Authors who wish to mark code elements with the language used,
> e.g. so that syntax highlighting scripts can use the right rules, can use
> the class attribute, e.g. by adding a class prefixed with "language-" to the
> element.
> (http://www.w3.org/html/wg/drafts/html/master/semantics.html#the-code-element)
> Actually IMO this is not the best way to recognize code snippets. @class
> attribute is both ubiquitous and devoid of any semantic meaning.
> On the other hand, I had a funny experiences some days ago while requesting
> an automated translation of a page in my language. This page contained PHP
> and JS code snippets, as well as a native scripting language. This means
> that it was full of control expressions such as "if ... else", "while",
> "function", "print" and so on.
> As you can easily imagine, these words in the snippet had been translated,
> thus making the snippets themselves useless.
> So I thought: why can't there be an agreement at least in HTML community to
> use @lang for this purpose on code-snippet elements and generally speaking
> in HTML documents?
> It wouldn't be difficult: according to BCP47, valid *existing* language tags
> can contain "private use subtags" in the form of a string consisting of "x-"
> followed by up to 8 alphabetic characters. This subtag can either follow a
> primary/regional language tag, or be present as stand-alone.
> This means that a snippet in the form <code lang="x-php">, for example,
> would be both easy to read, easy to target for syntax highlight extensions,
> and able to tell its content apart from parent elements defining a language
> for the whole document.
> Please tell me what you think about it.
> Thanks.
> AR

It sounds to me more like the problem was translation, not script
highlighting or something... You could set the universal translate
attribute to no, but then it wouldn't translate comments either which
may or may not be what you want... It's probably better than nothing.
Overloading language such that it means both spoken and programming
languages seems tough, the language of code might be Java but the
comments are in english.  Tough.  It doesn't seem insane that the
community could pioneer an x- attribute (or programming-lang="Perl")
and see if we can work it out?


-- 
Brian Kardell :: @briankardell :: hitchjs.com
Received on Friday, 6 March 2015 00:54:23 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:16:42 UTC