W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > July 2001

Re: rdfms-xmllang alternatives

From: Graham Klyne <Graham.Klyne@Baltimore.com>
Date: Thu, 26 Jul 2001 22:15:08 +0100
Message-Id: <>
To: pat hayes <phayes@ai.uwf.edu>
Cc: w3c-rdfcore-wg@w3.org

The original RFC 1766 was not written with XML in mind.  Indeed, it 
pre-dates the original XML recommendation by 3 years, and was designed for 
a different environment.  Many Internet protocols have some elements that 
exist for machine-to-machine communication, and other elements that exist 
to be displayed to humans.  For example, in email,  From:, To:, etc. 
headers carry addresses are there to get the mail delivered to the right 
mailbox.  Other fields such as Subject: carry information for human 
consumption.  The intent is that fields intended for human consumption have 
language tags so they can possibly be translated for different audiences, 
or presented through speech synthesis, or handled in some other sender- or 
recipient-language-dependent way.

The distinction is similar to that between XML element names and XML 
(PCDATA) element content.

That said, I do appreciate your point that it is arbitrary to exclude RDF, 
and the language might have been softened in RFC 3066, which does 
acknowledge XML.  I also expect that the authors didn't anticipate 
situations where the content could be either human- or machine 
language.  That would be very unusual in a lower-level network protocol.

BTW, the likely tag, if registered with IANA, would be 
'i-RDF'.  Three-letter tags are reserved for those approved by one of the 
other international standard bodies.

I'll see if I can get some more background on the thinking when I'm at the 
IETF meeting in a couple of weeks.

You also say:
>However, XML has many other uses now, and many of them are incompatiable 
>with the limited suite of uses contemplated in RFC1766 (eg formal 
>specification of syntax; RDF). In particular, RDF is not intended to be 
>read by human users, but by software agents, a fact which by itself seems 
>to render RFC1766 irrelevant.

I would suggest that this is an argument to not use xml:lang as the primary 
indicator of the language of a literal.  If there were no xml:lang 
attribute, I suspect the "obvious" way to indicate language would be with 
RDF properties (e.g. per TimBL's interpretation properties [1]).  Surely, 
the "Web Way" to designate a language should use a URI?


[1] http://www.w3.org/DesignIssues/InterpretationProperties.html

At 01:14 PM 7/26/01 -0700, pat hayes wrote:
>>At 07:28 PM 7/23/01 -0700, pat hayes wrote:
>>>>Another possible disadvantage?:  not all literals are in some 
>>>>language.  It doesn't really make sense to specify a language for, say, 
>>>>a decimal number or a MIME type string.
>>>Why not say that the language for those things is RDF?
>>Because the definition of xml:lang (which is the defined syntax whose 
>>meaning we are trying to capture) is that the language code used is as 
>>defined by RFC 1766, now obsoleted by RFC 3066, which says:
>>2.4 Meaning of the language tag
>>   The language tag always defines a language as spoken (or written,
>>   signed or otherwise signaled) by human beings for communication of
>>   information to other human beings.  Computer languages such as
>>   programming languages are explicitly excluded.  There is no
>>   guaranteed relationship between languages whose tags begin with the
>>   same series of subtags; specifically, they are NOT guaranteed to be
>>   mutually intelligible, although it will sometimes be the case that
>>   they are.
>Christ Almighty. That is so unbelievably stupid that we should ignore it 
>deliberately, I suggest. Whoever wrote that must have a humanities degree 
>and is probably waging a private war against the dehumanisation of 
>postmodernism. It presumes that programming languages are incomprehensible 
>to humans; it presumes that there is a clear conceptual distinction 
>between information exchange between humans and between other information 
>processors; it presumes that 'intelligibility' is meaningful without 
>explanation, and it presumes that XML has no utility beyond human text 
>markup. It is an intellectual fossil.
>If anyone objects to RDF as a tag on these grounds, just tell them that 
>you know at least two people who use it for communication between 
>themselves (I volunteer to be one of them), and cite De Sassure, 
>Wittgenstein and Quine as authorities to justify this as sufficient to 
>qualify it as a human language. Or tell them that RDF is actually a 
>sublanguage of Math, which has been used for communication between humans 
>numbering in the millions for about 400 years.
>PS. After writing the above, I checked out RFC1766, and it is clear that 
>that document is entirely concerned with XML's use as a text markup 
>language, where 'text' means text aimed at direct human use. It uses as 
>guiding examples problems of maintaining a coherent international user 
>interface, eg
>   - In markup languages, such as HTML and XML, language information can
>      be added to each part of the document identified by the markup
>      structure (including the whole document itself).  For example, one
>      could write <span lang="FR">C'est la vie.</span> inside a Norwegian
>      document; the Norwegian-speaking user could then access a French-
>      Norwegian dictionary to find out what the marked section meant.  If
>      the user were listening to that document through a speech synthesis
>      interface, this formation could be used to signal the synthesizer
>      to appropriately apply French text-to-speech pronunciation rules to
>      that span of text, instead of misapplying the Norwegian rules.
>However, XML has many other uses now, and many of them are incompatiable 
>with the limited suite of uses contemplated in RFC1766 (eg formal 
>specification of syntax; RDF). In particular, RDF is not intended to be 
>read by human users, but by software agents, a fact which by itself seems 
>to render RFC1766 irrelevant.
>I would suggest therefore in all seriousness that we consider simply 
>rejecting RFC1766 as inappropriate to our intended domain of use.
>Pat Hayes
>(650)859 6569 w
>(650)494 3973 h (until September)
>phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
>This footnote confirms that this email message has been swept by
>MIMEsweeper for the presence of computer viruses.

Graham Klyne                    Baltimore Technologies
Strategic Research              Content Security Group
<Graham.Klyne@Baltimore.com>    <http://www.mimesweeper.com>
Received on Thursday, 26 July 2001 18:15:57 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:53:50 UTC