#rdfms-xmllang - summary

This is an attempt to summarize the discussion of this issue to date.

Summary of the Summary:

 o M&S states that a language attribute is considered to be 'part of'
   a literal.

 o Of the three use cases, one does not need/use language attributes
   but model's language differently, one 'prefers' the M&S solution
   over the anon node with two properties solution and the other is
   as yet unclear.

 o three alternative proposals for representing language have been made,
     + the anon node with two properties trick
     + add a 4th component to triples (which is very close to M&S solution)
     + TBL's interpretation properties (which seem to me similar to Martyn's
       modeling technique).

Moving Forward:

I suggest we have to answer the following questions:

  o Do we understand what M&S says about xml:lang?

  o Is what M&S describes broken - is there some fatal flaw that must be
    fixed?

  o Is what M&S describes useful to application developers?

  o Is there another reason to change it?

Further Detail:

Three people had actions to determine use cases/requirements.

Martyin in [1], after talking to his sales and service guys came to the
surprising conclusion that "there is no expectation of language
attribution to survive in bodies of metadata".

"Most of our customers and prospectives would willingly model such
equivalence classes as a separate body of metadata where a resource
represents the semantic point and translations are associated as
properties."

"This feedback demotes the association of locale
with text down to the level of general property (in the body of the
metadata) rather than as an intrinsic attribute of the quoted literal".

Basically, Martyn's conclusion is that his users have no use for a lang
attribute as part of a literal.

Eric Miller produced requirements [2] based on the work of the DCMI
multiple languages special interest group and the DCMI registry
working group.  The DCMI registry working group loads schemas into
a service and permits searching of them.  Typical queries include:

'show me the french representation of the dc:title element'
'show me the DCES in german'

There is no indication of whether the xml:lang approach currently
described by M&S meets these requirements.

Jan Grant in [3] describes a use case/requirements based on an EC
project for meta data repositories.

"REQUIREMENT: the schema element registry should be language-neutral.

This is pretty much an absolute requirement in Europe. Where I have
keywords, descriptions, etc., those values should carry a language
with them through some mechanism".

"RDF was a win here,
since we can give the concepts we're modelling abstract identifiers
(in this case, URIs) and hang associated per-language preferred terms,
descriptions, keywords, and so on off them".

Jan "very much preferred" a query representation with the language
attribute part of the literal.

"Conclusion: we're in an international environment. Removing language from
RDF is a terrible step backwards. I find the notion of the notion of
literals having structure very appealing. If we ditch this, we absolutely
must have a solid working replacement. There may be a more general
notion of a structured literal, but we should not give up what we have
while we search for it."

In summary:

  Martyn's use case has no need for language attribution of literals
  as he uses a modeling style that does not require it.

  For Eric's use case it's unclear whether it the current M&S approach
  is adequate.

  Modeling the language as a part of a literal meets the requirements
  in Jan's use case and he "much prefers" this approach.

M&S states that language is seen to be 'part of' a literal [4][5]

There was some "eye rolling" reaction [6][7] to the bit about:

   All RDF applications must specify whether or not language tagging
   in literals is significant; that is, whether or not language is 
   considered when performing string matching or other processing.

Graham believes that M&S is broken [8] on the grounds is requires
an RDF application to operate on data which is not "in the model".

There have been the following proposals for alternative ways to 
represent language attributes:

  Aaron has proposed [9] using the trick recommended in m&s of 
  using an anon node with rdf:value and xml:lang properties.

  Ron has suggested adding a 4th component to triples to represent
  the language attribute

  Graham has pointed out[10] TBL's interpretation properties[11]
  which seem very similar to Martyn's modeling style.

[1] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Jul/0013.html
[2] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Jul/0038.html
[3] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Jul/0111.html
[4] http://lists.w3.org/Archives/Public/www-archive/2001Jun/att-0021/00-part#221
[5] http://lists.w3.org/Archives/Public/www-archive/2001Jun/att-0021/00-part#216
[6] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Jul/0078.html
[7] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Jul/0077.html
[8] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Jul/0085.html
[9] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Jul/0080.html
[10] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Jul/0102.html
[11] http://www.w3.org/DesignIssues/InterpretationProperties.html

Received on Thursday, 12 July 2001 15:42:04 UTC