Re: xml:lang [was Re: Outstanding Issues ]

On 2002-02-11 19:05, "ext Dave Beckett" <dave.beckett@bristol.ac.uk> wrote:

>>>> Patrick Stickler said:
>> On 2002-02-11 17:28, "ext Brian McBride" <bwm@hplb.hpl.hp.com> wrote:
>> 
>> [perhaps these should each be in their own thread?]
> 
> Here it is
> 
>>> rdfms-xmllang: Why isn't xml:lang information represented within the RDF da
>   ta
>>> model? 
>>> 
>>> This was put on hold whilst we looked at datatypes.  Model and Syntax says
>>> that lang is part of the literal; that no triples are generated for an
>>> xml:lang.  We can choose to stick with that or change it.  Does anyone have
>    a
>>> compelling reason to change it?
>> 
>> I think it should not be changed, but the verbage could be clarified.
>> 
>> The xml:lang attribute exists for the benefit of XML applications
>> (e.g. an RDF parser) not RDF applications (e.g. an RDF query engine)
>> and therefore it is reasonable that it have no representation in
>> the graph (no triples generated). An XML application is free to
>> select or omit elements based on the xml:lang attribute -- but
>> since that is not part of the needed functionality of most (any?)
>> RDF parsers, the attribute simply has no effect.
> 
> I think you are proposing changing what RDF M&S said about using
> xml:lang and literals.

Yes and no. If what M&S says applies only to the XML space,
then no. If what M&S says was supposed to apply to the RDF
space (have representation in the graph) then yes.

All attributes/elements beginning with xml: have special
meaning in teh XML space, and to the extent that RDF
has an XML based serialization, we need to support that,
but we don't IMO need to adopt the semantics of those
special forms in the RDF space.

> RDF is/was linked to XML and just ignoring
> xml:lang is unacceptable to me,

I'm not saying ignore xml:lang, I'm saying we
need to leave it clearly in the XML space, as
is now the case.

> and to other applications and
> communities too (Dublin Core for one).

Applications which base their functionality on the
XML serialization rather than triples are broken.

If use of DC depends on the presence of xml:lang in
its RDF realization, then that is broken and must be
fixed.

Now, I'm not saying that language qualification is
not important and needed -- only that at present the
RDF XML serialization provides no decent mechanisms
for such qualifications, and language is no different
than source, authority, etc. all of which are different
kinds of scope.

To decide that xml:lang is now going to generate triples
automatically is going to require alot of work and
will require modification to every parser out there, so
I just see it as out of scope.

As outlined below, there is a way to do qualification
(language, source, scope, etc.) without such modifications,
and while it is not as efficient and elegant as one would
like, it seems workable -- and a better way will require
a new serialization model, which we are not going to do
just now.

>> If individuals wish to qualify resources by language value, in a
>> way that will affect queries and other graph-based operations,
>> then they should do so in a way that is meaningful to RDF
>> applications.
>> 
>> E.g.
>> 
>>    xxx ex:keyword _:1 .
>>    _:1 rdf:value "pan" .
>>    _:1 xml:lang _:2 .
>>    _:2 rdf:value "en" .
>>    _:2 rdf:dtype xsd:lang .
>> 
>> Note that _:2 is a datatyped literal but _:1 is simply
>> a qualified literal (qualified for language). Note also
>> the relationship between the property xml:lang and the
>> datatype xsd:lang.
> 
> This is requiring the use of the datatyping part of RDF (unwritten yet)
> in order to do what was previously part of the core RDF M&S.

Not really. Forget the datatyping portion if that bothers you, i.e.

    xxx ex:keyword _:1 .
    _:1 rdf:value "pan" .
    _:1 xml:lang "en" .

All that this does is use a bNode to denote a context for
(occurrence of) the statement object which has associated
with it one or more properties. This same 'idiom' works
whether the object is a literal, a bNode, or a URIref.

Though, regardless of what the datatyping solution is, it should
be possible to use it to define the range of xml:lang as
xsd:lang or similar.

> It
> isn't clear where the datatyping stuff will live, so making users
> require RDF+RDFS+RDF Datayping in order to do what they could do with
> RDF M&S alone, seems a big step.

Point taken, though the solution that I proposed does not
actually depend on either RDFS or RDF DT. I just used those
in the example to illustrate the complete picture, since in
practice, all are probably going to be used together.

>> That said, the M&S view that the language is "part of" the
>> literal is not quite right, and probably should be adjusted
>> (or removed), in that, as with datatyping, language is a
>> property of the occurrence (context) of the literal
>> and not the literal itself. And especially since literals are
>> now tidy, an application shouldn't attach context specific
>> properties such as language to globally shared literal nodes.
> 
> Or lang-literals are tidy?

I do not see how lang-literals can be tidy. Not without some
explicit representation of the language portion in the node
label. If the language is invisible in the graph, then that
means that either it does not really exist, or tidy literals
are ambiguous.

Patrick 

--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com

Received on Tuesday, 12 February 2002 02:00:52 UTC