- From: Martin Duerst <duerst@w3.org>
- Date: Fri, 07 Nov 2003 04:06:16 -0500
- To: www-rdf-comments@w3.org
- Cc: w3c-i18n-ig@w3.org
Dear RDF WG,
Here are the last call comments from the I18N WG on
the RDF drafts. This is not necessarily by draft, but
by feature.
- Treatment of language information for XML Literals:
We have already commented on this extensively. We think that the
removal of language information from XML literals is a serious
problem for internationalization. Details of our comments can
be found at http://www.w3.org/2003/09/ri434.html.
One example of how language information could easily be added to
XML literals, with minimal impact on the overall design,
(just a proposal, not intended to preclude any other solution)
is to use a very short wrapper element:
The lexical form of the XML literal in
<ex:prop rdf:parseType='Literal'>foo</ex:prop>
(minimal example on purpose) becomes
<w>foo</w>
i.e. a <w> element is added as a wrapper to all xml literals.
<w> stands for 'wrapper'. A single-character element name is
chosen to keep potential overhead as low as possible. For
<ex:prop rdf:parseType='Literal' xml:lang='fr'>foo</ex:prop>
the lexical form becomes
<w xml:lang="fr">foo</w>
<w> does not need a namespace because it is only used for
abstraction/descriptive purposes within RDF, and potentially
within implementations that chose to use this form of representation.
It will usually not be used for interchange. It may occasionally be
used for interchange if instead of rdf:parseType='Literal', rdf:dataType
is used, i.e. in
<ex:prop rdf:dataType='&rdf;XMLLiteral'><w>foo</w></ex:prop>
but in that case, it is also well identified.
- XML Literals as typed literals: We have not objected to the treatment
of XML Literals as typed literals in the previous last call, because
there, it was possible to understand this treatment just as a technical
convenience. However, with the change to the treatment of language
information for XML literals, treating XML literals as datatyped literals
becomes highly questionable. Unless the language information issue can
be solved, we have to disagree with this treatment.
- XML Literals containing only text should be equivalent to the
corresponding plain literals and to the corresponding string type
literals. (Solving the language information issue for XML Literals
seems to be a precondition for this, but once this is done, it does
not seem to be too difficult, in the same way that it was possible
to make strings and plain literals equivalent.)
- Examples/Primer: There is one important facility of RDF that is almost
completely ignored in the primer and in the general discussion in the
concepts document. This is the ability to use not only ASCII characters,
but Unicode, in literal values, URIrefs, and (with some restrictions)
XML element names, and therefore property names and names of other nodes.
While this may be of somewhat secondary importance to English readers
(but still more important than the treatment it is given), it is crucial
for translations of the specification. We suggest the following:
- Mention this possibility very early on, at the first point where
literals and URIrefs are first treated in detail (most probably
section 2.2 (or even 2.1)).
- Add a simple example at this point, or change an already existing
example slightly.
- For the extensive examples in section 6, replace some of the
current examples with equivalent examples with more international
flavor. Most of the applications in section 6 are used in a
world-wide context, and finding some examples should not be
difficult. Overall, changing or adding two to three examples
in section 6 should be sufficient. They should not be limited to
examples like example 32, which contains the copyright sign
as a single non-ASCII character, although having an example
that shows how non-ASCII characters can be convenient in a
purely English context may also be a good idea.
- The explanations to these examples should mention the fact
that RDF and XML allow Unicode characters. This does not
have to be extensive; a few short sentences, with pointers
to the relevant parts of the normative specs, should be sufficent.
Readers of the primer should not be bothered/confused with
issues such as normalization.
- Alt container: Because of the special rule that the first element is
the default or preferred value, this is a fake alternative. This should
be changed, or a real alternative, without any preferences, should be
provided. This is in particular important if there is no preferred
version among different language versions (which is often needed for
political reasons), but we are sure there are many other cases where
it is not desirable to have a preferred alternative, or where there
just simply is no preferred alternative. (Other such examples include
voting ballots. Even for the ftp example given, a true alternative
may be desirable, to allow load balancing.)
- Measures/weights: The primer in a very small number of instances
uses 'weightInKg', and explains why, but for the rest, it always
uses just 'weight', even when there is no reason for such an
underspecified property. For world-wide data interchangability,
such details are crucial. Unless there is a specific point to make
(e.g. when explaining rdf:value), 'weightInKg' should always be
preferred. The same applies to other properties such as
rearSeatLegRoom. Language such as (primer 4.4:)
>>>>
because frequently the value
would be recorded simply as the typed literal (as in the triple above),
relying on an understanding of the context to fill in the unstated
units information.
>>>>
should be avoided. The primer should not recommend
practices that have made Mars missions go astray, among else.
- The motivation for using xsd datatypes should clearly say that these
are well established and should be used where appropriate. Having
the possibility to use different types is good if needed, but in
general, interoperability is increased when using a well-defined
common set. Also, to a large extent, the value spaces, and to
a somewhat smaller extent, the lexical forms, are locale-independent,
which make it possible to exchange data independent of a particular
human-oriented representation. This should also be mentioned.
Regards, Martin.
Received on Friday, 7 November 2003 07:54:27 UTC