Re: ILS from Charles McCathieNevile on 2002-08-04 (wai-xtech@w3.org from August 2002)

From: Charles McCathieNevile <charles@w3.org>
Date: Sun, 4 Aug 2002 09:05:29 -0400 (EDT)
To: Lisa Seeman <seeman@netvision.net.il>
cc: <wai-xtech@w3.org>
Message-ID: <Pine.LNX.4.30.0208040819410.24879-100000@tux.w3.org>
A couple of different thoughts.

1. XML or RDF?
I think that making this RDF is more helpful for several reasons.

The major one is that RDF data can easily be layered over the top of existing
content. This has two advantages - it means that we aren't relying on the
original author, or a subsequent editor of the content, to use the XML
schema, and it also means that several people can annotate the same
information, perhaps differently. This second possibility is important
because people interpret things differently, and because in many cases people
only do partial annotations, relying on others to supplement them. (Think of
longdesc information - there is very little of it, but if it was easier to
add longdesc to content controlled by others, and easy to find that, we might
hope for some more).

There are tools for dealing with RDF - storing, creating, editing, querying,
and many of them can be used online in an HTML-capable browser. If RDF is
used as annotations, seperate from the actual data, then using web-based
interfaces, or web-based services to integrate information for presentation
in user interfaces, becomes a sensible and easy method for transparent
upgrade from all-external additional tools to incorporating them directly
into authoring and browsing tools.

(There is also RDF information spreading around, particularly for things like
annotating images with longdesc and other supplemental information. This
suggests that we could expect to see more tools including RDF capability in
the medium term.)

2. Using existing standards and information

The draft schema specifies languages. I think this is a mistake - the
xml:lang attribute can be applied in any XML language, and I think it is
better to rely on taht than to try and replicate it - the necessary code for
that has already been made available in many contexts.

There is a dictionary - Wordnet - in english which is available in RDF form,
and which links words to synonyms, to more generalised or more specific words
for the same concept, etc. This is available in at least two versions:
http://www.semanticweb.org/library/ and http://xmlns.com/2001/08/wordnet/ but
we can hope that relating the two will not be monstrously difficult (at least
for "common purposes" - and the joy of RDF is that if we have an uncommon
purpose we can ask which version we are using...).

A similar exercise should be possible for other languages, if it has not
already been done.  I would be surprised if the national language institutes
of countries like France, Iceland and Spain, or the editorial committees of
large dictionary producers have not yet at least investigated this exercise.

3. Some general thoughts

I think that it is important to be able to talk about the objects we are
marking up - some general information about whether they are part of
something else is probably important. For example, if I find a paragraph
about "budgerigars eating peanuts" it might be interesting to know if it is
from a document about parrots or about food crops...

This relates to XML Accessibility Guidelines - checkpoints 3.2, 3.3, 4.5 and
4.10 in the June 2002 draft.

Something about what format they are (text, graphics, mixed, etc) might also
be useful. (This is a half-developed thought at this stage).

cheers

Chaals

On Sun, 4 Aug 2002, Lisa Seeman wrote:

>
>
>Hello all
>
>Charles has recommended that I post to this list (or move to this list) the
>discussions about Interoperable Language. Anyone who knows all the beginning
>stuff from PF or other lists can scroll down to the "RDF or XML" discussion.
>
>Background:
>In view of the anticipated development of symbolic language, translatable
>language, impairments in processing ambiguous, non-literal or generally
>difficult language, There is a definitive accessibility need for controlled
>use of language.
>
> But, as has been pointed  out, web authors will not like to use one.
>
>So we are developing a standard technique for marking up typical electronic
>textual content, and referencing textual content, so that it's meaning
>becomes unambiguous, translatable and machine-readable.
>
>Most important. No one needs to change the way that they write. All that you
>do is use  the meta data and markup to remove the ambiguity.
>
>So far so good......
>
>How it works:
>
>Basically the idea is that you refer the document to a lexicon, or set of
>cascaded lexicons, (so that you can over ride the main lexicon, for
>localization, jargon, or just your own usage) You can also over ride a
>lexicon in line. In other words you specify the meaning of each word and
>phrase in the document. You can use bad to mean good and what ever you like.
>You just need to mark it up.
>
>In the document itself  the default the meeting and word usage is the
>primary one defined  in the highest priority lexicon. But Language can be
>marked up to include more then one meaning or a different meaning.
>
>Content can be marked up with implied content, sarcasm, and other form of
>non literal translations.  The priority of text can be identified though
>markup, summary of any text can be provided though mark up.
>
>
>Current progress:
>
>I have put up a draft schema at: http://www.ubaccess.com/ils.html#drafts
>
>SO far it is implemented as XML, The problem or question is, should we be
>making this all sing as RDF? (see the next section)
>
>Using the gloss meta data we should have all the pieces in place to link to
>the primary lexicon.
>
>(we can also use marking up the schematic relationships to other lexicons to
>include them - See http://www.w3.org/WAI/PF/XML/#g4_0 )
>
>
>RDF or XML discussion:
>
>Pro's of XML:
>
>People can start to use the system, (if they can use XML) without tools.
>Inline implementation, easy and intuitive.
>This schema could be markup for a document (not a lexicon) or a lexicon -
>which means that if you have defined some language usage inline in  one
>document, you can use that document as a lexicon in other documents.
>So in other words, you can mark up a word usage once, inline, and refer to
>that language usage in other documents either by referencing it with meta
>data or referencing it inline.
>Basically once you start using it, you just reference the last document you
>did, and all your language usage from previous documents is included. You
>only need to reference unusual usage that you have never used before.
>
>Ideally we could extend the xhtml modules to include these elements or
>extent XHTML elements to have the included attributes. (but I am not sure
>how to do that, Is the XHTML schema modules finished or still at draft
>stage? )
>
> XSL is all the user agent needs to start to impliment usefuly. In other
>words - Peaple could use it today.
>
>Pro's of  RDF
>Let us assume that no one will use it without tools (that is a big
>assumption) - that will invalidate most of the pro XML argument
>It really is RDF stuff- it is information about meta data
>All we can hope to do in the schema is a best guess as to what we need to
>put in.  We know we will want this project to evolve, making RDF is the
>right environment.
>
>
>That's it so far
>
>your comments, suggestions......
>
>All the best,
>
>Lisa Seeman
>
>UnBounded Access
>
>Widen the World Web
>
>http://www.UBaccess.com
>
>
>
>

-- 
Charles McCathieNevile    http://www.w3.org/People/Charles  phone: +61 409 134 136
W3C Web Accessibility Initiative     http://www.w3.org/WAI  fax: +33 4 92 38 78 22
Location: 21 Mitchell street FOOTSCRAY Vic 3011, Australia
(or W3C INRIA, Route des Lucioles, BP 93, 06902 Sophia Antipolis Cedex, France)
Received on Sunday, 4 August 2002 09:05:32 UTC