[WNET] do words need XML markup?

Note to I18N IG:
1) the Semantic Web Best Practices and Deployment Group works in public, 
please reply-all, but note that such replies will be public.
2) the [WNET] tag helps SWBPD participants to distinguish threads

We are looking at mappings of the Wordnet theasuraus into OWL and/or RDFS.

The basic approach is to map the conceptual relationships between words 
and senses etc used in wordnet (such as hypernym and synonym) into an 
OWL ontology, and to map a specific wordnet instance, such as the 
english one, into an RDF knowledge base acting as an instance of the 
ontology.

In discussions in HP on this, we realised that we did not know whether 
a unicode string is a sufficient representation of a word, or whether 
XML markup is needed in some cases.

The case presented in charmod is for using XML for *text* rather than 
single words. For example, ruby is not relevant to individual words, and 
as far as I can tell, bidi using XML markup is useful for text involving 
mixed languages and computer-ese etc.

We were also trying to think about the issues involved with different 
languages making different uses of morphology versus the lexicon versus 
the grammar. A particular example that came to mind was long productive 
compound noun formation in german, where I assume that dictionaries do 
not list all known compound nouns ... (I am not a german speaker)

Hoping for some I18N input ...

Jeremy

Received on Tuesday, 6 July 2004 11:38:44 UTC