- From: gsergiu via GitHub <sysbot+gh@w3.org>
- Date: Tue, 09 Aug 2016 14:07:57 +0000
- To: public-annotation@w3.org
gsergiu has just created a new issue for
https://github.com/w3c/web-annotation:
== Processing language for multilingual resources ==
The conclusion of ticket
https://github.com/w3c/web-annotation/issues/337#issuecomment-238557004
is that there is a M to N relationship between the dc:language of
multilingual resources and the text processors that might process the
annotation body and/or target.
Therefore the following proposal for the definition of the processing
language property:
"This property represents the relationship between the language of the
resources (Body or Target) and the text processors or classes of text
processors that may process the resources for rendering, indexing or
any NLP processing."
1. Consequently I propose that the verbose representation of this
property should include <language, processor_class, processor_id>
tuples.
It is recommended to use a vocabulary for processor classes like:
textual_representation, audio_representation, visual_representation
(i.e. image), text_indexing, nlp_processing
Example:
```
processingLanguage:{
{language: [“en”, “fr”, “ro”], processor_class:
“textual_representation”},
{language: “en”, processor_class: “text_indexing”, processor_id :
“<snowball_indexer_uri>”},
{language: “ro”, processor_class: “audio_representation”,
processor_id : “<TTS_RO_uri>”}
}
```
2. The minified representation could be compliant with the
current specification, with the meaning that all text processors (all
types) should use the same processing language.
3. There are 2 open questions:
a. Should this property be named “processing”?
b. Should this information be embedded within the annotations
(model) or in the protocol (own http request)?
Please view or discuss this issue at
https://github.com/w3c/web-annotation/issues/341 using your GitHub
account
Received on Tuesday, 9 August 2016 14:08:04 UTC