Re: Workflows for localizing RDF (Fwd: Fwd: "Organization Ontology" Japanese translation available) from Dave Lewis on 2014-02-19 (public-bpmlod@w3.org from February 2014)

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Wed, 19 Feb 2014 10:33:20 +0000
To: Felix Sasaki <fsasaki@w3.org>, Jose Emilio Labra Gayo <jelabra@gmail.com>
CC: public-bpmlod@w3.org
Message-ID: <53048870.4020908@cs.tcd.ie>
Hi Felix, Jose,
Jose is correct, that it is difficult and probably not realistic to aim 
all the time for universal best practice that applies in all use cases. 
Instead we need to do some work in categorising the different use cases 
types and then use that to frame the more detailed discussions about 
specific best practices around annotations, formats, codes etc. So we 
start with identifying best practice within explanations of the 
different types of use cases where they may and may not be beneficial 
9including what the benefits and possible drawback are)

For example, in 
http://oa.upm.es/5178/1/A_Note_on_Ontology_Localization.pdf there is a 
nice framing of the different areas of concern in localizing ontologies. 
Capturing that briefly say under;
http://www.w3.org/community/bpmlod/wiki/Topic_classification#Localization_of_existing_vocabularies
would allow us then to drill down further into the issues in a more 
focussed way.

Also, is now the time to start using the w3C issue tracker to help 
manage the progress of different topics on the mailing list and in the 
meetings? That can help people in the community to champion a particular 
topic and drive discussions to a resolution over time.

Regards,
Dave




On 19/02/2014 08:16, Felix Sasaki wrote:
> Hi Jose, all,
>
> would it be ok to have an unstructured list of best pratice statements 
> in the wiki? Just to be able to keep track of discussions like the 
> workflow topic or the one at
> http://lists.w3.org/Archives/Public/w3c-translators/2014JanMar/0033.html
> That would not replace the existing structure.
>
> Btw., from the above w3c-translators list it looks like a machine 
> readable description of what is translatable or not could be helpful. 
> The paper you cite uses rdf:XMLLiteral for that
>
> : unileón : desc
> "<p>University of
> <span translate='no'>León</span>,
> Spain .
> </p>"^^rdf : XMLLiteral .
>
> But an approach that selects several parts in the ontology at the same 
> time may be useful. This works easily in the RDF / XML serialization, 
> see the ITS rules at
> http://lists.w3.org/Archives/Public/public-bpmlod/2014Feb/0008.html
> but not in other serializations. So having such a rule
> <its:translateRule selector="//rdfs:label[lang('en')] |
> //dct:title[lang('en')] | rdfs:comment[lang('en')]" translate="yes"/>
> could have helped Elena in above thread and Phil so that they don't 
> need to discuss what to translate in the ORG ontology or not.
>
> Best,
>
> Felix
>
>
>
> Am 17.02.14 13:10, schrieb Jose Emilio Labra Gayo:
>> That was part of the discussion we had at the beginnings of the 
>> community group. In my opinion, it is difficult to state "Do 
>> XYZ/Don't do XYZ..." in a general way because some times it will 
>> depend on the context and other factors. That's why in this paper 
>> [1], we chose the term "patterns" to be able to identify common 
>> practices and to offer some suggestions on which contexts one 
>> approach may be better than other.
>>
>> In the BPMLOD Wiki [2], we are creating a table identifying the main 
>> practices (or patterns) and filling them with those kind of advice 
>> (arguments in favour/arguments against). I think this exercise may be 
>> interesting but it would be great if we could also offer a more 
>> detailed study accompanied by real examples. In the last meeting we 
>> asked for voluntary contributors who could help us filling the table.
>>
>> Using pattern terminology, it is possible that some of the practices 
>> that we are documenting in the table, could be called "bad smells". 
>> But anyway, I think it is a good exercise to identify them and to see 
>> in which contexts they can be safely applied.
>>
>> Anyway, the statements that you mention are very good candidates to 
>> include in the table and to document in which contexts they are 
>> better applied. I think they were more or less covered in sections 
>> 4.6.2 (multilingual vocabularies) and 4.6.3 (localize existing 
>> vocabularies) in [1] but we haven't yet arrived to discuss those 
>> sections in the community group yet :)
>>
>> Best regards, Jose Labra
>>
>>
>> [1] J. E. Labra-Gayo, D. Kontokostas, and S. Auer, "Multilingual 
>> linked open data patterns", Semantic Web - Interoperability, 
>> Usability, Applicability, 2013. Available: 
>> http://www.semantic-web-journal.net/content/multilingual-linked-data-patterns
>>
>> [2] BPMLOD Wiki. Best practices table. 
>> http://www.w3.org/community/bpmlod/wiki/Best_practises_-_previous_notes
>>
>>     The topic would probably fit in every place you mention. But I
>>     have a question: will we provide best practices in the form of
>>
>>     "Do XYZ because ..."
>>     "Don't do XYZ because"
>>
>>     The question is really about how the BP will be presented. For
>>     the topic in this thread various statements could come to my mind:
>>
>>     "Clearly identify translatable content in your RDF ontology +
>>     data in a standard manner"
>>
>>     "Carefully decide whether you do translation inside the RDF file
>>     or whether you extract the content"
>>     (Here discuss the pros + cons of extraction, the
>>     contextualization issue)
>>
>>     - Felix
>>
>>     Am 11.02.14 21:22, schrieb Jose Emilio Labra Gayo:
>>>
>>>         About "But do we get issues when using this data type  (or
>>>         any non
>>>         http://www.w3.org/1999/02/22-rdf-syntax-ns#langString
>>>         datatype) when also using language tags on the literal? :":
>>>         Dave is right, the HTML data type would not allow for using
>>>         the language tag. You only could use it in the HTML content,
>>>         that is no query with SPARQL.
>>>
>>>         Your feedback was quite useful - my main point is: do we
>>>         want to write all this down in easy to understand best
>>>         practices? Dave had asked a similar question, I think.
>>>
>>>
>>>     In my opinion, yes. It is a very interesting topic that has
>>>     appeared in a real scenario. Looking at the Topics that we had
>>>     proposed here:
>>>
>>>     https://www.w3.org/community/bpmlod/wiki/Topic_classification
>>>
>>>     I think this discussion could fit in:
>>>
>>>     2.3 - Longer descriptions, where we could talk about the use of
>>>     HTML and even XML literals.
>>>     2.4 - Lexicalizations and linguistic information
>>>     2.5 - Localization information
>>>
>>>     4.2 - Localization of existing vocabularies
>>>
>>>     Do you think we need to add a different topic or is it ok as is?
>>>
>>>     Best regards, Jose Labra
>>>
>>>
>>>     -
>>>
>>>
>>>         Felix
>>>
>>>         Am 10.02.14 01:21, schrieb dave.lewis@cs.tcd.ie
>>>         <mailto:dave.lewis@cs.tcd.ie>:
>>>>         Hi Felix,
>>>>         Couple of comment inline:
>>>>
>>>>         On 07/02/2014 11:39, Felix Sasaki wrote:
>>>>>>         that makes sense - but do we need to have a special
>>>>>>         literal type to indicate that it should be parsed for
>>>>>>         'inline' tags? 
>>>>>
>>>>>         See above - the HTML literal
>>>>>         http://www.w3.org/TR/rdf11-concepts/#section-html
>>>>>         should do the job.
>>>>>
>>>>
>>>>         But do we get issues when using this data type  (or any non
>>>>         |http://www.w3.org/1999/02/22-rdf-syntax-ns#langStringdatatype)
>>>>         when also using language tags on the literal? :
>>>>
>>>>         http://www.w3.org/TR/rdf11-concepts/#dfn-literal-value
>>>>
>>>>
>>>>         |
>>>>>>         Also in some cases, for example if the span had
>>>>>>         its-term--into-ref pointing to a term definitions
>>>>>>         elsewhere in the linked data cloud, best practice might
>>>>>>         be to reform (i.e. extract) the literal into a NIF
>>>>>>         subgraph, with the annotated sub-string as separate
>>>>>>         nif:string objects.
>>>>>
>>>>>         Not sure if for generating an XLIFF file (see above) you
>>>>>         would a NIF subgraph. The main motivation for my BP
>>>>>         proposal was: allow people working with localization tools
>>>>>         (= processing XLIFF files) to translate labels in linke data.
>>>>>
>>>>>         So all the below makes sense IMO for textual content,
>>>>>         extracted from HTML / XML etc. But processing the labels
>>>>>         in linked data with NIF? Not sure if that is needed and
>>>>>         might even hinder XLIFF based using localization workflows.
>>>>>
>>>>
>>>>         Agreed, getting the annotation to work with XLIFF/ITS in a
>>>>         way that can used used in exisitng tools should be the
>>>>         primary aim here.
>>>>
>>>>         The use of NIF is more relevant if you wanted to make the
>>>>         content available to NLP tools that could understand NIF -
>>>>         which is a different use case.
>>>>
>>>>         cheers,
>>>>         Dave
>>>>
>>>>
>>>>>         Disclaimer: really nothing against NIF ;) My point is only
>>>>>         about the right approach for label translation.
>>>>>
>>>>>         Best,
>>>>>
>>>>>         Felix
>>>>
>>>
>>>
>>>
>>>
>>>     -- 
>>>     Saludos, Labra
>>
>>
>>
>>
>> -- 
>> Saludos, Labra
>
Received on Wednesday, 19 February 2014 10:30:10 UTC