Re: Resolution proposal for ISSUE-2 from Najib Tounsi on 2012-03-23 (public-multilingualweb-lt@w3.org from March 2012)

From: Najib Tounsi <ntounsi@gmail.com>
Date: Fri, 23 Mar 2012 14:40:32 +0000
To: public-multilingualweb-lt@w3.org
Message-ID: <4F6C8B60.2000807@emi.ac.ma>
Hi all

Here is another very newbie.
After some readings, I favor  the resolution suggested by Felix [1].

Best regards.

Najib

[1] 
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Mar/0022.html


On 3/23/12 8:48 AM, Felix Sasaki wrote:
> Hi Phil,
>
> thanks a lot for your mail. Actually I don't think that you need to 
> dive deeply into RDFa and Microdata. We need just make clear in a 
> conformance statement that:
>
> 1) An implementation of our standard needs to be able to parse its-* 
> (or whatever prefix we have) attributes in HTML, e.g. the HTML 
> "translate" attribute, its-locNote, its-term etc.
> 2) An implementation working in the XLIFF (or general XML) space needs 
> to be able to parse the XML counterparts of the its-* attributes, e.g. 
> its:translate, its:locNote, its:term etc.
> 3) An implementation MAY implement the (to be detailed out) "convert 
> HTML5 to RDFa or Microdata" algorithm, including the URI generation 
> facility Tadej mentioned.
>
> You can boil this down to a table with four columns, see attachment. 
> An implementation MUST state: "I implement data category XYZ, in 
> HTML5, or XML. If HTML5, then I provide the RDFa / Microdata conversion".
>
> HTH,
>
> Felix
>
> Am 22. März 2012 22:12 schrieb Phil Ritchie <philr@vistatec.ie 
> <mailto:philr@vistatec.ie>>:
>
>     I'm afraid I need to do some serious reading over the weekend on
>     RDFa and Microdata before I'll feel qualified to contribute
>     properly to the discussion.
>
>     The important considerations for me would relate to parsability
>     but all of the proposals would seem to provide well structured,
>     non-ambiguous, simply tokenised format.
>
>     Phil
>
>
>
>     On 22 Mar 2012, at 17:18, "Felix Sasaki" <fsasaki@w3.org
>     <mailto:fsasaki@w3.org>> wrote:
>
>>     Thank you, Tadej. Trying to summarize what you say: we need
>>
>>     1) HTML5 + ITS (or XYZ) schema
>>     2) Algorithm for transforming "HTML5+ITS" into HTML5/RDFa ,
>>     /Microdata, or /RDFa Lite. Could we say we just cover RDFa lite?
>>     3) Algorithm (what you wrote below) to generate URIs in RDFa
>>
>>     Your question about "A question for people consuming RDF/RDFa"
>>     still needs an answer, but otherwise I think we are done with
>>     this. Any thoughts by others, esp. implementors in the group?
>>
>>     Felix
>>
>>     Am 22. März 2012 15:47 schrieb Tadej Stajner
>>     <tadej.stajner@ijs.si <mailto:tadej.stajner@ijs.si>>:
>>
>>         On 3/22/2012 2:11 PM, Felix Sasaki wrote:
>>>
>>>
>>>         Am 22. März 2012 13:52 schrieb Jirka Kosek <jirka@kosek.cz
>>>         <mailto:jirka@kosek.cz>>:
>>>
>>>             On 22.3.2012 13:09, Felix Sasaki wrote:
>>>
>>>             > Solution 1) will be user friendly, and we will define
>>>             an RELAX NG schema
>>>             > HTML5+ITS (or + XYZ). The same approach has been taken
>>>             for Aria in the
>>>             > accessibility space, and Aria is now even part of the
>>>             HTML5 core language.
>>>             >
>>>             > Comments are very welcome. I hope we can agree on
>>>             during next week's call
>>>             > and find a volunteer for maintaining the schema and
>>>             another one for the
>>>             > mappings.
>>>
>>>             I volunteer for creating and maintaining schema.
>>>
>>>
>>>         Great, thanks a lot.
>>>
>>>
>>>             > Regarding the "URIs for element nodes in HTML5"
>>>             discussion: Ivan said that
>>>             > our group should consider whether this is really an issue.
>>>
>>>             I would expected more positioned reply from SW activity
>>>             lead :-)
>>>
>>>
>>>         Well, to be fair, he was more precise:
>>>
>>>         "RDFa does not include any definition, as far as the
>>>         extracted RDF is concerned, on pointing 'back' to the
>>>         original source structure. This should be done explicitly. I
>>>         am not sure whether this is a major issue, this is something
>>>         for the group to consider..."
>>>
>>>         But the essence is the same: is it important for us?
>>
>>         Some things to add (and to shed some light on ACTION-32):
>>
>>         I think it's important to define a way to do it, but not have
>>         it obligatory to serialize because it has zero utility until
>>         someone actually uses it in pure RDF. The thing is, as long
>>         as the HTML document is available and the RDFa is inlined,
>>         the references to the HTML structure in RDF don't add any
>>         additional information and can be trivially reconstructed.
>>         RDFa consumption tools can likely handle that kind of content
>>         as-is.
>>
>>         The tricky case is if someone at some point wants to get pure
>>         RDF from this (dropping the HTML in the process), we should
>>         have some specification that they could follow to achieve
>>         these references. The use case I can think of is feeding
>>         ITS-marked-up input into a NLP pipeline running on something
>>         like NIF, which needs URIs for annotated fragments of text.
>>         Luckily the conversion itself is pretty mechanical, so I see
>>         some strategies for minting URIs that can be dereferenceable
>>         directly to the fragment:
>>         * have the RDF node point back to the HTML element's id, if
>>         there is any (<meta property="its:annotates"
>>         resource="#id_myElement_bar" />)
>>         * have the RDF node mint a URI for the fragment using one if
>>         the NIF recipes (<meta property="its:annotates"
>>         resource="#hash_1_3_12341234123412341_bar" />)
>>
>>         A question for people consuming RDF/RDFa - is defining this
>>         sort of "URI generation recipe" at the RDFa consumption stage
>>         breaking too many assumptions? I'd like to avoid having
>>         producers generate redundant data.
>>
>>         .. and back to answering "how much RDF do we need"?
>>         My reason for considering RDFa was to encode the additional
>>         information we might have about the concepts that are behind
>>         the text. Right now the most important uses are:
>>         - the URI of the concept (the "means " relation);
>>         - the type URI of the concept (see ISSUE-3) (the "this
>>         fragment represents a concept of the type" relation);
>>         - the labels of the concept in other languages;
>>
>>         Since we can model those via the proposed data categories, we
>>         don't need explicit RDF support to represent this - it is
>>         however very important that these predicates can point to
>>         URIs in the RDF space (as is currently the case with
>>         its:termInfoRef, for instance), and that we at least have a
>>         process in place for transforming "HTML5+ITS" into HTML5/RDFa
>>         , /Microdata, or /RDFa Lite. Right now the examples you
>>         submitted look good for that purpose, adding an HTML URI
>>         generator should cover that part.
>>
>>         -- Tadej
>>
>>
>>
>>>
>>>             Anyway we probably shouldn't spend much time on mappings
>>>             as I can't
>>>             imagine anyone using RDFa/microdata in favor of simple
>>>             attributes.
>>>
>>>
>>>         I hope that the mapping can be fairly mechanical and will
>>>         not need much time. Even if it is not created by hand, I can
>>>         imagine tools like Enrycher that easily can generate it.
>>>         Having then a mapping of Enrycher output as an input to
>>>         schema.org <http://schema.org> based SEO is a nice scenario,
>>>         IMO, but it depends on RDFa/microdata.
>>>
>>>         Felix
>>>
>>>
>>>                                            Jirka
>>>
>>>             --
>>>             ------------------------------------------------------------------
>>>              Jirka Kosek      e-mail: jirka@kosek.cz
>>>             <mailto:jirka@kosek.cz> http://xmlguru.cz
>>>             ------------------------------------------------------------------
>>>                   Professional XML consulting and training services
>>>              DocBook customization, custom XSLT/XSL-FO document
>>>             processing
>>>             ------------------------------------------------------------------
>>>              OASIS DocBook TC member, W3C Invited Expert, ISO
>>>             JTC1/SC34 member
>>>             ------------------------------------------------------------------
>>>
>>>
>>>
>>>
>>>         -- 
>>>         Felix Sasaki
>>>         DFKI / W3C Fellow
>>>
>>
>>
>>
>>
>>     -- 
>>     Felix Sasaki
>>     DFKI / W3C Fellow
>>
>
>     ************************************************************
>     This email and any files transmitted with it are confidential and
>     intended solely for the use of the individual or entity to whom they
>     are addressed. If you have received this email in error please notify
>     the sender immediately by e-mail.
>
>     www.vistatec.com <http://www.vistatec.com>
>     ************************************************************
>
>
>
>
> -- 
> Felix Sasaki
> DFKI / W3C Fellow
>
--
Najib Tounsi,
W3C Office, Morocco
Received on Friday, 23 March 2012 14:41:18 UTC