Re: Resolution proposal for ISSUE-2 from Felix Sasaki on 2012-03-22 (public-multilingualweb-lt@w3.org from March 2012)

From: Felix Sasaki <fsasaki@w3.org>
Date: Thu, 22 Mar 2012 18:17:26 +0100
To: Tadej Stajner <tadej.stajner@ijs.si>
Cc: public-multilingualweb-lt@w3.org
Message-ID: <CAL58czpALT0da5=n7jPS9xb+Nji8yPD+GRuaJn=CO1hbVijPGw@mail.gmail.com>
Thank you, Tadej. Trying to summarize what you say: we need

1) HTML5 + ITS (or XYZ) schema
2) Algorithm for transforming "HTML5+ITS" into HTML5/RDFa , /Microdata, or
/RDFa Lite. Could we say we just cover RDFa lite?
3) Algorithm (what you wrote below) to generate URIs in RDFa

Your question about "A question for people consuming RDF/RDFa" still needs
an answer, but otherwise I think we are done with this. Any thoughts by
others, esp. implementors in the group?

Felix

Am 22. März 2012 15:47 schrieb Tadej Stajner <tadej.stajner@ijs.si>:

>  On 3/22/2012 2:11 PM, Felix Sasaki wrote:
>
>
>
> Am 22. März 2012 13:52 schrieb Jirka Kosek <jirka@kosek.cz>:
>
>> On 22.3.2012 13:09, Felix Sasaki wrote:
>>
>> > Solution 1) will be user friendly, and we will define an RELAX NG schema
>> > HTML5+ITS (or + XYZ). The same approach has been taken for Aria in the
>> > accessibility space, and Aria is now even part of the HTML5 core
>> language.
>> >
>> > Comments are very welcome. I hope we can agree on during next week's
>> call
>> > and find a volunteer for maintaining the schema and another one for the
>> > mappings.
>>
>>  I volunteer for creating and maintaining schema.
>>
>
>  Great, thanks a lot.
>
>>
>> > Regarding the "URIs for element nodes in HTML5" discussion: Ivan said
>> that
>> > our group should consider whether this is really an issue.
>>
>>  I would expected more positioned reply from SW activity lead :-)
>>
>
>  Well, to be fair, he was more precise:
>
>  "RDFa does not include any definition, as far as the extracted RDF is
> concerned, on pointing 'back' to the original source structure. This should
> be done explicitly. I am not sure whether this is a major issue, this is
> something for the group to consider..."
>
>  But the essence is the same: is it important for us?
>
>
>
> Some things to add (and to shed some light on ACTION-32):
>
> I think it's important to define a way to do it, but not have it
> obligatory to serialize because it has zero utility until someone actually
> uses it in pure RDF. The thing is, as long as the HTML document is
> available and the RDFa is inlined, the references to the HTML structure in
> RDF don't add any additional information and can be trivially
> reconstructed. RDFa consumption tools can likely handle that kind of
> content as-is.
>
> The tricky case is if someone at some point wants to get pure RDF from
> this (dropping the HTML in the process), we should have some specification
> that they could follow to achieve these references. The use case I can
> think of is feeding ITS-marked-up input into a NLP pipeline running on
> something like NIF, which needs URIs for annotated fragments of text.
> Luckily the conversion itself is pretty mechanical, so I see some
> strategies for minting URIs that can be dereferenceable directly to the
> fragment:
> * have the RDF node point back to the HTML element's id, if there is any
> (<meta property="its:annotates" resource="#id_myElement_bar" />)
> * have the RDF node mint a URI for the fragment using one if the NIF
> recipes (<meta property="its:annotates"
> resource="#hash_1_3_12341234123412341_bar" />)
>
> A question for people consuming RDF/RDFa - is defining this sort of "URI
> generation recipe" at the RDFa consumption stage breaking too many
> assumptions? I'd like to avoid having producers generate redundant data.
>
> .. and back to answering "how much RDF do we need"?
> My reason for considering RDFa was to encode the additional information we
> might have about the concepts that are behind the text. Right now the most
> important uses are:
> - the URI of the concept (the "means " relation);
> - the type URI of the concept (see ISSUE-3) (the "this fragment represents
> a concept of the type" relation);
> - the labels of the concept in other languages;
>
> Since we can model those via the proposed data categories, we don't need
> explicit RDF support to represent this - it is however very important that
> these predicates can point to URIs in the RDF space (as is currently the
> case with its:termInfoRef, for instance), and that we at least have a
> process in place for transforming "HTML5+ITS" into HTML5/RDFa , /Microdata,
> or /RDFa Lite. Right now the examples you submitted look good for that
> purpose, adding an HTML URI generator should cover that part.
>
> -- Tadej
>
>
>
>
>> Anyway we probably shouldn't spend much time on mappings as I can't
>> imagine anyone using RDFa/microdata in favor of simple attributes.
>>
>
>  I hope that the mapping can be fairly mechanical and will not need much
> time. Even if it is not created by hand, I can imagine tools like Enrycher
> that easily can generate it. Having then a mapping of Enrycher output as an
> input to schema.org based SEO is a nice scenario, IMO, but it depends on
> RDFa/microdata.
>
>  Felix
>
>
>>
>>                                Jirka
>>
>> --
>> ------------------------------------------------------------------
>>  Jirka Kosek      e-mail: jirka@kosek.cz      http://xmlguru.cz
>> ------------------------------------------------------------------
>>       Professional XML consulting and training services
>>  DocBook customization, custom XSLT/XSL-FO document processing
>> ------------------------------------------------------------------
>>  OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
>> ------------------------------------------------------------------
>>
>>
>
>
>  --
> Felix Sasaki
> DFKI / W3C Fellow
>
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow
Received on Thursday, 22 March 2012 17:17:57 UTC