W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > April 2013

Re: agenda+ referencing ontology (Re: ISSUE-119: ITS RDF Ontology creation [MLW-LT Standard Draft])

From: Phil Ritchie <philr@vistatec.ie>
Date: Thu, 18 Apr 2013 17:05:23 +0100
To: Felix Sasaki <fsasaki@w3.org>
Cc: Dave Lewis <dave.lewis@cs.tcd.ie>, Jirka Kosek <jirka@kosek.cz>, MultilingualWeb-LT Working Group <public-multilingualweb-lt@w3.org>
Message-ID: <OFA5C7CFC2.C21260F4-ON80257B51.005832D2-80257B51.0058622C@vistatec.ie>
Thanks for the really comprehensive answer Felix.

I understand aspects of this but don't yet quite have all of it mapped 
clearly in my mind. This weekend's study!

Phil.





From:   Felix Sasaki <fsasaki@w3.org>
To:     Phil Ritchie <philr@vistatec.ie>, 
Cc:     Dave Lewis <dave.lewis@cs.tcd.ie>, Jirka Kosek <jirka@kosek.cz>, 
MultilingualWeb-LT Working Group <public-multilingualweb-lt@w3.org>
Date:   17/04/2013 09:37
Subject:        Re: agenda+ referencing ontology (Re: ISSUE-119: ITS RDF 
Ontology  creation  [MLW-LT Standard Draft])



Hi Phil,

Am 17.04.13 09:31, schrieb Phil Ritchie:
Felix

Does NIF have wider adoption than RDF? 

NIF is an RDF based format. That is, the relation betwen NIF and RDF is 
like between XML and XHTML, or XML and XLIFF.

We use NIF in ITS2 to connect ITS information in markup (XML, HTML5) with 
an RDF representation. See

http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#conversion-to-nif

and a full example input HTML5 at
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-HTML-whitespace-normalization

RDF output using NIF and the ITS2 ontology at
http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/nif/EX-nif-conversion-output.xml


The purpose of the ITS2 ontology is not to relate the RDF representation 
to XML/RDF - NIF does that -, but to identify the ITS2 properties in an 
RDF manner, that is with RDF predicates.

There is an interconnection between NIF and the ITS ontology. See this 
example generated from a part of
http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/nif/EX-nif-conversion-output.xml


<http://example.com/exampledoc.html#char=11,17> nif:anchorOf "Dublin";
    nif:referenceContext <http://example.com/exampledoc.html#char=0,29>;
    a nif:RFC5147String;
    itsrdf:taIdentRef <http://dbpedia.org/resource/Dublin>;
    itsrdf:translate "no";
    itsrdf:withinText "yes".

This statement

<http://example.com/exampledoc.html#char=11,17> nif:anchorOf "Dublin".

Relates the HTML5 document with the RDF representation. To ancor this 
relation in the NIF RDF vocabulary we have this statement

<http://example.com/exampledoc.html#char=11,17> a nif:RFC5147String.

The actual ITS ontology statements are these three. They have the same 
subject as the NIF statements above. That creates the forehand mentioned 
relation between NIF and ITS2.
<http://example.com/exampledoc.html#char=11,17> itsrdf:taIdentRef 
<http://dbpedia.org/resource/Dublin>.
<http://example.com/exampledoc.html#char=11,17> itsrdf:translate "no".
<http://example.com/exampledoc.html#char=11,17> itsrdf:withinText "yes".

Now, if you want to process this in SPARQL asking for all non translatable 
items you would write something like this:

SELECT ?translatableItems
WHERE { ?translatableItems <http://www.w3.org/2005/11/its/rdf#translate> 
"no" }

and get as a result
http://example.com/exampledoc.html#char=23,30
http://example.com/exampledoc.html#char=11,17

Does this make sense and would it work for what you have in mind?

Best,

Felix

I understand from what I've read that it is maybe easier to read, more 
compact?

Phil 



On 17 Apr 2013, at 08:22, "Felix Sasaki" <fsasaki@w3.org> wrote:

Hi Dave, Phil, all,

I have put the ontology on the w3c server. The namespace 
http://www.w3.org/2005/11/its/rdf#
or
http://www.w3.org/2005/11/its/rdf#translate
resolve with 303 "see other" to
http://www.w3.org/2005/11/its/rdf-content/its-rdf.rdf (in RDF/XML version)
or
http://www.w3.org/2005/11/its/rdf-content/its-rdf.html
in the latter we can put some more documentation, but for the time being 
what is here is sufficient.

Can you discuss today whether people would agree with this? Note that we 
then should define the namespace for the ontology also in
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#notation

and this would mean that we reference the ontology normatively. If people 
agree with this, could you give me an action item to add the ontology URI 
during todays call?

Note for all implementers: this wouldn't influence you only if you 
implement the NIF conversion. Currently this is Sebastian and I - anybody 
else?

Best,

Felix

Am 17.04.13 09:04, schrieb Phil Ritchie:
Dave

I certainly want to work on transforming some Xliff with ITS LQI and 
Provenance data into RDF so I'd like to chip in with this.

I'm not sure I have all of the understanding necessary though - 
particularly around schema creation and validation.

Would it be worthwhile having a conf. call to get on the same page? I 
should be on today's call so we could chat then.

I would like to participate in that discussion - I can't be on the call 
today. But feel free to to discuss & hopefully we can bring up the topic 
again next week, or on a separate, dedicated call - would you be available 
Phil?

Best,

Felix



Phil 
Twitter: philinthecloud 
Skype: philviathecloud


On 17 Apr 2013, at 01:38, "Dave Lewis" <dave.lewis@cs.tcd.ie> wrote:

Hi Jirka, Felix, Sebastian, all,

I've updated ITS-RDF ontology as follows:

1) I agree with Felix's comment to remove custom XML schema types for 
attributes as RDf platforms in general don't validate against these, 
instead just specifying the simple XML schema type as appropriate, e.g. 
xsd:string, xsd:anyURI, xsd:decimal, xsd:nonNegativeInteger, xsd:integer

2) for data categories with standoff markup I've introduced a class to 
allow the correct grouping of indivdual attiributes to the a specfic item. 
These calsses are ProvRecord and LocalizationQualityIssue

3) for annotatorsRef I have just introduced individual attributes for each 
data categoriy where it applies, namely: termAnnotatorsRef, 
taAnnotatorsRef, mtConfidenceAnnotatorsRef

4) I've omitted anything related to Ruby

I believe this is consistent with the NIF related text in the current 
draft.

I've attached the ontology as a Turtle file, and have updated the same on:
http://www.w3.org/International/multilingualweb/lt/wiki/ITS-RDF_mapping 

If we can firm up on this then I propose documenting it in a more 
accessible format as per W3C norms. In addition we will need some best 
practice guidance on using this ontology with at least both NIF and 
PROV-O. I'm happy to work on these also, though all other inputs welcome.

Regards,
Dave



On 29/03/2013 13:37, Jirka Kosek wrote:
Hi Dave,

on the last telcon I have been tasked to "refresh" and try to move
forward some issues. Could you please implemented changes below into
proposed ITS RDF Ontology.

Thanks,

                                                                 Jirka

On 25.2.2013 9:04, MultilingualWeb-LT Working Group Issue Tracker wrote:

mlw-lt-track-ISSUE-119: ITS RDF Ontology creation [MLW-LT Standard Draft]

http://www.w3.org/International/multilingualweb/lt/track/issues/119

Raised by: Felix Sasaki
On product: MLW-LT Standard Draft

Dave started an ITS RDF Ontology. See
http://www.w3.org/International/multilingualweb/lt/wiki/ITS-RDF_mapping#Ontology_.28DRAFT.29

This is useful for the NIF conversion.

There was an offline discussion about this, including Dave, Leroy, 
Sebastian and I.

Some thoughts about the ontology current at
http://www.w3.org/International/multilingualweb/lt/wiki/ITS-RDF_mapping#Ontology_.28DRAFT.29


- the ontology uses various RDF classes that are not defined, e.g. 
"itstype:its-taConfidence.type" is identified as a class via
"rdf:type itstype:its-taConfidence.type"
So *if* one want to use "itstype:its-taConfidence.type" as a class, you'd 
need also
itstype:its-taConfidence.type rdf:type rdf:Class

- classes are normally written in upper case, so
"its-taConfidence.type" would be
"Its-taConfidence.type"

- As said in the offline thread (sorry for the repetition, guys), I would 
not define such classes at all. It would be sufficient to define actually 
no class - just use NIF URIs, and then have statements like this

someNIFBasedSubjectUri 
                 its:locQualityIssueComment[1] "'c'es' is unknown. Could 
be 'c'est'"; 
                 its:locQualityIssueEnabled[1]="yes" ;
                 its:locQualityIssueSeverity[1] "50";
                 its:locQualityIssueType "misspelling".

The RDF predicates would take as a domain a NIF URI, and as the range an 
XML literal (or HTML literal, if we use RDF 1.1).
This approach has also the advantage that you can convert the test suite 
output easily to RDF "instance" data.

- Felix






<itsrdf.ttl>

************************************************************
VistaTEC Ltd. Registered in Ireland 268483. 
Registered Office, VistaTEC House, 700, South Circular Road, 
Kilmainham. Dublin 8. Ireland. 

The information contained in this message, including any accompanying 
documents, is confidential and is intended only for the addressee(s). 
The unauthorized use, disclosure, copying, or alteration of this 
message is strictly forbidden. If you have received this message in
error please notify the sender immediately.
************************************************************


************************************************************
VistaTEC Ltd. Registered in Ireland 268483. 
Registered Office, VistaTEC House, 700, South Circular Road, 
Kilmainham. Dublin 8. Ireland. 

The information contained in this message, including any accompanying 
documents, is confidential and is intended only for the addressee(s). 
The unauthorized use, disclosure, copying, or alteration of this 
message is strictly forbidden. If you have received this message in
error please notify the sender immediately.
************************************************************

************************************************************
VistaTEC Ltd. Registered in Ireland 268483. 
Registered Office, VistaTEC House, 700, South Circular Road, 
Kilmainham. Dublin 8. Ireland. 

The information contained in this message, including any accompanying 
documents, is confidential and is intended only for the addressee(s). 
The unauthorized use, disclosure, copying, or alteration of this 
message is strictly forbidden. If you have received this message in
error please notify the sender immediately.
************************************************************
Received on Thursday, 18 April 2013 16:05:56 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:32:07 UTC