- From: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- Date: Sun, 26 May 2013 20:27:34 +0200
- To: Felix Sasaki <fsasaki@w3.org>
- CC: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, Phil Ritchie <philr@vistatec.ie>, Leroy Finn <finnle@tcd.ie>
- Message-ID: <51A25416.5080906@informatik.uni-leipzig.de>
Hi Felix, a) nif:anchorOf will be optional (MAY) in the future, as it is redundant, nif:beginIndex and endIndex are preferred (SHOULD) . Both are not really required, the rationale for the latter is rather that they are more useful in the processing chain. anchorOf can be very well calculated with most programing languages: String c = "abc".substring(2,3); Actually, we should consider including begin and endIndex in the example as well, although they might get bloated. b) nif:beginIndex "11" nif:endIndex "17" are intended to provide an explicit representation of #char=11,17 This is good for querying (e.g. SPARQL FILTER <>= ) and processing The #char= is taken from RFC 5147 which regulates exactly how they should be counted: http://tools.ietf.org/html/rfc5147#section-2.2.1 beginIndex and endIndex have two open issues: nif:beginIndex a owl:DatatypeProperty ; vs:term_status "testing" ; rdfs:label "begin index"@en ; rdfs:comment """The begin index of a character range as defined in http://tools.ietf.org/html/rfc5147#section-2.2.1 and http://tools.ietf.org/html/rfc5147#section-2.2.2, measured as the gap between two characters, starting to count from 0 (the position before the first character of a text). Example: Index "2" is the postion between "Mr" and "." in "Mr. Sandman". Note: RFC 5147 is re-used for the definition of character ranges. RFC 5147 is assuming a text/plain MIME type. NIF builds upon Unicode and is content agnostic. Requirement (1): This property has the same value the "Character position" of RFC 5147 and it must therefore be an xsd:nonNegativeInteger . Requirement (2): The index of the subject string MUST be calculated relative to the nif:referenceContext of the subject. If available, this is the rdf:Literal of the nif:isString property.""" ; # still being discussed: # rdfs:subPropertyOf oa:start ; rdfs:range <http://www.w3.org/2001/XMLSchema#nonNegativeInteger> ; rdfs:domain nif:String . Issue b1) rdfs:subPropertyOf oa:start ; The definition of Open Annotation is pretty weak at the moment. I am working together with them to clarify this[1]. You can safely ignore this, as we are focusing on RFC 5147. We will extend OA to match RFC 5147. [1] http://lists.w3.org/Archives/Public/public-openannotation/2013May/0038.html Issue b2) can best be answered by you, I guess: rdfs:range <http://www.w3.org/2001/XMLSchema#nonNegativeInteger> ; We were also considering xsd:int or xsd:long or having no range at all. nonNegativeInteger is infinite, but based on decimal . For memory consumption xsd:int would be best, but this would limit it to 2GB text files. I am lacking experience how well implementations optimize on this or if it is just used for validation. c) nif:convertedFrom is now nif:wasConvertedFrom -> easier to understand -> matches prov:wasDerivedFrom -> "Current state" wasConvertedFrom "former state" Correct in the examples d) The prefix can be removed: @prefix its: <http://www.w3.org/2005/11/its> . e) blank nodes are not my favorite, but they are unavoidable in this scenario. Ideally you can use a more elegant notation for writing them in turtle: <http://example.com/exampledoc.html#char=114,127> nif:anchorOf "tranport inc." ; nif:convertedFrom <http://example.com/exampledoc.html#xpath(/doc/para%5B1%5D/span%5B2%5D)> ; nif:referenceContext <http://example.com/exampledoc.html#char=0,180> ; a nif:RFC5147String ; itsrdf:hasLocQualityIssue [ a itsrdf:LocQualityIssue ; itsrdf:locQualityIssueComment "should be 'transport include'" ; itsrdf:locQualityIssueProfileRef <http://example.org/qaMovel/v1> ; itsrdf:locQualityIssueSeverity "75" ] . All the best, Sebastian Am 26.05.2013 18:40, schrieb Felix Sasaki: > Hi Sebastian, all, puttin Phil and Leroy into CC since they are > interested in the NIF testing topic, > > thank you for looking into the NIF section. This is now > https://www.w3.org/International/multilingualweb/lt/track/issues/125 > Most of the issues look pretty clear. All: I will implement them in > the spec on Tuesday if there are no further comments. > > One question, though: are the properties > nif:beginIndex > and > nif:endIndex > stable and does it express the same like "#char" in URIs? That is, > nif:beginIndex "11" > nif:endIndex "17" > is equal to #char=11,17 > ? > > FYI, I added input and output files for testing the NIF conversion, see > https://github.com/finnle/ITS-2.0-Testsuite/tree/master/its2.0/nif-conversion > At Leroy and Phil: since Phil asked for localization quality issue > (and XML) as test files and there was no other request, the input > files are all LQI. That also leads to blank nodes in the output, since > LQI in itsrdf requires these. Sebastian, the order of > "nif:convertedFrom" should be correct in the output > https://github.com/finnle/ITS-2.0-Testsuite/tree/master/its2.0/nif-conversion/expected > Let me know if s.t. is wrong. > > Best, > > Felix > > Am 26.05.13 15:12, schrieb Sebastian Hellmann: >> Dear all, >> We have recently produced a PDF document which gives a pretty good >> overview of NIF: >> http://svn.aksw.org/papers/2013/ISWC_NIF/public.pdf >> >> Furthermore, I have read the ITS 2.0 draft very closely once more >> and brushed up everything regarding NIF. There aren't any significant >> changes. Let's say we are going for "extra credit" ;) >> Please find a list of issues here: >> https://docs.google.com/document/d/1VagqM-Ty69mPYh0wHfkNTVUndOO34ub5cO4X9Eo572Q/edit# >> >> I will have a look at the other sections soon. >> >> All the best, >> Sebastian >> >> >> -- >> Dipl. Inf. Sebastian Hellmann >> Department of Computer Science, University of Leipzig >> Events: NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, >> Deadline: *July 8th*) >> Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf >> Projects: http://nlp2rdf.org , http://linguistics.okfn.org , >> http://dbpedia.org/Wiktionary , http://dbpedia.org >> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann >> Research Group: http://aksw.org > -- Dipl. Inf. Sebastian Hellmann Department of Computer Science, University of Leipzig Events: NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Deadline: *July 8th*) Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf Projects: http://nlp2rdf.org , http://linguistics.okfn.org , http://dbpedia.org/Wiktionary , http://dbpedia.org Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann Research Group: http://aksw.org
Received on Sunday, 26 May 2013 18:28:08 UTC