- From: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- Date: Fri, 06 Sep 2013 12:36:52 +0200
- To: Felix Sasaki <fsasaki@w3.org>
- CC: Dave Lewis <dave.lewis@cs.tcd.ie>, public-multilingualweb-lt@w3.org
- Message-ID: <5229B044.8050702@informatik.uni-leipzig.de>
Hi Felix, Percent encoding should be fine. I update it here as well: https://dl.dropboxusercontent.com/u/375401/tmp/EX-nif-conversion-output.xml Note that I replaced: <http://example.com/exampledoc.html> with <http://example.com/doc.html> everywhere. One of the reason was, that exampledoc.html was misspelled exampldoc.html at some points and "example" occurs twice in the uri. Furthermore, there is one small mistake in # we can attach the metadata to the parent node: <b its-ta-ident-ref="http://dbpedia.org/resource/Dublin" translate="no">Ireland</b> should be # we can attach the metadata to the parent node: <b its-ta-ident-ref="http://dbpedia.org/resource/Ireland" translate="no">Ireland</b> Also since it is informative now, we could also link to the persistent URI for the NIF service implementation spec as "further reading": http://persistence.uni-leipzig.org/nlp2rdf/specification/api.html All the best, Sebastian Am 06.09.2013 12:15, schrieb Felix Sasaki: > Hi Sebastian, Dave, all, > > I have made all edits, see > http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#conversion-to-nif > http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#nif-backconversion > http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/nif/EX-nif-conversion-output.xml > > The issue around "[" and "]" is not resolved yet. But besides that > everything (including Sebastian's note) should be ok. Dave, besides > the test suite update you described at > > http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0014.html > > this should be everything, right? > > Best, > > Felix > > > Am 06.09.13 11:39, schrieb Sebastian Hellmann: >> Ok, here is the updated example file: >> https://dl.dropboxusercontent.com/u/375401/tmp/EX-nif-conversion-output.xml >> >> There is a problem, however. [ and ] are not allowed in the query >> component, so >> rdf:resource="http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html&xpath=/html/body[1]/h2[1]" >> violates https://tools.ietf.org/html/rfc3986#appendix-A >> >> pchar = unreserved / pct-encoded / sub-delims / ":" / "@" >> query = *( pchar / "/" / "?" ) >> fragment = *( pchar / "/" / "?" ) >> pct-encoded = "%" HEXDIG HEXDIG >> unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" >> reserved = gen-delims / sub-delims >> gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" >> sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / >> ";" / "=" >> >> I am not an XPath expert. Could we do it like this? >> xpath=/html/body/1/h2/1 >> >> >> NIF 2.0 has become quite delayed. I am one of the main coordinators, >> which makes the standardization process lightweight, but I am also a >> bottleneck. One of the reason for the delay is my timely contribution >> here via email and meetings. I hope this is an acceptable trade off. >> The consequence is, that most of NIF 2.0 is not well documented, yet, >> although it is getting better day by day. >> >> All the best, >> Sebastian >> >> >> >> Am 06.09.2013 11:05, schrieb Sebastian Hellmann: >>> Hi Dave, >>> >>> Am 05.09.2013 13:19, schrieb Dave Lewis: >>>> Following decision on the 4th December call to opt for a query >>>> style URL for the NIF string in RDF (which will also be supported >>>> in NIF) when defining the mapping the following need to be changed >>>> in the spec: >>>> >>>> 1) all occurrences of RDF URLs with #char or #xpath fragments to be >>>> changed to a query style as suggested by the RDF group and expanded >>>> on by Felix in >>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html >>>> >>>> i.e. all fragment identifiers for NIF strings in annex F and G >>>> should be changed from, e.g.: >>>> >>>> one other associated question is, as we are using the query type to get around the limitations ofrfc 5147 >>>> char fragment in working with XML and HTML, is it still appropriate after the above change to type the NIF string in the example with >>>> the subclass nif:RFC5147String? Sebastien? e.g. >>>> >>>> http://example.com/myitsservice?input=http://example.com/exampldoc.html& <http://example.com/myitsservice?input=http://example.com/exampldoc.html&char=0,29>char=0,11 >>>> rdf:type nif:RFC5147String; >>> >>> I am currently working on a formal ABNF definition for this, but you >>> can consider it to be like this in Java: >>> >>> String prefix = >>> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html&" >>> ; >>> String identifier = "char=0,11" ; >>> String uri = prefix+identifier ; >>> >>> // only identifier has to have the syntax given by the rdf:type >>> validate ("nif:RFC5147String", identifier) ; >>> >>> So the syntax is only relevant for the identifier part. These would >>> be alternative prefixes as well: >>> String prefixOption1 = >>> "http://example.com/myitsservice/informat/html/intype/url/input/http://example.com/exampldoc.html/" >>> ; >>> String prefixOption2 = >>> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html#" >>> ; >>> String prefixOption3 = >>> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html&" >>> ; >>> >>> It really doesn't matter and all three are valid RDF (The first one >>> is a bit awkward, of course) >>> >>> >>>> 2) Once this is fixed we need to update the NIF part of the test suite and tests rerun by Felix, Leroy and Phil >>> >>> As written above, this is not strictly necessary, but it is nice to >>> be consistent. >>> >>>> 3) Add the following suggested note wording to the end of Annex >>>> "Note: NIF allows URL for a String resource to be referenced as URIs that are fragments of the original document in the form: >>>> http://example.com/exampledoc.html#char=0,11 >>>> or >>>> http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text()[1]) >>>> >>>> Though this offers a potentially convenient mechanism for linking NIF resources in RDF back to the original document, the char >>>> fragment is defined currently only for text/plain while the xpath fragment is not defined for HTML. Therefore this URL >>>> recipe does fulfil the ITS requirements to support both XML and HTML and the aim of this mapping to produce resources adhering >>>> to the Linked Data principle of dereferenceablility. The future definition and registration of these fragment types, while a potentially >>>> attractive feature, is beyond the scope of this specification." >>> >>> >>> Let's change this like this: >>> http://example.com/doc.html#xpath(/html/body[1]/h2[1]/text()[1]) >>> maps to >>> http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html#char=0,11 >>> >>> Note that RDF is ok, with all Fragment Ids: >>> http://www.w3.org/TR/rdf-concepts/#section-fragID >>> >>> RFC 3986 as well: http://tools.ietf.org/html/rfc3986#page-24 >>> >>> >>>> The semantics of a fragment identifier are defined by the set of >>>> representations that might result from a retrieval action on the >>>> primary resource. The fragment's format and resolution is therefore >>>> dependent on the media type [RFC2046] of a potentially retrieved >>>> representation, even though such a retrieval is only performed >>>> if the >>>> URI is dereferenced. If no such representation exists, then the >>>> semantics of the fragment are considered unknown and are effectively >>>> unconstrained. >>> >>> >>> >>> The text could be like this: >>> "Note: NIF allows URL for a String resource to be referenced as URIs >>> that are fragments of the original document in the form: >>> http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html#char=0,11 >>> or >>> http://example.com/doc.html#xpath(/html/body[1]/h2[1]/text()[1]) >>> >>> This offers a convenient mechanism for linking NIF resources in RDF >>> back to the original document. RDF treats URIs as opaque and does >>> not impose any semantic constraints on the used fragment >>> identifiers, thus enabling their usage in RDF in a consistent >>> manner. However, fragment identifiers get interpreted according to >>> the retrieved mime type, if a retrieval action occurs as is the case >>> in Linked Data. The char fragment is defined currently only for >>> text/plain while the xpath fragment is not defined for HTML. >>> Therefore this URL recipe does fulfil the ITS requirements to >>> support both XML and HTML and the aim of this mapping to produce >>> resources adhering to the Linked Data principle of >>> dereferenceablility. The future definition and registration of these >>> fragment types, while a potentially attractive feature, is beyond >>> the scope of this specification." >>> >>> I will try to update the example in the spec as well. >>> >>> All the best, >>> Sebastian >>> >>>> cheers, >>>> Dave >>>> >>>> >>>> >>>> <http://example.com/myitsservice?input=http://example.com/exampldoc.html&char=0,29> >>>> >>>> >>>> >>>> >>>> On 03/09/2013 09:14, Felix Sasaki wrote: >>>>> 1) last call item "RDF - NIF conversion". See >>>>> https://www.w3.org/International/multilingualweb/lt/track/issues/131 >>>>> and these mails >>>>> Phil >>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0001.html >>>>> >>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0066.html >>>>> >>>>> >>>>> Dave >>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0067.html >>>>> >>>>> >>>>> Felix >>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0068.html >>>>> >>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html >>>>> >>>>> >>>>> Goal: decide about the option 1) or 2) or something else (see a >>>>> variation of option 2) in >>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html >>>>> >>>>> IMPORTANT: even if you are not implementing ITS <> NIF, please >>>>> state your opinion since tomorrow want want to form a working >>>>> group opinion, to be able to move forward. >>>> >>> >>> >>> -- >>> Dipl. Inf. Sebastian Hellmann >>> Department of Computer Science, University of Leipzig >>> Events: >>> * NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, >>> Extended Deadline: *July 18th*) >>> * LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt) >>> Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf >>> Projects: http://nlp2rdf.org , http://linguistics.okfn.org , >>> http://dbpedia.org/Wiktionary , http://dbpedia.org >>> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann >>> Research Group: http://aksw.org >> >> >> -- >> Dipl. Inf. Sebastian Hellmann >> Department of Computer Science, University of Leipzig >> Events: >> * NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Extended >> Deadline: *July 18th*) >> * LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt) >> Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf >> Projects: http://nlp2rdf.org , http://linguistics.okfn.org , >> http://dbpedia.org/Wiktionary , http://dbpedia.org >> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann >> Research Group: http://aksw.org > -- Dipl. Inf. Sebastian Hellmann Department of Computer Science, University of Leipzig Events: * NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Extended Deadline: *July 18th*) * LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt) Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf Projects: http://nlp2rdf.org , http://linguistics.okfn.org , http://dbpedia.org/Wiktionary , http://dbpedia.org Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann Research Group: http://aksw.org
Received on Friday, 6 September 2013 10:37:21 UTC