- From: Felix Sasaki <fsasaki@w3.org>
- Date: Fri, 06 Sep 2013 11:46:16 +0200
- To: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- CC: Dave Lewis <dave.lewis@cs.tcd.ie>, public-multilingualweb-lt@w3.org
- Message-ID: <5229A468.70401@w3.org>
Hi Sebastian, Am 06.09.13 11:39, schrieb Sebastian Hellmann: > Ok, here is the updated example file: > https://dl.dropboxusercontent.com/u/375401/tmp/EX-nif-conversion-output.xml Thanks a lot for this. About the problem: why not just percent escape "[" and "]" ? > > There is a problem, however. [ and ] are not allowed in the query > component, so > rdf:resource="http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html&xpath=/html/body[1]/h2[1]" > violates https://tools.ietf.org/html/rfc3986#appendix-A > > pchar = unreserved / pct-encoded / sub-delims / ":" / "@" > query = *( pchar / "/" / "?" ) > fragment = *( pchar / "/" / "?" ) > pct-encoded = "%" HEXDIG HEXDIG > unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" > reserved = gen-delims / sub-delims > gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" > sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / > ";" / "=" > > I am not an XPath expert. Could we do it like this? > xpath=/html/body/1/h2/1 I would rather do the perecent escape approach. > > > NIF 2.0 has become quite delayed. I am one of the main coordinators, > which makes the standardization process lightweight, but I am also a > bottleneck. One of the reason for the delay is my timely contribution > here via email and meetings. I hope this is an acceptable trade off. > The consequence is, that most of NIF 2.0 is not well documented, yet, > although it is getting better day by day. Indeed. And I am sure that really soon you won't need to look into this anymore. - Felix > > All the best, > Sebastian > > > > Am 06.09.2013 11:05, schrieb Sebastian Hellmann: >> Hi Dave, >> >> Am 05.09.2013 13:19, schrieb Dave Lewis: >>> Following decision on the 4th December call to opt for a query style >>> URL for the NIF string in RDF (which will also be supported in NIF) >>> when defining the mapping the following need to be changed in the spec: >>> >>> 1) all occurrences of RDF URLs with #char or #xpath fragments to be >>> changed to a query style as suggested by the RDF group and expanded >>> on by Felix in >>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html >>> >>> i.e. all fragment identifiers for NIF strings in annex F and G >>> should be changed from, e.g.: >>> >>> one other associated question is, as we are using the query type to get around the limitations ofrfc 5147 >>> char fragment in working with XML and HTML, is it still appropriate after the above change to type the NIF string in the example with >>> the subclass nif:RFC5147String? Sebastien? e.g. >>> >>> http://example.com/myitsservice?input=http://example.com/exampldoc.html& <http://example.com/myitsservice?input=http://example.com/exampldoc.html&char=0,29>char=0,11 >>> rdf:type nif:RFC5147String; >> >> I am currently working on a formal ABNF definition for this, but you >> can consider it to be like this in Java: >> >> String prefix = >> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html&" >> ; >> String identifier = "char=0,11" ; >> String uri = prefix+identifier ; >> >> // only identifier has to have the syntax given by the rdf:type >> validate ("nif:RFC5147String", identifier) ; >> >> So the syntax is only relevant for the identifier part. These would >> be alternative prefixes as well: >> String prefixOption1 = >> "http://example.com/myitsservice/informat/html/intype/url/input/http://example.com/exampldoc.html/" >> ; >> String prefixOption2 = >> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html#" >> ; >> String prefixOption3 = >> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html&" >> ; >> >> It really doesn't matter and all three are valid RDF (The first one >> is a bit awkward, of course) >> >> >>> 2) Once this is fixed we need to update the NIF part of the test suite and tests rerun by Felix, Leroy and Phil >> >> As written above, this is not strictly necessary, but it is nice to >> be consistent. >> >>> 3) Add the following suggested note wording to the end of Annex >>> "Note: NIF allows URL for a String resource to be referenced as URIs that are fragments of the original document in the form: >>> http://example.com/exampledoc.html#char=0,11 >>> or >>> http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text()[1]) >>> >>> Though this offers a potentially convenient mechanism for linking NIF resources in RDF back to the original document, the char >>> fragment is defined currently only for text/plain while the xpath fragment is not defined for HTML. Therefore this URL >>> recipe does fulfil the ITS requirements to support both XML and HTML and the aim of this mapping to produce resources adhering >>> to the Linked Data principle of dereferenceablility. The future definition and registration of these fragment types, while a potentially >>> attractive feature, is beyond the scope of this specification." >> >> >> Let's change this like this: >> http://example.com/doc.html#xpath(/html/body[1]/h2[1]/text()[1]) >> maps to >> http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html#char=0,11 >> >> Note that RDF is ok, with all Fragment Ids: >> http://www.w3.org/TR/rdf-concepts/#section-fragID >> >> RFC 3986 as well: http://tools.ietf.org/html/rfc3986#page-24 >> >> >>> The semantics of a fragment identifier are defined by the set of >>> representations that might result from a retrieval action on the >>> primary resource. The fragment's format and resolution is therefore >>> dependent on the media type [RFC2046] of a potentially retrieved >>> representation, even though such a retrieval is only performed if the >>> URI is dereferenced. If no such representation exists, then the >>> semantics of the fragment are considered unknown and are effectively >>> unconstrained. >> >> >> >> The text could be like this: >> "Note: NIF allows URL for a String resource to be referenced as URIs >> that are fragments of the original document in the form: >> http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html#char=0,11 >> or >> http://example.com/doc.html#xpath(/html/body[1]/h2[1]/text()[1]) >> >> This offers a convenient mechanism for linking NIF resources in RDF >> back to the original document. RDF treats URIs as opaque and does not >> impose any semantic constraints on the used fragment identifiers, >> thus enabling their usage in RDF in a consistent manner. However, >> fragment identifiers get interpreted according to the retrieved mime >> type, if a retrieval action occurs as is the case in Linked Data. The >> char fragment is defined currently only for text/plain while the >> xpath fragment is not defined for HTML. Therefore this URL recipe >> does fulfil the ITS requirements to support both XML and HTML and the >> aim of this mapping to produce resources adhering to the Linked Data >> principle of dereferenceablility. The future definition and >> registration of these fragment types, while a potentially attractive >> feature, is beyond the scope of this specification." >> >> I will try to update the example in the spec as well. >> >> All the best, >> Sebastian >> >>> cheers, >>> Dave >>> >>> >>> >>> <http://example.com/myitsservice?input=http://example.com/exampldoc.html&char=0,29> >>> >>> >>> >>> >>> On 03/09/2013 09:14, Felix Sasaki wrote: >>>> 1) last call item "RDF - NIF conversion". See >>>> https://www.w3.org/International/multilingualweb/lt/track/issues/131 >>>> and these mails >>>> Phil >>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0001.html >>>> >>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0066.html >>>> >>>> >>>> Dave >>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0067.html >>>> >>>> >>>> Felix >>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0068.html >>>> >>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html >>>> >>>> >>>> Goal: decide about the option 1) or 2) or something else (see a >>>> variation of option 2) in >>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html >>>> >>>> IMPORTANT: even if you are not implementing ITS <> NIF, please >>>> state your opinion since tomorrow want want to form a working group >>>> opinion, to be able to move forward. >>> >> >> >> -- >> Dipl. Inf. Sebastian Hellmann >> Department of Computer Science, University of Leipzig >> Events: >> * NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Extended >> Deadline: *July 18th*) >> * LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt) >> Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf >> Projects: http://nlp2rdf.org , http://linguistics.okfn.org , >> http://dbpedia.org/Wiktionary , http://dbpedia.org >> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann >> Research Group: http://aksw.org > > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Events: > * NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Extended > Deadline: *July 18th*) > * LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt) > Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf > Projects: http://nlp2rdf.org , http://linguistics.okfn.org , > http://dbpedia.org/Wiktionary , http://dbpedia.org > Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann > Research Group: http://aksw.org
Received on Friday, 6 September 2013 09:46:50 UTC