- From: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- Date: Fri, 06 Sep 2013 12:36:52 +0200
- To: Felix Sasaki <fsasaki@w3.org>
- CC: Dave Lewis <dave.lewis@cs.tcd.ie>, public-multilingualweb-lt@w3.org
- Message-ID: <5229B044.8050702@informatik.uni-leipzig.de>
Hi Felix,
Percent encoding should be fine. I update it here as well:
https://dl.dropboxusercontent.com/u/375401/tmp/EX-nif-conversion-output.xml
Note that I replaced:
<http://example.com/exampledoc.html>
with
<http://example.com/doc.html>
everywhere. One of the reason was, that exampledoc.html was misspelled
exampldoc.html at some points and "example" occurs twice in the uri.
Furthermore, there is one small mistake in
# we can attach the metadata to the parent node:
<b its-ta-ident-ref="http://dbpedia.org/resource/Dublin"
translate="no">Ireland</b>
should be
# we can attach the metadata to the parent node:
<b its-ta-ident-ref="http://dbpedia.org/resource/Ireland"
translate="no">Ireland</b>
Also since it is informative now, we could also link to the persistent
URI for the NIF service implementation spec as "further reading":
http://persistence.uni-leipzig.org/nlp2rdf/specification/api.html
All the best,
Sebastian
Am 06.09.2013 12:15, schrieb Felix Sasaki:
> Hi Sebastian, Dave, all,
>
> I have made all edits, see
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#conversion-to-nif
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#nif-backconversion
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/nif/EX-nif-conversion-output.xml
>
> The issue around "[" and "]" is not resolved yet. But besides that
> everything (including Sebastian's note) should be ok. Dave, besides
> the test suite update you described at
>
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0014.html
>
> this should be everything, right?
>
> Best,
>
> Felix
>
>
> Am 06.09.13 11:39, schrieb Sebastian Hellmann:
>> Ok, here is the updated example file:
>> https://dl.dropboxusercontent.com/u/375401/tmp/EX-nif-conversion-output.xml
>>
>> There is a problem, however. [ and ] are not allowed in the query
>> component, so
>> rdf:resource="http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html&xpath=/html/body[1]/h2[1]"
>> violates https://tools.ietf.org/html/rfc3986#appendix-A
>>
>> pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
>> query = *( pchar / "/" / "?" )
>> fragment = *( pchar / "/" / "?" )
>> pct-encoded = "%" HEXDIG HEXDIG
>> unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
>> reserved = gen-delims / sub-delims
>> gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
>> sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," /
>> ";" / "="
>>
>> I am not an XPath expert. Could we do it like this?
>> xpath=/html/body/1/h2/1
>>
>>
>> NIF 2.0 has become quite delayed. I am one of the main coordinators,
>> which makes the standardization process lightweight, but I am also a
>> bottleneck. One of the reason for the delay is my timely contribution
>> here via email and meetings. I hope this is an acceptable trade off.
>> The consequence is, that most of NIF 2.0 is not well documented, yet,
>> although it is getting better day by day.
>>
>> All the best,
>> Sebastian
>>
>>
>>
>> Am 06.09.2013 11:05, schrieb Sebastian Hellmann:
>>> Hi Dave,
>>>
>>> Am 05.09.2013 13:19, schrieb Dave Lewis:
>>>> Following decision on the 4th December call to opt for a query
>>>> style URL for the NIF string in RDF (which will also be supported
>>>> in NIF) when defining the mapping the following need to be changed
>>>> in the spec:
>>>>
>>>> 1) all occurrences of RDF URLs with #char or #xpath fragments to be
>>>> changed to a query style as suggested by the RDF group and expanded
>>>> on by Felix in
>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html
>>>>
>>>> i.e. all fragment identifiers for NIF strings in annex F and G
>>>> should be changed from, e.g.:
>>>>
>>>> one other associated question is, as we are using the query type to get around the limitations ofrfc 5147
>>>> char fragment in working with XML and HTML, is it still appropriate after the above change to type the NIF string in the example with
>>>> the subclass nif:RFC5147String? Sebastien? e.g.
>>>>
>>>> http://example.com/myitsservice?input=http://example.com/exampldoc.html& <http://example.com/myitsservice?input=http://example.com/exampldoc.html&char=0,29>char=0,11
>>>> rdf:type nif:RFC5147String;
>>>
>>> I am currently working on a formal ABNF definition for this, but you
>>> can consider it to be like this in Java:
>>>
>>> String prefix =
>>> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html&"
>>> ;
>>> String identifier = "char=0,11" ;
>>> String uri = prefix+identifier ;
>>>
>>> // only identifier has to have the syntax given by the rdf:type
>>> validate ("nif:RFC5147String", identifier) ;
>>>
>>> So the syntax is only relevant for the identifier part. These would
>>> be alternative prefixes as well:
>>> String prefixOption1 =
>>> "http://example.com/myitsservice/informat/html/intype/url/input/http://example.com/exampldoc.html/"
>>> ;
>>> String prefixOption2 =
>>> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html#"
>>> ;
>>> String prefixOption3 =
>>> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html&"
>>> ;
>>>
>>> It really doesn't matter and all three are valid RDF (The first one
>>> is a bit awkward, of course)
>>>
>>>
>>>> 2) Once this is fixed we need to update the NIF part of the test suite and tests rerun by Felix, Leroy and Phil
>>>
>>> As written above, this is not strictly necessary, but it is nice to
>>> be consistent.
>>>
>>>> 3) Add the following suggested note wording to the end of Annex
>>>> "Note: NIF allows URL for a String resource to be referenced as URIs that are fragments of the original document in the form:
>>>> http://example.com/exampledoc.html#char=0,11
>>>> or
>>>> http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text()[1])
>>>>
>>>> Though this offers a potentially convenient mechanism for linking NIF resources in RDF back to the original document, the char
>>>> fragment is defined currently only for text/plain while the xpath fragment is not defined for HTML. Therefore this URL
>>>> recipe does fulfil the ITS requirements to support both XML and HTML and the aim of this mapping to produce resources adhering
>>>> to the Linked Data principle of dereferenceablility. The future definition and registration of these fragment types, while a potentially
>>>> attractive feature, is beyond the scope of this specification."
>>>
>>>
>>> Let's change this like this:
>>> http://example.com/doc.html#xpath(/html/body[1]/h2[1]/text()[1])
>>> maps to
>>> http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html#char=0,11
>>>
>>> Note that RDF is ok, with all Fragment Ids:
>>> http://www.w3.org/TR/rdf-concepts/#section-fragID
>>>
>>> RFC 3986 as well: http://tools.ietf.org/html/rfc3986#page-24
>>>
>>>
>>>> The semantics of a fragment identifier are defined by the set of
>>>> representations that might result from a retrieval action on the
>>>> primary resource. The fragment's format and resolution is therefore
>>>> dependent on the media type [RFC2046] of a potentially retrieved
>>>> representation, even though such a retrieval is only performed
>>>> if the
>>>> URI is dereferenced. If no such representation exists, then the
>>>> semantics of the fragment are considered unknown and are effectively
>>>> unconstrained.
>>>
>>>
>>>
>>> The text could be like this:
>>> "Note: NIF allows URL for a String resource to be referenced as URIs
>>> that are fragments of the original document in the form:
>>> http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html#char=0,11
>>> or
>>> http://example.com/doc.html#xpath(/html/body[1]/h2[1]/text()[1])
>>>
>>> This offers a convenient mechanism for linking NIF resources in RDF
>>> back to the original document. RDF treats URIs as opaque and does
>>> not impose any semantic constraints on the used fragment
>>> identifiers, thus enabling their usage in RDF in a consistent
>>> manner. However, fragment identifiers get interpreted according to
>>> the retrieved mime type, if a retrieval action occurs as is the case
>>> in Linked Data. The char fragment is defined currently only for
>>> text/plain while the xpath fragment is not defined for HTML.
>>> Therefore this URL recipe does fulfil the ITS requirements to
>>> support both XML and HTML and the aim of this mapping to produce
>>> resources adhering to the Linked Data principle of
>>> dereferenceablility. The future definition and registration of these
>>> fragment types, while a potentially attractive feature, is beyond
>>> the scope of this specification."
>>>
>>> I will try to update the example in the spec as well.
>>>
>>> All the best,
>>> Sebastian
>>>
>>>> cheers,
>>>> Dave
>>>>
>>>>
>>>>
>>>> <http://example.com/myitsservice?input=http://example.com/exampldoc.html&char=0,29>
>>>>
>>>>
>>>>
>>>>
>>>> On 03/09/2013 09:14, Felix Sasaki wrote:
>>>>> 1) last call item "RDF - NIF conversion". See
>>>>> https://www.w3.org/International/multilingualweb/lt/track/issues/131
>>>>> and these mails
>>>>> Phil
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0001.html
>>>>>
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0066.html
>>>>>
>>>>>
>>>>> Dave
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0067.html
>>>>>
>>>>>
>>>>> Felix
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0068.html
>>>>>
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html
>>>>>
>>>>>
>>>>> Goal: decide about the option 1) or 2) or something else (see a
>>>>> variation of option 2) in
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html
>>>>>
>>>>> IMPORTANT: even if you are not implementing ITS <> NIF, please
>>>>> state your opinion since tomorrow want want to form a working
>>>>> group opinion, to be able to move forward.
>>>>
>>>
>>>
>>> --
>>> Dipl. Inf. Sebastian Hellmann
>>> Department of Computer Science, University of Leipzig
>>> Events:
>>> * NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org,
>>> Extended Deadline: *July 18th*)
>>> * LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt)
>>> Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
>>> Projects: http://nlp2rdf.org , http://linguistics.okfn.org ,
>>> http://dbpedia.org/Wiktionary , http://dbpedia.org
>>> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
>>> Research Group: http://aksw.org
>>
>>
>> --
>> Dipl. Inf. Sebastian Hellmann
>> Department of Computer Science, University of Leipzig
>> Events:
>> * NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Extended
>> Deadline: *July 18th*)
>> * LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt)
>> Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
>> Projects: http://nlp2rdf.org , http://linguistics.okfn.org ,
>> http://dbpedia.org/Wiktionary , http://dbpedia.org
>> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
>> Research Group: http://aksw.org
>
--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Events:
* NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Extended
Deadline: *July 18th*)
* LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt)
Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
Projects: http://nlp2rdf.org , http://linguistics.okfn.org ,
http://dbpedia.org/Wiktionary , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org
Received on Friday, 6 September 2013 10:37:21 UTC