- From: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- Date: Tue, 16 Aug 2011 13:40:11 +0900
- To: Michael Martin <martin@informatik.uni-leipzig.de>, public-lod <public-lod@w3.org>, Alexander Dutton <alexander.dutton@oucs.ox.ac.uk>
- Message-ID: <4E49F4AB.3090800@informatik.uni-leipzig.de>
Hi Michael and Alex, sorry to answer so late, I was in holiday in France. I looked at the three provided resources [1,2,3] and there are still some comments and questions I have. 1. The part after the # is actually not sent to the server. Are there any solutions for this? It is not really LinkedData friendly. Compare |http://linkedgeodata.org/triplify/near/*51*.*033333*,*13*.*733333*/*1000*/class/Amenity (Currently not working, but it gives all points within a 1000m radius) | The client would be required to calculate the subset of triples from the resource, that are addressed. 2. [1] is quite basic and they are basically using position and lines. I made a qualitative comparison of different fragment id approaches for text in [4] slide 7. I was wondering if anybody has researched such properties of URI fragments. Currently, I am benchmarking stability of these uris using Wikipedia changes. Has such work been done before? 3. @Alex: In my opinion, your proposed fragment ontology can only be used to provide documentation for different fragments. I would rather propose to just use one triple: <http://www.w3.org/DesignIssues/LinkedData.html#offset__14406-14418> a <http://nlp2rdf.lod2.eu/schema/string/OffsetBasedString> The ontology I made for Strings might be generalized for formats other than text based [5] One triple is much shorter. As you can see I also tried to encode the type of fragment right into the fragment "offset", although a notation like "type=offset" might be better. 4. @Michael: is there some standardisation respective URIs for text going on? I heard there would be a Language Technology W3C group. The approach by Wilde and Dürst[1] seems to lack stability. Do you think we could do such standardisation for document fragments and text fragments within the Media Fragments Group[3] ? I really thought the liveUrl project was quite good, but it seems dead[6]. In LOD2[7] and NIF[8] we will need some fragment identifiers to Standardize NLP tools for the LOD2 stack. It would be great to reuse stuff instead of starting from scratch. I had to extend [1] for example, because it did not produce stable uris and also it did not contain the type of algorithm used to produce the URI. All the best, Sebastian [1] http://tools.ietf.org/html/rfc5147 [2] http://tools.ietf.org/html/draft-hausenblas-csv-fragment [3] http://www.w3.org/TR/media-frags/ [4] http://www.slideshare.net/kurzum/nif-nlp-interchange-format [5] http://nlp2rdf.lod2.eu/schema/string/ [6] http://liveurls.mozdev.org/index.html [7] http://lod2.eu [8] http://aksw.org/Projects/NIF Am 04.08.2011 22:37, schrieb Michael Hausenblas: > > Alex, > >> Has something already done this? Is it even (mostly?) sane? > > Sane yes, IMO. Done, sort of, see: > > + URI Fragment Identifiers for the text/plain [1] > + URI Fragment Identifiers for the text/csv [2] > > Cheers, > Michael > > [1] http://tools.ietf.org/html/rfc5147 > [2] http://tools.ietf.org/html/draft-hausenblas-csv-fragment > > -- > Dr. Michael Hausenblas, Research Fellow > LiDRC - Linked Data Research Centre > DERI - Digital Enterprise Research Institute > NUIG - National University of Ireland, Galway > Ireland, Europe > Tel. +353 91 495730 > http://linkeddata.deri.ie/ > http://sw-app.org/about.html > > On 4 Aug 2011, at 14:22, Alexander Dutton wrote: > >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi all, >> >> Say I have an XML document, <http://example.org/something.xml>, and I >> want to talk about about some part of it in RDF. As this is XML, being >> able to point into it using XPath sounds ideal, leading to something >> like: >> >> <#fragment> a fragment:Fragment ; >> fragment:within <http://example.org/something.xml> ; >> fragment:locator "/some/path[1]"^^fragment:xpath . >> >> (For now we can ignore whether we wanted a nodeset or a single node, >> and how to handle XML namespaces.) >> >> More generally, we might want other ways of locating fragments >> (probably with a datatype for each): >> >> * character offsets / ranges >> * byte offsets / ranges >> * line numbers / ranges >> * some sub-rectangle of an image >> * XML node IDs >> * page ranges of a paginated document >> >> Some of these will be IMT-specific and may need some more thinking >> about, but the idea is there. >> >> >> Has something already done this? Is it even (mostly?) sane? >> >> >> Yours, >> >> Alex >> >> >> NB. Our actual use-case is having pointers into an NLM XML file >> (embodying a journal article) so we can hook up our in-text reference >> pointer¹ URIs to the original XML elements (<xref/>s) they were >> generated from. This will allow us to work out the context of each >> citation for use in further analysis of the relationship between the >> citing and cited articles. >> >> ¹ See >> <http://opencitations.wordpress.com/2011/07/01/nomenclature-for-citations-and-references/> >> >> for an explanation of the terminology. >> >> - -- >> Alexander Dutton >> Developer, data.ox.ac.uk, InfoDev, Oxford University Computing Services >> Open Citations Project, Department of Zoology, University >> of Oxford >> -----BEGIN PGP SIGNATURE----- >> Version: GnuPG v1.4.11 (GNU/Linux) >> Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ >> >> iEYEARECAAYFAk46nS4ACgkQS0pRIabRbjDVZQCdGblvoMgNqEietlE5EwAkPJY8 >> pikAn2KApM0HjcXj6TZegA+Dek/DJIQX >> =UcCr >> -----END PGP SIGNATURE----- >> >> > > -- Dipl. Inf. Sebastian Hellmann Department of Computer Science, University of Leipzig Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann Research Group: http://aksw.org
Received on Tuesday, 16 August 2011 04:45:54 UTC