- From: Michael Hausenblas <michael.hausenblas@deri.org>
- Date: Tue, 16 Aug 2011 06:12:55 +0100
- To: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- Cc: Michael Martin <martin@informatik.uni-leipzig.de>, public-lod <public-lod@w3.org>, Alexander Dutton <alexander.dutton@oucs.ox.ac.uk>
> It is not really LinkedData friendly. Why? > @Michael: is there some standardisation respective URIs for text > going on? As you've rightly identified, an RFC already exists. What would this new standardisation activity be chartered for? As and aside, this reminds me a bit of http://xkcd.com/927/ > The approach by Wilde and Dürst[1] seems to lack stability. I don't know what you mean by this. Lack of take-up, yes. Stability, what's that? > Do you think we could do such standardisation for document fragments > and text fragments within the Media Fragments Group[3] ? No. Disclaimer: I'm a MF WG member. Look at our charter [1] ... Maybe this thread should slowly be moved over to uri@w3.org [2]? Cheers, Michael [1] http://www.w3.org/2008/01/media-fragments-wg.html [2] http://lists.w3.org/Archives/Public/uri/ -- Dr. Michael Hausenblas, Research Fellow LiDRC - Linked Data Research Centre DERI - Digital Enterprise Research Institute NUIG - National University of Ireland, Galway Ireland, Europe Tel. +353 91 495730 http://linkeddata.deri.ie/ http://sw-app.org/about.html On 16 Aug 2011, at 05:40, Sebastian Hellmann wrote: > Hi Michael and Alex, > sorry to answer so late, I was in holiday in France. > I looked at the three provided resources [1,2,3] and there are still > some comments and questions I have. > > 1. The part after the # is actually not sent to the server. Are > there any solutions for this? It is not really LinkedData friendly. > Compare http://linkedgeodata.org/triplify/near/51.033333,13.733333/1000/class/Amenity > (Currently not working, but it gives all points within a 1000m radius) > > The client would be required to calculate the subset of triples from > the resource, that are addressed. > > 2. [1] is quite basic and they are basically using position and > lines. I made a qualitative comparison of different fragment id > approaches for text in [4] slide 7. > I was wondering if anybody has researched such properties of URI > fragments. Currently, I am benchmarking stability of these uris > using Wikipedia changes. > Has such work been done before? > > 3. @Alex: In my opinion, your proposed fragment ontology can only > be used to provide documentation for different fragments. > I would rather propose to just use one triple: > <http://www.w3.org/DesignIssues/LinkedData.html#offset__14406-14418> > a <http://nlp2rdf.lod2.eu/schema/string/OffsetBasedString> > The ontology I made for Strings might be generalized for formats > other than text based [5] > One triple is much shorter. As you can see I also tried to encode > the type of fragment right into the fragment "offset", although a > notation like "type=offset" might be better. > > 4. @Michael: is there some standardisation respective URIs for > text going on? > I heard there would be a Language Technology W3C group. The approach > by Wilde and Dürst[1] seems to lack stability. > Do you think we could do such standardisation for document fragments > and text fragments within the Media Fragments Group[3] ? > I really thought the liveUrl project was quite good, but it seems > dead[6]. > > > In LOD2[7] and NIF[8] we will need some fragment identifiers to > Standardize NLP tools for the LOD2 stack. > It would be great to reuse stuff instead of starting from scratch. I > had to extend [1] for example, because it did not produce stable > uris and also it did not contain the type of algorithm used to > produce the URI. > > All the best, > Sebastian > > > [1] http://tools.ietf.org/html/rfc5147 > [2] http://tools.ietf.org/html/draft-hausenblas-csv-fragment > [3] http://www.w3.org/TR/media-frags/ > [4] http://www.slideshare.net/kurzum/nif-nlp-interchange-format > [5] http://nlp2rdf.lod2.eu/schema/string/ > [6] http://liveurls.mozdev.org/index.html > [7] http://lod2.eu > [8] http://aksw.org/Projects/NIF > > Am 04.08.2011 22:37, schrieb Michael Hausenblas: >> >> >> Alex, >> >>> Has something already done this? Is it even (mostly?) sane? >> >> Sane yes, IMO. Done, sort of, see: >> >> + URI Fragment Identifiers for the text/plain [1] >> + URI Fragment Identifiers for the text/csv [2] >> >> Cheers, >> Michael >> >> [1] http://tools.ietf.org/html/rfc5147 >> [2] http://tools.ietf.org/html/draft-hausenblas-csv-fragment >> >> -- >> Dr. Michael Hausenblas, Research Fellow >> LiDRC - Linked Data Research Centre >> DERI - Digital Enterprise Research Institute >> NUIG - National University of Ireland, Galway >> Ireland, Europe >> Tel. +353 91 495730 >> http://linkeddata.deri.ie/ >> http://sw-app.org/about.html >> >> On 4 Aug 2011, at 14:22, Alexander Dutton wrote: >> >>> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> Hi all, >>> >>> Say I have an XML document, <http://example.org/something.xml>, >>> and I >>> want to talk about about some part of it in RDF. As this is XML, >>> being >>> able to point into it using XPath sounds ideal, leading to >>> something like: >>> >>> <#fragment> a fragment:Fragment ; >>> fragment:within <http://example.org/something.xml> ; >>> fragment:locator "/some/path[1]"^^fragment:xpath . >>> >>> (For now we can ignore whether we wanted a nodeset or a single node, >>> and how to handle XML namespaces.) >>> >>> More generally, we might want other ways of locating fragments >>> (probably with a datatype for each): >>> >>> * character offsets / ranges >>> * byte offsets / ranges >>> * line numbers / ranges >>> * some sub-rectangle of an image >>> * XML node IDs >>> * page ranges of a paginated document >>> >>> Some of these will be IMT-specific and may need some more thinking >>> about, but the idea is there. >>> >>> >>> Has something already done this? Is it even (mostly?) sane? >>> >>> >>> Yours, >>> >>> Alex >>> >>> >>> NB. Our actual use-case is having pointers into an NLM XML file >>> (embodying a journal article) so we can hook up our in-text >>> reference >>> pointer¹ URIs to the original XML elements (<xref/>s) they were >>> generated from. This will allow us to work out the context of each >>> citation for use in further analysis of the relationship between the >>> citing and cited articles. >>> >>> ¹ See >>> <http://opencitations.wordpress.com/2011/07/01/nomenclature-for-citations-and-references/ >>> > >>> for an explanation of the terminology. >>> >>> - -- >>> Alexander Dutton >>> Developer, data.ox.ac.uk, InfoDev, Oxford University Computing >>> Services >>> Open Citations Project, Department of Zoology, University >>> of Oxford >>> -----BEGIN PGP SIGNATURE----- >>> Version: GnuPG v1.4.11 (GNU/Linux) >>> Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ >>> >>> iEYEARECAAYFAk46nS4ACgkQS0pRIabRbjDVZQCdGblvoMgNqEietlE5EwAkPJY8 >>> pikAn2KApM0HjcXj6TZegA+Dek/DJIQX >>> =UcCr >>> -----END PGP SIGNATURE----- >>> >>> >> >> > > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann > Research Group: http://aksw.org
Received on Tuesday, 16 August 2011 05:13:30 UTC