- From: Christian Chiarcos <christian.chiarcos@web.de>
- Date: Tue, 19 May 2020 01:03:44 +0200
- To: Gerben <gerben@treora.com>, "Luca De Santis" <desantis@netseven.it>
- Cc: public-openannotation@w3.org, "chiarcos@informatik.uni-frankfurt.de" <chiarcos@informatik.uni-frankfurt.de>
- Message-ID: <op.0kt40iy5br5td5@kitaba>
Dear Luca, dear all, this is remotely related only, but in the LD4LT CG (https://www.w3.org/community/ld4lt), we're in the process of discussing an extension of Web Annotation for the requirements of language technology on the web, largely based on a harmonization between Web Annotation, the NLP Interchange Format and several ISO TC37 standards, and with use cases in language technology and DH. This will include a reconsideration of the WA API specifications (using WA and https://persistence.uni-leipzig.org/nlp2rdf/specification/api.html as starting points), and any input or feature requests would be welcome. We're still in the process of requirement analysis, with an intermediate survey under https://github.com/ld4lt/linguistic-annotation/blob/master/survey/required-features.md. This survey did not tackle the API, yet, but so far focused on the vocabulary. Best, Christian Am .05.2020, 19:43 Uhr, schrieb Luca De Santis <desantis@netseven.it>: > Dear Gerben, > thank you very much for your answer. > > What you are proposing is quite intriguing, albeit my use case is much > simpler. > > In particular in the Triple research project my company is involved in > (https://www.gotriple.eu/), we are integrating our Pundit annotation > tool. > In Triple we are building a discovery platform for “content” (e.g. > articles) related to Social Science and Humanities. In Triple we would > like to show if an >article has been annotated with Pundit (and possibly > with other interoperable annotation tools). > > We need a very simple API on our Annotation Server that, given the > document URL, returns the annotations in it.We already have our own API > for this, as Hypothes.is has (see > https://hypothes.is/api/search?uri=<..>). > I just wondered if there is a more Web Annotation Protocol-savvy way to > do that. > > For what I understand, also reading the first part of your email, the > answer is no, which IMHO is quite a pity. > Since the Annotation Container is quite a handy concept, we were > thinking of implementing a new API for retrieving annotations based on ( > https://>www.w3.org/TR/annotation-protocol/#representations-with-annotation-descriptions > ), adding a url parameter to filter only those belonging to that > >specific target. > The goal was to try and being more compliant as possible to the Web > Annotation standard, but it seems that there isn’t a 100% savvy way of > >implementing our use case. > > Am I wrong? Any other idea? > > TIA! > > Sincerely, > Luca > > > > >> Il giorno 18 mag 2020, alle ore 16:28, Gerben <gerben@treora.com> ha >> scritto: >> >> >> Hello Luca and all, >> >> Great that you bring this up; I have been intending to send a similar >> email to this list in the near future. Hereby! >> >> As for the Web Annotation specs (tl;dr: looks like the WG never got >> around to spec a search method) >> >> I have not been involved in the standardisation process, but my >> understanding has been that the Web Annotation Protocol was made to >> >>define how to interact with an “Annotation Container”, but not how to >> find (a URL for) a container[1]. I suppose that defining a search >> >>protocol could be boiled down to defining a URL template for >> containers, plus possibly defining a vocabulary to add search-specific >> >>information such as the relevancy of each result. >> >> Specifying search appears to have been discussed in 2015 in issue 48 >> “Support for search”[2], with a resolution made in a call[3]: >>> >>> RESOLUTION: The WG will consider a separate document defining a >>> non-exclusive search interface to be published at >>>least as a Note >>> and potentially part of Protocol >> >> As far as I can see, this plan did not turn into anything; I asked >> Benjamin Young about this, perhaps he (or other WG members) will follow >> >>up about what happened to this plan. >> >> As for ways forward (tl;dr: should we extend OpenSearch or something?) >> >> Personally I have been planning to take another stab at creating >> interoperable annotation services (I poked a bit at this nearly six >> years >>ago[4]). I plan to start with making a simple browser extension >> that lets the user subscribe to multiple annotation sources to receive >> their >>annotations; much like an RSS/Atom feed aggregator. It would >> contain a discovery mechanism (probably via <link> tags) so that the >> user >>can discover annotation services by visiting their websites >> (again, much like with RSS/Atom). It would appear like a button to >> subscribe to >>a blog, but now you can take it with you and get its >> content in context wherever you go on the web (who is following whom >> then?). :) >> >> Different than with RSS/Atom, one would need the ability to search for >> a subset of relevant annotations, especially to get annotations >> >>targeting a given page. For this the most obvious prior art is >> OpenSearch[5]. It defines how a small XML file can be used to describe >> how >>to query a search engine. The file is discoverable through e.g. a >> <link rel="search"> tag, so that e.g. browsers can offer the user >>to >> use that search provider. The description would have an URL template to >> specify the endpoint to use, for example GitHub’s description >> >>document contains this line[6]: >>> >>> <Url type="text/html" method="get" >>> template="https://github.com/search?q=>>>{searchTerms}&ref=opensearch"/> >> >> OpenSearch is designed to be extended and allows arbitrary parameters >> using xml namespaces, so we could introduce new parameter >>types as >> needed. In particular we would need the ability to pass a target URL >> instead of (or besides) the {searchTerms} parameter, >>plus any desired >> filters for author, date, etcetera. The URI template allows using >> custom namespaces[7], so we could invent something like >>this (taking >> the liberty of assuming a new vocabulary at >> "http://www.w3.org/ns/wap#"): >>> >>> <Url >>> type="application/ld+json;profile="http://www.w3.org/ns/anno.jsonld" >>> method="get" >>> xmlns:wap="http://www.w3.org/ns/wap#" >>> xmlns:oa="http://www.w3.org/ns/anno.jsonld#" >>> template="https://example.org/annotations?uri={wap:target?}&t=>>>{wap:createdAfter?}&by={oa:creator?}&q={searchTerms?}" >>> /> >> >> Some advantages of extending OpenSearch, that I can think of: >> We’d be extending an ecosystem instead of reinventing the wheel. >> Many aspects such as searching for text queries have already been >> defined, and will be understood by existing tools, which >>should make >> text search among annotations work out of the box with existing >> browsers or meta-search engines. >> OpenSearch descriptors can specify any (and multiple) response formats, >> each with its own URL template; a search server could >>thus provide an >> endpoint to get the search results as an Annotation Container, and >> another endpoint to obtain results in an HTML >>page, or Atom, etc. >> Autodiscovery of search services is part of the spec, so e.g. a website >> can include a <link rel="search" …> element to >>announce its >> annotation service. >> >> But also some possible disadvantages: >> Although it might be the most popular standard for describing search >> endpoints, OpenSearch nowadays lacks a website or an >>organisation >> behind it, and appears mostly dormant since many years now. Trying to >> help blow life back into it seems possibly >>worthwhile but a big step. >> Adopting an existing spec introduces more complexity than may be >> required. For example, descriptors are expressed in XML, thus >>any >> tool would have to be able to parse XML to use it. >> It seems mainly designed for public, gratis search services. One may >> for example want a way to describe authentication methods to >>get a >> personal(ised) annotation feed. For mechanisms beyond just putting a >> secret code into the URL template, this may require >>another (ideally >> orthogonal) OpenSearch extension. >> In many cases one may want to describe more capabilities (e.g. creating >> annotations) that may seem inappropriate to shoehorn into >>OpenSearch; >> and if one find/creates separate ‘annotation service descriptor’ spec >> for those purposes, it is tempting to just >>describe the search >> endpoints in there. >> >> I would be very open to other suggestions than extending OpenSearch; it >> just seemed the most fitting solution I found so far. But perhaps >> >>some approach that makes use of the linked data ecosystem, like >> Linked Data Fragments[8], would be more a natural fit. Does anyone have >> >>tips? >> >> Also it seems important to think about the bigger picture of annotation >> search services. While in a typical use case one may want to >>discover >> annotations from multiple sources on the web pages one visits, it seems >> undesirable to have to query each source with the URL >>of each page >> (again the question: who’s following whom?). To improve both on privacy >> and efficiency, I imagine one could use a trusted >>aggregation service >> that queries sources on behalf of the user, and which moreover might >> not run search queries but rather crawl (or >>subscribe to) the >> annotation services to get the content in bulk; somewhat like usual web >> search engines, except the user specifies which >>sources to crawl. >> While both the sources and the aggregator could in theory based on the >> same search protocol, such an architecture may >>be better of with >> extra protocol features both for the annotation sources (to support >> bulk annotation crawling/subscribing), and to the >>aggregators (e.g. a >> method to add a new source to subscribe to). >> >> Whichever the approach will be, I think it would be great to >> collaborate to make some sort of interoperable annotation ecosystem. >> >>Thoughts welcome! >> >> — Gerben >> >> >> >> [1]: Except one discovery mechanism, serving only to let a resource >> announce that “Annotations on the resource SHOULD be created >>within >> the referenced Container”: >> https://www.w3.org/TR/annotation-protocol/#discovery-of-annotation-containers >> [2]: https://github.com/w3c/web-annotation/issues/48 >> [3]: https://www.w3.org/2015/12/16-annotation-minutes.html#item03 >> [4]: See https://web.hypothes.is/blog/supporting-open-annotation/ >> [5]: >> https://github.com/dewitt/opensearch/blob/master/opensearch-1-1-draft-6.md >> ; https://en.wikipedia.org/wiki/OpenSearch ; >> https://>>web.archive.org/web/20180421215752/http://www.opensearch.org/Home >> [6]: https://github.com/opensearch.xml >> [7]: >> https://github.com/dewitt/opensearch/blob/master/opensearch-1-1-draft-6.md#fully-qualified-parameter-names >> ; related: >> https://>>web.archive.org/web/20180408193434/http://www.opensearch.org/Specifications/OpenSearch/Extensions/Parameter/1.0 >> [8]: https://linkeddatafragments.org/ >> >> >> On 13/05/2020 22:37, Luca De Santis wrote: >>> Dear all,I’m Luca De Santis of Net7, the company behind the Pundit >>> annotation tool ( https://thepund.it ). >>> We are currently working on some updates of our tool. Amongst them, we >>> are planning to develop an endpoint that >>>supports, in read-only >>> mode, the Web Annotation Protocol (WAP). Currently Pundit is compliant >>> to the Web Annotation >>>Data Model (well, quite compliant…). >>> >>> Basically the APIs that we’d like to implement are: >>> 1. the (filtered) retrieval of “a group” of annotations >>> 2. the retrieval of a single annotation. >>> >>> No problem for point 2, which is pretty clear. >>> >>> Point 1, which corresponds in our use case to a “search for >>> annotations”, is not completely clear to me.In fact, while the concept >>> of “Annotation Containers” is very handy, I haven't seen a WAP >>> compliant mode to pass >>>parameters to filter results.Some examples >>> of these parameters: >>> - the URI of the target document >>> - some conditions (e.g.: on author, date, etc). >>> >>> Is there any standardization of the possible parameters to pass to >>> filter annotations in a container?In particular we are planning to >>> implement this method >>> https://www.w3.org/TR/annotation-protocol/#representations-with->>>annotation-descriptions >>> . >>> >>> Other tools/services like Hypothes.is or Europeana seem to have >>> implemented a specific search endpoint (e.g. >>> https://>>>hypothes.is/api/search?uri=https://www.repubblica.it ), but >>> if there is a clean and WAP complaint way to implement this >>>feature >>> I'd stick with it. >>> >>> Any idea on that? Thanks in advance. >>> >>> Regards, >>> Luca De Santis >>> >>> -- >>> ------------------------------------------------------------------------------------------------ >>> Luca De Santis / Chief Technology Officer >>> desantis@netseven.it www.netseven.it >>> +39 050 55 25 74 +39 335 7376 153 >>> skype: lucadex >>> >>>>>> <logomail.png> >>> >>>>>> via G. Carducci 60 | 56017 Ghezzano (PI) - Italy >>> >>> P.Iva e CF 01577590506 >>> CCIAA di Pisa n. 01577590506 del 26/04/2001 >>> Capitale Sociale 10.000,00 € >>> ------------------------------------------------------------------------------------------------ >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> > > -- > ------------------------------------------------------------------------------------------------ > Luca De Santis / Chief Technology Officer > desantis@netseven.itwww.netseven.it > +39 050 55 25 74 +39 335 7376 153 > skype: lucadex > >>> >> via G. Carducci 60 | 56017 Ghezzano (PI) - Italy > > P.Iva e CF 01577590506 > CCIAA di Pisa n. 01577590506 del 26/04/2001 > Capitale Sociale 10.000,00 € > ------------------------------------------------------------------------------------------------
Received on Monday, 18 May 2020 23:04:04 UTC