- From: Karin Verspoor <Karin.Verspoor@nicta.com.au>
- Date: Thu, 25 Jul 2013 13:41:22 +0000
- To: Dan Whaley <dwhaley@hypothes.is>
- CC: "public-openannotation@w3.org" <public-openannotation@w3.org>
I don't know the specific individuals at Google, but I do know people integrally involved with the development of the CLUEWeb corpus, as well as other people doing NLP/text mining at Google. I'd be happy to forward this thread on and keep you posted with any response. BTW I am still working (in irregular and infrequent bursts) on trying to align OA with other text annotation models for corpora. Here is my current work-in-progress: https://github.com/dbcls/bh13/wiki/Annotation-data-model Any comments welcome. One key representational choice that many text/corpus annotation models assume is that a document can be divided into segments/sections/paragraphs and annotation spans might be relative to those segments. Best regards, Karin On 18/07/2013, at 9:35 PM, Dan Whaley <dwhaley@hypothes.is<mailto:dwhaley@hypothes.is>> wrote: http://googleresearch.blogspot.com/2013/07/11-billion-clues-in-800-million.html If anyone knows them, it might be intriguing to know if they've looked at OA at all. D ________________________________ The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.
Received on Thursday, 25 July 2013 13:41:58 UTC