Re: Web annotated by freebase concepts

I don't know the specific individuals at Google, but I do know people integrally involved with the development of the CLUEWeb corpus, as well as other people doing NLP/text mining at Google. I'd be happy to forward this thread on and keep you posted with any response.

BTW I am still working (in irregular and infrequent bursts) on trying to align OA with other text annotation models for corpora.  Here is my current work-in-progress: https://github.com/dbcls/bh13/wiki/Annotation-data-model  Any comments welcome. One key representational choice that many text/corpus annotation models assume is that a document can be divided into segments/sections/paragraphs and annotation spans might be relative to those segments.

Best regards,
Karin

On 18/07/2013, at 9:35 PM, Dan Whaley <dwhaley@hypothes.is<mailto:dwhaley@hypothes.is>> wrote:

http://googleresearch.blogspot.com/2013/07/11-billion-clues-in-800-million.html

If anyone knows them, it might be intriguing to know if they've looked at OA at all.

D


________________________________

The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

Received on Thursday, 25 July 2013 13:41:58 UTC