- From: Peter Mika <pmika@yahoo-inc.com>
- Date: Wed, 3 Jun 2015 10:11:47 +0000 (UTC)
- To: Nicolas Torzec <torzecn@yahoo-inc.com>, Dan Brickley <danbri@google.com>, Barry Carter <carter.barry@gmail.com>
- Cc: Phil Barker <phil.barker@hw.ac.uk>, "schema.org Mailing List" <public-schemaorg@w3.org>
- Message-ID: <339210772.4101781.1433326307636.JavaMail.yahoo@mail.yahoo.com>
It's somewhat outdated by now, but at Yahoo Labs we built an index over the WDC data which allows you to do structured queries: http://glimmer.research.yahoo.com/ The method is described in the following publication: Roi Blanco, Peter Mika, Sebastiano Vigna: Effective and Efficient Entity Search in RDF Data. International Semantic Web Conference (1) 2011: 83-97 The code is open source at: https://github.com/yahoo/Glimmer Enjoy,Peter On Tuesday, June 2, 2015 10:46 PM, Nicolas Torzec <torzecn@yahoo-inc.com> wrote: (message got stuck in my inbox yesterday) I am not sure about use case: e.g. how much do you care about freshness? Best bet is probably Google.- Yahoo doesn't have anything like this publicly available today.- I am not sure about Bing. Sindice used to have something like that but has gone out of business as far as I understand. See [1] for a recap. Did you look at Web Data Commons: [2] ? The structured data are extracted from the Common Crawl, openly licensed, and stored on S3 for convenience. One could build a stalled index on top of it if you care about random access and much about freshness? References: [1]: http://www.dataversity.net/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post/[2]: http://webdatacommons.org/ -Nicolas. On Monday, June 1, 2015 5:11 PM, Dan Brickley <danbri@google.com> wrote: On 1 June 2015 at 20:38, Barry Carter <carter.barry@gmail.com> wrote: > Phil, I was referring to google's public search engine at google.com Currently Custom Search would be your best bet w.r.t Google. Not sure what the other engines do. cheers, Dan > On Mon, 1 Jun 2015, Phil Barker wrote: > >> Date: Mon, 01 Jun 2015 19:35:15 +0100 >> From: Phil Barker <phil.barker@hw.ac.uk> >> To: public-schemaorg@w3.org >> Subject: Re: Query schema.org data? >> Resent-Date: Mon, 01 Jun 2015 18:35:51 +0000 >> Resent-From: public-schemaorg@w3.org >> >> >> With Google custom search engine, I think you should be able to choose >> option to ?Restrict Pages using Schema.org Types? to Place and to add a >> >> refinement something like more:p:Place-name:Texas >> >> On 01/06/15 19:12, Barry Carter wrote: >> Is it possible to query the data Google, Microsoft, Yahoo, etc >> collect from pages marked with schema.org tags? >> >> For example, I tried googling "[more:Place:name:Texas]" (no >> quotes) as >> quasi-suggested by: >> >> https://developers.google.com/custom-search/docs/structured_data >> >> but got no results. >> >> Of course, that page is specific to per-site custom queries, so >> my syntax may >> be wrong. >> >> Is there any generic way (on any of the search engines using >> schema.org) to >> search for documents that have a schema.org Place named Texas in >> them? >> >> [I realize schema.org itself has no data, but it still seemed to >> make a good post title] >> >> >> >> >> -- >> -- >> Phil Barker @philbarker >> LRMI, Cetis, ICBL http://people.pjjk.net/phil >> Heriot-Watt University >> >> Ubuntu: http://xkcd.com/456/ >> not so much an operating system as a learning opportunity. >> >
Received on Wednesday, 3 June 2015 10:13:14 UTC