Re: Index of Types -> Domains

Speaking not to where the domain usage counts on schema.org come from (Dan
Brickley might be able to address that), but this specifically:

> "... examples of domains implementing a particular Type or format in
their pages"

While the last structured data extraction was generated from a crawl now
more than a year old, the Web Data Commons (http://www.webdatacommons.org/)
does make statistics available on "RDFa, Microdata, Embedded JSON-LD, and
Microformats" found in each crawl, as well as the full corpus.  You can
find the most recent data here:

Web Data Commons Extraction Report - October 2016 Corpus
http://www.webdatacommons.org/structureddata/2016-10/stats/stats.html

If you have the desire to do so, you can also access and analyze Common
Crawl data yourself (data sets are generated monthly); learn more here:
http://commoncrawl.org/


On Fri, Nov 3, 2017 at 11:14 AM, David Pierce <david.dean.pierce@gmail.com>
wrote:

> I've seen this come up a bit in the SEO community where webmasters and
> researchers are trying to look for examples of domains implementing a
> particular Type or format in their pages.
>
> Moreover, when I look at the documentation for any given type--Product
> <http://schema.org/Product>, for example--I see a note about its usage: *Usage:
> Over 1,000,000 domains*
>
> How is this calculated, where is stored, and how might I make use of it?
> Has an index of Types to Domains already been built?
>
> If not, has anybody explored building one? At the outset, I imaging
> something complementing the aforementioned Type documentation. For each
> type, I want to see what sites include that in their schema.org markup,
> and what implementation they use (microdata vs. json+ld).
>
> If this has already been built, I'd love to make use of it for my own
> learning. If it hasn't...well, I suppose I might start working on building
> it.
>

Received on Friday, 3 November 2017 19:13:59 UTC