W3C home > Mailing lists > Public > public-schemaorg@w3.org > December 2020

Re: a Schema Markup Validator: adopting SDTT for validator.schema.org

From: Hans Polak <info@polak.es>
Date: Fri, 18 Dec 2020 10:38:36 +0100
To: public-schemaorg@w3.org
Message-ID: <5decc3f7-2743-eca0-f155-c2fabf40ea73@polak.es>
Good morning,

The <Schema Generator <https://schema.pythonanywhere.com/>> tool creates 
(mostly) valid syntax for schema.org. One of the things I wanted to do 
was to integrate it with a validator tool. I hope there will be an easy 
way to access that feature.

When I test schemas, I oftentimes see the "Google-product-specific data 
checks". These are helpful for adding required (by Google) properties 
and getting the data in the correct format.

Will there be an API (or something similar) that can take a schema and 
return the errors / suggestions?

Yours sincerely,
Hans Polak

On 15/12/20 16:46, Dan Brickley wrote:
> Schema.org folks (steering group, community group, everyone...),
> https://github.com/schemaorg/schemaorg/issues/2790 tracks a proposal 
> for a validator.schema.org <http://validator.schema.org> tool, to be 
> based on Google SDTT, and to be accompanied by opensource 
> collaboration on data shape validation and parser interoperability.
> Today my Google colleagues are sharing Google's plans for the future 
> of the Google Structured Data Testing Tool (SDTT) - see 
> https://developers.google.com/search/blog/2020/12/structured-data-testing-tool-update. 
> The intent is to rework it into a vendor-neutral tool that can 
> continue to serve as a markup syntax checker for JSON-LD, Microdata, 
> RDFa as used by the communities around Schema.org. Although it could 
> live on its own independent domain, it would make a great addition to 
> the Schema.org site, and I would like to proceed in that direction in 
> 2021, as part of Google's long term commitment to hosting the 
> Schema.org site and keeping it relevant for schema.org 
> <http://schema.org> users.
> The basic idea is that the service now known as "Google Structured 
> Data Testing Tool" would stop making Google-product-specific data 
> checks, but continue  - as "Schema Markup Validator" - to serve as a 
> robust tool for checking JSON-LD, Microdata and RDFa schema markup. No 
> validator (or schema.org <http://schema.org> parser) is perfect, so 
> part of this work will involve documenting any shortcomings in the 
> parsers/validators, and collaboration with opensource implementers and 
> standards makers towards improving the ecosystem for everyone.
> In addition to syntax validation, there is also the more futuristic 
> topic of "shape validation". For those unfamiliar with this 
> distinction, syntax validation is about helping publishers get the 
> basic structure of JSON-LD, Microdata, RDFa correct, whereas shape 
> validation is about looking at the extracted structured data and 
> comparing it to the documented needs of various online services, to 
> see which features or tools it might be eligible for. SDTT currently 
> performs its own version of "shape checking" to identify markup that 
> matches the shapes needed by Google features, as listed in 
> https://developers.google.com/search/docs/guides/search-gallery. 
> However the intent is to turn this functionality /off/, so that the 
> testing tool becomes a simpler vendor-neutral offering focussed on 
> correctness of markup /syntax/.
> In addition to adopting a "degooglified" SDTT as a syntax-level 
> "Schema Markup Validator", I would also like in 2021 to continue some 
> collaboration around shape validation. This is the idea of using 
> relatively new web standards (shacl, shex) to check structured data 
> for matching specific data patterns or "shapes". See 
> https://en.wikipedia.org/wiki/SHACL and 
> https://en.wikipedia.org/wiki/ShEx, or the  free online book 
> "Validating RDF Data", https://book.validatingrdf.com/. Google 
> recently opensourced some Javascript software 
> <https://github.com/google/schemarama/> in this area, which brings 
> together other opensource tooling to create a shape validation system 
> using both ShEx and SHACL. While it looks superficially like SDTT, the 
> focus is different: there is no syntax-level validation (which is why 
> the plan outlined above for SDTT is useful). Over time, we can explore 
> ways of integrating these different kinds of validation, but we can 
> make some very useful, simpler steps first by giving a reworked SDTT a 
> home under Schema.org.
> I've linked some more detailed notes on SDTT from the issue at 
> https://github.com/schemaorg/schemaorg/issues/2790 - or see 
> https://docs.google.com/document/d/1q8z_rRJepiz4Os_KcEs3NaCVEm3US5l-qYL14JmE0To/edit# 
> directly. Feel free to follow up here, in Github or the doc, ...
> cheers,
> Dan
Received on Friday, 18 December 2020 09:38:53 UTC

This archive was generated by hypermail 2.4.0 : Friday, 18 December 2020 09:38:54 UTC