W3C home > Mailing lists > Public > public-bioschemas@w3.org > December 2020

Re: FW: a Schema Markup Validator: adopting SDTT for validator.schema.org

From: Dan Brickley <danbri@danbri.org>
Date: Tue, 15 Dec 2020 16:31:23 +0000
Message-ID: <CAFfrAFq85h19wJ45B2T12VxzCVq71sypshfunFQXeKmT1arcQg@mail.gmail.com>
To: "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>
Cc: "public-bioschemas@w3.org" <public-bioschemas@w3.org>
I think the Schemarama (shapes validation) piece in particular, should be a
good fit for bioschemas

demo - http://schemarama-demo.site/
code - https://github.com/google/schemarama

Do you have some shex/shacl shape files we could test it with?


On Tue, 15 Dec 2020 at 16:00, Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk>
wrote:

> This will be of interest to our community.
>
>
>
> Alasdair
>
>
>
> --
>
> Alasdair J G Gray
>
> Associate Professor in Computer Science,
> School of Mathematical and Computer Sciences
> Heriot-Watt University, Edinburgh, UK.
>
> Email: A.J.G.Gray@hw.ac.uk <A.J.G.Gray@hw.ac.uk>
> Web: http://www.macs.hw.ac.uk/~ajg33
> ORCID: http://orcid.org/0000-0002-5711-4872
> Office: Earl Mountbatten Building 1.39
> Twitter: @gray_alasdair
>
>
>
>
>
> Heriot-Watt is a global University, as a result my working hours may not
> be your working hours. Do not feel pressure to reply to this email outside
> your working hours.
>
>
>
>
>
> To arrange a meeting: https://doodle.com/mm/alasdairgray/book-a-time
>
>
>
>
>
> *From: *"danbri@google.com" <danbri@google.com>
> *Date: *Tuesday, 15 December 2020 at 15:50
> *To: *"public-schemaorg@w3.org" <public-schemaorg@w3.org>, Tom Marsh <
> tmarsh@exchange.microsoft.com>, St├ęphane Corlosquet <scorlosquet@gmail.com>,
> Yuliya Tihohod <tilid@yandex-team.ru>, "R.V. Guha" <guha@google.com>,
> Nicolas Torzec <torzecn@oath.com>
> *Subject: *a Schema Markup Validator: adopting SDTT for
> validator.schema.org
> *Resent from: *"public-schemaorg@w3.org" <public-schemaorg@w3.org>
> *Resent date: *Tuesday, 15 December 2020 at 15:46
>
>
>
>
> ***************************************************************** *
> *Caution: This email originated from a sender outside Heriot-Watt
> University. Do not follow links or open attachments if you doubt the
> authenticity of the sender or the content. *
> * *****************************************************************
>
>
>
>
>
> Schema.org folks (steering group, community group, everyone...),
>
>
>
> https://github.com/schemaorg/schemaorg/issues/2790 tracks a proposal for
> a validator.schema.org tool, to be based on Google SDTT, and to be
> accompanied by opensource collaboration on data shape validation and parser
> interoperability.
>
>
>
> Today my Google colleagues are sharing Google's plans for the future of
> the Google Structured Data Testing Tool (SDTT) - see
> https://developers.google.com/search/blog/2020/12/structured-data-testing-tool-update.
> The intent is to rework it into a vendor-neutral tool that can continue to
> serve as a markup syntax checker for JSON-LD, Microdata, RDFa as used by
> the communities around Schema.org. Although it could live on its own
> independent domain, it would make a great addition to the Schema.org site,
> and I would like to proceed in that direction in 2021, as part of Google's
> long term commitment to hosting the Schema.org site and keeping it relevant
> for schema.org users.
>
>
>
> The basic idea is that the service now known as "Google Structured Data
> Testing Tool" would stop making Google-product-specific data checks, but
> continue  - as "Schema Markup Validator" - to serve as a robust tool for
> checking JSON-LD, Microdata and RDFa schema markup. No validator (or
> schema.org parser) is perfect, so part of this work will involve
> documenting any shortcomings in the parsers/validators, and collaboration
> with opensource implementers and standards makers towards improving the
> ecosystem for everyone.
>
>
>
> In addition to syntax validation, there is also the more futuristic topic
> of "shape validation". For those unfamiliar with this distinction, syntax
> validation is about helping publishers get the basic structure of JSON-LD,
> Microdata, RDFa correct, whereas shape validation is about looking at the
> extracted structured data and comparing it to the documented needs of
> various online services, to see which features or tools it might be
> eligible for. SDTT currently performs its own version of "shape checking"
> to identify markup that matches the shapes needed by Google features, as
> listed in https://developers.google.com/search/docs/guides/search-gallery.
> However the intent is to turn this functionality *off*, so that the
> testing tool becomes a simpler vendor-neutral offering focussed on
> correctness of markup *syntax*.
>
>
>
> In addition to adopting a "degooglified" SDTT as a syntax-level "Schema
> Markup Validator", I would also like in 2021 to continue some collaboration
> around shape validation. This is the idea of using relatively new web
> standards (shacl, shex) to check structured data for matching specific data
> patterns or "shapes". See https://en.wikipedia.org/wiki/SHACL and
> https://en.wikipedia.org/wiki/ShEx, or the  free online book "Validating
> RDF Data", https://book.validatingrdf.com/. Google recently opensourced
> some Javascript software <https://github.com/google/schemarama/> in this
> area, which brings together other opensource tooling to create a shape
> validation system using both ShEx and SHACL. While it looks superficially
> like SDTT, the focus is different: there is no syntax-level validation
> (which is why the plan outlined above for SDTT is useful). Over time, we
> can explore ways of integrating these different kinds of validation, but we
> can make some very useful, simpler steps first by giving a reworked SDTT a
> home under Schema.org.
>
>
>
> I've linked some more detailed notes on SDTT from the issue at
> https://github.com/schemaorg/schemaorg/issues/2790 - or see https://docs.google.com/document/d/1q8z_rRJepiz4Os_KcEs3NaCVEm3US5l-qYL14JmE0To/edit#
>
> <https://docs.google.com/document/d/1q8z_rRJepiz4Os_KcEs3NaCVEm3US5l-qYL14JmE0To/edit>directly.
> Feel free to follow up here, in Github or the doc, ...
>
>
>
> cheers,
>
>
>
> Dan
> ------------------------------
>
> Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With
> campuses and students across the entire globe we span the world, delivering
> innovation and educational excellence in business, engineering, design and
> the physical, social and life sciences. This email is generated from the
> Heriot-Watt University Group, which includes:
>
>    1. Heriot-Watt University, a Scottish charity registered under number
>    SC000278
>    2. Heriot- Watt Services Limited (Oriam), Scotland's national
>    performance centre for sport. Heriot-Watt Services Limited is a private
>    limited company registered is Scotland with registered number SC271030 and
>    registered office at Research & Enterprise Services Heriot-Watt University,
>    Riccarton, Edinburgh, EH14 4AS.
>
> The contents (including any attachments) are confidential. If you are not
> the intended recipient of this e-mail, any disclosure, copying,
> distribution or use of its contents is strictly prohibited, and you should
> please notify the sender immediately and then delete it (including any
> attachments) from your system.
>
Received on Tuesday, 15 December 2020 16:31:54 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 15 December 2020 16:31:55 UTC