Re: Proposal to extend rangeIncludes of DataTypes predicates in schema.org

This is a nice analysis, thanks.
I support the approach of making the xsd datatypes subtypes of  datatypes 
schema.org. But we need authority to do so, ---again the request is at the 
sponsors of schema.org  like to declare that they accept 
XSD-datatype-typed RDF literals in addition to plain string literal. This 
is may be hard to achieve.

Yes, the proposal originated from the errors found with Google Structured 
Data Testing Tool. Actually every one can put whatever string between the 
quotes, in object of birthDate for instance. This generates errors in 
pages displays as well. No automatic tests/constrains are possible there 
as the expected value is always a string. It can be 
2015-05-29T16:00:00+0200  or Marc-05-29T16:00:00+0200 or xyz.
The strong solution is to add xsd:date in the range of birthDate. 
Of course others can maybe test this , but I found the no-harm solution 
was to extend rangeIncludes objects for key datatype predicates in 
schema.org (birthDate, startDate, endDate and schema:value). Others 
predicates are not concerned  here, but I assume there will have the same 
issue and extended upon a real need.

Kind Regards,

Marc Twagirumukiza | Agfa HealthCare
Senior Clinical Researcher | HE/Advanced Clinical Applications Research
T  +32 3444 8188 | M  +32 499 713 300

http://www.agfahealthcare.com
http://blog.agfahealthcare.com
Click on link to read important disclaimer: 
http://www.agfahealthcare.com/maildisclaimer 



From:   "martin.hepp@ebusiness-unibw.org" 
<martin.hepp@ebusiness-unibw.org>
To:     Marc Twagirumukiza/AXPZC/AGFA@AGFA
Cc:     W3C Web Schemas Task Force <public-vocabs@w3.org>
Date:   10/11/2014 13:28
Subject:        Re: Proposal to extend rangeIncludes of DataTypes 
predicates in schema.org



As an alternative, you could also make the schema.org datatypes subtypes 
of the xsd datatypes.

But IMO, the most important thing is to separate two things:

1. You want that major search engines tolerate XSD datatype information 
instead of plain strings for schema.org properties in RDFa. This must be 
implemented by the search engines; augmenting schema.org would not 
automatically mean that all consumers will handle that properly (though 
one could argue that a compliant consumer should do so, then).

2. It may also be that you want to encourage Web sites to add XSD datatype 
information to schema.org properties. This will not work, IMO, because
a) this is impossible in Microdata syntax (no datatype at instance level)
b) it has been one of the top data-quality issues in pre-schema.org times; 
you can simply not expect this to happen at broad scale.

So an RDF-based consumer will have to apply heuristics and data cleansing 
anyway, for all Microdata-based markup and for data that lacks datatype 
information.

For challenge #1 above, a confirmation that XSD datatype information is 
fine in RDFa and JSON-LD data would be more effective and sufficient. 
Adding it to schema.org adds only complexity for Web developers, since it 
introduces traditional Semantic Web architecture legacy. By design, 
schema.org is only loosely coupled with the Semantic Web vision - it does 
not try to break things for the Semantic Web world, but it also avoids 
signing up to details of the current state of Semantic Web architecture.
 
Have you tested whether the Google Structured Data Testing Tool complains 
about XSD datatype information? If not, we would have a non-issue.

Back in the GoodRelations-in-RDFa age, Google would have tolerated but not 
required XSD datatype information.

Martin


On 10 Nov 2014, at 11:44, Marc Twagirumukiza <marc.twagirumukiza@agfa.com> 
wrote:

> I agree, RDF-based consumers of data can fix the data by adding datatype 
information from the vocabulary to literals (e.g. with a SPARQL CONSTRUCT 
rule). 
> This approach has been used for a while but semantically saying it has 
many limitations. 
> The best approach -I am convinced is to extend rangeIcludes objects. I 
would suggest not adding them systematically (only where needed) on the 
predicates like birthDate, startDate, endDate. 
> This will not be a problem for other consumers will continue to use the 
traditional way. 
> The other approache my help as well: the sponsors of schema.org declare 
that they accept XSD-datatype-typed RDF literals in addition to plain 
string literals. 
> 
> Kind Regards,
> 
> Marc 
> 
> 
> 
> From:        "martin.hepp@ebusiness-unibw.org" 
<martin.hepp@ebusiness-unibw.org> 
> To:        Marc Twagirumukiza/AXPZC/AGFA@AGFA 
> Cc:        W3C Web Schemas Task Force <public-vocabs@w3.org> 
> Date:        10/11/2014 11:13 
> Subject:        Re: Proposal to extend rangeIncludes of DataTypes 
predicates in schema.org 
> 
> 
> 
> Yes, but on the other hand, the RDF approach of forcing to have datatype 
information at the level of each individual literal is IMO flawed. It can 
be explained from the history of RDF, with the need to work with data that 
has no vocabulary (then you need to know the datatype from the literal). 
So I am not convinced that it is really worthwile to go that road. Rather, 
RDF-based consumers of data should fix the data by adding datatype 
information from the vocabulary to literals (e.g. with a SPARQL CONSTRUCT 
rule).
> 
> The only use-case where I see a need for datatype information attached 
to literal values is when the vocabulary allows multiple datatypes that 
the client could not distinguish automatiocally (e.g. think of a property 
that allows a xsd:string and xsd:boolean - then "True" may be a string or 
a boolean value). But that is a rare exception.
> 
> Martin
> 
> 
> 
> On 10 Nov 2014, at 11:02, Marc Twagirumukiza 
<marc.twagirumukiza@agfa.com> wrote:
> 
> > Hi Martin, 
> > Yes, it can also help if the sponsors of schema.org declare that they 
accept XSD-datatype-typed RDF literals in addition to plain string literal 
so the publisher can use XSD datatypes for typed literals in RDFa without 
the need to change schema.org. 
> > However this is not 'yet' done so far  and looking at some examples 
given in git repo (
https://github.com/rvguha/schemaorg/blob/master/data/examples.txt), it 
shows that only plein-literal are accepted. That was the rationale of the 
proposal. 
> > One of other approach can help to solve the problem. 
> > 
> > Kind Regards,
> > 
> > Marc Twagirumukiza | Agfa HealthCare
> > Senior Clinical Researcher | HE/Advanced Clinical Applications 
Research
> > T  +32 3444 8188 | M  +32 499 713 300
> > 
> > http://www.agfahealthcare.com
> > http://blog.agfahealthcare.com
> > Click on link to read important disclaimer: 
http://www.agfahealthcare.com/maildisclaimer 
> > 
> > 
> > 
> > From:        "martin.hepp@ebusiness-unibw.org" 
<martin.hepp@ebusiness-unibw.org> 
> > To:        Marc Twagirumukiza/AXPZC/AGFA@AGFA 
> > Cc:        W3C Web Schemas Task Force <public-vocabs@w3.org> 
> > Date:        10/11/2014 10:53 
> > Subject:        Re: Proposal to extend rangeIncludes of DataTypes 
predicates in schema.org 
> > 
> > 
> > 
> > While this is an interesting direction, I think this would rather 
increase confusion - most schema.org datatypes have very close XSD 
counterparts, and an RDF-based consuming client could easily use 
heuristics to map from a schema.org datatype to an XSD datatype. 
> > 
> > Listing e.g. xsd:int and xsd:integer in parallel to schema.org:Number 
would make the human-readable display of schema.org more confusing.
> > 
> > It was a design decision of schema.org back then to use a 
self-contained meta-model, with datatypes and ontology language components 
(like rangeIncludes, sameAs, Class, Property,...) being defined locally 
inside schema.org. There are pros and cons for this approach, but in any 
case it is the established base.
> > 
> > Also, I think it would be sufficient if the sponsors of schema.org 
declare that they accept XSD-datatype-typed RDF literals in addition to 
plain string literals.
> > 
> > Then a publisher can use XSD datatypes for typed literals in RDFa 
without the need to change schema.org.
> > 
> > Martin
> > 
> > 
> > On 10 Nov 2014, at 10:08, Marc Twagirumukiza 
<marc.twagirumukiza@agfa.com> wrote:
> > 
> > > Hello there, 
> > > I would like to open discussions about the proposal of extending all 
dataTypes predicates ranges by their corresponding XSD classes. 
> > > This will keep 'somehow' the semantic use of the dataType predicates 
of schema.org. 
> > > 
> > > Most concerned properties are: 
> > > -birthdate:  http://schema.org/birthDate 
> > > -startDate:  http://schema.org/startDate 
> > > -endDate:  http://schema.org/endDate 
> > > 
> > > (I understand that other dataTypes properties' ranges may needs to 
be harmonised as well,  but this can be done upon specific need). 
> > > 
> > > Also for 
> > > -value:  http://schema.org/value 
> > > 
> > > Eg. For birthDate: 
> > > 
> > > <div typeof="rdf:Property" resource="http://schema.org/birthDate">   
 
> > >      <span class="h" property="rdfs:label">birthDate</span> 
> > >      <span property="rdfs:comment">Date of birth.</span> 
> > >      <span>Domain: <a property="http://schema.org/domainIncludes" 
href="http://schema.org/Person">Person</a></span> 
> > >      <span>Range: <a property="http://schema.org/rangeIncludes" 
href="http://schema.org/Date">Date</a></span> 
> > >   + <span>Range: <a property="http://schema.org/rangeIncludes" 
href="http://www.w3.org/2001/XMLSchema#date">xsd:Date</a></span> 
> > > </div> 
> > > 
> > > I would like having your thoughts on this, before I can send the 
proposal in Git repo. 
> > > 
> > > Kind Regards,
> > > 
> > > Marc Twagirumukiza | Agfa HealthCare
> > > Senior Clinical Researcher | HE/Advanced Clinical Applications 
Research
> > > T  +32 3444 8188 | M  +32 499 713 300
> > > 
> > > http://www.agfahealthcare.com
> > > http://blog.agfahealthcare.com
> > > Click on link to read important disclaimer: 
http://www.agfahealthcare.com/maildisclaimer 
> > > 
> > > Message Recall Request
> > > Request date:                 10/11/2014 
> > > Message Recalled:                  Link
> > > No report requested 
> > > 
> > 
> > 
> > 
> 
> 
> 

Received on Monday, 10 November 2014 13:09:39 UTC