RE: Questions regarding the Schema.org validator

Thanks, Gregg, Let’s see if I get the terminology:

[1] asserts that WPSideBar is subClassOf WebPageElement, and further assertions are made in that document that lets us infer using rule “rdfs9” that WPSideBar is also of type (an indirect subClassOf) WebPageElement, CreativeWork and Thing.

Back to the original question, how does this make WPSideBar a valid value for the member property of Organization that asserts via rangeIncludes that it must be an Organization or a Person?

--
Tony

From: Gregg Kellogg <gregg@greggkellogg.net>
Sent: Sunday, August 6, 2023 3:00 AM
To: Tony McCreath <tony@websiteadvantage.com.au>
Cc: Hédic Guibert <guiberthedic@gmail.com>; public-schemaorg@w3.org
Subject: Re: Questions regarding the Schema.org validator

On Aug 4, 2023, at 4:03 PM, Tony McCreath <tony@websiteadvantage.com.au<mailto:tony@websiteadvantage.com.au>> wrote:

Interesting stuff,

I found what I think are the RDFS entailment rules:

https://www.w3.org/TR/rdf11-mt/#rdfs-entailment


Greg, could you explain how WPSideBar was inferred?

WPSideBar is asserted data from your example, not inferred. Looking at the definition at https://schema.org/WPSideBar, we can see the subclass hierarchy Thing > CreativeWork > WebPageElement > WPSideBar, so we can infer those types. Specifically, “rdfs9” xxx rdf:subClassOf yyy . zzz rdf:type xxx . => zzz rdf:type yyy . If you look at the vocabulary definition in Turtle [1], you’ll see the rdfs:subClassOf relationships of the associated terms.

schema:WPSideBar a rdfs:Class ;
    rdfs:label "WPSideBar" ;
    rdfs:comment "A sidebar section of the page." ;
    rdfs:subClassOf schema:WebPageElement .

RDFS can also infer types based on rdfs:range and rdfs:domain, but those aren’t used in schema.org<http://schema.org>. Instead, schema:rangeIncludes and schema:domainIncludes are similar, but can’t really be used for inference. The Linter has logic to treat these like the RDFS variations if there is only a single value for domainIncludes/rangeIncludes.

Gregg

[1] https://github.com/schemaorg/schemaorg/blob/bd3df4106937863a6ae9351fcb4782b67a016357/data/releases/22.0/schemaorg-all-http.ttl#L5279-L5282



Tony


On Aug 2, 2023, at 8:09 AM, Hédic Guibert <guiberthedic@gmail.com<mailto:guiberthedic@gmail.com>> wrote:

Hello Schema.org,

I have a few questions regarding the JSON-LD specifications and the Schema.org validator.
I am working on the JSON-LD format and I noticed some mismatches between what I understand of the specifications and what the validator is doing.
Chances are I just did not read the specifications correctly. However, after having looked into it with attention, I still can't say if I am mistaken or if the validator is doing a few things wrong.

My first question is about the validation of typed values using the Schema.org vocabulary. The Schema.org vocabulary expects some precise types for typed values as written in the JSON-LD documentation<https://www.w3.org/TR/json-ld11/#specifying-the-type>. Therefore, from what I understand of the specifications, the following document should be invalid for the Schema.org vocabulary:

{
"@context": "https://schema.org<https://schema.org/>",
"@type": "Organization",
"member": {
"@type": "WPSideBar",
"description": "1977"
}
}

An alternative way to validate your input is using the Structured Data Linter (http://linter.structured-data.org/, although you’ll need to enclose the JSON-LD in a script tag for it to be properly extracted). The linter uses RDFS entailment along with some schema.org<http://schema.org/> specific rules to infer relationships, and sometimes generate validation errors. In this case, it infers additional types on the “member” value, which are not incompatible with the stated WPSideBar stated value. I get the following inferred types: schema:WPSideBar, schema:Thing, schema:CreativeWork, schema:WebPageElement, and rdfs:Resource.

Since the member<https://schema.org/member> property expects an Organization<https://schema.org/Organization> or a Person<https://schema.org/Person> as its value, I expected this to be invalid since the WPSideBar<https://schema.org/WPSideBar> type is not a child of any of these types. However, when validating it with the Schema.org validator, this is considered a valid type. So, maybe there is something I did not understand about typed values?
And I can't understand why, if the above example is valid, the following example is invalid :

{
"@context": "https://schema.org<https://schema.org/>",
"@type": "Organization",
"member": {
"@type": "XPathType",
"description": "1977"
}
}

To me, both should be invalid. But if one is valid, the other probably should be as well? This, or I missed something.

That’s not really how RDF(S) entailment works, schema:member has rangeIncludes Organization or Person and a domainIncludes of Organization and ProgramMembership. Because of the loose semantics of both relationships, it could be that the range or domain could be other things so we can’t really infer any relationships (or restrictions) based on the vocabulary definitions. Note that rdfs:range and rdfs:domain have different semantics, from which we could determine inconsistencies, but rangeIncludes and domainIncludes were specifically designed to allow for a more open world interpretation where other types could be added, or inferred based on the actual properties used.

We do see that XPathType is a DataType, which should be part of a typed literal and not a node type, but I don’t believe there’s any logic (or restriction) that prevents a datatype from being used as a node time (this could be an improvement on the reasoner that the linter uses, but would be schema.org<http://schema.org/> specific).



My second question is about specifying the type, as described here<https://www.w3.org/TR/json-ld11/#specifying-the-type>. It is written that the @type entry is optional and that the type may be inferred from its properties. The specifications specifically write this about node objects. So, to my understanding, this document should be valid :

{
"@context": "http://schema.org/",
"name": "Jane Doe",
"jobTitle": "Professor",
"telephone": "(425) 123-4567",
"url": "http://www.janedoe.com<http://www.janedoe.com/>"
}

But it is not. However, the type is not required on typed values. So, maybe the type is actually required on node objects but not on typed values?

As Phil indicated, for node objects, @type is never required. For value objects, it’s necessary to distinguish the value from being a simple string or language-tagged string. In some cases, a node type can be inferred, but it’s complicated when the vocabulary suggests any of a number of possible types; the best we could do is infer the closest related super-type (same for rangeIncludes and values).



Lastly, it is written in the specifications (here<https://www.w3.org/TR/json-ld11/#specifying-the-type>) that
In addition to setting the type of nodes, @type can also be used to set the type of a value to create a typed value. This use of @type is similar to that used to define the type of a node object,but value objects are restricted to having just a single type.

However, if I use the validator on the following JSON, it is considered a valid JSON-LD document.

{
"@context": "https://schema.org<https://schema.org/>",
"@type": "Organization",
"member": {
"@type": ["WPSideBar", "WPHeader"],
"description": "1977"
}
}

Shouldn't this document be invalid?

In this case the types of the node object value are legitimate classes, which are both subclasses of WebPageElement, so not inconsistent.

Admittedly, the semantics of schema.org<http://schema.org/> properties and classes is a bit fuzzy, and while we can often find semantic inconsistencies, we can’t always do so.

Gregg



Considering my last 2 questions, I might be confused about the terminology.
However, I would be glad if you could enlighten me on these questions!

Sorry for this quite long email!

Respectfully,

Hédic Guibert

Received on Saturday, 5 August 2023 23:07:55 UTC