Extending SHACL Core (was: Update and opportunities with SHACL)

(Clarified subject line to be about extending the SHACL core only. Hydra 
mailing list kept in the loop, please ignore if considered off-topic)

On 8/12/2015 6:39, Markus Lanthaler wrote:
> Since we just talked about Symfony on a different thread (on 
> public-hydra) I had something like this in mind: 
> http://symfony.com/doc/current/book/validation.html#supported-constraints 
> There are heaps of data validation libraries out there. I think 
> learning from them would be a good idea. Some of these things are 
> already supported by SHACL, some can be worked around or approximated. 
> But just looking at a very basic example like email validation already 
> shows how complex this can become in practice.

Yes, comparing to other validation libraries is useful - thanks for the 
pointer. I skimmed through the constraint list of symfony, and believe 
SHACL covers many of them. I have marked the missing ones with a +

Basic Constraints
- NotBlank/Blank/NotNull: sh:minCount, sh:maxCount
- Null: Doesn't apply to RDF
- True: sh:hasValue true
- False: sh:hasValue false
- Type: sh:datatype, sh:valueClass

String Constraints
+ Email: there seems to be no simple regex to validate all emails [1], 
but there are approximations. If we are limiting ourselves to SPARQL 
regex then there is a trade-off. Doing this in JS provides more options.
- Length: sh:minLength, sh:maxLength
- Url: Approx sh:nodeKind=sh:IRI or sh:datatype=xsd:anyURI
- Regex: sh:pattern
+ Ip: Should be doable with a regex
+ Uuid: Should be doable with a regex

Number Constraints
- Range: sh:minInclusive etc

Comparison Constraints
(Here I am not sure if they allow the values of multiple properties to 
be compared with each other - that would be outside of the core language 
right now).
- EqualTo: sh:hasValue
+ NotEqualTo:
+ IdenticalTo: sh:hasValue currently has SPARQL matching semantics, i.e. 
1^^xsd:integer would be equal to 1^^xsd:float. Exact equality cannot be 
checked right now
+ NotIdenticalTo:
- LessThan etc: sh:minInclusive etc

Date Constraints
- Date: sh:datatype=xsd:date
- DateTime: sh:datatype=xsd:dateTime
- Time: sh:datatype=xsd:time

Collection Constraints
- Choice: sh:allowedValues
+ Collection: (quite complex)
+ Count: (could be expressed in SPARQL)
+ UniqueEntity: reminds me of primary keys, not in SHACL core
+ Language/Locale/Country: Would all require some look-up tables (named 
graphs, services) that may change. Not difficult but busy work.

File Constraints
+ File: Doesn't apply to SHACL
+ Image: Would require a mime type look-up, could be done in JS

Financial and other Number Constraints
+ Cardscheme, Currency, Luhn, Iban, Isbn, Issn: Again, this is not 
difficult but would require stable references to master data - who would 
maintain that when the SHACL WG ends?

Other Constraints
- Callback, Expression: sh:sparql, sh:jsExpression (if this gets done), etc.
+ All: We had discussed support for collections, but not implemented yet
+ UserPassword: Doesn't apply to SHACL
- Valid: sh:valueShape


Reflecting on this list, I believe it is clear that the WG needs to make 
a trade-off here. Yes we could include many of the missing ones, 
especially those that map to simple regular expressions. But then this 
could quickly become a slippery slope. For example, if we add Country, 
then why not State, and why only the US states and not the Australian 
ones? OTOH, the distinction between EqualTo and IdenticalTo may be 
relevant, and could become a flag on sh:hasValue. Having said this, any 
SHACL library can represent all these things, and I hope many students 
and volunteers can work on specific topics such as "SHACL template 
library for US Addresses". We have put the macro mechanisms in place 
(more on this in a separate email).

I am certain the WG will discuss this topic further, but I believe it 
would be generally helpful to collect other constraint libraries (such 
as symfony) to make sure we didn't miss anything obvious. Pointers welcome.

Holger

[1] 
http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address

Received on Wednesday, 12 August 2015 00:35:37 UTC