Re: Release of ShEx.ex 0.1

> On Jul 16, 2019, at 5:39 PM, Eric Prud'hommeaux <eric@w3.org> wrote:
> 
> 
> On Mon, Jul 15, 2019 at 01:44:35PM -0700, Marcel Otto wrote:
>> 
>> 
>> Hi,
>> 
>> 
>> 
>> 
>> 
>> I'm very happy to announce the first release of ShEx.ex, an Elixir implementation of the ShEx and ShapeMap specs. You can find the source at https://github.com/marcelotto/shex-ex and a short guide at https://rdf-elixir.dev/shex-ex/
>> 
> 
> 
> 
> 
> Very cool! I'll add this to shex.io.
> 
> 
> 

Awesome.
> 
>> 
>> 
>> One distinguishing feature of ShEx.ex might be its support for parallel processing of larger amounts of nodes out-of-the-box. This feature is however is still considered experimental, as it currently lacks empirically founded parameters for the workload distribution (batch sizes etc). For this reason, I want to ask if there are some public example data and schemas for testing and comparison purposes.
>> 
> 
> 
> 
> 
> On <http://build.fhir.org/downloads.html> you'll find ShEx and Turtle downloads. Let me know (try <https://gitter.im/ericprud>, though I'll be on vacation and AFK some) if they need a little TLC.
> 
> 
> 

Great, any additional test data helps. But the Turtle examples at http://build.fhir.org/examples-ttl.zip lead me to a 404.
>> 
>> 
>> ShEx.ex already passes large parts of the official test suite. However, here are the ones that still fail:
>> 
>> 
>> 
>> 
>> 
>> - the `negativeStructure` tests are not passing yet, because the schema is not yet validated for the respective problems on schema creation time (I hope to deliver this soon)
>> - the following features are in general not supported yet, so all test with the resp. traits are not passing: imports, external shapes, annotations, semantic actions
>> - `1literalPattern_with_ascii_boundaries_fail` and `1literalPattern_with_all_controls_fail` are failing because of some issues with non-ascii characters
>> 
> 
> 
> 
> 
> In case you also run into character encoding probs, you might need to use UTF-16 character pairs for characters outside BMP. For a recent example, in [ShAcE], I had to express a character in the range U+10000 - U+EFFFF as [\uD800-\uDB7F][\uDC00-\uDCFF].
> 
> 
> var PN_CHARS_BASE_RE = '(?:[a-zA-Z\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u02FF\u0370-\u037D\u037F-\u1FFF\u200C-\u200D\u2070-\u218F\u2C00-\u2FEF\u3001-\uD7FF\uF900-\uFDCF\uFDF0-\uFFFD]|[\uD800-\uDB7F][\uDC00-\uDCFF])' // last is UTF16 for \U00010000-\U000EFFFF
> 
> 
> [ShAcE] https://github.com/shexSpec/ace-shexc-support/blob/master/lib/ace/mode/shexc_highlight_rules.js#L63
>> 
>> 
>> - `nPlus1` and `PTstar-greedy-fail` are failing because of an issue with greediness
>> - `FocusIRI2groupBnodeNested2groupIRIRef` and `FocusIRI2EachBnodeNested2EachIRIRef` are failing
>> - A major issue for now is the limited set of supported datatypes in RDF.ex (on top of which ShEx.ex is implemented): xsd:boolean, xsd:integer, xsd:decimal, xsd:double, xsd:time, xsd:date, xsd:dateTime.  This limits the applicability of numeric value constraints and the lexical form checks for datatype constraints and makes 29 tests using unsupported datatypes in these circumstances fail. Addressing the limited set of supported datatypes is one of the next planned features for RDF.ex, but will take some time, as I'm generally not happy with the current implementation of the XSD datatypes and want to do a rewrite of that part.
>> - I also had some struggles with the JSON-based format for ShapeMaps used in the tests with `sht:ShapeMap` trait as I couldn't find any information about it. The only place it seems to be mentioned is in "Example 1" of the ShapeMap spec. But it doesn't mention any further details for how to encode for example literals. Is this format even intended to be used outside of the test suite?
>> 
> 
> 
> 
> 
> Those tests pre-date the ShapeMap spec and were just to test the idea of having ShapeMaps rather than a particular format. Let's make those more useful.
> 
> 
> Another convention that's emerged is shape manifests à la <https://github.com/shexSpec/schemas/blob/master/Wikidata/DigitalPreservation/manifest_software_wikiDP.json>. I'd like to swap the test manifests to use this convention where possible so implementors don't end up writing special manifest code just for the test harness. Don't you wish I'd mentioned that before you started?
> 
> 
> 

Interesting and would have been good to know, but on the other hand wouldn’t have changed that much, since I wanted to cover first only what’s in the specs. As long as these shape manifest become part of the specs somehow, I much prefer this over the current need of supporting an unspecified format just for the test suite.
> 
>> 
>> 
>> I hope you find ShEx.ex nevertheless useful already and would be happy to hear your thoughts on it.
>> 
>> 
>> 
>> 
>> 
>> Best,
>> Marcel
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 

Received on Tuesday, 16 July 2019 20:43:12 UTC