- From: Jose Emilio Labra Gayo <jelabra@gmail.com>
- Date: Tue, 16 Jul 2019 01:02:46 +0200
- To: Marcel Otto <marcelotto.de@googlemail.com>
- Cc: public-shex@w3.org
- Message-ID: <CAJadXXJ18E1Sq3c-q5d9F_FjQ-JPiWEgk5=QwWcNdXO1xxfkTA@mail.gmail.com>
Congratulations for the new ShEx implementation. It is really great to have a new one. About public data and examples for benchmarks. Two years ago, we wrote a paper where we proposed a possible benchmark based on a real project called the WebIndex. I had implemented a program that generated both valid and non-valid RDF data according to a data model inspired by the WebIndex model. That work was published as a draft paper here [1]. I was planning to resume that work once there were more implementations and/or I had more time, maybe you want to reuse part of that work and test your system with it. If you do it, let me know if I can help. The source code of the benchmark data generation tool is here: http://labra.weso.es/wiGen/ [1] Validating and describing linked data portals using shapes, Jose-Emilio Labra-Gayo, Eric Prud'hommeaux, Harold Solbrig, Iovka Boneva, arXiv:1701.08924 [cs.DB] https://arxiv.org/abs/1701.08924 Best regards, Jose Labra On Mon, Jul 15, 2019 at 10:45 PM Marcel Otto <marcelotto.de@googlemail.com> wrote: > Hi, > > I'm very happy to announce the first release of ShEx.ex, an Elixir > implementation of the ShEx and ShapeMap specs. You can find the source at > https://github.com/marcelotto/shex-ex and a short guide at > https://rdf-elixir.dev/shex-ex/ > > One distinguishing feature of ShEx.ex might be its support for parallel > processing of larger amounts of nodes out-of-the-box. This feature is > however is still considered experimental, as it currently lacks empirically > founded parameters for the workload distribution (batch sizes etc). For > this reason, I want to ask if there are some public example data and > schemas for testing and comparison purposes. > > ShEx.ex already passes large parts of the official test suite. However, > here are the ones that still fail: > > - the `negativeStructure` tests are not passing yet, because the schema is > not yet validated for the respective problems on schema creation time (I > hope to deliver this soon) > - the following features are in general not supported yet, so all test > with the resp. traits are not passing: imports, external shapes, > annotations, semantic actions > - `1literalPattern_with_ascii_boundaries_fail` and > `1literalPattern_with_all_controls_fail` are failing because of some issues > with non-ascii characters > - `nPlus1` and `PTstar-greedy-fail` are failing because of an issue with > greediness > - `FocusIRI2groupBnodeNested2groupIRIRef` and > `FocusIRI2EachBnodeNested2EachIRIRef` are failing > - A major issue for now is the limited set of supported datatypes in > RDF.ex (on top of which ShEx.ex is implemented): xsd:boolean, xsd:integer, > xsd:decimal, xsd:double, xsd:time, xsd:date, xsd:dateTime. This limits the > applicability of numeric value constraints and the lexical form checks for > datatype constraints and makes 29 tests using unsupported datatypes in > these circumstances fail. Addressing the limited set of supported datatypes > is one of the next planned features for RDF.ex, but will take some time, as > I'm generally not happy with the current implementation of the XSD > datatypes and want to do a rewrite of that part. > - I also had some struggles with the JSON-based format for ShapeMaps used > in the tests with `sht:ShapeMap` trait as I couldn't find any information > about it. The only place it seems to be mentioned is in "Example 1" of the > ShapeMap spec. But it doesn't mention any further details for how to encode > for example literals. Is this format even intended to be used outside of > the test suite? > > I hope you find ShEx.ex nevertheless useful already and would be happy to > hear your thoughts on it. > > Best, > Marcel > > -- -- Jose Labra
Received on Monday, 15 July 2019 23:03:22 UTC