W3C home > Mailing lists > Public > public-json-ld-wg@w3.org > February 2019

Re: JavaScript JSON-LD Streaming Parser

From: Ruben Taelman <ruben.taelman@ugent.be>
Date: Tue, 12 Feb 2019 10:20:01 +0100
To: Gregg Kellogg <gregg@greggkellogg.net>
Cc: public-json-ld-wg@w3.org
Message-ID: <etPan.5c628fc1.4ffed8d8.13d@ugent.be>
Dear Gregg,

> Implementing streaming parsers and serializers is important, but not something the WG is ready to work on quite yet.

If this should change in the future, I’d be happy to share my experiences.

> Note that the current version of the test suite is at http://w3c.github.io/json-ld-api/tests/

Thanks for letting me know, I was indeed targeting the old test suite.
I will submit my EARL report once I’ve updated this.

> You may consider doing a PR to http://github.com/json-ld/json-ld.org [2] to add your implementation.

I just submitted a PR for this.

Kind regards,
Ruben Taelman

On 11 February 2019 at 23:54:07, Gregg Kellogg (gregg@greggkellogg.net) wrote:

Ruben, thanks for implementing this. Implementing streaming parsers and serializers is important, but not something the WG is ready to work on quite yet. We did just discuss this in our Face to Face last week [1].

Note that the current version of the test suite is at http://w3c.github.io/json-ld-api/tests/; we need to redirect the CG version of the tests here.

Also, the CG maintains a list of implementations on the json-ld.org homepage. You may consider doing a PR to http://github.com/json-ld/json-ld.org [2] to add your implementation.

Gregg Kellogg
gregg@greggkellogg.net

[1] https://www.w3.org/2018/json-ld-wg/Meetings/Minutes/2019/2019-02-08-json-ld#section5-2
[2] https://github.com/json-ld/json-ld.org/blob/fbb52bafdb696ae0950c74952b1267f0dbb4be02/index.html#L182-L189

On Feb 8, 2019, at 2:10 AM, Ruben Taelman <ruben.taelman@ugent.be> wrote:

Dear all,

I had a couple of use cases where I needed to be able to
parse JSON-LD documents to RDF in a streaming way.
To the best of my knowledge, current JavaScript implementations
don’t support streaming parsing, which is why I implemented a streaming parser [1].
Such a parser is especially useful when you need to parse large documents
that don’t fully fit into your memory.

This parser can be configured to be fully spec-compliant.
However, by default, it is not fully compliant for performance reasons.
For example, the parser will by default throw an error
if an @context is found as a non-first entry in an object.

Obviously, a streaming parser will never be as fast as a regular parser for all cases.
However, we still achieve comparable performance for parsing
typical JSON-LD documents, compared to jsonld.js [2].
Currently, this parser is significantly slower for expanded documents,
so I am still looking into optimizing this.

At the moment JSON-LD 1.0 is supported,
but I aim to look into supporting the new 1.1 features in the near future.

More information on how the streaming algorithm works
can be found in the readme [3].

[1] https://github.com/rubensworks/jsonld-streaming-parser.js
[2] https://github.com/rubensworks/jsonld-streaming-parser.js#performance
[3] https://github.com/rubensworks/jsonld-streaming-parser.js#how-it-works

Kind regards,
Ruben Taelman
Received on Tuesday, 12 February 2019 09:20:02 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:15:24 UTC