- From: Olivier Grisel <olivier.grisel@ensta.org>
- Date: Mon, 3 Oct 2011 13:51:19 +0200
- To: Ivan Herman <ivan@w3.org>
- Cc: Markus Lanthaler <markus.lanthaler@gmx.net>, public-linked-json@w3.org
2011/10/3 Ivan Herman <ivan@w3.org>: > > On Oct 2, 2011, at 22:33 , Markus Lanthaler wrote: >> >> >>>> We could also require serializations ensure that @context is listed >>>> first. If it isn't listed first, the processor has to save each >>>> key-value pair until the @context is processed. This creates a memory >>>> and complexity burden for one-pass processors. >> >> Agree. I think that would make a lot of sense since you can see the context >> as a kind of header anyway. > > I must admit I do not really understand that, but that probably shows my ignorance of the wider JSON world. > > However... the standard JSON parser in Python parses a JSON object into a dictionary. However, at least in Python, you cannot rely on the order of the keys within the dictionary (it is determined by some hashing algorithm, if I am not mistaken, but that is internal to the interpreter anyway). Ie, whether @context appears first or last does not make any difference. > > Worse: if you then use such a structure to generate JSON using again the 'dump' feature of the standard Python parser, there is no way to control the order of those keys. In other words, if we impose such an order in JSON-LD, that means that a Python programmer must bypass the standard JSON library module and do the dump by hand. I do not think that would be acceptable... In python 2.7 and 3.2+ it is possible to have a deterministic order by using the collections.OrderedDict class from the standard library. In that case the json.dump will respect that order. At parsing time it is now possible to pass the OrderedDict class as "object_pairs_hook" to avoid loosing the ordering information. http://docs.python.org/library/json.html So I don't think this is such as use deal to enforce the @context node as first position. But that will require a bit of communication effort for documenting and advertising such good practices to JSON-LD library developers. IMHO it is very interesting to be able to do one pass / streaming processing of huge JSON-LD dumps without having to load the payload in memory. For instance I would really like to be able to have JSON-LD dumps of the full DBpedia that I could pre-filter in one-pass before loading it to a CouchDB database or and ElasticSearch fulltext index. Such a dump JSON-LD would be several tens of GB uncompressed and would probably not fit in today computers' main memory. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel
Received on Monday, 3 October 2011 11:52:10 UTC