- From: pukkamustard <pukkamustard@posteo.net>
- Date: Wed, 21 Sep 2022 11:50:29 +0000
- To: Miel Vander Sande <miel.vandersande@meemoo.be>
- Cc: semantic-web@w3.org
Hello Miel, Miel Vander Sande <miel.vandersande@meemoo.be> writes: > - it's inspired by HDT, does that mean it's self-indexed and queryable? The structures used are exactly the same as in HDT (front-coded dictionary and BitMapTriples). However, RDF/CBOR uses variable-length encoding of the integers that reference dictionary terms (we inherit this from CBOR). Thus, it is not possible to compute fixed offses and zip around structures on-disk. RDF/CBOR was designed for small pieces of content that can be handled in memory. On-disk query-ability was not a design goal and has been lost. On the other hand, the variable-length encoding should in principle allow more compact encodings. > - How does it compare to https://rdf4j.org/documentation/reference/rdf4j-binary/, > https://jena.apache.org/documentation/io/rdf-binary ? As far as I understand, the encoding you link requires a Apache Thrift or Google Protocol Buffers definition and tools to generate serializes. RDF/CBOR is defined (via CBOR) directly in bytes and bits. There is no external tooling required to implement the serialization. We do use CDDL to define the serialization, however this is just a documentation tool. Unfortunately, I have not been able to perform quantative performance tests. So I can't make any statements on efficiency compared to other binary serializations. > - Does it integrate with any of the existing frameworks for handling > RDF? Can you work with the OCaml implementation in Python, Java or > RDF.js? Unfortunately, I don't think integrating the OCaml implementation in any other language would be a good way of using the serialization. A major objective of RDF/CBOR is that it is re-implementable from scratch (this is a design goal shared with CBOR). Although this has yet to be done, I believe it should be feasible to re-implement RDF/CBOR in your favorite language with reasonable effort. CBOR libraries that can be used for the low-level encodings exist and can be used. I would be very happy to assist you in such an endeavor. Best regards, pukkamustard > Op zo 18 sep. 2022 om 22:03 schreef pukkamustard <pukkamustard@posteo.net>: > > Hello semantic-web, > > I'd like to share some recent work towards a binary serialization of RDF > using CBOR: > > https://openengiadina.codeberg.page/rdf-cbor/ > > CBOR (RFC 8949) is a binary data serialization that provides basic data > types (string, integer, arrays, etc.) as well as extendable tags for > annotating more complex data types. RDF/CBOR encodes RDF into CBOR > types. CBOR types are re-used for efficient binary serialization of > literal values and certain binary IRIs (e.g. UUIDs). > > RDF/CBOR is very much inspired by the HDT serialization and uses a very > similar encoding (front-coded dictionaries and BitMapTriples). Unlike > HDT, RDF/CBOR is optimized for small pieces of content that are created, > transported and read by possibly constrained devices. > > The serialization is defined using the Concise Data Definition Language > (CDDL; RFC 8610) which allows a very concise and precise specification. > > RDF/CBOR also allows groups of RDF statements to be content-addressed, > i.e. identifiers are the cryptographic hash of the serialized > statements. This can be used for cryptographic signature schemes and > makes RDF viable on distributed, peer-to-peer systems. > > I look forward to your feedback and comments. > > Best regards, > pukkamustard
Received on Wednesday, 21 September 2022 12:08:41 UTC