- From: Alexandre Bertails <alexandre@bertails.org>
- Date: Sat, 31 Jan 2015 15:33:54 -0500
- To: "henry.story@bblfish.net" <henry.story@bblfish.net>
- Cc: Anton Kulaga <antonkulaga@gmail.com>, Public Banana-RDF <public-banana-rdf@w3.org>
All right, here is a second attempt, taking into account the feedback I got. Thanks for that by the way. A few remarks: * the goal here was *never* to do streaming/non-blocking IO * those are abstractions for the existing implementations (Jena, Sesame, N3.js, jsonld-js, custom) * we *only* need to accommodate sync and async IO * this addresses the issues related to * @base directive * @prefix directives * handling of relative graphs without violating the RDF model i.e. only full IRIs in the model itself. So this is happening in the serialized form only. * decoupled the interfaces when I could, because depending on the implementations, you only have access to String as Input (or you have some optimizations for String-s), so I wanted to avoid an unnecessary round-trip through an InputStream Anyway, here it is: /** The result of parsing an RDF Graph */ final case class Parsed[Rdf <: RDF]( graph: Rdf#Graph, prefixes: Map[String, Rdf#Uri] ) /** An RDF reader. * * All functions return results in the context `M`. `S` is a phantom * type for the RDF syntax. */ trait RDFReader[Rdf <: RDF, M[_], +S] { /** reads an RDF Graph from a [[java.io.InputStream]] and a base IRI */ def read(inputStream: InputStream, base: String): M[Parsed[Rdf]] /** reads an RDF Graph from a [[java.io.Reader]] and a base IRI */ def read(reader: Reader, base: String): M[Parsed[Rdf]] } /** An RDF reader for [[java.lang.String]]s. * * All functions return results in the context `M`. `S` is a phantom * type for the RDF syntax. */ trait RDFStringReader[Rdf <: RDF, M[_], +S] { /** reads an RDF Graph from a [[java.lang.String]] and a base IRI */ def read(input: String, base: String): M[Parsed[Rdf]] } /** An RDF writer. * * All functions return results in the context `M`. `S` is a phantom * type for the RDF syntax. */ trait RDFWriter[Rdf <: RDF, M[_], +S] { /** writes `graph` in a [[java.io.OutputStream]] * * @param graph * @param outputStream * @param optBase if set and supported, used in a `@base`-like directive * @param prefixes used in `@prefix`-like directives */ def write( graph: Rdf#Graph, outputStream: OutputStream, optBase: Option[String], prefixes: Map[String, Rdf#Uri] ): M[Unit] /** writes `graph` in a [[java.io.OutputStream]] where IRIs are * relative to the provided `base`. * * Note: the result cannot be reliably parsed without knowing the * base. * * @param graph * @param outputStream * @param base all IRIs are relative to it * @param prefixes used in `@prefix`-like directives */ def writeAsRelative( graph: Rdf#Graph, outputStream: OutputStream, base: String, prefixes: Map[String, Rdf#Uri] ): M[Unit] } /** An RDF formatter returning [[java.lang.String]]s. * * All functions return results in the context `M`. `S` is a phantom * type for the RDF syntax. */ trait RDFStringWriter[Rdf <: RDF, M[_], +S] { /** returns `graph` as a [[java.lang.String]] * * @param graph * @param optBase if set and supported, used in a `@base`-like directive * @param prefixes used in `@prefix`-like directives */ def asString( graph: Rdf#Graph, optBase: Option[String], prefixes: Map[String, Rdf#Uri] ): M[String] /** returns `graph` as a [[java.lang.String]] where IRIs are relative * to the provided `base` * * @param graph * @param base all IRIs are relative to it * @param prefixes used in `@prefix`-like directives */ def asString( graph: Rdf#Graph, base: String, prefixes: Map[String, Rdf#Uri] ): M[String] } Alexandre On Thu, Jan 29, 2015 at 3:23 PM, Alexandre Bertails <alexandre@bertails.org> wrote: > On Thu, Jan 29, 2015 at 3:05 PM, henry.story@bblfish.net > <henry.story@bblfish.net> wrote: >> >> On 29 Jan 2015, at 20:57, Anton Kulaga <antonkulaga@gmail.com> wrote: >> >> Hi, Alexandre! >> >> I think we lack support of streaming API for readers. All readers return >> RDFGraph but in many cases it is not what is needed. For instance: >> 1) In case of large files I would prefer to parse them in a nonblocking way, >> so I need a stream. >> >> >> +1 > > Sure, but this is not what RDFReader is trying to solve. Streaming > parsers would live in an entirely different typeclass. This is not > what I am trying to address here. > > `M` already let's you implement the synchronous and asynchronous > parsers, and handle errors. > >> >> 2) In case of gettings data from websockets (in a string format) it is an >> overhead to wrap each websocket message into a graph >> >> >> +1 >> >> >> Writers are also not clear for me. >> Why do you provide "base:String" in ordinary "write" method? base:String >> makes sense only for relative graphs (for which you already have def >> writeAsRelative ) > > This addresses the use-case of the `@base` directive for syntaxes that > support it. We could decide to ignore it entirely. > > Anton was mentioning that this information might be in his graphs > already. The banana-rdf model doesn't support that for now. Neither it > plans anything for prefixes. It could actually, but I am not sure that > all RDF implementations would actually support it (N3.js doesn't > support it at the moment, neither does the current Plantain), so it's > hard to write a contract that binds the Rdf#Graph and an RDFWriter > when prefixes/base are involved. Suggestions welcome. > >> Also I think it would be better to have the base be a URI, rather than a >> string. > > That would require the user to build one, hence having an RDFOps > somewhere. In practice, the implementations do not require that. And > an error on the `base` can already be handled in the `M` context. But > if there is more demand for an `base: Rdf#URI` and if nobody else > prefers keeping the String, then I will certainly not fight for it :-) > >> >> Can we not also have a readAsRelative, and end up with a relative graph? >> That’s what we have had until now, and it can be quite useful. > > The relative graph is a really ugly hack. It is not really handled by > implementations (especially Sesame), and will probably never be, > without the kind of hacks we have today. The proposal here is to use a > well-defined `rel:` base URI instead. The rationale in the > documentation for the package. > > Alexandre > >> >> >> 2015-01-29 21:42 GMT+02:00 Alexandre Bertails <alexandre@bertails.org>: >>> >>> Hey guys, >>> >>> It is time to fix our reader and writer typeclasses. We have a few >>> outstanding issues on Github, and I have spent some time thinking on >>> how to move forwards. So please find here my proposal. >>> >>> Please review, comment, ask questions, I am planning to start hacking >>> on those as soon as possible. I want this stuff to be ready way before >>> my Scaladays talk, along with the tutorial I am planning for >>> banana-rdf. >>> >>> Alexandre >>> >>> >>> >>> * rdf/common/src/main/scala/org/w3/banana/io/package.scala >>> >>> ``` >>> package org.w3.banana >>> >>> /** Types for IO operations on RDF objects. >>> * >>> * RDF syntaxes like Turtle or JSON-LD do allow for the use of >>> * relative IRIs but there are no relative IRIs in the RDF model >>> * itself. So instead of failing on a relative IRI without a known >>> * base, RDF parsers would usually *silently* pick a default base IRI >>> * on their own. >>> * >>> * In `banana-rdf`, we standardize the behaviour by **always** using >>> * `rel:` as the default base IRI for IO operations when no base >>> * was provided. RDF applications can then look at the scheme and >>> * know what happened. >>> */ >>> package object io { >>> >>> /** `rel:`, the special base IRI used when no base is provided. */ >>> final val DEFAULT_BASE: String = "rel:" >>> >>> } >>> ``` >>> >>> * rdf/common/src/main/scala/org/w3/banana/io/RDFReader.scala >>> >>> ```package org.w3.banana >>> package io >>> >>> import java.io._ >>> >>> /** An RDF reader. >>> * >>> * All functions return results in the context `M`. `S` is a phantom >>> * type for the RDF syntax. >>> */ >>> trait RDFReader[Rdf <: RDF, M[_], +S] { >>> >>> /** reads a RDF Graph from a [[java.io.InputStream]] and a base IRI */ >>> def read(is: InputStream, base: String): M[Rdf#Graph] >>> >>> /** reads an RDF Graph from a [[java.io.Reader]] and a base IRI */ >>> def read(reader: Reader, base: String): M[Rdf#Graph] >>> >>> /** reads an RDF Graph from a [[java.io.InputStream]] using the `rel:` >>> * base IRI >>> */ >>> final def read(is: InputStream): M[Rdf#Graph] = read(is, DEFAULT_BASE) >>> >>> /** reads an RDF Graph from a [[java.io.Reader]] using the `rel:` base >>> * IRI >>> */ >>> final def read(reader: Reader): M[Rdf#Graph] = read(reader, >>> DEFAULT_BASE) >>> >>> } >>> ``` >>> >>> * rdf/common/src/main/scala/org/w3/banana/io/RDFWriter.scala >>> >>> ``` >>> package org.w3.banana >>> package io >>> >>> import java.io.OutputStream >>> >>> /** An RDF writer. >>> * >>> * All functions return results in the context `M`. `S` is a phantom >>> * type for the RDF syntax. >>> */ >>> trait RDFWriter[Rdf <: RDF, M[_], +T] { >>> >>> /** writes `graph` in a [[java.io.OutputStream]] with the provided >>> * `base` >>> */ >>> def write(graph: Rdf#Graph, os: OutputStream, base: String): M[Unit] >>> >>> /** writes `graph` in a [[java.io.OutputStream]] where IRIs are >>> * relative to the provided `base` >>> */ >>> def writeAsRelative(graph: Rdf#Graph, os: OutputStream, base: String): >>> M[Unit] >>> >>> /** returns `graph` as a [[java.lang.String]] */ >>> def asString(graph: Rdf#Graph, base: String): M[String] >>> >>> /** returns `graph` as a [[java.lang.String]] where IRIs are relative >>> * to the provided `base` >>> */ >>> def asRelativeString(graph: Rdf#Graph, base: String): M[String] >>> >>> } >>> ``` >>> >> >> >> >> -- >> Best regards, >> Anton Kulaga >> >> >> Social Web Architect >> http://bblfish.net/ >>
Received on Saturday, 31 January 2015 20:34:21 UTC