- From: Damian Gessler <dgessler@iplantcollaborative.org>
- Date: Fri, 17 May 2013 10:47:23 -0600
- To: public-rdf-comments@w3.org
Hi Dave,
Thank you. I've been happy to see the progress with JSON-LD. For a
number of years we've had to use our own JSON formulation for production
because of the lack of a W3C rec re JSON and OWL. Our systems run
transaction-time DL reasoning on SSWAP OWL Semantic Web Services. See
http://sswap.info, http://sswap.info/api [particularly
http://sswap.info/api/JSONSyntax], and http:/sswap.info/jit. The work is
funded by the National Science Foundation.
I do believe that some study of my proposal for RDF/JSON shows it to
address a suite of issues relevant to its design in what is at its core
a simple and tight model, but if that discussion is closed, then at
least it stands in the public record.
Best,
Damian.
On 5/17/13 9:13 AM, David Wood wrote:
> Hi Damian,
>
> The RDF WG held substantial discussions regarding various designs for
> RDF in JSON in the first half of 2011. The discussions are well
> documented both in our mailing list and on our wiki. We decided roughly
> a year later (May/June 2012) to proceed with JSON-LD due to the success
> of that community group's activities and implementations. One might note
> that take up of JSON-LD from third parties has been solid (e. g.
> Google's GMail announcement yesterday).
>
> The only JSON item in our plate this late in the WG's charter is whether
> to write a Note (not a Recommendation) on RDF/JSON. That's it. We will
> not be accepting new design proposals at this time, although a future
> working group might consider your proposal.
>
> Regards,
> Dave
> (Chair hat *on*)
> --
> http://about.me/david_wood
>
>
> On May 17, 2013, at 7:27, Peter Ansell <ansell.peter@gmail.com
> <mailto:ansell.peter@gmail.com>> wrote:
>
>>
>>
>> On 17 May 2013 09:10, Damian Gessler <dgessler@iplantcollaborative.org
>> <mailto:dgessler@iplantcollaborative.org>> wrote:
>>
>> This is discussion is long, but hopefully offers constructive
>> comment for RDF/JSON. It is submitted as an email per directions
>> at
>> https://dvcs.w3.org/hg/rdf/__raw-file/default/rdf-json/__index.html <https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-json/index.html>.
>>
>> The model proposed here addresses untyped literals, typed
>> literals, resources (URIs and bnodes), QNames (including reserved
>> prefixes, user-defined prefixes, and a default namespace),
>> preservation of XML encoding information, type declarations,
>> comments, short-circuit parsing, and both aggregate and disbursed
>> subject blocks. It does so with a "natural" reading of the
>> resultant JSON that yields similarities to both N3 and RDF/XML. It
>> is designed to be informationally lossless with respect to both
>> RDF and RDF/XML, and can be used either as a pure RDF
>> serialization independent of RDF/XML, or as a streaming
>> transliteration on the large extant repository of legacy RDF/XML
>> documents on the Web.
>>
>> We begin simply and pedagogically, but things will speed up:
>>
>> 1. We ask rhetorically what we are trying to achieve with
>> RDF/JSON. We begin with an immediate and simple JSON serialization
>> for RDF: a serialization that preserves the core and fundamental
>> data model of RDF (the S,P,O triple) while adding little else; viz:
>>
>> [
>> [ "S", "P", "O" ],
>> [ "S", "P", "O" ],
>> ...
>> ]
>>
>> Where S is the Subject, P is the Predicate (or Property), and O is
>> the Object. This simple serialization can be expanded to support
>> literal datatypes in a number of ways; e.g.:
>>
>> [
>> [ "S", "P", "L" ],
>> [ "S", "P", { "L" : "D" } ],
>> [ "S", "P", { "R" : {} } ],
>> ...
>> ]
>>
>> for RDF Objects L (Literal) (and datatype D) and R (Resource) (URI
>> or bnode). There are also other minor variants and syntaxes that
>> could differentiate between untyped literals, typed literals, and
>> resources.
>>
>> We will reject this serialization per se; but it is important to
>> offer it as a "null model" because that forces us to be explicit
>> as to why another serialization with necessarily overloaded
>> semantics is preferable.
>>
>> Clearly, by not stopping at this immediate and natural JSON
>> serialization of triples, the vision of RDF/JSON must be either
>> implicitly, or explicitly, something other than just serializing
>> RDF into JSON.
>>
>> By presenting a data model of:
>>
>> { "S" : { "P" : { "O" : [ ... ] } }
>>
>> RDF/JSON shows that it prioritizes a subject-oriented data
>> structure of the underlying RDF data model in achieving its JSON
>> serialization. This elegant, natural, data model has similarities
>> to the use and adoption of N3 over N-Triples.
>>
>>
>> 2. We note that the goal of RDF/JSON cannot be interpreted as to
>> translate legacy JSON -> RDF. This is because the semantics of any
>> arbitrary, legacy, JSON document do not map to the semantics of
>> RDF/JSON. For example, JSON arrays do not map to RDF List
>> constructs--and indeed, nor should they, for an array is not a
>> list (though in many cases it can be interpreted as such). Also,
>> RDF/JSON introduces reserved keys ("type", "value", "lang",
>> "datatype") that have implied semantics on the resultant
>> de-serialized data models that are not recognized as such in JSON.
>> This is not to say that one could not read legacy JSON, build an
>> in-memory data model, and output RDF/JSON; it is to say that such
>> an operation (arbitrary, legacy JSON -> RDF -> RDF/JSON) is
>> outside both the goals and spec of RDF/JSON. For JSON -> RDF, see
>> JSON-LD [1].
>>
>> Thus the perspective of RDF/JSON is focused on RDF -> JSON, while
>> leveraging some of the JSON data modeling constructs. The W3C
>> recommend serialization for RDF is RDF/XML [2]. There is a large
>> legacy presence of RDF/XML documents on the Web, especially for
>> OWL. Thus a desirable characteristic of a JSON serialization would
>> be the informationally lossless transformation of RDF/XML -> JSON.
>> This becomes a key guide for the following discussion. While
>> RDF/JSON can position itself as solely a RDF serialization
>> independent of others, distinct, and separate from RDF/XML, this
>> is perhaps a missed opportunity.
>>
>> Alternatively, RDF/JSON could position itself as an RDF -> JSON
>> serialization that builds upon, and is receptive to,
>> informationally lossless transliterations of the
>> already-recommended W3C serialization for RDF: RDF/XML. The
>> motivation is that such an approach builds a suite of
>> complementary W3C technologies, including various serializations,
>> rather than a merely a collection of competing formats. Of course,
>> RDF/JSON should also be able to stand separate and independent of
>> RDF/XML, such that one could go RDF -> RDF/JSON -> RDF without any
>> serialization through RDF/XML. Thus we seek both worlds.
>>
>> Currently, RDF/JSON is not informationally lossless with respect
>> to RDF/XML; we note a number of difficulties:
>>
>> 2a. QNames. RDF/JSON does not support QNames [3]. This presumably
>> could be addressed by adding semantics on how to serialize
>> prefixes. If RDF/JSON chooses not to support QNames then it can be
>> still said to be informationally lossless with respect to RDF, but
>> it cannot be said to be informationally lossless with respect to
>> RDF/XML. This would seem to be an undesirable and unnecessary
>> limitation.
>>
>> 2b. Serializing. RDF/JSON binds all of a Subject's predicates, and
>> all and each of those Predicates' Objects into a single, compound
>> JSON object. Yet RDF/XML does not require that all statements
>> about a Subject be together or in any one place in the document,
>> and RDF does not require this generically for serialization. Thus
>> RDF/JSON cannot be implemented as a streaming syntactical
>> re-serializer directly on RDF/XML: RDF/JSON must have knowledge of
>> the entire RDF data model, such as to know all of a Subject's
>> predicates and their objects, before it can serialize even the
>> first subject. This is somewhat unfortunate, since we would like a
>> serialization spec to be independent of implementation algorithms,
>> be they streaming or "DOM"-based. RDF/JSON's requirement that "S"
>> be unique (for each unique Subject) is forced upon it by JSON's
>> requirement that all keys in a JSON object be unique (but see below).
>>
>> 2c. Parsing. RDF/JSON imposes a data model outside of RDF proper,
>> which limits the utility of the serialization. But it is fair to
>> say it also enhances the utility of the serialization: there is a
>> trade-off. The elegance and "naturalness" of RDF/JSON's { "S" : {
>> "P" : [ "O" ] } } model necessarily clusters statements about
>> Subjects, while disbursing statements about Predicates and Objects
>> throughout the document. I call this the "phone book" problem,
>> where the chosen serialization of the producer limits the utility
>> available to the consumer, even though the consumer "has all the
>> data." In the "old days," phone books were distributed as
>> serialized name:number pairs, sorted by name, printed on paper.
>> The sorting produced essentially an array, such that one could use
>> an approximate binary search to find a name amongst a million
>> entries in a matter of seconds. The data producer (the phone
>> company) gave the consumer both name and number, and at some level
>> did not care whether the consumer was interested in the name,
>> number, or both. But the serialization essentially forced the
>> consumer to accept name:number ordered-pairs; the sorting and
>> serialization on name biased against number:name utility. A
>> separate serialization (called a reverse-lookup) was needed if one
>> had a number and wanted to find its associated name. These books
>> were usually hard to find. What is relevant here is not the old
>> days of phone books, but to note that RDF has no such restriction.
>> RDF does not bias Subjects over Objects, or Objects over
>> Predicates, etc. One of the benefits of the RDF/JSON modeling is
>> that once one is done processing a Subject, one is guaranteed that
>> no more syntactic statements about the Subject (as a Subject, and
>> as identified lexically by its key [i.e., not addressing the
>> semantics of owl:sameAs]) shall be made. Thus unlike RDF/XML, a
>> streaming parser can be implemented for RDF/JSON such that further
>> processing of a document stream can be abandoned prior to the
>> entire document being processed. I call this "short-circuit"
>> parsing. But this comes at the cost that the RDF/JSON model limits
>> the utility of the data when not consumed as intended, and in this
>> case the "intent" is set not by the producer, but by RDF/JSON
>> itself. One could say that RDF/JSON benefits the parser at the
>> expense of the serializer.
>>
>> 2d. RDF/JSON has no mechanism to retain comments ex situ of RDF
>> (e.g., RDF/XML XML comments [<!-- -->]). This is made difficult
>> due to JSON's lack of support for embedded comments.
>>
>>
>> The proposal below addresses the above issues while keeping very
>> much in the flavor of RDF/JSON's { "S" : { "P" : [ "O" ] } }
>> model. It is informationally lossless with respect to both RDF and
>> RDF/XML (supports QNames and comments); it supports streaming
>> serialization (e.g., as a syntactical transliterator on streaming
>> RDF/XML); and it supports streaming parsing of its own serialization.
>>
>> The proposal is quite simple and contains two "forms":
>>
>> Form 1. Guarantee that all statements about a Subject are
>> localized in the document, thus supporting short-circuit parsing.
>> Short-circuit guarantees are "communicated" to the parser by
>> virtue of an opening JSON object. A parser is guaranteed that all
>> keys of a JSON object are unique, thus when it "sees" a JSON
>> object, it "knows" that all statements about the key are localized
>> to the JSON object.
>>
>> Form 1 is very similar in structure to RDF/JSON.
>>
>> 1a. Simple, untyped literals:
>>
>> {
>> "S" : { "P" : "L" }
>> }
>>
>> Examples:
>>
>> 1a.i
>> {
>> "http://example.org/about" :
>> { "http://purl.org/dc/terms/__title
>> <http://purl.org/dc/terms/title>" : "Anna's Homepage" }
>> }
>>
>> 1a.ii
>> {
>> "http://example.org/about" : {
>> "http://purl.org/dc/terms/__title
>> <http://purl.org/dc/terms/title>" : [ "Anna's Homepage", "Annas
>> hjemmeside" ],
>> "http://anotherUniqueProperty/__p
>> <http://anotherUniqueProperty/p>" : "L"
>> ...
>> }
>> }
>>
>> JSON array [] constructs are required for the Object only as
>> needed. This differs from RDF/JSON which requires Object array
>> constructs even in cases of there being only a single Object. JSON
>> imposes no unique value restriction for array elements.
>>
>> Example 1a.i shows that simple statements are "simply" serialized.
>> The examples below will show that more complex statements are
>> built from the application of simple rules.
>>
>> Example 1a.ii shows JSON arrays as RDF Objects to package multiple
>> property instances and values.
>>
>> 1b. Typed Literals. We note from RDF/XML that datatypes on
>> literals are attributes on the Predicates (not on the literals
>> themselves). In a similar manner, typed literals do not have a
>> language, per se [4]: a language qualifier is on the Predicate.
>> Thus we here make a simple extension that allows use to replace
>> the literal "L" with an JSON object {} to capture arbitrary
>> RDF/XML attribute data, with special semantics for "rdf:value"; i.e.:
>>
>> 1b. Typed literals:
>>
>> {
>> "S" : { "P" : {
>> "rdf:value" : "L",
>> "rdf:datatype" : "D",
>> ...
>> }
>> }
>> }
>>
>> Example:
>>
>> {
>> "http://example.org/about" : {
>> "http://purl.org/dc/terms/__title
>> <http://purl.org/dc/terms/title>" : {
>> "rdf:value" : "Annas hjemmeside",
>> "rdf:datatype" : "http://www.w3.org/2001/__XMLSchema#string
>> <http://www.w3.org/2001/XMLSchema#string>",
>> "xml:lang" : "da"
>> }
>> }
>> }
>>
>> Here, rdf:value is akin to RDF/JSON "value." It and it alone is
>> NOT an attribute on the Predicate (it is the "text content" of the
>> equivalent XML element), but all other key:value pairs are
>> interpreted as Predicate attributes. rdf:datatype is akin to
>> RDF/JSON's "datatype," but there is no need to introduce a new and
>> reserved key word: the RDF/XML attribute assumes the role immediately.
>>
>> This simple form--that RDF Objects are JSON Objects with a
>> syntactical placement of RDF/XML attributes--yields an immediate
>> and consistent extension for Objects as resources (URIs and bnodes):
>>
>> 1c. Objects as resources (URIs and bnodes):
>>
>> {
>> "S" : { "P" :
>> {
>> "rdf:resource" : "O",
>> ...
>> }
>> }
>> }
>>
>> Compound example:
>>
>> {
>> "http://example.org/about" : {
>>
>> "http://purl.org/dc/terms/__title
>> <http://purl.org/dc/terms/title>" : [
>>
>> "Anna's Homepage",
>>
>> {
>> "rdf:value" : "Annas hjemmeside",
>> "rdf:datatype" :
>> "http://www.w3.org/2001/__XMLSchema#string
>> <http://www.w3.org/2001/XMLSchema#string>",
>> "xml:lang" : "da"
>> } ],
>>
>> "http://xmlns.com/foaf/0.1/__homepage
>> <http://xmlns.com/foaf/0.1/homepage>" : { "rdf:resource" :
>> "http://example.org/anna" },
>>
>> "http://purl.org/dc/terms/__creator
>> <http://purl.org/dc/terms/creator>" : "_:anna"
>>
>> }
>> }
>>
>> At first it may not seem that the above proposal differs much in
>> substance from RDF/JSON, but it does in a number of ways. It
>> retains the essence of { "S" : { "P" : "O" } } model, but
>> simplifies the serialization for simple cases, and aligns more
>> complex cases with a transliteration of RDF/XML attributes. This
>> requires no actual knowledge of RDF as a re-serializer.
>>
>> The model also lends itself "naturally" to QName support [3], thus
>> becoming closer to being informationally lossless with respect to
>> RDF/XML. We support Qnames by noting the "xmlns" attribute on the
>> rdf:RDF "Subject"; viz.:
>>
>> {
>>
>> "rdf:RDF" : {
>>
>> "xmlns:rdf" :
>> "http://www.w3.org/1999/02/22-__rdf-syntax-ns#
>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#>",
>> "xmlns:xsd" : "http://www.w3.org/2001/__XMLSchema#
>> <http://www.w3.org/2001/XMLSchema#>",
>>
>> "xmlns:" : "http://example.org/",
>> "xmlns:dc" : "http://purl.org/dc/terms/",
>> "xmlns:foaf" : "http://xmlns.com/foaf/0.1"
>> },
>>
>> ":about" : {
>> ...
>> }
>> }
>>
>> We bootstrap the definition of the rdf: namespace within the
>> rdf:RDF construct. We make the implicit assumption that the token
>> "rdf:RDF" can never itself be the valid Subject of a user-defined
>> payload--a topic we discuss further in section 4. below.
>>
>> We can achieve a slight clean-up in presentation by recognizing
>> "xmlns" as a keyword, but we do this only as "syntactical sugar"
>> on the underlying model of XML attributes on Subject entries; e.g.:
>>
>> {
>> "xmlns" : {
>> "" : "http://example.org/",
>> "dc" : "http://purl.org/dc/terms/",
>> "foaf" : "http://xmlns.com/foaf/0.1"
>> },
>>
>> ":about" : {
>> ...
>> }
>> }
>>
>> RDF requires that all Subjects are resources: either URIs or
>> bnodes. Resources can be lexically written in four variants:
>>
>> Absolute URIs; e.g., http://example.org/about, urn:example:about
>> QName with prefix (namespace); e.g., dc:title
>> QName with reserved underscore (_) for bnode; e.g., _:anna
>> QName with user-defined default namespace; e.g., ":myTerm"
>>
>> Notably, RDF does not allow relative URIs for Subjects or
>> Predicates [5]. Thus "a", "5", "a/b/c", are all valid (relative)
>> URIs, but are lexically illegal as RDF Subjects. Thus we note that
>> lexically, all valid Subjects and Predicates necessarily always
>> contain a colon (:). Thus we can unambiguously allow the keyword
>> "xmlns" (or "@xmlns") to appear in the "S" place and overload it
>> with special meaning as a document directive. In a similar manner
>> we can use "?xml" to preserve record of the XML document encoding
>> that may appear on the first line of an RDF/XML document. In so
>> doing we are not stating that 'this' document has the encoding; we
>> are stating that this document, if transliterated from, or to,
>> XML, has the encoding:
>>
>> {
>>
>> "?xml" : {
>> "version" : "1.0",
>> "encoding" : "UTF-8"
>> },
>>
>> "xmlns" : {
>> "rdf" : "http://www.w3.org/1999/02/22-__rdf-syntax-ns#
>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#>",
>> "xsd" : "http://www.w3.org/2001/__XMLSchema#
>> <http://www.w3.org/2001/XMLSchema#>",
>> "" : "http://example.org/",
>> "dc" : "http://purl.org/dc/terms/",
>> "foaf" : "http://xmlns.com/foaf/0.1"
>> },
>>
>> ":about" : {
>>
>> "dc:title" : [
>> "Anna's Homepage",
>> {
>> "rdf:value" : "Annas hjemmeside",
>> "rdf:datatype" : "xsd:string",
>> "xml:lang" : "da"
>> } ],
>>
>> "foaf:homepage" : { "rdf:resource" : ":anna" },
>>
>> "dc:creator" : "_:anna"
>>
>> },
>>
>> "_:anna" : {
>> "foaf:name" : "Anna",
>> "foaf:homepage" : { "rdf:resource" : "http://example.org/anna" }
>> }
>> }
>>
>> Note in the above the use of (source) doc encoding, prefixes,
>> default namespace, QNames, absolute URIs, bnodes, untyped
>> literals, and typed literals. This could have been serialized from
>> an RDF data model, or transliterated syntactically from RDF/XML.
>> Our rules are still simple and consistent: almost the same as
>> RDF/JSON, with the extension that object "metadata" is analogous
>> to RDF/XML attributes and bundled inside a JSON object using
>> existing rdf: namespace predicates.
>>
>>
>> Form 2. Support the disbursement of statements throughout a
>> document, for example as applicable when stream transliterating
>> RDF/XML -> JSON. This currently cannot be done in RDF/JSON, but is
>> quite simple to do:
>>
>> [
>>
>> { "?xml" : {
>> "version" : "1.0",
>> "encoding" : "UTF-8"
>> }
>> },
>>
>> { "xmlns" : {
>> "rdf" : "http://www.w3.org/1999/02/22-__rdf-syntax-ns#
>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#>",
>> "xsd" : "http://www.w3.org/2001/__XMLSchema#
>> <http://www.w3.org/2001/XMLSchema#>",
>> "" : "http://example.org/",
>> "dc" : "http://purl.org/dc/terms/",
>> "foaf" : "http://xmlns.com/foaf/0.1"
>> }
>> },
>>
>> { ":about" :
>> {
>> "dc:title" : "Anna's Homepage",
>> "dc:creator" : "_:anna"
>> }
>> },
>>
>> { "_:anna" : {
>> "foaf:name" : "Anna",
>> "foaf:homepage" : { "rdf:resource" : "http://example.org/anna" }
>> }
>> },
>>
>> { ":about" : {
>> "dc:title" : {
>> "rdf:value" : "Annas hjemmeside",
>> "rdf:datatype" : "xsd:string",
>> "xml:lang" : "da"
>> },
>> "foaf:homepage" : { "rdf:resource" : ":anna" }
>> }
>> }
>>
>> ]
>>
>> (Note the repetition of :about). All the previous rules apply. We
>> simply note that { "S" : { "P" : "O" } } used in the earlier
>> examples was just a simplification of a larger, more encompassing
>> model: [ { "S" : { "P" : "O" } }, { "S" : { "P" : "O" } }, ... ].
>> This reads "naturally:" an array of JSON objects, each making
>> statements about an RDF Subject, with no restriction that
>> successive Subjects be unique (Because each is enclosed in its own
>> {} construct). The embracing opening and closing JSON array []
>> construct (Form 2) "communicates" the chosen serialization to the
>> parser that it may NOT now assume that all statements about a
>> given Subject are known, until it processes through the
>> End-Of-File. If the serializer chooses to group all statements for
>> all subjects (Form 1), then it can easily do this too by not using
>> the opening JSON array [] construct and building JSON objects per
>> the earlier examples above. Thus the "spec" does not bais towards
>> parsers or serializers (it lets the producer decide). The spec
>> supports short-circuiting for both streaming serializers and
>> streaming parsers: just write/read the first non-whitespace
>> character as a '[' or '{' and proceed accordingly.
>>
>>
>> 3. RDF/XML has short-hand notation for rdf:type statements that
>> allows concise "declarations" at the beginning of a document.
>> These declarations can aid parsers. For example, OWL models can be
>> aided by knowing if a property is an owl:ObjectProperty or an
>> owl:DatatypeProperty when it is first *used* (i.e., when it first
>> occurs as a resource in a statement). Because the serialization of
>> RDF does not place restrictions on the ordering within a document
>> of resource definitions and type statements, a predicate's use may
>> precede its declaration and definition (if any). The RDF/XML
>> "declaration" short-hand looks like this:
>>
>> <owl:Class rdf:about="http://mySite.org/__MyClass
>> <http://mySite.org/MyClass>"/>
>> <mySite:MyClass rdf:about="http://mySite.org/__MyThing
>> <http://mySite.org/MyThing>"/>
>> <owl:DatatypeProperty
>> rdf:about="http://mySite.org/__myDatatypeProperty
>> <http://mySite.org/myDatatypeProperty>"/>
>> <owl:DatatypeProperty
>> rdf:about="http://mySite.org/__myOtherDatatypeProperty
>> <http://mySite.org/myOtherDatatypeProperty>"/>
>> ....
>>
>> and is semantically equivalent to more verbose rdf:type statements
>> about each of the resources.
>>
>> Now note that the { "S" : { "P" : "O" } } construct leaves two
>> other constructs undefined; namely:
>>
>> { "S" : "T" } and
>> { "S" : [ "T", ... ] }
>>
>> where "T" is some text (a string).
>>
>> Thus we can define the use of these constructs to support concise
>> rdf:type declarations in a manner similar to RDF/XML:
>>
>> {
>> "owl:Class" : "mySite:myClass",
>> "mySite:MyClass" : "mySite:myThing",
>> "owl:DatatypeProperty" : [ "mySite:myDatatypeProperty",
>> "mySite:__myOtherDatatypeProperty" ]
>> ...
>> }
>>
>> The meaning of the above is that the JSON objects (or array
>> elements) are each rdf:type of the JSON subject. There is no
>> ambiguity in how to interpret the above because none of the
>> constructs are of the form "S" : { ... }. This aligns nicely with
>> RDF/XML declarations. Full example is below in 4.
>>
>>
>> 4. Semantic serialization and parsing. RDF/JSON is presumably a
>> sole RDF -> JSON serialization. It need know nothing about RDF/XML
>> (though clearly here I advocate changing that to a tighter linkage
>> to informationally lossless transliteration of RDF/XML). But it
>> seems that the more that RDF/JSON differentiates itself as
>> something more than "one more ad hoc way of representing RDF in
>> JSON" (of which there are many such competing proposals), the more
>> it could position itself as an important and distinct addition to
>> the W3C toolbox.
>>
>> One way to do this is to more tightly embrace RDF as the
>> underlying W3C Semantic Web technology and then use knowledge of
>> those semantics to improve the serialization; i.e., RDF/JSON would
>> be a "smart," semantically-aware JSON serialization of W3C
>> Semantic Web technologies.
>>
>> We immediately distinguish here between "semantic serialization
>> and parsing" and "inference." Various implicit forms of semantic
>> parsing are already done by many parsers and interpreters--for
>> example, a scripting language interpreter may assume from 'var x =
>> 1' that x is an integer variable, even though it has not been
>> declared as being of that type. The goal of semantic serialization
>> and parsing is to improve and effect the serialization and parsing
>> while neither adding nor removing any new knowledge. For example,
>> with semantic parsing this:
>>
>> {
>>
>> "owl:DatatypeProperty" : ":myProperty",
>>
>> ":mySubject" : {
>> ":myProperty" : {
>> "rdf:resource" : "http://example.org/anna"
>> }
>> }
>> }
>>
>> is equivalent to, and could be replaced by, this:
>>
>> {
>>
>> "owl:DatatypeProperty" : ":myProperty",
>>
>> ":mySubject" : { ":myProperty" : "http://example.org/anna" }
>> }
>>
>> The token "http://example.org/anna" is necessarily a resource, not
>> a literal. The line between semantic serialization and parsing and
>> inference is subtle. The former is concerned with preservation of
>> explicit statements of knowledge (or their absence) while using ex
>> situ knowledge in a manner that improves the serialization or
>> parsing; the latter is concerned with making statements explicit
>> that may otherwise be necessarily-true yet only implicit (not
>> stated). Our focus is on the former. (If a serialization is
>> missing statements, we want to preserve that absence, since the
>> action of serialization should maintain input->output data
>> integrity [for example, cases of purposely "broken" data models
>> for the purpose of testing]).
>>
>> A side-effect of the above is that in order to support streaming
>> parsers, the order of statements in the document can be important
>> (e.g., in the above example, if the declaration of myProperty
>> occurred after its assignment, then the value
>> "http://example.org/anna" would be considered a string literal,
>> not a resource). This can be an issue, because RDF -> RDF/XML
>> serializers may not give users control of the ordering of
>> statements, nor even guarantee deterministic representations on
>> successive invocations, thus RDF -> RDF/XML -> RDF/JSON -> RDF
>> could fail to be informationally lossless. There are ways to
>> address this, but at a minimum semantic serialization and parsing
>> should be carefully weighed.
>>
>> If we accept due diligence on a dependency of statement ordering
>> in the document, then we can outline at least four ways to support
>> semantic serialization and parsing:
>>
>> 1. Recognize "rdf:RDF", "xmlns", etc. when they appear in the
>> Subject position as document directives, not user-defined Subjects
>> (see above).
>>
>> 2. Predefine the xmlns namespaces rdf, rdfs, xsd, and owl (require
>> no explicit assignments).
>>
>> 3. Recognize the semantics of rdf:type, rdfs:range, rdfs:domain,
>> rdfs:subClassOf, rdfs:subPropertyOf, etc.: the RDF Object of those
>> predicates must be a resource (cannot be a literal). An exception
>> and special semantics apply when the object is an XSD datatype
>> (e.g., "rdfs:range xsd:integer").
>>
>> 4. Allow the preservation of ex situ RDF comments with the keyword
>> "comment" (or "@comment" or "//" or "#"). For example, if
>> transliterating in RDF/XML, then the comments would be
>> re-serialized as XML comments (<!-- -->). But if translating into
>> N3, then the comments would be re-serialized as # comments.
>>
>> Example:
>>
>> {
>>
>> "?xml" : {
>> "version" : "1.0",
>> "encoding" : "UTF-8"
>> },
>>
>> "xmlns" : {
>> "" : "http://example.org/",
>> "dc" : "http://purl.org/dc/terms/",
>> "foaf" : "http://xmlns.com/foaf/0.1",
>> "mySite" : "http://mySite.org/myTerms/"
>> },
>>
>> "//" : "This is a comment",
>>
>> "rdf:Property" : [ "dc:title", "dc:creator" ],
>>
>> "owl:DatatypeProperty" : "mySite:aDatatypeProperty",
>>
>> "owl:ObjectProperty" : "mySite:hasHomepage",
>>
>> "owl:Class" : [ "mySite:myClass", "mySite:anotherClass" ],
>>
>> "mySite:aDatatypeProperty" : {
>> "rdfs:range" : "xsd:string"
>> },
>>
>> "mySite:anObjectProperty" : {
>> "rdfs:range" : "mySite:myClass"
>> },
>>
>> "mySite:anotherObjectProperty" : {
>> "rdfs:subPropertyOf" : "mySite:anObjectProperty",
>> "rdfs:domain" : "mySite:myClass"
>> },
>>
>> ":about" : {
>> "dc:title" : "Anna's Homepage",
>> "dc:creator" : "_:anna",
>> "mySite:hasHomepage" : "http://example.org/anna",
>> "rdfs:comment" : [
>> "This comment is an explicit property of the subject :about",
>> "So is this one"
>> ],
>> "//" : [
>> "This is not a property of the subject.",
>> "It is equivalent to two XML comments <!-- --> within the
>> :about element block when re-serialized as RDF/XML"
>> ]
>> }
>>
>> }
>>
>>
>> I believe the above will allow the informationally lossless
>> transliteration of thousands (millons?) of extant RDF/XML
>> documents into RDF/JSON--though a more thorough analysis is first
>> warranted. The mere proliferation of said documents conforming to
>> RDF/JSON should aid in its adoption. And of course, de novo RDF ->
>> RDF/JSON is also satisfied.
>>
>>
>> Summary:
>>
>> There are many candidates for serializing RDF as JSON. If we want
>> anything more than the null model of a array of triples, then we
>> should identify the goals and prioritize the trade-offs. The
>> proposal here attempts the following goals:
>>
>> 1. RDF/JSON should enable RDF -> JSON serialization independent
>> any other RDF serialization (specifically, one should be able to
>> go directly from an RDF data model into RDF/JSON without any
>> intervening serialization).
>>
>> 2. RDF/JSON should be able to be implemented as a streaming
>> re-serializer on legacy RDF/XML without the need for building a
>> complete, in-memory RDF data model. The special attention to
>> RDF/XML is because it is already the W3C recommended serialization
>> for RDF.
>>
>>
>> I don't understand why this needs to be a goal. I also did not
>> understand how your proposal enables it, as your examples do not
>> explore the full range of legal RDF/XML document syntax trees, some of
>> which are unnecessarily complex and really do not need to be
>> replicated in any other RDF serialisation. It may be useful to be able
>> to transliterate *from* RDF/XML to something else, although the
>> usecase would be very thin, but there is no reason to be able to
>> support reserialising back to RDF/XML once you have gone away from it,
>> so you don't need to preserve the XMLisms in JSON.
>>
>> 3. RDF/JSON should allow the enablement of short-circuit parsing,
>> if the provider chooses to serialize content so as to support it.
>>
>>
>> I am a little confused as to how the format you propose, which is not
>> really the simple Talis RDF/JSON anymore after the changes, could be
>> structured to *not* support short-circuit parsing anymore. The JSON
>> model does not allow repeated keys within an object, so there is no
>> simple way to use subjects as keys in any other way and I am not sure
>> what the other alternative is from your proposal.
>>
>> In general though, I am a little confused about the need to ever do
>> short-circuit parsing. What documents are so large that you cannot pay
>> the cost of parsing an entire document to the RDF abstract model?
>>
>> 4. RDF/JSON should be informationally lossless with respect to
>> both RDF and to transliterations of RDF/XML.
>>
>>
>> Any RDF serialisation must not be informationally lossless with
>> respect to RDF. Some serialisations support structures that cannot be
>> translated back to RDF triples (ie, any quads format, JSON-LD with
>> relaxed use of blank nodes, and N3 with its extensions), but all of
>> them are otherwise only defined based on the RDF format, not on
>> another syntax.
>>
>> I fail to see what the benefit would be to having a consistent
>> transliteration from the huge variety of possible RDF/XML structures
>> without going through an RDF model.
>>
>> 5. RDF/JSON should reflect a "natural" JSON representation: simple
>> things should be "simply serialized" and complex things should be
>> built from simple things. If one knows JSON, but doesn't really
>> know RDF, then one should feel comfortable that JSON constructs
>> are being used in intuitive, "natural" ways without the need for
>> syntactic convolutions.
>>
>>
>> I think you would be more comfortable using JSON-LD, as it is designed
>> based on many of your goals, except for the RDF/XML transliteration
>> goal, and includes many of the features that you propose, except for
>> comments.
>>
>> 6. As a proposed W3C recommendation, RDF/JSON should leverage RDF,
>> RDFS, XSD, and OWL semantics when it can do so either without
>> compromise to the above goals, or with clear and prioritized
>> compromise (for example, identifying cases where reliance on
>> statement ordering is acceptable).
>>
>>
>> Of the RDF serialisations, only N3, with its non-RDF extensions,
>> attempts to do anything other than provide a container for simple RDF
>> triples or quads. How would your proposed format encode anything
>> above RDF triples while staying consistent with RDF?
>>
>> Don't get me wrong, you could have a niche format for your own
>> purposes. However, I think the usecases, which heavily rely on being
>> able to represent a literal RDF/XML document in JSON, are very thin
>> and would not be of interest to many people who will simply pay the
>> cost of parsing an entire document to memory. Alternatively, parsing
>> RDF/XML to N-Triples can be done while streaming from disk to disk,
>> and sorting the document can be done easily with a fixed memory cost,
>> before parsing it and serialising to Talis RDF/JSON in a streaming
>> method. This is all possible without the legacy XML-specific
>> information that will not be practically useful to anyone using the
>> JSON document, and they will not want to preserve it, in general, just
>> to support a translation back to the exact XML document that was
>> originally used to create it.
>>
>> Of your proposed changes to Talis RDF/JSON, the namespace extensions
>> would be of most interest to me, although I would definitely not
>> relate it to the XML QName specification which is far too limited to
>> be of any use in a modern format.
>>
>> Prior to that, if W3C is interested in continuuing at all with
>> RDF/JSON standardisation, I will be proposing to add Graph/Quads
>> support to the specification based on the extension that Joshua
>> Shinavier made to the format for the Sesame RDF/JSON parser/writer. It
>> adds an extra "graph" key with an Array of URIs, added to the Object
>> position, and is fairly backwards compatible with the current Talis
>> RDF/JSON specification as long as parsers do not fault on the
>> unrecognised "graph" key.
>>
>> Peter
Received on Friday, 17 May 2013 16:48:35 UTC