Re: Public feedback on RDF/JSON: Proposal to align w/ W3C RDF/XML

On 17 May 2013 09:10, Damian Gessler <dgessler@iplantcollaborative.org>wrote:

> This is discussion is long, but hopefully offers constructive comment for
> RDF/JSON. It is submitted as an email per directions at
> https://dvcs.w3.org/hg/rdf/**raw-file/default/rdf-json/**index.html<https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-json/index.html>
> .
>
> The model proposed here addresses untyped literals, typed literals,
> resources (URIs and bnodes), QNames (including reserved prefixes,
> user-defined prefixes, and a default namespace), preservation of XML
> encoding information, type declarations, comments, short-circuit parsing,
> and both aggregate and disbursed subject blocks. It does so with a
> "natural" reading of the resultant JSON that yields similarities to both N3
> and RDF/XML. It is designed to be informationally lossless with respect to
> both RDF and RDF/XML, and can be used either as a pure RDF serialization
> independent of RDF/XML, or as a streaming transliteration on the large
> extant repository of legacy RDF/XML documents on the Web.
>
> We begin simply and pedagogically, but things will speed up:
>
> 1. We ask rhetorically what we are trying to achieve with RDF/JSON. We
> begin with an immediate and simple JSON serialization for RDF: a
> serialization that preserves the core and fundamental data model of RDF
> (the S,P,O triple) while adding little else; viz:
>
> [
>   [ "S", "P", "O" ],
>   [ "S", "P", "O" ],
>   ...
> ]
>
> Where S is the Subject, P is the Predicate (or Property), and O is the
> Object. This simple serialization can be expanded to support literal
> datatypes in a number of ways; e.g.:
>
> [
>   [ "S", "P", "L" ],
>   [ "S", "P", { "L" : "D" } ],
>   [ "S", "P", { "R" : {} } ],
>   ...
> ]
>
> for RDF Objects L (Literal) (and datatype D) and R (Resource) (URI or
> bnode). There are also other minor variants and syntaxes that could
> differentiate between untyped literals, typed literals, and resources.
>
> We will reject this serialization per se; but it is important to offer it
> as a "null model" because that forces us to be explicit as to why another
> serialization with necessarily overloaded semantics is preferable.
>
> Clearly, by not stopping at this immediate and natural JSON serialization
> of triples, the vision of RDF/JSON must be either implicitly, or
> explicitly, something other than just serializing RDF into JSON.
>
> By presenting a data model of:
>
> { "S" : { "P" : { "O" : [ ... ] } }
>
> RDF/JSON shows that it prioritizes a subject-oriented data structure of
> the underlying RDF data model in achieving its JSON serialization. This
> elegant, natural, data model has similarities to the use and adoption of N3
> over N-Triples.
>
>
> 2. We note that the goal of RDF/JSON cannot be interpreted as to translate
> legacy JSON -> RDF. This is because the semantics of any arbitrary, legacy,
> JSON document do not map to the semantics of RDF/JSON. For example, JSON
> arrays do not map to RDF List constructs--and indeed, nor should they, for
> an array is not a list (though in many cases it can be interpreted as
> such). Also, RDF/JSON introduces reserved keys ("type", "value", "lang",
> "datatype") that have implied semantics on the resultant de-serialized data
> models that are not recognized as such in JSON. This is not to say that one
> could not read legacy JSON, build an in-memory data model, and output
> RDF/JSON; it is to say that such an operation (arbitrary, legacy JSON ->
> RDF -> RDF/JSON) is outside both the goals and spec of RDF/JSON. For JSON
> -> RDF, see JSON-LD [1].
>
> Thus the perspective of RDF/JSON is focused on RDF -> JSON, while
> leveraging some of the JSON data modeling constructs. The W3C recommend
> serialization for RDF is RDF/XML [2]. There is a large legacy presence of
> RDF/XML documents on the Web, especially for OWL. Thus a desirable
> characteristic of a JSON serialization would be the informationally
> lossless transformation of RDF/XML -> JSON. This becomes a key guide for
> the following discussion. While RDF/JSON can position itself as solely a
> RDF serialization independent of others, distinct, and separate from
> RDF/XML, this is perhaps a missed opportunity.
>
> Alternatively, RDF/JSON could position itself as an RDF -> JSON
> serialization that builds upon, and is receptive to, informationally
> lossless transliterations of the already-recommended W3C serialization for
> RDF: RDF/XML. The motivation is that such an approach builds a suite of
> complementary W3C technologies, including various serializations, rather
> than a merely a collection of competing formats. Of course, RDF/JSON should
> also be able to stand separate and independent of RDF/XML, such that one
> could go RDF -> RDF/JSON -> RDF without any serialization through RDF/XML.
> Thus we seek both worlds.
>
> Currently, RDF/JSON is not informationally lossless with respect to
> RDF/XML; we note a number of difficulties:
>
> 2a. QNames. RDF/JSON does not support QNames [3]. This presumably could be
> addressed by adding semantics on how to serialize prefixes. If RDF/JSON
> chooses not to support QNames then it can be still said to be
> informationally lossless with respect to RDF, but it cannot be said to be
> informationally lossless with respect to RDF/XML. This would seem to be an
> undesirable and unnecessary limitation.
>
> 2b. Serializing. RDF/JSON binds all of a Subject's predicates, and all and
> each of those Predicates' Objects into a single, compound JSON object. Yet
> RDF/XML does not require that all statements about a Subject be together or
> in any one place in the document, and RDF does not require this generically
> for serialization. Thus RDF/JSON cannot be implemented as a streaming
> syntactical re-serializer directly on RDF/XML: RDF/JSON must have knowledge
> of the entire RDF data model, such as to know all of a Subject's predicates
> and their objects, before it can serialize even the first subject. This is
> somewhat unfortunate, since we would like a serialization spec to be
> independent of implementation algorithms, be they streaming or "DOM"-based.
> RDF/JSON's requirement that "S" be unique (for each unique Subject) is
> forced upon it by JSON's requirement that all keys in a JSON object be
> unique (but see below).
>
> 2c. Parsing. RDF/JSON imposes a data model outside of RDF proper, which
> limits the utility of the serialization. But it is fair to say it also
> enhances the utility of the serialization: there is a trade-off. The
> elegance and "naturalness" of RDF/JSON's { "S" : { "P" : [ "O" ] } } model
> necessarily clusters statements about Subjects, while disbursing statements
> about Predicates and Objects throughout the document. I call this the
> "phone book" problem, where the chosen serialization of the producer limits
> the utility available to the consumer, even though the consumer "has all
> the data." In the "old days," phone books were distributed as serialized
> name:number pairs, sorted by name, printed on paper. The sorting produced
> essentially an array, such that one could use an approximate binary search
> to find a name amongst a million entries in a matter of seconds. The data
> producer (the phone company) gave the consumer both name and number, and at
> some level did not care whether the consumer was interested in the name,
> number, or both. But the serialization essentially forced the consumer to
> accept name:number ordered-pairs; the sorting and serialization on name
> biased against number:name utility. A separate serialization (called a
> reverse-lookup) was needed if one had a number and wanted to find its
> associated name. These books were usually hard to find. What is relevant
> here is not the old days of phone books, but to note that RDF has no such
> restriction. RDF does not bias Subjects over Objects, or Objects over
> Predicates, etc. One of the benefits of the RDF/JSON modeling is that once
> one is done processing a Subject, one is guaranteed that no more syntactic
> statements about the Subject (as a Subject, and as identified lexically by
> its key [i.e., not addressing the semantics of owl:sameAs]) shall be made.
> Thus unlike RDF/XML, a streaming parser can be implemented for RDF/JSON
> such that further processing of a document stream can be abandoned prior to
> the entire document being processed. I call this "short-circuit" parsing.
> But this comes at the cost that the RDF/JSON model limits the utility of
> the data when not consumed as intended, and in this case the "intent" is
> set not by the producer, but by RDF/JSON itself. One could say that
> RDF/JSON benefits the parser at the expense of the serializer.
>
> 2d. RDF/JSON has no mechanism to retain comments ex situ of RDF (e.g.,
> RDF/XML XML comments [<!-- -->]). This is made difficult due to JSON's lack
> of support for embedded comments.
>
>
> The proposal below addresses the above issues while keeping very much in
> the flavor of RDF/JSON's { "S" : { "P" : [ "O" ] } } model. It is
> informationally lossless with respect to both RDF and RDF/XML (supports
> QNames and comments); it supports streaming serialization (e.g., as a
> syntactical transliterator on streaming RDF/XML); and it supports streaming
> parsing of its own serialization.
>
> The proposal is quite simple and contains two "forms":
>
> Form 1. Guarantee that all statements about a Subject are localized in the
> document, thus supporting short-circuit parsing. Short-circuit guarantees
> are "communicated" to the parser by virtue of an opening JSON object. A
> parser is guaranteed that all keys of a JSON object are unique, thus when
> it "sees" a JSON object, it "knows" that all statements about the key are
> localized to the JSON object.
>
> Form 1 is very similar in structure to RDF/JSON.
>
> 1a. Simple, untyped literals:
>
> {
>   "S" : { "P" : "L" }
> }
>
> Examples:
>
> 1a.i
> {
>   "http://example.org/about" :
>     { "http://purl.org/dc/terms/**title <http://purl.org/dc/terms/title>"
> : "Anna's Homepage" }
> }
>
> 1a.ii
> {
>   "http://example.org/about" : {
>     "http://purl.org/dc/terms/**title <http://purl.org/dc/terms/title>" :
> [ "Anna's Homepage", "Annas hjemmeside" ],
>     "http://anotherUniqueProperty/**p <http://anotherUniqueProperty/p>" :
> "L"
>     ...
>   }
> }
>
> JSON array [] constructs are required for the Object only as needed. This
> differs from RDF/JSON which requires Object array constructs even in cases
> of there being only a single Object. JSON imposes no unique value
> restriction for array elements.
>
> Example 1a.i shows that simple statements are "simply" serialized. The
> examples below will show that more complex statements are built from the
> application of simple rules.
>
> Example 1a.ii shows JSON arrays as RDF Objects to package multiple
> property instances and values.
>
> 1b. Typed Literals. We note from RDF/XML that datatypes on literals are
> attributes on the Predicates (not on the literals themselves). In a similar
> manner, typed literals do not have a language, per se [4]: a language
> qualifier is on the Predicate. Thus we here make a simple extension that
> allows use to replace the literal "L" with an JSON object {} to capture
> arbitrary RDF/XML attribute data, with special semantics for "rdf:value";
> i.e.:
>
> 1b. Typed literals:
>
> {
>   "S" : { "P" : {
>     "rdf:value" : "L",
>     "rdf:datatype" : "D",
>      ...
>     }
>   }
> }
>
> Example:
>
> {
>   "http://example.org/about" : {
>     "http://purl.org/dc/terms/**title <http://purl.org/dc/terms/title>" :
> {
>       "rdf:value" : "Annas hjemmeside",
>       "rdf:datatype" : "http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
> ",
>       "xml:lang" : "da"
>       }
>     }
> }
>
> Here, rdf:value is akin to RDF/JSON "value." It and it alone is NOT an
> attribute on the Predicate (it is the "text content" of the equivalent XML
> element), but all other key:value pairs are interpreted as Predicate
> attributes. rdf:datatype is akin to RDF/JSON's "datatype," but there is no
> need to introduce a new and reserved key word: the RDF/XML attribute
> assumes the role immediately.
>
> This simple form--that RDF Objects are JSON Objects with a syntactical
> placement of RDF/XML attributes--yields an immediate and consistent
> extension for Objects as resources (URIs and bnodes):
>
> 1c. Objects as resources (URIs and bnodes):
>
> {
>   "S" : { "P" :
>     {
>       "rdf:resource" : "O",
>       ...
>       }
>     }
> }
>
> Compound example:
>
> {
>   "http://example.org/about" : {
>
>     "http://purl.org/dc/terms/**title <http://purl.org/dc/terms/title>" :
> [
>
>       "Anna's Homepage",
>
>       {
>         "rdf:value" : "Annas hjemmeside",
>         "rdf:datatype" : "http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
> ",
>         "xml:lang" : "da"
>       } ],
>
>       "http://xmlns.com/foaf/0.1/**homepage<http://xmlns.com/foaf/0.1/homepage>"
> : { "rdf:resource" : "http://example.org/anna" },
>
>       "http://purl.org/dc/terms/**creator<http://purl.org/dc/terms/creator>"
> : "_:anna"
>
>     }
> }
>
> At first it may not seem that the above proposal differs much in substance
> from RDF/JSON, but it does in a number of ways. It retains the essence of {
> "S" : { "P" : "O" } } model, but simplifies the serialization for simple
> cases, and aligns more complex cases with a transliteration of RDF/XML
> attributes. This requires no actual knowledge of RDF as a re-serializer.
>
> The model also lends itself "naturally" to QName support [3], thus
> becoming closer to being informationally lossless with respect to RDF/XML.
> We support Qnames by noting the "xmlns" attribute on the rdf:RDF "Subject";
> viz.:
>
> {
>
>   "rdf:RDF" : {
>
>       "xmlns:rdf"  : "http://www.w3.org/1999/02/22-**rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> ",
>       "xmlns:xsd"  : "http://www.w3.org/2001/**XMLSchema#<http://www.w3.org/2001/XMLSchema#>
> ",
>
>       "xmlns:"     : "http://example.org/",
>       "xmlns:dc"   : "http://purl.org/dc/terms/",
>       "xmlns:foaf" : "http://xmlns.com/foaf/0.1"
>   },
>
>   ":about" : {
>     ...
>   }
> }
>
> We bootstrap the definition of the rdf: namespace within the rdf:RDF
> construct. We make the implicit assumption that the token "rdf:RDF" can
> never itself be the valid Subject of a user-defined payload--a topic we
> discuss further in section 4. below.
>
> We can achieve a slight clean-up in presentation by recognizing "xmlns" as
> a keyword, but we do this only as "syntactical sugar" on the underlying
> model of XML attributes on Subject entries; e.g.:
>
> {
>     "xmlns" : {
>       ""     : "http://example.org/",
>       "dc"   : "http://purl.org/dc/terms/",
>       "foaf" : "http://xmlns.com/foaf/0.1"
>     },
>
>   ":about" : {
>     ...
>   }
> }
>
> RDF requires that all Subjects are resources: either URIs or bnodes.
> Resources can be lexically written in four variants:
>
> Absolute URIs; e.g., http://example.org/about, urn:example:about
> QName with prefix (namespace); e.g., dc:title
> QName with reserved underscore (_) for bnode; e.g., _:anna
> QName with user-defined default namespace; e.g., ":myTerm"
>
> Notably, RDF does not allow relative URIs for Subjects or Predicates [5].
> Thus "a", "5", "a/b/c", are all valid (relative) URIs, but are lexically
> illegal as RDF Subjects. Thus we note that lexically, all valid Subjects
> and Predicates necessarily always contain a colon (:). Thus we can
> unambiguously allow the keyword "xmlns" (or "@xmlns") to appear in the "S"
> place and overload it with special meaning as a document directive. In a
> similar manner we can use "?xml" to preserve record of the XML document
> encoding that may appear on the first line of an RDF/XML document. In so
> doing we are not stating that 'this' document has the encoding; we are
> stating that this document, if transliterated from, or to, XML, has the
> encoding:
>
> {
>
>     "?xml" : {
>       "version" : "1.0",
>       "encoding" : "UTF-8"
>     },
>
>     "xmlns" : {
>       "rdf"  : "http://www.w3.org/1999/02/22-**rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> ",
>       "xsd"  : "http://www.w3.org/2001/**XMLSchema#<http://www.w3.org/2001/XMLSchema#>
> ",
>       ""     : "http://example.org/",
>       "dc"   : "http://purl.org/dc/terms/",
>       "foaf" : "http://xmlns.com/foaf/0.1"
>     },
>
>   ":about" : {
>
>     "dc:title" : [
>       "Anna's Homepage",
>       {
>         "rdf:value" : "Annas hjemmeside",
>         "rdf:datatype" : "xsd:string",
>         "xml:lang" : "da"
>       } ],
>
>       "foaf:homepage" : { "rdf:resource" : ":anna" },
>
>       "dc:creator" : "_:anna"
>
>     },
>
>   "_:anna" : {
>     "foaf:name" : "Anna",
>     "foaf:homepage" : { "rdf:resource" : "http://example.org/anna" }
>     }
> }
>
> Note in the above the use of (source) doc encoding, prefixes, default
> namespace, QNames, absolute URIs, bnodes, untyped literals, and typed
> literals. This could have been serialized from an RDF data model, or
> transliterated syntactically from RDF/XML. Our rules are still simple and
> consistent: almost the same as RDF/JSON, with the extension that object
> "metadata" is analogous to RDF/XML attributes and bundled inside a JSON
> object using existing rdf: namespace predicates.
>
>
> Form 2. Support the disbursement of statements throughout a document, for
> example as applicable when stream transliterating RDF/XML -> JSON. This
> currently cannot be done in RDF/JSON, but is quite simple to do:
>
> [
>
>   { "?xml" : {
>       "version" : "1.0",
>       "encoding" : "UTF-8"
>     }
>   },
>
>   { "xmlns" : {
>       "rdf"  : "http://www.w3.org/1999/02/22-**rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> ",
>       "xsd"  : "http://www.w3.org/2001/**XMLSchema#<http://www.w3.org/2001/XMLSchema#>
> ",
>       ""     : "http://example.org/",
>       "dc"   : "http://purl.org/dc/terms/",
>       "foaf" : "http://xmlns.com/foaf/0.1"
>     }
>   },
>
>   { ":about" :
>     {
>       "dc:title" : "Anna's Homepage",
>       "dc:creator" : "_:anna"
>     }
>   },
>
>   { "_:anna" : {
>     "foaf:name" : "Anna",
>     "foaf:homepage" : { "rdf:resource" : "http://example.org/anna" }
>     }
>   },
>
>   { ":about" : {
>     "dc:title" : {
>         "rdf:value" : "Annas hjemmeside",
>         "rdf:datatype" : "xsd:string",
>         "xml:lang" : "da"
>       },
>     "foaf:homepage" : { "rdf:resource" : ":anna" }
>     }
>   }
>
> ]
>
> (Note the repetition of :about). All the previous rules apply. We simply
> note that { "S" : { "P" : "O" } } used in the earlier examples was just a
> simplification of a larger, more encompassing model: [ { "S" : { "P" : "O"
> } }, { "S" : { "P" : "O" } }, ... ]. This reads "naturally:" an array of
> JSON objects, each making statements about an RDF Subject, with no
> restriction that successive Subjects be unique (Because each is enclosed in
> its own {} construct). The embracing opening and closing JSON array []
> construct (Form 2) "communicates" the chosen serialization to the parser
> that it may NOT now assume that all statements about a given Subject are
> known, until it processes through the End-Of-File. If the serializer
> chooses to group all statements for all subjects (Form 1), then it can
> easily do this too by not using the opening JSON array [] construct and
> building JSON objects per the earlier examples above. Thus the "spec" does
> not bais towards parsers or serializers (it lets the producer decide). The
> spec supports short-circuiting for both streaming serializers and streaming
> parsers: just write/read the first non-whitespace character as a '[' or '{'
> and proceed accordingly.
>
>
> 3. RDF/XML has short-hand notation for rdf:type statements that allows
> concise "declarations" at the beginning of a document. These declarations
> can aid parsers. For example, OWL models can be aided by knowing if a
> property is an owl:ObjectProperty or an owl:DatatypeProperty when it is
> first *used* (i.e., when it first occurs as a resource in a statement).
> Because the serialization of RDF does not place restrictions on the
> ordering within a document of resource definitions and type statements, a
> predicate's use may precede its declaration and definition (if any). The
> RDF/XML "declaration" short-hand looks like this:
>
> <owl:Class rdf:about="http://mySite.org/**MyClass<http://mySite.org/MyClass>
> "/>
> <mySite:MyClass rdf:about="http://mySite.org/**MyThing<http://mySite.org/MyThing>
> "/>
> <owl:DatatypeProperty rdf:about="http://mySite.org/**myDatatypeProperty<http://mySite.org/myDatatypeProperty>
> "/>
> <owl:DatatypeProperty rdf:about="http://mySite.org/**
> myOtherDatatypeProperty <http://mySite.org/myOtherDatatypeProperty>"/>
> ....
>
> and is semantically equivalent to more verbose rdf:type statements about
> each of the resources.
>
> Now note that the { "S" : { "P" : "O" } } construct leaves two other
> constructs undefined; namely:
>
>   { "S" : "T" } and
>   { "S" : [ "T", ... ] }
>
>   where "T" is some text (a string).
>
> Thus we can define the use of these constructs to support concise rdf:type
> declarations in a manner similar to RDF/XML:
>
> {
>   "owl:Class" : "mySite:myClass",
>   "mySite:MyClass" : "mySite:myThing",
>   "owl:DatatypeProperty" : [ "mySite:myDatatypeProperty", "mySite:**myOtherDatatypeProperty"
> ]
>   ...
> }
>
> The meaning of the above is that the JSON objects (or array elements) are
> each rdf:type of the JSON subject. There is no ambiguity in how to
> interpret the above because none of the constructs are of the form "S" : {
> ... }. This aligns nicely with RDF/XML declarations. Full example is below
> in 4.
>
>
> 4. Semantic serialization and parsing. RDF/JSON is presumably a sole RDF
> -> JSON serialization. It need know nothing about RDF/XML (though clearly
> here I advocate changing that to a tighter linkage to informationally
> lossless transliteration of RDF/XML). But it seems that the more that
> RDF/JSON differentiates itself as something more than "one more ad hoc way
> of representing RDF in JSON" (of which there are many such competing
> proposals), the more it could position itself as an important and distinct
> addition to the W3C toolbox.
>
> One way to do this is to more tightly embrace RDF as the underlying W3C
> Semantic Web technology and then use knowledge of those semantics to
> improve the serialization; i.e., RDF/JSON would be a "smart,"
> semantically-aware JSON serialization of W3C Semantic Web technologies.
>
> We immediately distinguish here between "semantic serialization and
> parsing" and "inference." Various implicit forms of semantic parsing are
> already done by many parsers and interpreters--for example, a scripting
> language interpreter may assume from 'var x = 1' that x is an integer
> variable, even though it has not been declared as being of that type. The
> goal of semantic serialization and parsing is to improve and effect the
> serialization and parsing while neither adding nor removing any new
> knowledge. For example, with semantic parsing this:
>
> {
>
>   "owl:DatatypeProperty" : ":myProperty",
>
>   ":mySubject" : {
>     ":myProperty" : {
>       "rdf:resource" : "http://example.org/anna"
>     }
>   }
> }
>
> is equivalent to, and could be replaced by, this:
>
> {
>
>   "owl:DatatypeProperty" : ":myProperty",
>
>   ":mySubject" : { ":myProperty" : "http://example.org/anna" }
> }
>
> The token "http://example.org/anna" is necessarily a resource, not a
> literal. The line between semantic serialization and parsing and inference
> is subtle. The former is concerned with preservation of explicit statements
> of knowledge (or their absence) while using ex situ knowledge in a manner
> that improves the serialization or parsing; the latter is concerned with
> making statements explicit that may otherwise be necessarily-true yet only
> implicit (not stated). Our focus is on the former. (If a serialization is
> missing statements, we want to preserve that absence, since the action of
> serialization should maintain input->output data integrity [for example,
> cases of purposely "broken" data models for the purpose of testing]).
>
> A side-effect of the above is that in order to support streaming parsers,
> the order of statements in the document can be important (e.g., in the
> above example, if the declaration of myProperty occurred after its
> assignment, then the value "http://example.org/anna" would be considered
> a string literal, not a resource). This can be an issue, because RDF ->
> RDF/XML serializers may not give users control of the ordering of
> statements, nor even guarantee deterministic representations on successive
> invocations, thus RDF -> RDF/XML -> RDF/JSON -> RDF could fail to be
> informationally lossless. There are ways to address this, but at a minimum
> semantic serialization and parsing should be carefully weighed.
>
> If we accept due diligence on a dependency of statement ordering in the
> document, then we can outline at least four ways to support semantic
> serialization and parsing:
>
> 1. Recognize "rdf:RDF", "xmlns", etc. when they appear in the Subject
> position as document directives, not user-defined Subjects (see above).
>
> 2. Predefine the xmlns namespaces rdf, rdfs, xsd, and owl (require no
> explicit assignments).
>
> 3. Recognize the semantics of rdf:type, rdfs:range, rdfs:domain,
> rdfs:subClassOf, rdfs:subPropertyOf, etc.: the RDF Object of those
> predicates must be a resource (cannot be a literal). An exception and
> special semantics apply when the object is an XSD datatype (e.g.,
> "rdfs:range xsd:integer").
>
> 4. Allow the preservation of ex situ RDF comments with the keyword
> "comment" (or "@comment" or "//" or "#"). For example, if transliterating
> in RDF/XML, then the comments would be re-serialized as XML comments (<!--
> -->). But if translating into N3, then the comments would be re-serialized
> as # comments.
>
> Example:
>
> {
>
>   "?xml" : {
>     "version" : "1.0",
>     "encoding" : "UTF-8"
>   },
>
>   "xmlns" : {
>     ""       : "http://example.org/",
>     "dc"     : "http://purl.org/dc/terms/",
>     "foaf"   : "http://xmlns.com/foaf/0.1",
>     "mySite" : "http://mySite.org/myTerms/"
>   },
>
>   "//" : "This is a comment",
>
>   "rdf:Property" : [ "dc:title", "dc:creator" ],
>
>   "owl:DatatypeProperty" : "mySite:aDatatypeProperty",
>
>   "owl:ObjectProperty" : "mySite:hasHomepage",
>
>   "owl:Class" : [ "mySite:myClass", "mySite:anotherClass" ],
>
>   "mySite:aDatatypeProperty" : {
>       "rdfs:range" : "xsd:string"
>   },
>
>   "mySite:anObjectProperty" : {
>     "rdfs:range" : "mySite:myClass"
>   },
>
>   "mySite:anotherObjectProperty" : {
>     "rdfs:subPropertyOf" : "mySite:anObjectProperty",
>     "rdfs:domain" : "mySite:myClass"
>   },
>
>   ":about" : {
>       "dc:title" : "Anna's Homepage",
>       "dc:creator" : "_:anna",
>       "mySite:hasHomepage" : "http://example.org/anna",
>       "rdfs:comment" : [
>         "This comment is an explicit property of the subject :about",
>         "So is this one"
>         ],
>       "//" : [
>         "This is not a property of the subject.",
>         "It is equivalent to two XML comments <!-- --> within the :about
> element block when re-serialized as RDF/XML"
>         ]
>     }
>
> }
>
>
> I believe the above will allow the informationally lossless
> transliteration of thousands (millons?) of extant RDF/XML documents into
> RDF/JSON--though a more thorough analysis is first warranted. The mere
> proliferation of said documents conforming to RDF/JSON should aid in its
> adoption. And of course, de novo RDF -> RDF/JSON is also satisfied.
>
>
> Summary:
>
> There are many candidates for serializing RDF as JSON. If we want anything
> more than the null model of a array of triples, then we should identify the
> goals and prioritize the trade-offs. The proposal here attempts the
> following goals:
>
> 1. RDF/JSON should enable RDF -> JSON serialization independent any other
> RDF serialization (specifically, one should be able to go directly from an
> RDF data model into RDF/JSON without any intervening serialization).
>
> 2. RDF/JSON should be able to be implemented as a streaming re-serializer
> on legacy RDF/XML without the need for building a complete, in-memory RDF
> data model. The special attention to RDF/XML is because it is already the
> W3C recommended serialization for RDF.
>

I don't understand why this needs to be a goal. I also did not understand
how your proposal enables it, as your examples do not explore the full
range of legal RDF/XML document syntax trees, some of which are
unnecessarily complex and really do not need to be replicated in any other
RDF serialisation. It may be useful to be able to transliterate *from*
RDF/XML to something else, although the usecase would be very thin, but
there is no reason to be able to support reserialising back to RDF/XML once
you have gone away from it, so you don't need to preserve the XMLisms in
JSON.


> 3. RDF/JSON should allow the enablement of short-circuit parsing, if the
> provider chooses to serialize content so as to support it.
>

I am a little confused as to how the format you propose, which is not
really the simple Talis RDF/JSON anymore after the changes, could be
structured to *not* support short-circuit parsing anymore. The JSON model
does not allow repeated keys within an object, so there is no simple way to
use subjects as keys in any other way and I am not sure what the other
alternative is from your proposal.

In general though, I am a little confused about the need to ever do
short-circuit parsing. What documents are so large that you cannot pay the
cost of parsing an entire document to the RDF abstract model?

4. RDF/JSON should be informationally lossless with respect to both RDF and
> to transliterations of RDF/XML.
>

Any RDF serialisation must not be informationally lossless with respect to
RDF. Some serialisations support structures that cannot be translated back
to RDF triples (ie, any quads format, JSON-LD with relaxed use of blank
nodes, and N3 with its extensions), but all of them are otherwise only
defined based on the RDF format, not on another syntax.

I fail to see what the benefit would be to having a consistent
transliteration from the huge variety of possible RDF/XML structures
without going through an RDF model.


> 5. RDF/JSON should reflect a "natural" JSON representation: simple things
> should be "simply serialized" and complex things should be built from
> simple things. If one knows JSON, but doesn't really know RDF, then one
> should feel comfortable that JSON constructs are being used in intuitive,
> "natural" ways without the need for syntactic convolutions.
>

I think you would be more comfortable using JSON-LD, as it is designed
based on many of your goals, except for the RDF/XML transliteration goal,
and includes many of the features that you propose, except for comments.


> 6. As a proposed W3C recommendation, RDF/JSON should leverage RDF, RDFS,
> XSD, and OWL semantics when it can do so either without compromise to the
> above goals, or with clear and prioritized compromise (for example,
> identifying cases where reliance on statement ordering is acceptable).
>

Of the RDF serialisations, only N3, with its non-RDF extensions, attempts
to do anything other than provide a container for simple RDF triples or
quads.  How would your proposed format encode anything above RDF triples
while staying consistent with RDF?

Don't get me wrong, you could have a niche format for your own purposes.
However, I think the usecases, which heavily rely on being able to
represent a literal RDF/XML document in JSON, are very thin and would not
be of interest to many people who will simply pay the cost of parsing an
entire document to memory. Alternatively, parsing RDF/XML to N-Triples can
be done while streaming from disk to disk, and sorting the document can be
done easily with a fixed memory cost, before parsing it and serialising to
Talis RDF/JSON in a streaming method. This is all possible without the
legacy XML-specific information that will not be practically useful to
anyone using the JSON document, and they will not want to preserve it, in
general, just to support a translation back to the exact XML document that
was originally used to create it.

Of your proposed changes to Talis RDF/JSON, the namespace extensions would
be of most interest to me, although I would definitely not relate it to the
XML QName specification which is far too limited to be of any use in a
modern format.

Prior to that, if W3C is interested in continuuing at all with RDF/JSON
standardisation, I will be proposing to add Graph/Quads support to the
specification based on the extension that Joshua Shinavier made to the
format for the Sesame RDF/JSON parser/writer. It adds an extra "graph" key
with an Array of URIs, added to the Object position, and is fairly
backwards compatible with the current Talis RDF/JSON specification as long
as parsers do not fault on the unrecognised "graph" key.

Peter

Received on Friday, 17 May 2013 11:27:32 UTC