- From: Damian Gessler <dgessler@iplantcollaborative.org>
- Date: Fri, 17 May 2013 10:47:23 -0600
- To: public-rdf-comments@w3.org
Hi Dave, Thank you. I've been happy to see the progress with JSON-LD. For a number of years we've had to use our own JSON formulation for production because of the lack of a W3C rec re JSON and OWL. Our systems run transaction-time DL reasoning on SSWAP OWL Semantic Web Services. See http://sswap.info, http://sswap.info/api [particularly http://sswap.info/api/JSONSyntax], and http:/sswap.info/jit. The work is funded by the National Science Foundation. I do believe that some study of my proposal for RDF/JSON shows it to address a suite of issues relevant to its design in what is at its core a simple and tight model, but if that discussion is closed, then at least it stands in the public record. Best, Damian. On 5/17/13 9:13 AM, David Wood wrote: > Hi Damian, > > The RDF WG held substantial discussions regarding various designs for > RDF in JSON in the first half of 2011. The discussions are well > documented both in our mailing list and on our wiki. We decided roughly > a year later (May/June 2012) to proceed with JSON-LD due to the success > of that community group's activities and implementations. One might note > that take up of JSON-LD from third parties has been solid (e. g. > Google's GMail announcement yesterday). > > The only JSON item in our plate this late in the WG's charter is whether > to write a Note (not a Recommendation) on RDF/JSON. That's it. We will > not be accepting new design proposals at this time, although a future > working group might consider your proposal. > > Regards, > Dave > (Chair hat *on*) > -- > http://about.me/david_wood > > > On May 17, 2013, at 7:27, Peter Ansell <ansell.peter@gmail.com > <mailto:ansell.peter@gmail.com>> wrote: > >> >> >> On 17 May 2013 09:10, Damian Gessler <dgessler@iplantcollaborative.org >> <mailto:dgessler@iplantcollaborative.org>> wrote: >> >> This is discussion is long, but hopefully offers constructive >> comment for RDF/JSON. It is submitted as an email per directions >> at >> https://dvcs.w3.org/hg/rdf/__raw-file/default/rdf-json/__index.html <https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-json/index.html>. >> >> The model proposed here addresses untyped literals, typed >> literals, resources (URIs and bnodes), QNames (including reserved >> prefixes, user-defined prefixes, and a default namespace), >> preservation of XML encoding information, type declarations, >> comments, short-circuit parsing, and both aggregate and disbursed >> subject blocks. It does so with a "natural" reading of the >> resultant JSON that yields similarities to both N3 and RDF/XML. It >> is designed to be informationally lossless with respect to both >> RDF and RDF/XML, and can be used either as a pure RDF >> serialization independent of RDF/XML, or as a streaming >> transliteration on the large extant repository of legacy RDF/XML >> documents on the Web. >> >> We begin simply and pedagogically, but things will speed up: >> >> 1. We ask rhetorically what we are trying to achieve with >> RDF/JSON. We begin with an immediate and simple JSON serialization >> for RDF: a serialization that preserves the core and fundamental >> data model of RDF (the S,P,O triple) while adding little else; viz: >> >> [ >> [ "S", "P", "O" ], >> [ "S", "P", "O" ], >> ... >> ] >> >> Where S is the Subject, P is the Predicate (or Property), and O is >> the Object. This simple serialization can be expanded to support >> literal datatypes in a number of ways; e.g.: >> >> [ >> [ "S", "P", "L" ], >> [ "S", "P", { "L" : "D" } ], >> [ "S", "P", { "R" : {} } ], >> ... >> ] >> >> for RDF Objects L (Literal) (and datatype D) and R (Resource) (URI >> or bnode). There are also other minor variants and syntaxes that >> could differentiate between untyped literals, typed literals, and >> resources. >> >> We will reject this serialization per se; but it is important to >> offer it as a "null model" because that forces us to be explicit >> as to why another serialization with necessarily overloaded >> semantics is preferable. >> >> Clearly, by not stopping at this immediate and natural JSON >> serialization of triples, the vision of RDF/JSON must be either >> implicitly, or explicitly, something other than just serializing >> RDF into JSON. >> >> By presenting a data model of: >> >> { "S" : { "P" : { "O" : [ ... ] } } >> >> RDF/JSON shows that it prioritizes a subject-oriented data >> structure of the underlying RDF data model in achieving its JSON >> serialization. This elegant, natural, data model has similarities >> to the use and adoption of N3 over N-Triples. >> >> >> 2. We note that the goal of RDF/JSON cannot be interpreted as to >> translate legacy JSON -> RDF. This is because the semantics of any >> arbitrary, legacy, JSON document do not map to the semantics of >> RDF/JSON. For example, JSON arrays do not map to RDF List >> constructs--and indeed, nor should they, for an array is not a >> list (though in many cases it can be interpreted as such). Also, >> RDF/JSON introduces reserved keys ("type", "value", "lang", >> "datatype") that have implied semantics on the resultant >> de-serialized data models that are not recognized as such in JSON. >> This is not to say that one could not read legacy JSON, build an >> in-memory data model, and output RDF/JSON; it is to say that such >> an operation (arbitrary, legacy JSON -> RDF -> RDF/JSON) is >> outside both the goals and spec of RDF/JSON. For JSON -> RDF, see >> JSON-LD [1]. >> >> Thus the perspective of RDF/JSON is focused on RDF -> JSON, while >> leveraging some of the JSON data modeling constructs. The W3C >> recommend serialization for RDF is RDF/XML [2]. There is a large >> legacy presence of RDF/XML documents on the Web, especially for >> OWL. Thus a desirable characteristic of a JSON serialization would >> be the informationally lossless transformation of RDF/XML -> JSON. >> This becomes a key guide for the following discussion. While >> RDF/JSON can position itself as solely a RDF serialization >> independent of others, distinct, and separate from RDF/XML, this >> is perhaps a missed opportunity. >> >> Alternatively, RDF/JSON could position itself as an RDF -> JSON >> serialization that builds upon, and is receptive to, >> informationally lossless transliterations of the >> already-recommended W3C serialization for RDF: RDF/XML. The >> motivation is that such an approach builds a suite of >> complementary W3C technologies, including various serializations, >> rather than a merely a collection of competing formats. Of course, >> RDF/JSON should also be able to stand separate and independent of >> RDF/XML, such that one could go RDF -> RDF/JSON -> RDF without any >> serialization through RDF/XML. Thus we seek both worlds. >> >> Currently, RDF/JSON is not informationally lossless with respect >> to RDF/XML; we note a number of difficulties: >> >> 2a. QNames. RDF/JSON does not support QNames [3]. This presumably >> could be addressed by adding semantics on how to serialize >> prefixes. If RDF/JSON chooses not to support QNames then it can be >> still said to be informationally lossless with respect to RDF, but >> it cannot be said to be informationally lossless with respect to >> RDF/XML. This would seem to be an undesirable and unnecessary >> limitation. >> >> 2b. Serializing. RDF/JSON binds all of a Subject's predicates, and >> all and each of those Predicates' Objects into a single, compound >> JSON object. Yet RDF/XML does not require that all statements >> about a Subject be together or in any one place in the document, >> and RDF does not require this generically for serialization. Thus >> RDF/JSON cannot be implemented as a streaming syntactical >> re-serializer directly on RDF/XML: RDF/JSON must have knowledge of >> the entire RDF data model, such as to know all of a Subject's >> predicates and their objects, before it can serialize even the >> first subject. This is somewhat unfortunate, since we would like a >> serialization spec to be independent of implementation algorithms, >> be they streaming or "DOM"-based. RDF/JSON's requirement that "S" >> be unique (for each unique Subject) is forced upon it by JSON's >> requirement that all keys in a JSON object be unique (but see below). >> >> 2c. Parsing. RDF/JSON imposes a data model outside of RDF proper, >> which limits the utility of the serialization. But it is fair to >> say it also enhances the utility of the serialization: there is a >> trade-off. The elegance and "naturalness" of RDF/JSON's { "S" : { >> "P" : [ "O" ] } } model necessarily clusters statements about >> Subjects, while disbursing statements about Predicates and Objects >> throughout the document. I call this the "phone book" problem, >> where the chosen serialization of the producer limits the utility >> available to the consumer, even though the consumer "has all the >> data." In the "old days," phone books were distributed as >> serialized name:number pairs, sorted by name, printed on paper. >> The sorting produced essentially an array, such that one could use >> an approximate binary search to find a name amongst a million >> entries in a matter of seconds. The data producer (the phone >> company) gave the consumer both name and number, and at some level >> did not care whether the consumer was interested in the name, >> number, or both. But the serialization essentially forced the >> consumer to accept name:number ordered-pairs; the sorting and >> serialization on name biased against number:name utility. A >> separate serialization (called a reverse-lookup) was needed if one >> had a number and wanted to find its associated name. These books >> were usually hard to find. What is relevant here is not the old >> days of phone books, but to note that RDF has no such restriction. >> RDF does not bias Subjects over Objects, or Objects over >> Predicates, etc. One of the benefits of the RDF/JSON modeling is >> that once one is done processing a Subject, one is guaranteed that >> no more syntactic statements about the Subject (as a Subject, and >> as identified lexically by its key [i.e., not addressing the >> semantics of owl:sameAs]) shall be made. Thus unlike RDF/XML, a >> streaming parser can be implemented for RDF/JSON such that further >> processing of a document stream can be abandoned prior to the >> entire document being processed. I call this "short-circuit" >> parsing. But this comes at the cost that the RDF/JSON model limits >> the utility of the data when not consumed as intended, and in this >> case the "intent" is set not by the producer, but by RDF/JSON >> itself. One could say that RDF/JSON benefits the parser at the >> expense of the serializer. >> >> 2d. RDF/JSON has no mechanism to retain comments ex situ of RDF >> (e.g., RDF/XML XML comments [<!-- -->]). This is made difficult >> due to JSON's lack of support for embedded comments. >> >> >> The proposal below addresses the above issues while keeping very >> much in the flavor of RDF/JSON's { "S" : { "P" : [ "O" ] } } >> model. It is informationally lossless with respect to both RDF and >> RDF/XML (supports QNames and comments); it supports streaming >> serialization (e.g., as a syntactical transliterator on streaming >> RDF/XML); and it supports streaming parsing of its own serialization. >> >> The proposal is quite simple and contains two "forms": >> >> Form 1. Guarantee that all statements about a Subject are >> localized in the document, thus supporting short-circuit parsing. >> Short-circuit guarantees are "communicated" to the parser by >> virtue of an opening JSON object. A parser is guaranteed that all >> keys of a JSON object are unique, thus when it "sees" a JSON >> object, it "knows" that all statements about the key are localized >> to the JSON object. >> >> Form 1 is very similar in structure to RDF/JSON. >> >> 1a. Simple, untyped literals: >> >> { >> "S" : { "P" : "L" } >> } >> >> Examples: >> >> 1a.i >> { >> "http://example.org/about" : >> { "http://purl.org/dc/terms/__title >> <http://purl.org/dc/terms/title>" : "Anna's Homepage" } >> } >> >> 1a.ii >> { >> "http://example.org/about" : { >> "http://purl.org/dc/terms/__title >> <http://purl.org/dc/terms/title>" : [ "Anna's Homepage", "Annas >> hjemmeside" ], >> "http://anotherUniqueProperty/__p >> <http://anotherUniqueProperty/p>" : "L" >> ... >> } >> } >> >> JSON array [] constructs are required for the Object only as >> needed. This differs from RDF/JSON which requires Object array >> constructs even in cases of there being only a single Object. JSON >> imposes no unique value restriction for array elements. >> >> Example 1a.i shows that simple statements are "simply" serialized. >> The examples below will show that more complex statements are >> built from the application of simple rules. >> >> Example 1a.ii shows JSON arrays as RDF Objects to package multiple >> property instances and values. >> >> 1b. Typed Literals. We note from RDF/XML that datatypes on >> literals are attributes on the Predicates (not on the literals >> themselves). In a similar manner, typed literals do not have a >> language, per se [4]: a language qualifier is on the Predicate. >> Thus we here make a simple extension that allows use to replace >> the literal "L" with an JSON object {} to capture arbitrary >> RDF/XML attribute data, with special semantics for "rdf:value"; i.e.: >> >> 1b. Typed literals: >> >> { >> "S" : { "P" : { >> "rdf:value" : "L", >> "rdf:datatype" : "D", >> ... >> } >> } >> } >> >> Example: >> >> { >> "http://example.org/about" : { >> "http://purl.org/dc/terms/__title >> <http://purl.org/dc/terms/title>" : { >> "rdf:value" : "Annas hjemmeside", >> "rdf:datatype" : "http://www.w3.org/2001/__XMLSchema#string >> <http://www.w3.org/2001/XMLSchema#string>", >> "xml:lang" : "da" >> } >> } >> } >> >> Here, rdf:value is akin to RDF/JSON "value." It and it alone is >> NOT an attribute on the Predicate (it is the "text content" of the >> equivalent XML element), but all other key:value pairs are >> interpreted as Predicate attributes. rdf:datatype is akin to >> RDF/JSON's "datatype," but there is no need to introduce a new and >> reserved key word: the RDF/XML attribute assumes the role immediately. >> >> This simple form--that RDF Objects are JSON Objects with a >> syntactical placement of RDF/XML attributes--yields an immediate >> and consistent extension for Objects as resources (URIs and bnodes): >> >> 1c. Objects as resources (URIs and bnodes): >> >> { >> "S" : { "P" : >> { >> "rdf:resource" : "O", >> ... >> } >> } >> } >> >> Compound example: >> >> { >> "http://example.org/about" : { >> >> "http://purl.org/dc/terms/__title >> <http://purl.org/dc/terms/title>" : [ >> >> "Anna's Homepage", >> >> { >> "rdf:value" : "Annas hjemmeside", >> "rdf:datatype" : >> "http://www.w3.org/2001/__XMLSchema#string >> <http://www.w3.org/2001/XMLSchema#string>", >> "xml:lang" : "da" >> } ], >> >> "http://xmlns.com/foaf/0.1/__homepage >> <http://xmlns.com/foaf/0.1/homepage>" : { "rdf:resource" : >> "http://example.org/anna" }, >> >> "http://purl.org/dc/terms/__creator >> <http://purl.org/dc/terms/creator>" : "_:anna" >> >> } >> } >> >> At first it may not seem that the above proposal differs much in >> substance from RDF/JSON, but it does in a number of ways. It >> retains the essence of { "S" : { "P" : "O" } } model, but >> simplifies the serialization for simple cases, and aligns more >> complex cases with a transliteration of RDF/XML attributes. This >> requires no actual knowledge of RDF as a re-serializer. >> >> The model also lends itself "naturally" to QName support [3], thus >> becoming closer to being informationally lossless with respect to >> RDF/XML. We support Qnames by noting the "xmlns" attribute on the >> rdf:RDF "Subject"; viz.: >> >> { >> >> "rdf:RDF" : { >> >> "xmlns:rdf" : >> "http://www.w3.org/1999/02/22-__rdf-syntax-ns# >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#>", >> "xmlns:xsd" : "http://www.w3.org/2001/__XMLSchema# >> <http://www.w3.org/2001/XMLSchema#>", >> >> "xmlns:" : "http://example.org/", >> "xmlns:dc" : "http://purl.org/dc/terms/", >> "xmlns:foaf" : "http://xmlns.com/foaf/0.1" >> }, >> >> ":about" : { >> ... >> } >> } >> >> We bootstrap the definition of the rdf: namespace within the >> rdf:RDF construct. We make the implicit assumption that the token >> "rdf:RDF" can never itself be the valid Subject of a user-defined >> payload--a topic we discuss further in section 4. below. >> >> We can achieve a slight clean-up in presentation by recognizing >> "xmlns" as a keyword, but we do this only as "syntactical sugar" >> on the underlying model of XML attributes on Subject entries; e.g.: >> >> { >> "xmlns" : { >> "" : "http://example.org/", >> "dc" : "http://purl.org/dc/terms/", >> "foaf" : "http://xmlns.com/foaf/0.1" >> }, >> >> ":about" : { >> ... >> } >> } >> >> RDF requires that all Subjects are resources: either URIs or >> bnodes. Resources can be lexically written in four variants: >> >> Absolute URIs; e.g., http://example.org/about, urn:example:about >> QName with prefix (namespace); e.g., dc:title >> QName with reserved underscore (_) for bnode; e.g., _:anna >> QName with user-defined default namespace; e.g., ":myTerm" >> >> Notably, RDF does not allow relative URIs for Subjects or >> Predicates [5]. Thus "a", "5", "a/b/c", are all valid (relative) >> URIs, but are lexically illegal as RDF Subjects. Thus we note that >> lexically, all valid Subjects and Predicates necessarily always >> contain a colon (:). Thus we can unambiguously allow the keyword >> "xmlns" (or "@xmlns") to appear in the "S" place and overload it >> with special meaning as a document directive. In a similar manner >> we can use "?xml" to preserve record of the XML document encoding >> that may appear on the first line of an RDF/XML document. In so >> doing we are not stating that 'this' document has the encoding; we >> are stating that this document, if transliterated from, or to, >> XML, has the encoding: >> >> { >> >> "?xml" : { >> "version" : "1.0", >> "encoding" : "UTF-8" >> }, >> >> "xmlns" : { >> "rdf" : "http://www.w3.org/1999/02/22-__rdf-syntax-ns# >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#>", >> "xsd" : "http://www.w3.org/2001/__XMLSchema# >> <http://www.w3.org/2001/XMLSchema#>", >> "" : "http://example.org/", >> "dc" : "http://purl.org/dc/terms/", >> "foaf" : "http://xmlns.com/foaf/0.1" >> }, >> >> ":about" : { >> >> "dc:title" : [ >> "Anna's Homepage", >> { >> "rdf:value" : "Annas hjemmeside", >> "rdf:datatype" : "xsd:string", >> "xml:lang" : "da" >> } ], >> >> "foaf:homepage" : { "rdf:resource" : ":anna" }, >> >> "dc:creator" : "_:anna" >> >> }, >> >> "_:anna" : { >> "foaf:name" : "Anna", >> "foaf:homepage" : { "rdf:resource" : "http://example.org/anna" } >> } >> } >> >> Note in the above the use of (source) doc encoding, prefixes, >> default namespace, QNames, absolute URIs, bnodes, untyped >> literals, and typed literals. This could have been serialized from >> an RDF data model, or transliterated syntactically from RDF/XML. >> Our rules are still simple and consistent: almost the same as >> RDF/JSON, with the extension that object "metadata" is analogous >> to RDF/XML attributes and bundled inside a JSON object using >> existing rdf: namespace predicates. >> >> >> Form 2. Support the disbursement of statements throughout a >> document, for example as applicable when stream transliterating >> RDF/XML -> JSON. This currently cannot be done in RDF/JSON, but is >> quite simple to do: >> >> [ >> >> { "?xml" : { >> "version" : "1.0", >> "encoding" : "UTF-8" >> } >> }, >> >> { "xmlns" : { >> "rdf" : "http://www.w3.org/1999/02/22-__rdf-syntax-ns# >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#>", >> "xsd" : "http://www.w3.org/2001/__XMLSchema# >> <http://www.w3.org/2001/XMLSchema#>", >> "" : "http://example.org/", >> "dc" : "http://purl.org/dc/terms/", >> "foaf" : "http://xmlns.com/foaf/0.1" >> } >> }, >> >> { ":about" : >> { >> "dc:title" : "Anna's Homepage", >> "dc:creator" : "_:anna" >> } >> }, >> >> { "_:anna" : { >> "foaf:name" : "Anna", >> "foaf:homepage" : { "rdf:resource" : "http://example.org/anna" } >> } >> }, >> >> { ":about" : { >> "dc:title" : { >> "rdf:value" : "Annas hjemmeside", >> "rdf:datatype" : "xsd:string", >> "xml:lang" : "da" >> }, >> "foaf:homepage" : { "rdf:resource" : ":anna" } >> } >> } >> >> ] >> >> (Note the repetition of :about). All the previous rules apply. We >> simply note that { "S" : { "P" : "O" } } used in the earlier >> examples was just a simplification of a larger, more encompassing >> model: [ { "S" : { "P" : "O" } }, { "S" : { "P" : "O" } }, ... ]. >> This reads "naturally:" an array of JSON objects, each making >> statements about an RDF Subject, with no restriction that >> successive Subjects be unique (Because each is enclosed in its own >> {} construct). The embracing opening and closing JSON array [] >> construct (Form 2) "communicates" the chosen serialization to the >> parser that it may NOT now assume that all statements about a >> given Subject are known, until it processes through the >> End-Of-File. If the serializer chooses to group all statements for >> all subjects (Form 1), then it can easily do this too by not using >> the opening JSON array [] construct and building JSON objects per >> the earlier examples above. Thus the "spec" does not bais towards >> parsers or serializers (it lets the producer decide). The spec >> supports short-circuiting for both streaming serializers and >> streaming parsers: just write/read the first non-whitespace >> character as a '[' or '{' and proceed accordingly. >> >> >> 3. RDF/XML has short-hand notation for rdf:type statements that >> allows concise "declarations" at the beginning of a document. >> These declarations can aid parsers. For example, OWL models can be >> aided by knowing if a property is an owl:ObjectProperty or an >> owl:DatatypeProperty when it is first *used* (i.e., when it first >> occurs as a resource in a statement). Because the serialization of >> RDF does not place restrictions on the ordering within a document >> of resource definitions and type statements, a predicate's use may >> precede its declaration and definition (if any). The RDF/XML >> "declaration" short-hand looks like this: >> >> <owl:Class rdf:about="http://mySite.org/__MyClass >> <http://mySite.org/MyClass>"/> >> <mySite:MyClass rdf:about="http://mySite.org/__MyThing >> <http://mySite.org/MyThing>"/> >> <owl:DatatypeProperty >> rdf:about="http://mySite.org/__myDatatypeProperty >> <http://mySite.org/myDatatypeProperty>"/> >> <owl:DatatypeProperty >> rdf:about="http://mySite.org/__myOtherDatatypeProperty >> <http://mySite.org/myOtherDatatypeProperty>"/> >> .... >> >> and is semantically equivalent to more verbose rdf:type statements >> about each of the resources. >> >> Now note that the { "S" : { "P" : "O" } } construct leaves two >> other constructs undefined; namely: >> >> { "S" : "T" } and >> { "S" : [ "T", ... ] } >> >> where "T" is some text (a string). >> >> Thus we can define the use of these constructs to support concise >> rdf:type declarations in a manner similar to RDF/XML: >> >> { >> "owl:Class" : "mySite:myClass", >> "mySite:MyClass" : "mySite:myThing", >> "owl:DatatypeProperty" : [ "mySite:myDatatypeProperty", >> "mySite:__myOtherDatatypeProperty" ] >> ... >> } >> >> The meaning of the above is that the JSON objects (or array >> elements) are each rdf:type of the JSON subject. There is no >> ambiguity in how to interpret the above because none of the >> constructs are of the form "S" : { ... }. This aligns nicely with >> RDF/XML declarations. Full example is below in 4. >> >> >> 4. Semantic serialization and parsing. RDF/JSON is presumably a >> sole RDF -> JSON serialization. It need know nothing about RDF/XML >> (though clearly here I advocate changing that to a tighter linkage >> to informationally lossless transliteration of RDF/XML). But it >> seems that the more that RDF/JSON differentiates itself as >> something more than "one more ad hoc way of representing RDF in >> JSON" (of which there are many such competing proposals), the more >> it could position itself as an important and distinct addition to >> the W3C toolbox. >> >> One way to do this is to more tightly embrace RDF as the >> underlying W3C Semantic Web technology and then use knowledge of >> those semantics to improve the serialization; i.e., RDF/JSON would >> be a "smart," semantically-aware JSON serialization of W3C >> Semantic Web technologies. >> >> We immediately distinguish here between "semantic serialization >> and parsing" and "inference." Various implicit forms of semantic >> parsing are already done by many parsers and interpreters--for >> example, a scripting language interpreter may assume from 'var x = >> 1' that x is an integer variable, even though it has not been >> declared as being of that type. The goal of semantic serialization >> and parsing is to improve and effect the serialization and parsing >> while neither adding nor removing any new knowledge. For example, >> with semantic parsing this: >> >> { >> >> "owl:DatatypeProperty" : ":myProperty", >> >> ":mySubject" : { >> ":myProperty" : { >> "rdf:resource" : "http://example.org/anna" >> } >> } >> } >> >> is equivalent to, and could be replaced by, this: >> >> { >> >> "owl:DatatypeProperty" : ":myProperty", >> >> ":mySubject" : { ":myProperty" : "http://example.org/anna" } >> } >> >> The token "http://example.org/anna" is necessarily a resource, not >> a literal. The line between semantic serialization and parsing and >> inference is subtle. The former is concerned with preservation of >> explicit statements of knowledge (or their absence) while using ex >> situ knowledge in a manner that improves the serialization or >> parsing; the latter is concerned with making statements explicit >> that may otherwise be necessarily-true yet only implicit (not >> stated). Our focus is on the former. (If a serialization is >> missing statements, we want to preserve that absence, since the >> action of serialization should maintain input->output data >> integrity [for example, cases of purposely "broken" data models >> for the purpose of testing]). >> >> A side-effect of the above is that in order to support streaming >> parsers, the order of statements in the document can be important >> (e.g., in the above example, if the declaration of myProperty >> occurred after its assignment, then the value >> "http://example.org/anna" would be considered a string literal, >> not a resource). This can be an issue, because RDF -> RDF/XML >> serializers may not give users control of the ordering of >> statements, nor even guarantee deterministic representations on >> successive invocations, thus RDF -> RDF/XML -> RDF/JSON -> RDF >> could fail to be informationally lossless. There are ways to >> address this, but at a minimum semantic serialization and parsing >> should be carefully weighed. >> >> If we accept due diligence on a dependency of statement ordering >> in the document, then we can outline at least four ways to support >> semantic serialization and parsing: >> >> 1. Recognize "rdf:RDF", "xmlns", etc. when they appear in the >> Subject position as document directives, not user-defined Subjects >> (see above). >> >> 2. Predefine the xmlns namespaces rdf, rdfs, xsd, and owl (require >> no explicit assignments). >> >> 3. Recognize the semantics of rdf:type, rdfs:range, rdfs:domain, >> rdfs:subClassOf, rdfs:subPropertyOf, etc.: the RDF Object of those >> predicates must be a resource (cannot be a literal). An exception >> and special semantics apply when the object is an XSD datatype >> (e.g., "rdfs:range xsd:integer"). >> >> 4. Allow the preservation of ex situ RDF comments with the keyword >> "comment" (or "@comment" or "//" or "#"). For example, if >> transliterating in RDF/XML, then the comments would be >> re-serialized as XML comments (<!-- -->). But if translating into >> N3, then the comments would be re-serialized as # comments. >> >> Example: >> >> { >> >> "?xml" : { >> "version" : "1.0", >> "encoding" : "UTF-8" >> }, >> >> "xmlns" : { >> "" : "http://example.org/", >> "dc" : "http://purl.org/dc/terms/", >> "foaf" : "http://xmlns.com/foaf/0.1", >> "mySite" : "http://mySite.org/myTerms/" >> }, >> >> "//" : "This is a comment", >> >> "rdf:Property" : [ "dc:title", "dc:creator" ], >> >> "owl:DatatypeProperty" : "mySite:aDatatypeProperty", >> >> "owl:ObjectProperty" : "mySite:hasHomepage", >> >> "owl:Class" : [ "mySite:myClass", "mySite:anotherClass" ], >> >> "mySite:aDatatypeProperty" : { >> "rdfs:range" : "xsd:string" >> }, >> >> "mySite:anObjectProperty" : { >> "rdfs:range" : "mySite:myClass" >> }, >> >> "mySite:anotherObjectProperty" : { >> "rdfs:subPropertyOf" : "mySite:anObjectProperty", >> "rdfs:domain" : "mySite:myClass" >> }, >> >> ":about" : { >> "dc:title" : "Anna's Homepage", >> "dc:creator" : "_:anna", >> "mySite:hasHomepage" : "http://example.org/anna", >> "rdfs:comment" : [ >> "This comment is an explicit property of the subject :about", >> "So is this one" >> ], >> "//" : [ >> "This is not a property of the subject.", >> "It is equivalent to two XML comments <!-- --> within the >> :about element block when re-serialized as RDF/XML" >> ] >> } >> >> } >> >> >> I believe the above will allow the informationally lossless >> transliteration of thousands (millons?) of extant RDF/XML >> documents into RDF/JSON--though a more thorough analysis is first >> warranted. The mere proliferation of said documents conforming to >> RDF/JSON should aid in its adoption. And of course, de novo RDF -> >> RDF/JSON is also satisfied. >> >> >> Summary: >> >> There are many candidates for serializing RDF as JSON. If we want >> anything more than the null model of a array of triples, then we >> should identify the goals and prioritize the trade-offs. The >> proposal here attempts the following goals: >> >> 1. RDF/JSON should enable RDF -> JSON serialization independent >> any other RDF serialization (specifically, one should be able to >> go directly from an RDF data model into RDF/JSON without any >> intervening serialization). >> >> 2. RDF/JSON should be able to be implemented as a streaming >> re-serializer on legacy RDF/XML without the need for building a >> complete, in-memory RDF data model. The special attention to >> RDF/XML is because it is already the W3C recommended serialization >> for RDF. >> >> >> I don't understand why this needs to be a goal. I also did not >> understand how your proposal enables it, as your examples do not >> explore the full range of legal RDF/XML document syntax trees, some of >> which are unnecessarily complex and really do not need to be >> replicated in any other RDF serialisation. It may be useful to be able >> to transliterate *from* RDF/XML to something else, although the >> usecase would be very thin, but there is no reason to be able to >> support reserialising back to RDF/XML once you have gone away from it, >> so you don't need to preserve the XMLisms in JSON. >> >> 3. RDF/JSON should allow the enablement of short-circuit parsing, >> if the provider chooses to serialize content so as to support it. >> >> >> I am a little confused as to how the format you propose, which is not >> really the simple Talis RDF/JSON anymore after the changes, could be >> structured to *not* support short-circuit parsing anymore. The JSON >> model does not allow repeated keys within an object, so there is no >> simple way to use subjects as keys in any other way and I am not sure >> what the other alternative is from your proposal. >> >> In general though, I am a little confused about the need to ever do >> short-circuit parsing. What documents are so large that you cannot pay >> the cost of parsing an entire document to the RDF abstract model? >> >> 4. RDF/JSON should be informationally lossless with respect to >> both RDF and to transliterations of RDF/XML. >> >> >> Any RDF serialisation must not be informationally lossless with >> respect to RDF. Some serialisations support structures that cannot be >> translated back to RDF triples (ie, any quads format, JSON-LD with >> relaxed use of blank nodes, and N3 with its extensions), but all of >> them are otherwise only defined based on the RDF format, not on >> another syntax. >> >> I fail to see what the benefit would be to having a consistent >> transliteration from the huge variety of possible RDF/XML structures >> without going through an RDF model. >> >> 5. RDF/JSON should reflect a "natural" JSON representation: simple >> things should be "simply serialized" and complex things should be >> built from simple things. If one knows JSON, but doesn't really >> know RDF, then one should feel comfortable that JSON constructs >> are being used in intuitive, "natural" ways without the need for >> syntactic convolutions. >> >> >> I think you would be more comfortable using JSON-LD, as it is designed >> based on many of your goals, except for the RDF/XML transliteration >> goal, and includes many of the features that you propose, except for >> comments. >> >> 6. As a proposed W3C recommendation, RDF/JSON should leverage RDF, >> RDFS, XSD, and OWL semantics when it can do so either without >> compromise to the above goals, or with clear and prioritized >> compromise (for example, identifying cases where reliance on >> statement ordering is acceptable). >> >> >> Of the RDF serialisations, only N3, with its non-RDF extensions, >> attempts to do anything other than provide a container for simple RDF >> triples or quads. How would your proposed format encode anything >> above RDF triples while staying consistent with RDF? >> >> Don't get me wrong, you could have a niche format for your own >> purposes. However, I think the usecases, which heavily rely on being >> able to represent a literal RDF/XML document in JSON, are very thin >> and would not be of interest to many people who will simply pay the >> cost of parsing an entire document to memory. Alternatively, parsing >> RDF/XML to N-Triples can be done while streaming from disk to disk, >> and sorting the document can be done easily with a fixed memory cost, >> before parsing it and serialising to Talis RDF/JSON in a streaming >> method. This is all possible without the legacy XML-specific >> information that will not be practically useful to anyone using the >> JSON document, and they will not want to preserve it, in general, just >> to support a translation back to the exact XML document that was >> originally used to create it. >> >> Of your proposed changes to Talis RDF/JSON, the namespace extensions >> would be of most interest to me, although I would definitely not >> relate it to the XML QName specification which is far too limited to >> be of any use in a modern format. >> >> Prior to that, if W3C is interested in continuuing at all with >> RDF/JSON standardisation, I will be proposing to add Graph/Quads >> support to the specification based on the extension that Joshua >> Shinavier made to the format for the Sesame RDF/JSON parser/writer. It >> adds an extra "graph" key with an Array of URIs, added to the Object >> position, and is fairly backwards compatible with the current Talis >> RDF/JSON specification as long as parsers do not fault on the >> unrecognised "graph" key. >> >> Peter
Received on Friday, 17 May 2013 16:48:35 UTC