- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Mon, 28 Feb 2011 12:00:03 +0000
- To: RDF-WG <public-rdf-wg@w3.org>
== Relationship of languages
An N-Triples document is a syntactically valid Turtle document.
An N-Triples document is a valid N-Quads document.
A Turtle document is not a valid TriG document.
An N-Quads document is not a valid TriG document.
I find that a bit strange.
== Tokens
It would be useful to split the Turtle grammar more clearly into tokens
and grammar rules.
Having a set of tokens that can be reused across all Turtle-related
languages would make for no unexpected surprises for application writers
(e.g. what's allowed in a prefixed name).
It also means implementers can use one (performance tuned) tokenizer but
the app writer benefit is more important.
We could add the tokens for variables, keywords, and the symbols "{",
"}" and so one set of tokens will cover evolutions of N3, N-Triples,
N-Quads, Turtle, TriG and SPARQL as well as be a possible starting point
for any other languages of the same style (a rules format; a CSV-like
results format, or RDF-Tuples; domain specific formats).
This is neutral to decisions of what language futures for named
graph/graph lityerals/whatever. It's just establishing the ground work.
The details of prefix names will cause some debate :-)
== Charset
All UTF-8. People do write "UTF-8 N-triples". This is a change to
N-Triples and N-Quads that is backwards compatible.
== N-Triples/N-Quads as data
The N-Triples format is designed for testing so
<s> <p> <o> .
could mean IRIs "s" etc, not resolved against the base, so an N-triples
file can be different from the same bytes as Turtle.
A data-format N-triples / N-Quads would be a subset of Turtle, with the
same IRI resolution rules and same syntax for IRI tokens. And in UTF-8.
As these formats are used as dump formats, pinning down details would be
a help to data publishers and consumers.
Andy
Received on Monday, 28 February 2011 23:19:30 UTC