- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Sun, 7 Apr 2013 17:55:29 -0400
- To: 'W3C RDF WG' <public-rdf-wg@w3.org>
I've had these niggling doubts for a while, and finally succumbed to
that morbid desire to explore some problems that I'd rather not know
about. We've all known for a while that we can create graphs with APIs
(now even serializable in Turtle) which can't be written in RDF/XML.
Here's a list of issues I think we need to clarify:
1 Namespaces are OK syntactically[nssyn], though our notion of namespace
IRIs is of course outside the Namespaces definition as URIs [nsURI].
[nssyn] http://www.w3.org/TR/REC-xml-names/#NT-Attribute
[nsURI] http://www.w3.org/TR/REC-xml-names/#dt-namespace
------------------------------------------------------------
2 QNames forbid a raft of [first] and [nth] characters which are
permissible in [IRIs].
first: [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] |
[#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] |
[#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
[#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] |
[#x10000-#xEFFFF]
nth: first | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] |
[#x203F-#x2040]
http://www.w3.org/TR/REC-xml-names/#NT-NCName
IRIs: ipchar = [A-Z] | "_" | [a-z] | [0-9] | "-" | "." "~" |
"%" HEX HEX | "!" | "$" | "&" | "'" | "(" | ")" |
"*" | "+" | "," | ";" | "=" | ":" | "@" |
[#xA0-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFEF] |
[#x10000-#x1FFFD] | [#x20000-#x2FFFD] |
[#x30000-#x3FFFD] | [#x40000-#x4FFFD] |
[#x50000-#x5FFFD] | [#x60000-#x6FFFD] |
[#x70000-#x7FFFD] | [#x80000-#x8FFFD] |
[#x90000-#x9FFFD] | [#xA0000-#xAFFFD] |
[#xB0000-#xBFFFD] | [#xC0000-#xCFFFD] |
[#xD0000-#xDFFFD] | [#xE1000-#xEFFFD]
http://tools.ietf.org/html/rfc3987#section-2.2
------------------------------------------------------------
3 XML content excludes [#x00-#x08] [#x0B-#x0C] [#x0E-#x1F], all of
which are permitted in "Unicode strings" and thus RDF literals
[Rlit]. This applies regardless of CDATA enclosure or entity
substitution.
[Rlit] https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#dfn-lexical-form
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]
http://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-Char
------------------------------------------------------------
4 XML Schema also prohibits the above control characters from
appearing in something typed as xsd:string [string].
[string] http://www.w3.org/TR/xmlschema-2/#dt-string
------------------------------------------------------------
For 4, I propose notes in RDF Concepts and the serialization syntaxes
(e.g. Turtle). For the others, I wonder if we're forced into some
miserable escaping mechanism applied on top of XML.
--
-ericP
Received on Sunday, 7 April 2013 21:56:00 UTC