Re: Are there valid RDF/XML documents that encode invalid RDF?

I can't find any rationale for ignoring the 
 character reference. And the referenced character is not allowed in an IRI. This would make the document not valid RDF/XML.

Richard



> On 2 Jul 2020, at 20:27, Wouter Beek <wouter@triply.cc> wrote:
> 
> Dear list,
> 
> We encounter RDF/XML documents in the wild that contain `&# HEX HEX`
> escaped characters.  Here is an MWE (notice the subject term):
> 
> ```
> <?xml version="1.0" encoding="utf-8" ?>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ns0="b:">
>  <rdf:Description rdf:about="a:&#xA;">
>    <ns0:b rdf:resource="c:c"/>
>  </rdf:Description>
> </rdf:RDF>
> ```
> 
> Some RDF/XML parsers remove these escape sequences altogether (without
> replacing them with anything), e.g., Rapper, W3C RDF/XML validator.
> 
> Some RDF/XML parsers replace these escape sequences with the
> corresponding characters, thereby introducing syntax errors in RDF
> terms (in the above example: introducing an unescaped newline
> character inside an IRI).  An example of such a parser is
> <https://github.com/rdfjs/rdfxml-streaming-parser.js/issues/39>.
> 
> My question is as follows:
>  1. Is the above example snippet a valid RDF/XML document?
>  2. If so, is it intended that some valid RDF/XML documents encode
> invalid RDF, or is there a standard procedure of handling such
> documents such that result in valid RDF somehow?
> 
> ---
> Best,
> Wouter.
> 
> Email: wouter@triply.cc
> WWW: https://triply.cc
> Tel: +31647674624
> 

Received on Friday, 3 July 2020 17:05:07 UTC