- From: Damian Steer <pldms@mac.com>
- Date: Thu, 26 Apr 2012 16:33:07 +0100
- To: www-rdf-validator@w3.org
- Cc: pedantic-web@googlegroups.com
Forwarded from the pedantic-web list.
Initially this was (erroneously) reported as an issue with ARP and UTF-8 BOMs, but there's no BOM involved and ARP has never had an issue with BOMs.
It seems that validating (all?) rdf files under www.w3.org results in errors of the form:
"An attempt to load the RDF from URI 'http://www.w3.org/ns/formats/data/RDF_XML' failed. (Undecodable data when reading URI at byte 0 using encoding 'UTF-8'. Please check encoding and encoding declaration of your document.)"
But the byte value may vary, e.g. 24574 for http://www.w3.org/ns/ma-ont.rdf.
I understand that the same file (RDF_XML) validated without issue when copied to a remote server.
The code is question is presumably:
try {// read whole file as characters
int c;
while ((c = isr.read()) != -1) {
sb.append((char)c);
bytenum++;
}
}
catch (IOException e){
throw new getRDFException("Undecodable data when reading URI at byte "+bytenum+" using encoding '"+finalCharset+"'."+" Please check encoding and encoding declaration of your document.");
}
<http://dev.w3.org/cvsweb/2006/RDFValidator/WEB-INF/src/org/w3c/rdfvalidator/ARPServlet.java?rev=1.6>
So the issue may not be encoding, the same message being reported for any IO exception.
Thanks for your help,
Damian Steer
Begin forwarded message:
> From: Damian Steer <pldms@mac.com>
> Subject: Re: [pedantic-web] Encoding issues when dereferencing "formats:" URIs
> Date: 25 April 2012 16:07:07 GMT+01:00
> To: pedantic-web@googlegroups.com
> Reply-To: pedantic-web@googlegroups.com
>
> On 25/04/12 15:49, Andreas Radinger wrote:
>> Hi,
>>
>> I don't think any of these files (neither .rdf nor .ttl) have a BOM at
>> the beginning of the file.
>> http://people.w3.org/rishida/utils/bomtester/index.php?filename=http%3A%2F%2Fwww.w3.org%2Fns%2Fformats%2Fdata%2FRDF_XML.rdf
>>
>> The W3C RDF Validator has also no bug in dealing with RDF/XML files that
>> have a BOM.
>
> +1.
>
> I tried another file under ns/:
>
> <http://www.w3.org/ns/ma-ont.rdf>
>
> => "Undecodable data when reading URI at byte 24574 using encoding 'UTF-8'."
>
> And then the rdf namespace:
>
> => "... byte 0 ..."
>
> But <http://people.w3.org/simon/foaf.rdf> was fine.
>
> Hypothesis: validating rdf under the www.w3.org domain is broken.
>
> It may be unrelated to encoding. The error is triggered by any
> IOException reading characters from an input stream reader.
>
> Damian
Received on Thursday, 26 April 2012 15:33:48 UTC