RDFa Lite representation of schema.org: character encoding

Hello:

Please explicitly declare the encoding for the RDFa Lite 1.1
representation of schema.org at
http://www.schema.org/docs/schema_org_rdfa.html

It appears to be UTF8, but is set to neither in the HTTP header or via a
<meta charset="UTF-8"> element.

Declaring this explicitly would assist tools, such as RDFLib/pyRdfa,
that parse the RDFa Lite representation of schema.org and arrive at a
different conclusion about the actual encoding than browsers do. Right
now, for example, the emdash (U+2014) in http://schema.org/Product gets
corrupted when you run a simple script like:

import pyRdfa
print(pyRdfa.pyRdfa().rdf_from_source(
  'http://www.schema.org/docs/schema_org_rdfa.html',
   outputFormat="json")
)

Declaring the encoding in a local copy fixes the output.

Thanks,
Dan Scott

Received on Monday, 12 August 2013 19:00:57 UTC