Re: XML problems with percent encoding

Hello,

 > You could look at adding an _
 > (http://ko.dbpedia.org/property/_%EA%B4%91%EC%9E%90) for the problem
 > cases, or add a fragment
 > (http://ko.dbpedia.org/property/%EA%B4%91%EC%9E%90#it)? The latter has
 > its own issues, so I'd try the former.
 >
 > Damian
 >

Does not seem to work, see below (third last line). Included in the beer 
ontology. Is there a correct way of using it now or not? We do not have 
a problem with it, which needs fixing, but if we include it in DBpedia, 
we are worried that data from DBpedia will bust a lot of XML parsers and 
serializers.

<?xml version="1.0"?>
<rdf:RDF
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:dbp="http://ko.dbpedia.org/property/"
     xmlns="http://www.purl.org/net/ontology/beer#"
 >
   <owl:Ontology rdf:about="">
     <owl:versionInfo>beer_v0.4.owl, based on 
http://purl.org/net/ontology/beer_v0.3.owl</owl:versionInfo>
   </owl:Ontology>

  <owl:Class rdf:ID="Lager">
     <rdfs:label xml:lang="de">Helles</rdfs:label>
  </owl:Class>

   <Lager rdf:ID="Krieger">
     <hasAlcoholicContent>4.5</hasAlcoholicContent>
     <dbp:_%EA%B4%91%EC%9E%90>4.5</dbp:_%EA%B4%91%EC%9E%90>
   </Lager>
</rdf:RDF>

Reagrds, Sebastian


Damian Steer schrieb:
> Sebastian Hellmann wrote:
>> Dear all,
>> we (especially Matthias Weidl @ KAIST)  are currently working on
>> producing a Korean DBpedia.
>> We encountered a problem again that we are not really able to solve but
>> can only produce a workaround. The property URIs in korean completely
>> have special Characters. If we try to URL encode them, serialisation in
>> RDF/XML is bound to fail.
>>
>> For a property like:
>> http://dbpedia.org/property/l%E3%A4ngengrad
>> Jena produces the following:
>> <ns0:ngengrad xmlns:ns0="http://dbpedia.org/property/l%E3%A4">
>> because % is not a valid character in an XML tag.
>> But if the property only contains special characters, it can not work
>> any more:
>> http://ko.dbpedia.org/property/%EA%B4%91%EC%9E%90
>>
>> In DBpedia we created a work around for this, replacing % with _percent_
>> but it is clearly not a satisfactory solution.
>>
>> How shall we resolve this matter?
>> Is XML conformity still necessary or is there a motion to only use
>> turtle in the future?
> 
> RDF/XML remains the only recommended rdf serialisation, but it can't
> serialise every rdf graph.
> 
> Not a happy situation.
> 
> You could look at adding an _
> (http://ko.dbpedia.org/property/_%EA%B4%91%EC%9E%90) for the problem
> cases, or add a fragment
> (http://ko.dbpedia.org/property/%EA%B4%91%EC%9E%90#it)? The latter has
> its own issues, so I'd try the former.
> 
> Damian
> 






-- 
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

Received on Tuesday, 17 November 2009 10:29:32 UTC