Re: XML problems with percent encoding

Sebastian Hellmann wrote:
> Dear all,
> we (especially Matthias Weidl @ KAIST)  are currently working on 
> producing a Korean DBpedia.
> We encountered a problem again that we are not really able to solve 
> but can only produce a workaround. The property URIs in korean 
> completely have special Characters. If we try to URL encode them, 
> serialisation in RDF/XML is bound to fail.
>
> For a property like:
> http://dbpedia.org/property/l%E3%A4ngengrad
> Jena produces the following:
> <ns0:ngengrad xmlns:ns0="http://dbpedia.org/property/l%E3%A4">
> because % is not a valid character in an XML tag.
> But if the property only contains special characters, it can not work 
> any more:
> http://ko.dbpedia.org/property/%EA%B4%91%EC%9E%90
>
> In DBpedia we created a work around for this, replacing % with _percent_
> but it is clearly not a satisfactory solution.
>
> How shall we resolve this matter?
> Is XML conformity still necessary or is there a motion to only use 
> turtle in the future?
>
>

Sorry I am late to this thread.
Why are you percent encoding the special chars. Why not just leave them 
in Korean?
Semantic Web standards are based on IRIs that allow all this chars

Jeremy


>
>

Received on Wednesday, 18 November 2009 15:14:56 UTC