W3C home > Mailing lists > Public > xproc-dev@w3.org > September 2009

Re: make-absolute-uris may match strings which are not of type anyURI ...

From: Manfred Staudinger <manfred.staudinger@gmail.com>
Date: Fri, 25 Sep 2009 16:03:05 +0200
Message-ID: <a946be3a0909250703q1995c6faj67c08b8a584747ae@mail.gmail.com>
To: Toman_Vojtech@emc.com
Cc: xproc-dev@w3.org, richard@inf.ed.ac.uk
On 21/09/2009, Henry S. Thompson <ht@inf.ed.ac.uk> wrote:
>>> So the input
>>> 	<c:file name="ab er.xml"/>
>>> becomes
>>> 	<c:file name="file:/J:/test/ab%20er.xml"/>
>>
>> Now I wonder if this is actually correct. After reading the relevant
>> parts of the XML Base and XML Schema (anyURI) specifications, my current
>> understanding is that that on the XML *source* level, the values are not
>> escaped. So in your XML source, you can (must?) use "raw" values such as
>> "ab er.xml" in your @xml:base attributes (or in elements/attributes that
>> are of type xs:anyURI).
>
> Yes.
>
>> These values get escaped internally when the processor does some URI
>> manipulation with them.
>
> No.  The values should be escaped at the last possible moment before
> dereferencing.  In particular, absolutisation should _not_ do escaping.
>
>> So the p:make-absolute-uris step should therefore do the escaping itself
>> ("ab er.xml" --> "ab%20er.xml") before resolving the URI against the
>> base URI. Then it should *unescape* the result, so you don't get:
>>
>> <c:file name="file:/J:/test/ab%20er.xml"/>
>>
>> but:
>>
>> <c:file name="file:/J:/test/ab er.xml"/>
>
> That's the right result, but why escape then unescape?
The XPath 2.0 function resolve-uri('ab er.xml', 'file:/J:/test/')
returns "file:/J:/test/ab%20er.xml". So either that function or your
assertion is in error.

>> I am not really sure about this, perhaps somebody else can shed more
>> light into this?
>
> I've copied Richard Tobin explicitly, as he is the expert on this matter.
Citing ("copied explicitly") the expert is more usefull if you clearly
mark it (e.g. <quote>...</quote>), cite your source and maybe give
some context.

On 21/09/2009, Toman_Vojtech@emc.com <Toman_Vojtech@emc.com> wrote:
>> > <c:file name="file:/J:/test/ab er.xml"/>
>>
>> That's the right result, but why escape then unescape?
>
> I think you are right: internally no escaping + unescaping should be
> necessary.
Well ... in addition to what I said above about the resolve-uri
function, you should also have a look at the base-uri function.

Regards,
Manfred
Received on Friday, 25 September 2009 14:03:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 25 September 2009 14:03:47 GMT