W3C home > Mailing lists > Public > xproc-dev@w3.org > September 2009

Re: make-absolute-uris may match strings which are not of type anyURI ...

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Mon, 21 Sep 2009 14:24:40 +0100
To: <Toman_Vojtech@emc.com>
Cc: <xproc-dev@w3.org>, Richard Tobin <richard@inf.ed.ac.uk>
Message-ID: <f5bk4zsqvl3.fsf@hildegard.inf.ed.ac.uk>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[anon] writes:

>> So the input
>> 	<c:file name="ab er.xml"/>
>> becomes
>> 	<c:file name="file:/J:/test/ab%20er.xml"/>
>
> Now I wonder if this is actually correct. After reading the relevant
> parts of the XML Base and XML Schema (anyURI) specifications, my current
> understanding is that that on the XML *source* level, the values are not
> escaped. So in your XML source, you can (must?) use "raw" values such as
> "ab er.xml" in your @xml:base attributes (or in elements/attributes that
> are of type xs:anyURI).

Yes.

> These values get escaped internally when the processor does some URI
> manipulation with them.

No.  The values should be escaped at the last possible moment before
dereferencing.  In particular, absolutisation should _not_ do escaping.

> So the p:make-absolute-uris step should therefore do the escaping itself
> ("ab er.xml" --> "ab%20er.xml") before resolving the URI against the
> base URI. Then it should *unescape* the result, so you don't get:
>
> <c:file name="file:/J:/test/ab%20er.xml"/>
>
> but:
>
> <c:file name="file:/J:/test/ab er.xml"/>

That's the right result, but why escape then unescape?

> I am not really sure about this, perhaps somebody else can shed more
> light into this?

I've copied Richard Tobin explicitly, as he is the expert on this
matter.

ht
- -- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
                         Half-time member of W3C Team
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFKt36YkjnJixAXWBoRApnzAJsHsdTkfhqRl1LkC+tysi1VdRqNSwCfaqqs
T7Cqpzvzq0ppjfP25fqkjPg=
=T3HP
-----END PGP SIGNATURE-----
Received on Monday, 21 September 2009 13:25:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 21 September 2009 13:25:19 GMT