- From: Oliver Ruebenacker <curoli@gmail.com>
- Date: Tue, 22 Nov 2011 09:00:48 -0500
- To: Toby Inkstere <tai@g5n.co.uk>
- Cc: semantic-web@w3.org
Hello, On Mon, Nov 21, 2011 at 6:40 PM, Toby Inkstere <tai@g5n.co.uk> wrote: > On Thu, 17 Nov 2011 11:19:41 -0500 > Oliver Ruebenacker <curoli@gmail.com> wrote: > >> I have a silly little technical question: when parsing an XML/RDF >> document, what is the easiest way to find out whether a string >> representing a URI is a complete absolute URI,a relative URI or an >> abbreviation? > > As Tim said, there is no place in RDF/XML where you need to distinguish > between URIs and QNames. Everywhere a URI is allowed a QName is > disallowed, and vice versa. > > To distinguish between a relative URI and an absolute one, use the > following regular expression: > > ^([A-Za-z][A-Za-z0-9.+-]*): > > If it matches that regular expression, it's absolute; otherwise it's > relative. Easy peasy. (That regular expression is derived from Section > 3.1 of RFC 3986, which defines the syntax for URI schemes.) Thanks a lot, this is indeed easy. I should have realized that, whatever is the rule for delimiting path segments, it is something that does not appear in a scheme name. > For performance reasons, you may wish to limit the length of a URI > scheme that can be matched: > > ^([A-Za-z][A-Za-z0-9.+-]{0,127}): > > ... it seems unlikely that any URI scheme more than 128 characters will > ever be used. This will prevent massive strings of alphanumeric > characters from slowing down your regular expression. Although I don't recall any scheme name longer than five characters, I would be cautious what fancy ideas people may come up with in the future, such as scheme names generated by encoding other information. Take care Oliver -- Oliver Ruebenacker, Computational Cell Biologist Virtual Cell (http://vcell.org) SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) http://www.oliver.curiousworld.org
Received on Tuesday, 22 November 2011 14:01:25 UTC