[Bug 2249] R-257: Health warning needed about percent-escaping URIs

http://www.w3.org/Bugs/Public/show_bug.cgi?id=2249





------- Comment #5 from cmsmcq@w3.org  2007-09-24 17:14 -------
Comment #4 suggests deleting the words 'finite-length' from the
description of anyURI's lexical forms and from similar passages
elsewhere.

That seems to me at best a risky idea.

It's true that no implementation we know how to build will be able to
distinguish the set of finite-length sequences from the set of
infinite sequences.  But I think the stipulation that the sequence of
characters be finite has another utility: it helps ensure that the
question "is this string in the lexical space of this datatype?" can
be answered in finite time.

For arbitrary strings of finite length, and arbitrary grammars of
finite size, there are algorithms which can decide whether the string
is recognized by the grammar.  I do not believe the same is true for
infinite strings.  (At least, not if we accept the view that to be an
algorithm, a computational method must always terminate after a finite
number of steps.)  So I do not think it would be wise to define the
lexical spaces of any of our simple types as including infinite-length
strings.

It's not solely a practical question; it's also a theoretical question.

(It's true that some students of automata theory have worked with
automatata with infinite numbers of states, and that two-level [van
Wijngaarden] grammars correspond to context-free grammars with
infinite numbers of productions.  But in both of these cases, I think,
the existence of an algorithm for recognizing strings depends
crucially upon the strings' being finite in length.  In recognizing a
finite-length string, even an infinite state machine or grammar can
successfully be represented by a finite machine or grammar which
produces the same result for this particular string.  I am not a
computer scientist, and I have not taken the time this morning to
re-read my textbooks on this subject, so I could be wrong.  So I will
offer this weaker argument: almost all the textbooks I have seen
describe algorithms for recognizing finite-length, not infinite
strings.  In view of the finiteness of any actual machine available to
us, it seems unnecessary to require that implementors replace the
textbook algorithms with others capable of handling infinite inputs.)

Received on Monday, 24 September 2007 17:14:16 UTC