# Canonical, Lexical, and Value Spaces

From: Matthew Fuchs <matt@westbridgetech.com>
Date: Wed, 10 Jul 2002 17:09:30 -0600
Message-Id: <5.1.0.14.1.20020710170826.02833c48@localhost>
To: W3C XML Schema Comments list <www-xml-schema-comments@w3.org>
```[This mail bounced to me, because it came from an unknown
email address.  And then it got lost.  I've just found it
and forward it to the list with apologies to Matthew and
the WG for its four-week delay. -MSM]

As I mentioned during the call, the discussion on 110b leads me to think
that we d be better off representing the relationship between the lexical
space and the value space as an adjunction.  The manager s summary of what
follows is that this would link lexical forms to the value space through
the canonical space.  The value here is that we wouldn t have needed to
claim that 2.0 and 2.00000 are the same in the value space as only 2.0 is a
canonical form, only 2.0 is in the value space and (from this perspective)
2.00000 is just another lexical form for 2.0 .  In other words, the current
view seems to be that the following diagram commutes:

Lexicals                     Values

f

2.000 --------------à 2.000

|                               |

| a?                          | b

\|/              g             \|/

2.0     <--------------à   2.0

The bottom arrow is claimed to be a bijection, and for that to work, b must
be the identity arrow (i.e., 2.0 = 2.000).  But given that g is a
bijection, we already have a unique surjection from the lexical
representations to the canonical ones the arrow a? is g (b(f( 2.000 )))
(where g is the inverse of g [which is the restriction of f to the
canonicals]).  But given that we already have a unique a?, why not dump f
and b, giving the following diagram:

Lexicals                     Values

2.000

|

| a?

\|/              g

2.0     <--------------à   2.0

In which case, it is obvious that no value space can support more
information than is provided by the canonical forms.  The issue of whether
2.000 = 2.0 is handled on the lexical side, and the issue of whether
2.0=2.000 is rendered irrelevant.  While this may only apply to the numeric
types, I think it may still be worth stating, as it better separates the
value space from issues arriving from the lexical forms.  It s also useful
in that the lexical forms could appear (as we now well know) to convey more
information than the canonical form can, and this would unambiguously state
otherwise.

I d also note that this makes the relationship between the lexical space
and the value space an adjunction:

2.000 -------u------à 2.0 =UF( 2.000 )    2.0=F( 2.000 )

|                            |                              |

|                            |                              |

>                         U(>)                          !>

|                            |                              |

\|/                          \|/                            \|/

01.654000 u----à 1.654                      1.654

While this may seem obscure at the moment, I think it becomes essential
both for building operators on these types, and provides a basis for
embedding all the numeric types into a single larger type, because the
different datatypes have different adjunctions into the larger space.

Matthew
```
Received on Wednesday, 10 July 2002 19:12:10 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:08:59 UTC