- From: Charles McCathieNevile <chaals@opera.com>
- Date: Wed, 04 Mar 2009 09:42:52 +0100
- To: "Ian Hickson" <ian@hixie.ch>
- Cc: "public-geolocation@w3.org" <public-geolocation@w3.org>
On Tue, 03 Mar 2009 23:23:47 +0100, Ian Hickson <ian@hixie.ch> wrote:
> On Tue, 3 Mar 2009, Richard Barnes wrote:
>>
>> It's not really the number of fields that's important, right? If you
>> don't care about the semantics of the fields, then you can just use one
>> fields where everything's smashed together.
...
>> you may as well just use a single field.
>
> That might not be a bad idea, actually. What's the use case for having
> the information in multiple fields rather than just a multiline field?
...
> Are there use cases that a one-field answer wouldn't solve?
Being able to take two encoded addresses and determine if they are the
same place (or in the same country). I don't know how important that is -
depends on whether you will have real-world data with civic address as the
only useful location, but I would be surprised if that didn't occur.
The semantics also let you determine things that are important to use
case. A room in a big building like the Pentagon or a shop at Chadstone
shopping centre is realted to another place in the same building in a way
that two rooms at Microsoft in Redmond, or two shops in Malvern rd aren't.
The diversity of addressing conventions (even for *the same studio
apartment*) means that collapsing the semantics very quickly reduces the
ability to do intelligent matching of places for any region you don't have
a huge amount of knowledge, and the ability to force data collection to
fit carefully defined patterns. Where people are asked detailed questions,
they tend to give detailed answers, but where the question is "what is
your address?", you will get much more variability in the data from any
group of people. Normalising this latter dataset to provide a useful
"local" application suddenly incurs a substantial requirement for
processing it.
I doubt that this is of concern to Google (who collect a lot of knowledge
and can probably afford the processing as a negligible marginal cost) but
it may be of concern to a small business which wants to produce
applications using proximity of civic addresses as a metric.
There are cases where that is useful, like parts of India or Australia,
and other cases where the distance between /adjacent/ civic addresses can
be measured in terms of travel time by car or train - like different parts
of Australia.
So this comes back to the use cases and requirements. If being able to
compare addresses and make useful inferences matters, then it is important
to split out the semantics, with the level of detail determining how far
down the semantic split should go. Otherwise, you are right that it is not
really important.
cheers
Chaals (normally a pure lurker)
--
Charles McCathieNevile Opera Software, Standards Group
je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals Try Opera: http://www.opera.com
Received on Wednesday, 4 March 2009 08:43:38 UTC