Re: draft-mealling-human-friendly-identifier-req-00.txt

Michael Mealling (michael@bailey.dscga.com)
Mon, 12 Oct 1998 09:19:08 -0400 (EDT)


From: Michael Mealling <michael@bailey.dscga.com>
Message-Id: <199810121319.JAA12094@bailey.dscga.com>
In-Reply-To: <000601bdf593$0f9a25c0$15d0000d@copper-208.parc.xerox.com> from Larry Masinter at "Oct 11, 98 08:47:40 pm"
To: masinter@parc.xerox.com (Larry Masinter)
Date: Mon, 12 Oct 1998 09:19:08 -0400 (EDT)
Cc: nico@centraal.com, michaelm@rwhois.net, uri@Bunyip.Com
Subject: Re: draft-mealling-human-friendly-identifier-req-00.txt

Larry Masinter said this:
> I think Michael's document should be revised to make the distinction
> between "generic search" and "match" clearer; it's common in IETF,
> but not so clear without that context.
> 
> A "search" might be: find me all pizza parlors within 30 miles
> of this address
> while a "match" could be: find me a place called something
> like "joe's pizza"
> 
> The distinction isn't clean, since the boundary between the two kinds
> of operations can be very blurry: you can keep on adding features to
> "match" until you have full "search". 

Exactly. Currently there are only two features I think we absolutely
need: location and topic segment. The reason I include these and
not others is that they are generally useful for trademark. I.e.
"Joe's Pizza" (this example is getting somewhat like the weather
map example in URNs ;-) needs two things in order to be unique
within trademarks: which location is it in and is it really a food
based business. 

This is a very slippery slope since, as we saw with the original
URN ideas, everyone will want to add their favorite discriminator:
date of business establishment, price ranges, product offerings, 
DUNS number, etc.

This type of information is fine for including in the metadat that
is returned but not for the actual matches. They need to be fast.
BUT, and this is the long range aspect of this, once that metadata
is being maintained at the leaves, it is perfect pickings for
a directory service to pick up on and use at some latter date.
I.e. an HFI architecture shouldn't solve the general search problem
but it can make strategic decisions to make life easier to solve
latter on.


> >N-to-N mapping
> > A single identifier should be capable of being used by two
> > separate entities. 
> I think the nature of an "identifier" is that it identifies. It might
> need some context in order to identify, but within that context, it has
> to be unique.  Otherwise, what you have is some kind of partial match,
> or an identifier of a set. Now, if you're going to call them "human
> friendly names" instead of "human friendly identifiers", then they're
> no longer required to be unique, since we know that several entities
> (resources?) can have the same "name".

I prefer an identifier of a set. I specifically didn't use 'name' because
of it being so overloaded by the URN discussions. I just didn't want
to open up the old debate of what a 'name' really is...

> Maybe we should be more explicit about contextual information that
> can be necessary auxiliary information to the identifier which will
> allow it to identify. For example, in the RealName space, the year
> of registration is additional contextual information that's necessary
> to actually supply uniqueness.

So far I have industry segment (there's an ISO code for this) and 
location (this one is proving more difficult). I can probably understand
year but I would think that you would need more granularity. Are you
intending on using date as a discriminator between entries that
were registered at different times but otherwise are exactly the same?
Under what circumstances can that happen? Or are you using a year as
more of just a discriminating token that has no real semantic meaning?

> >Conversely, an entity should be capable of having more than one identifier.
> 
> This might seem like a trivial request, but it's given WebDAV a great
> deal of difficulty: can a resource have more than one URL? Or is
> a 'resource' actually 'the thing that a URL points to', and that
> different URLs = different resources, although some resources are
> 'equivalent'.

This is part of the problem when using the term URI for this. When I think
of an HFI as a URI I'm thinking that the resource that is identified
is the set of entities that requested that they be part of that set for
that particular identifier. I.e. its a URI query without the "?".

> 
> I think we should move this conversation to uri@bunyip.com.

I'll start the CC....


-- 
--------------------------------------------------------------------------------
Michael Mealling	|      Vote Libertarian!       | www.rwhois.net/michael
Sr. Research Engineer   |   www.ga.lp.org/gwinnett     | ICQ#:         14198821
Network Solutions	|          www.lp.org          |  michaelm@netsol.com