Re: draft-mealling-human-friendly-identifier-req-00.txt

Michael Mealling (michael@bailey.dscga.com)
Mon, 12 Oct 1998 16:03:33 -0400 (EDT)


From: Michael Mealling <michael@bailey.dscga.com>
Message-Id: <199810122003.QAA13004@bailey.dscga.com>
In-Reply-To: <82289F0E8F05D211A5F000A0C982BAD42823A8@ns1.centraal.com> from Nicolas Popp at "Oct 12, 98 10:57:13 am"
To: nico@centraal.com (Nicolas Popp)
Date: Mon, 12 Oct 1998 16:03:33 -0400 (EDT)
Cc: michaelm@netsol.com, nico@centraal.com, masinter@parc.xerox.com,
Subject: Re: draft-mealling-human-friendly-identifier-req-00.txt

Nicolas Popp said this:
> >>How this information is gotten from the user is a human factors issue.
> >I'm willing to bet its easily discernable from local browsing patterns.
> >Location is a bit more difficult since there isnt' a standard out there
> >for it. The research I've done is showing that Country-<n number of 
> >administative regions>-City-<n number of neighborhood designations> is
> >sufficient for most situations.
> 
> It will have to be a user input, and this is not as good. Ask AltaVista how
> many users use their advance search capabilities. 

Sure, but, since I've used both, advanced doesn't really give me all that
much (unless they've recently added something). This isn't a problem with
altavista its a problem with the quality of data they have to work with.
You can only get so advanced off of crappy metadata...

> As far as extracting
> patterns, this is not easy either. That's why I prefer global uniqueness to
> uniqueness for a specific region and industry sector. But again, each
> approach has its limits, and as long as the spec does not prohibit any
> approach, we are ok.

I suspect it will be an evolutionary/community specific thing. If users
come to find that region/industry context is not useful then registrants will
tend to register names that are unique....

> >Nope. You still search on "Joe's Pizza" since that's the name. If Joe is
> >silly enough to just put "go:Joe" on the side of a bus then he's to far
> >gone for help. I.e. registrants should use their trademarks. But, substring
> >matches should be allowed if the user requests it. "Joe's Pizza" should
> >never match "Joe's Shoes". "Joe's" would match all of them in a given
> >industry segment/location. That list wouldn't be to outrageous.
> 
> My point was that if there is no "Joe's Pizza" in the database, it becomes
> very difficult to return a precise list on a simple textual search on the
> identifiers (I know that, that's what the RealName System is doing, and
> sometime our results list are not as good as I would like it). In such a
> case, the only way to get a short and precise list is to use the other
> properties of the resource characteristic. For instance "Joe's pizza" as a
> query implies category "Food/restaurant" which could be used to filter
> entries out. 

I think I'm understanding that you are wanting the match semantics to 
also match on natural language aspects of something even in the identifier
queried for isn't there? I.e. Joe has a pizza place but he doesn't
register "Joe's Pizza" he just registered "Joe's". Then the user queries
for "Joe's Pizza" and the matching subsystem is smart enough to make
that leap? Wow. That's pretty difficult. Or do I have it wrong?

> >One point for clarity: I also think it is very important that the root
> >only contain enough data to handle matches and a referral to the locally
> >maintained server. Much like DNS, the root contains very little since
> >it is a heavily loaded service. The real data about a domain is kept
> >at that domain's nameserver. 
> 
> That's an implementation issue. There are many advantages to a centralized
> system as there is to a distributed one. For instance, it is difficult to
> provide any interesting directory service if the data is never centralized
> in one place. These architectures are not incompatible either. The data can
> be distributed, and the resoultion/search services centralized. For
> instance, in the RealName System, the metadata is maintained in distributed
> RDF files (saved on our customer's Web site), but the resolvers aggregate
> the metadata in order to build coherent results list. It really depends on
> the scope of the namespace anyway.

Yep. If all you want is a syntactic match on the actual registered Unicode
plus one or two minor discriminators then there isn't really a need for
all of the data to be in the root. Its also doubtfull if a directory with
that much centralized data could scale all that well. 

> >Exactly. But I also want to allow for a very low barrier to entry for
> >non-businesses. In my proposal (out this week sometime) there is a 
> >.....
> >"McDonalds" via some registrar. Now, if my neighbors 12 year old son
> >is also known as "McDonalds" in his online game, he can also register
> >that identifier. When someone requests "McDonalds" they can get both or
> none
> >depending on if they have requested that unqualified entries be turned off.
> 
> Yes, but you are throwing a new concept at the user. I am not saying that
> the concept is bad. But, you are asking a "naive" user has to understand
> that there are qualified and unqualified names, and that he/she needs to
> manipulate a preference in order to filter results in and out. With users,
> less is always better. Did you know that one of the most frequently typed
> query in a search engine is "www.yahoo.com"? 

Sure. But its analogous to what they do in real life. People routinely
deal with the same "name" being used for many different things. I also
expect that different communities will treat qualified vs unqualified
differently. I.e. the online gaming community would optimize for
the unqualified while the consumer/e-commerce market would optimize for the 
qualified. But, as with DNS, there are always features that can fall by the
wayside if unused (MAILA and MAILB?).

> As a side note, One interesting thing that we are doing to reduce the noise
> in our results list is to use the "popularity" of an identifier to order the
> results. So if you type "books", you get a list with "Barnes & Noble Books"
> and "Amazon.com Books" but "Nico's Books" is way below although all these
> entries have the same relevance for the query "books". That's because one of
> the properties of the resource characteristics is "usage" (a function of how
> many times the identifier has been resolved). Of course this only shows the
> value of going beyond simple textual search.

This is an interesting additional context. I'm not sure how to guage
popularity from an architectural standpoint, though. Hit metering has
always been an issue discussed from time to time. Popularity is also
a relative vector since a particular resource may be popular to one 
community while completely unknown by another. 

-MM


> -----Original Message-----
> From: Michael Mealling [mailto:michael@bailey.dscga.com]
> Sent: Monday, October 12, 1998 6:59 AM
> To: nico@centraal.com
> Cc: michaelm@netsol.com; nico@centraal.com; masinter@parc.xerox.com;
> uri@bunyip.com
> Subject: Re: draft-mealling-human-friendly-identifier-req-00.txt
> 
> 
> Nicolas Popp said this:
> > >Yes and no. ;-) From my experience users want an interesting mix of both.
> > >They want navigation (unique lookup) most of the time. But when something
> > >changes they want the navigation to be able to detect that and turn it
> > >into a search. Also, navigation is when you have a known quantity. The
> > >intent is that the known quantity is slightly more complex than the
> > >friendly identifier itself. I.e. if I see "go:Joe's Pizza" on the side
> > >of a bus and I type it in then I'm not all that concerned about
> > >getting a search (most users would expect it since its what they're
> > accustomed
> > >to with the yellow pages). Now, once I've done that and selected the
> > >one I want, if I type in "go:Joe's Pizza" again I should get the same
> > >one back. Only unless I expand the identifier (specify a search
> explicitly)
> > >do I get the original list back.
> > 
> > Let me just add a few "practical points" about the value of uniqueness.
> > First, you will never see "go:Joe's Pizza" on the side of a bus  unless
> the
> > identifier is unique. 
> 
> Sure. And it should be unique for any given geographic area or else
> Joe has a very lucrative trademark case to make.
> 
> > Joe would not pay for it otherwise. Of course, this is
> > the registrant view point. Nevertheless, for me as a user, it is also very
> > important that Joe be ready to let me know about his HFI.  If it had not
> > been for the bus, I would have never learned how to find this restaurant.
> > Also, if a simple action such as "go:Joe's Pizza" returns a list of 100
> > possibles destination, "Joe's Pizza" seems like a lousy resource
> > "identifier" to me. Why would I even bother  remembering it?
> 
> Correct. But I'm also assuming that there are a few very controlled 
> discriminators that go along with the match request. At a minimum
> it would be geographic location and industry segment. Industry segment
> allows the user to ask for "Apple" and get Apple Computing or Apple Music.
> How this information is gotten from the user is a human factors issue.
> I'm willing to bet its easily discernable from local browsing patterns.
> Location is a bit more difficult since there isnt' a standard out there
> for it. The research I've done is showing that Country-<n number of 
> administative regions>-City-<n number of neighborhood designations> is
> sufficient for most situations.
> 
> > >Yep. Its search but its search only on the identifier. Not on the data
> > >represented by the identifier....
> > 
> > Humm, but then , your query will return "Joe's Shoes", "Joe's BookStore",
> > and "Joe's fabulous stamp collection". Then you are quickly become as bad
> as
> > today;'s search engines, especially on generic terms queries.  I guess I
> am
> > back to the scalability problem...
> 
> Nope. You still search on "Joe's Pizza" since that's the name. If Joe is
> silly enough to just put "go:Joe" on the side of a bus then he's to far
> gone for help. I.e. registrants should use their trademarks. But, substring
> matches should be allowed if the user requests it. "Joe's Pizza" should
> never match "Joe's Shoes". "Joe's" would match all of them in a given
> industry segment/location. That list wouldn't be to outrageous.
> 
> > >This is my intent. I'm working on an architectural draft now. It assumes
> > >that what is returned in an RDF object that has a minimal schema but that
> > >can be extended by RDF methods to include whatever information the
> > >owner of the identifier wants...
> > 
> > Interesting. Our resolvers do something like that, except that we used our
> > own XLM tags.  Try
> >
> http://www.realnames.com/resolver.dll?action=query&realName=books&contentTyp
> > e=xml.
> > Of course, it should be RDF (we are fixing that).  If you subscribe
> > withRealName, you get an RDF file,though. 
> 
> Cool. The RDF Schema group is just about wrapping up their first draft.
> Between that and RDF itself we should have a format that allows a 
> standard schema for the entire system but also allows for community specific
> information where appropriate. For example, I could very well see the
> chemistry community using something like this. They could get an HFI for
> each chemical. The common schema just doesn't give them the data they 
> want so they extend the RDF on their local server so that it contains
> their own schema.
> 
> One point for clarity: I also think it is very important that the root
> only contain enough data to handle matches and a referral to the locally
> maintained server. Much like DNS, the root contains very little since
> it is a heavily loaded service. The real data about a domain is kept
> at that domain's nameserver. 
> 
> > >hehe. The IETF has developed a very strong imunne response to policy. ;-)
> > >Seriously, in what context? As far as trademark disputes, operational
> > >and registration issues or more of a exact mechanics of how an end user
> > >updates their HFI?
> > 
> > Management of the namespace is important to ensure that the identifiers
> and
> > the meta data are of good quality. If you don't do that, you get a lot of
> > noise. As example, consider the quality of the META HTML tags. Everyone
> > abuses them. Even respected vendor embeds their competitor products names
> in
> > their homepage's meta tags so that they can be found in all circumstances
> > (in our example, Joe from "Joe's Pizza" will try to get "Fred's Pizza").
> 
> Exactly. But I also want to allow for a very low barrier to entry for
> non-businesses. In my proposal (out this week sometime) there is a 
> distinction made in the ranking of responses from the root. There are
> "qualified" and "unqualified" entries. Qualified entries are made by
> a registrar that has meet certain legal/contractual obligations for what
> it puts in the root. It can gaurantee trademark compliance, a level
> of service for the data, etc. Unqualified entries are those made by the
> masses at large and have a low barrier to entry. For example, the gaming
> community is searching for some system that can handle all of the myriads
> of nicknames they use on their online games. There is overlap and the
> data maintained behind the name isn't all that good. But it is of value
> to that community.
> 
> There is one rule though that makes this usefull: when ranking, qualified
> entries always outrank qualified entries. Example: McDonald's registers
> "McDonalds" via some registrar. Now, if my neighbors 12 year old son
> is also known as "McDonalds" in his online game, he can also register
> that identifier. When someone requests "McDonalds" they can get both or none
> depending on if they have requested that unqualified entries be turned off.
> 
> (It also gives trademark holders an easy way to see who is infringing...)
> 
> 
> 
> 
> > ----------------------------------------------------------------------
> > 
> > Nicolas Popp said this:
> > > Michael.
> > > 
> > > Let me start by saying that I am truly excited by your submission to the
> > > IETF.  > As you probably know, Centraal has developed a "human friendly 
> > > namespace" for commercial Web pages (http://www.centraal.com). 
> > 
> > Yep. I've been following you guys for a while...
> > 
> > > Therefore, I am extremely pleased of your interest in capturing the
> > general
> > > requirements for Human Friendly Namespaces.  I would actually really
> like 
> > > to meet with you face to face to discuss these issues in more details. 
> > > In the meantime, I could not resist sending you some preliminary
> comments.
> > 
> > Sure. You guys are on the left coast, right? I could always fly out
> > sometime to talk about this more in depth. 
> > 
> > Let me preface my remarks and that document with a bit of background.
> > As with most IETF requirements documents, that one was written
> > with a non-vague solution in mind. It really isn't meant to be
> > an exhaustive set of requirements for all methods of user oriented
> > navigation or search (now there's a PhD thesis). Instead, it is
> > meant to outline a set of objectives for a potential working group.
> > Once those requirements are met the working group can be declared
> > done and disbanded. In the case of these systems, there are several
> > potential solution spaces that have slightly different paths.
> > 
> > As with all requirements documents though, it is maliable. If 
> > the working group decides that a feature is really a requirement
> > then it can be added. Also, if the Area Directors are shown that
> > the entire Working Group thinks that a particular requirement
> > needs to be scraped then that is also possible.
> > 
> > > So, here there are:
> > > 
> > > >Succinctly stated, the requirements that are considered out of scope
> > > aregeneric >search/navigation [...]
> > > 
> > > I tend to disagree with this. Navigation and search services are core
> > > services of a human friendly namespace. Hence, they should be part of
> the
> > > requirements. I look at them as the equivalent of the name resolution
> > > services in the URN specification. They are intimately related. Also by
> > > making these services part of the specification, we open the doors to
> the
> > > standardization of the search interface which in turn will facilitate
> > their
> > > integration in application and Web sites like the search engines.
> > 
> > If this was any other organization that the IETF I would have to agree
> > with you. This requirement is more of a finessing of the IETF process
> > than anything. My hope is to form a working group. In order to do
> > that the Area Directors and those interested in the subject generally
> > want to know that you are solving a portion of the problem space that
> > is actually doable. Historically the IETF has been very reluctant to
> > solve generic search/navigation problems. I ran into this at the last
> > meeting where I tried for form the METAD group to look at merging 
> > the whois++ and rwhois specs.
> > 
> > What is important is what is left _unsaid_. While generic search and
> > navigation are not requirements there is an unstated requirement
> > that any system such as this should be engineered with an eye toward
> > being an integral part of a system that does support generic search
> > and navigation. I.e. we can always go beyond the requirements if its
> > doable and doesn't impact other requirements....
> > 
> > Keith Moore (Applications Area Director) calls those "weasel words". ;-)
> > 
> > I detect some vagueness in my wording. I should be clearer and say that
> > this search and navigation feature is over the data behind the identifier.
> > Without search/navigation of the identifier itself it wouldn't be all
> > that useful. I.e. if we couch this as a generic directory service the
> > IETF will run away screaming "overly broad scope!"
> > 
> > > >N-to-N mappingA single identifier should be capable of being used by
> two
> > > separate entities. 
> > > >Conversely, an entity should be capable of having more than one
> > identifier.
> > > 
> > > Does it mean that uniqueness is tolerated (1-N)? Personally,  I regard
> > > uniqueness of the identifier as an important asset for a human friendly
> > > namespace. Although not all the namespaces will be able to enforce
> > > uniqueness, should not it be encouraged?  Uniqueness enables direct
> > > navigation (accessing a resource using its HFI in one single and simple
> > > step). Direct navigation is far more user friendly than navigation
> through
> > > search.
> > 
> > Yes and no. ;-) From my experience users want an interesting mix of both.
> > They want navigation (unique lookup) most of the time. But when something
> > changes they want the navigation to be able to detect that and turn it
> > into a search. Also, navigation is when you have a known quantity. The
> > intent is that the known quantity is slightly more complex than the
> > friendly identifier itself. I.e. if I see "go:Joe's Pizza" on the side
> > of a bus and I type it in then I'm not all that concerned about
> > getting a search (most users would expect it since its what they're
> > accustomed
> > to with the yellow pages). Now, once I've done that and selected the
> > one I want, if I type in "go:Joe's Pizza" again I should get the same
> > one back. Only unless I expand the identifier (specify a search
> explicitly)
> > do I get the original list back.
> > 
> > Also, there is the concept of architecturally enforced uniqueness and
> > policy enforeced uniqueness. I think that, architecturally, the system
> > should allow 1-N but that, in certain cases such as trademark
> infringement,
> > the policies of those inserting the identifiers could preclude such an
> > arrangement.
> > 
> > Also, 1-N is a case that is massaged by the fact that the client is
> > _strongly_ encouraged to send discriminating contexts such as 
> > location and assumed industry/topic segment. My hunch is that for
> > most businesses this will give a fairly high level of uniqueness. If not
> > then I bet someone has a trademark infringement case waiting to happen.
> > 
> > > >Matching semanticsAt the least, substring matches are required. Other
> > > methodsof matching 
> > > >should be evaluated based on performance and abilityto give the user an
> > > accurate result set.
> > > 
> > > Is not that search? 
> > 
> > Yep. Its search but its search only on the identifier. Not on the data
> > represented by the identifier....
> > 
> > > Can it scale if the only search capabilities that you
> > > offers are textual search on the names? 
> > 
> > For its intended, limited scope, yes. But this doesn't mean that the 
> > system can't add features or be extended latter. These are IETF
> > requirements, not exhaustive features. Think of it this way, if these
> > minimal things aren't done then it fails and you don't get a standard.
> > That doesn't mean you can't do more....
> > 
> > > Can we extend HFI namespace from the
> > > simple notion of aliased address to the notion of a space of named
> > resource
> > > characteristics. This is more interesting than the simple mapping of a
> > name
> > > to a physical locator (e.g. HFI to URL or HFI to email address). If we
> > allow
> > > the namespace to be about metadata, then we can also develop a general
> > > framework for precise search and navigation (search/directory service
> > > infrastructure). For instance, if my namespace is about restaurants and
> > > supports the meta property "geographical region", a user would be able
> to
> > > find the unique "pizza hut in palo-alto"(formally, hfi:pizza hut +
> > > region:palo-alto; please do not draw any conclusion about my food
> habits).
> > 
> > This is my intent. I'm working on an architectural draft now. It assumes
> > that what is returned in an RDF object that has a minimal schema but that
> > can be extended by RDF methods to include whatever information the
> > owner of the identifier wants...
> > 
> > This is where the generic search from above comes in. Once that data
> > is in their local resolvers, it is a rather simple operation to build
> > a real directory service on top of it...
> > 
> > > >The identifier should be capable of expressing hierarchy. In somecases
> it
> > > makes
> > >  >sense for an identifier to appear to belong to ahierarchy. But this is
> > > merely a capability. It isnot 
> > > >a hierarchy.It is expected that hierarchical identifiers will be a
> > distinct
> > > minority.
> > > 
> > > In the case of a namespace of resource characteristics, the properties
> of
> > > the resource can be used to create structure and hierarchies within the
> > > namespace while not sacrificing the human friendliness of the name (e.g.
> > the
> > > "region" would organize the names according to the region hierarchy,  a
> > > "yellow page category property would organize the names according to the
> > > classification hierarchy, etc...but still no slashes or dots in the
> HFI)"
> > .
> > 
> > Yes. By hierarchy I'm thinking of more of an ownership type of 
> > system like DNS. I.e where once you get register "foo" then do you 
> > administrative control over "foo bar" or insted is it "foo/bar".
> > But I think you are right though. There really is no difference when
> > we're talking about flat spaces and non-uniqueness of identifiers.
> > I.e. "foo bar" and "foo/bar" are essentially the same. The only difference
> > is that "foo/bar" may be specified at an intranet level in a local
> > server instead of the flat root. I think this could be a toss up.
> > Either one could be planned for. I do think a bit of vanity comes 
> > into it for end users. I think they like to think they have 'control'.
> > 
> > > Lastly, do you think that it would be useful to consider the following:
> > > 
> > > . Persistence of the names
> > 
> > hmmm... there might be a slight divergence of intent here. First let me
> > preface this by saying that I am intentionally not talking about names.
> > The 6+ years I've spent in the URN struggle have specified one clear
> > thing: no one really knows what the word "name" means. ;-) For most
> > of the folx in the URI groups in the IETF it has a specific conotation of
> > infinite presistence. For the kind of system I'm thinking about
> persistence
> > of the HFI is not required. They can be extremely short lived. What is
> > persistant is the URN pointing to the resource that is included in the
> > RDF metadata that was returned during the HFI resolution.
> > 
> > > . Administration of the namespace
> > 
> > hehe. The IETF has developed a very strong imunne response to policy. ;-)
> > Seriously, in what context? As far as trademark disputes, operational
> > and registration issues or more of a exact mechanics of how an end user
> > updates their HFI?
> > 
> > > . Services definition ( F2L, F2F, etc...)
> > 
> > Good question. They could follow the URI ones: I.e. I2L, I2N, I2C.
> > (we got rid of the N2L in favor of I2L since the input types itself.)
> > Strategicly though I think it would be a good thing to default to 
> > returning fairly rich metadata. If you don't then people still default
> > to just returning the URL and you don't get enough data in the leaf
> > nodes to support a good directory service as an add on. 
> > 
> > 
> > This has been nice. Its hard to find people who can understand these
> > issues and talk about them intelligently. Feel free to continue the
> > email exchange since I'd like to keep a record of this conversation.
> > There are some good points here I'd like to preserve...
> > 
> > -MM
> > 
> > -- 
> >
> ----------------------------------------------------------------------------
> > ----
> > Michael Mealling	|      Vote Libertarian!       |
> > www.rwhois.net/michael
> > Sr. Research Engineer   |   www.ga.lp.org/gwinnett     | ICQ#:
> > 14198821
> > Network Solutions	|          www.lp.org          |
> > michaelm@netsol.com
> > 
> > -----Original Message-----
> > From: Michael Mealling [mailto:michael@bailey.dscga.com]
> > Sent: Friday, October 09, 1998 7:11 AM
> > To: nico@centraal.com
> > Subject: Re: draft-mealling-human-friendly-identifier-req-00.txt
> > 
> > 
> > Nicolas Popp said this:
> > > Michael.
> > > 
> > > Let me start by saying that I am truly excited by your submission to the
> > > IETF.  > As you probably know, Centraal has developed a "human friendly 
> > > namespace" for commercial Web pages (http://www.centraal.com). 
> > 
> > Yep. I've been following you guys for a while...
> > 
> > > Therefore, I am extremely pleased of your interest in capturing the
> > general
> > > requirements for Human Friendly Namespaces.  I would actually really
> like 
> > > to meet with you face to face to discuss these issues in more details. 
> > > In the meantime, I could not resist sending you some preliminary
> comments.
> > 
> > Sure. You guys are on the left coast, right? I could always fly out
> > sometime to talk about this more in depth. 
> > 
> > Let me preface my remarks and that document with a bit of background.
> > As with most IETF requirements documents, that one was written
> > with a non-vague solution in mind. It really isn't meant to be
> > an exhaustive set of requirements for all methods of user oriented
> > navigation or search (now there's a PhD thesis). Instead, it is
> > meant to outline a set of objectives for a potential working group.
> > Once those requirements are met the working group can be declared
> > done and disbanded. In the case of these systems, there are several
> > potential solution spaces that have slightly different paths.
> > 
> > As with all requirements documents though, it is maliable. If 
> > the working group decides that a feature is really a requirement
> > then it can be added. Also, if the Area Directors are shown that
> > the entire Working Group thinks that a particular requirement
> > needs to be scraped then that is also possible.
> > 
> > > So, here there are:
> > > 
> > > >Succinctly stated, the requirements that are considered out of scope
> > > aregeneric >search/navigation [...]
> > > 
> > > I tend to disagree with this. Navigation and search services are core
> > > services of a human friendly namespace. Hence, they should be part of
> the
> > > requirements. I look at them as the equivalent of the name resolution
> > > services in the URN specification. They are intimately related. Also by
> > > making these services part of the specification, we open the doors to
> the
> > > standardization of the search interface which in turn will facilitate
> > their
> > > integration in application and Web sites like the search engines.
> > 
> > If this was any other organization that the IETF I would have to agree
> > with you. This requirement is more of a finessing of the IETF process
> > than anything. My hope is to form a working group. In order to do
> > that the Area Directors and those interested in the subject generally
> > want to know that you are solving a portion of the problem space that
> > is actually doable. Historically the IETF has been very reluctant to
> > solve generic search/navigation problems. I ran into this at the last
> > meeting where I tried for form the METAD group to look at merging 
> > the whois++ and rwhois specs.
> > 
> > What is important is what is left _unsaid_. While generic search and
> > navigation are not requirements there is an unstated requirement
> > that any system such as this should be engineered with an eye toward
> > being an integral part of a system that does support generic search
> > and navigation. I.e. we can always go beyond the requirements if its
> > doable and doesn't impact other requirements....
> > 
> > Keith Moore (Applications Area Director) calls those "weasel words". ;-)
> > 
> > I detect some vagueness in my wording. I should be clearer and say that
> > this search and navigation feature is over the data behind the identifier.
> > Without search/navigation of the identifier itself it wouldn't be all
> > that useful. I.e. if we couch this as a generic directory service the
> > IETF will run away screaming "overly broad scope!"
> > 
> > > >N-to-N mappingA single identifier should be capable of being used by
> two
> > > separate entities. 
> > > >Conversely, an entity should be capable of having more than one
> > identifier.
> > > 
> > > Does it mean that uniqueness is tolerated (1-N)? Personally,  I regard
> > > uniqueness of the identifier as an important asset for a human friendly
> > > namespace. Although not all the namespaces will be able to enforce
> > > uniqueness, should not it be encouraged?  Uniqueness enables direct
> > > navigation (accessing a resource using its HFI in one single and simple
> > > step). Direct navigation is far more user friendly than navigation
> through
> > > search.
> > 
> > Yes and no. ;-) From my experience users want an interesting mix of both.
> > They want navigation (unique lookup) most of the time. But when something
> > changes they want the navigation to be able to detect that and turn it
> > into a search. Also, navigation is when you have a known quantity. The
> > intent is that the known quantity is slightly more complex than the
> > friendly identifier itself. I.e. if I see "go:Joe's Pizza" on the side
> > of a bus and I type it in then I'm not all that concerned about
> > getting a search (most users would expect it since its what they're
> > accustomed
> > to with the yellow pages). Now, once I've done that and selected the
> > one I want, if I type in "go:Joe's Pizza" again I should get the same
> > one back. Only unless I expand the identifier (specify a search
> explicitly)
> > do I get the original list back.
> > 
> > Also, there is the concept of architecturally enforced uniqueness and
> > policy enforeced uniqueness. I think that, architecturally, the system
> > should allow 1-N but that, in certain cases such as trademark
> infringement,
> > the policies of those inserting the identifiers could preclude such an
> > arrangement.
> > 
> > Also, 1-N is a case that is massaged by the fact that the client is
> > _strongly_ encouraged to send discriminating contexts such as 
> > location and assumed industry/topic segment. My hunch is that for
> > most businesses this will give a fairly high level of uniqueness. If not
> > then I bet someone has a trademark infringement case waiting to happen.
> > 
> > > >Matching semanticsAt the least, substring matches are required. Other
> > > methodsof matching 
> > > >should be evaluated based on performance and abilityto give the user an
> > > accurate result set.
> > > 
> > > Is not that search? 
> > 
> > Yep. Its search but its search only on the identifier. Not on the data
> > represented by the identifier....
> > 
> > > Can it scale if the only search capabilities that you
> > > offers are textual search on the names? 
> > 
> > For its intended, limited scope, yes. But this doesn't mean that the 
> > system can't add features or be extended latter. These are IETF
> > requirements, not exhaustive features. Think of it this way, if these
> > minimal things aren't done then it fails and you don't get a standard.
> > That doesn't mean you can't do more....
> > 
> > > Can we extend HFI namespace from the
> > > simple notion of aliased address to the notion of a space of named
> > resource
> > > characteristics. This is more interesting than the simple mapping of a
> > name
> > > to a physical locator (e.g. HFI to URL or HFI to email address). If we
> > allow
> > > the namespace to be about metadata, then we can also develop a general
> > > framework for precise search and navigation (search/directory service
> > > infrastructure). For instance, if my namespace is about restaurants and
> > > supports the meta property "geographical region", a user would be able
> to
> > > find the unique "pizza hut in palo-alto"(formally, hfi:pizza hut +
> > > region:palo-alto; please do not draw any conclusion about my food
> habits).
> > 
> > This is my intent. I'm working on an architectural draft now. It assumes
> > that what is returned in an RDF object that has a minimal schema but that
> > can be extended by RDF methods to include whatever information the
> > owner of the identifier wants...
> > 
> > This is where the generic search from above comes in. Once that data
> > is in their local resolvers, it is a rather simple operation to build
> > a real directory service on top of it...
> > 
> > > >The identifier should be capable of expressing hierarchy. In somecases
> it
> > > makes
> > >  >sense for an identifier to appear to belong to ahierarchy. But this is
> > > merely a capability. It isnot 
> > > >a hierarchy.It is expected that hierarchical identifiers will be a
> > distinct
> > > minority.
> > > 
> > > In the case of a namespace of resource characteristics, the properties
> of
> > > the resource can be used to create structure and hierarchies within the
> > > namespace while not sacrificing the human friendliness of the name (e.g.
> > the
> > > "region" would organize the names according to the region hierarchy,  a
> > > "yellow page category property would organize the names according to the
> > > classification hierarchy, etc...but still no slashes or dots in the
> HFI)"
> > .
> > 
> > Yes. By hierarchy I'm thinking of more of an ownership type of 
> > system like DNS. I.e where once you get register "foo" then do you 
> > administrative control over "foo bar" or insted is it "foo/bar".
> > But I think you are right though. There really is no difference when
> > we're talking about flat spaces and non-uniqueness of identifiers.
> > I.e. "foo bar" and "foo/bar" are essentially the same. The only difference
> > is that "foo/bar" may be specified at an intranet level in a local
> > server instead of the flat root. I think this could be a toss up.
> > Either one could be planned for. I do think a bit of vanity comes 
> > into it for end users. I think they like to think they have 'control'.
> > 
> > > Lastly, do you think that it would be useful to consider the following:
> > > 
> > > . Persistence of the names
> > 
> > hmmm... there might be a slight divergence of intent here. First let me
> > preface this by saying that I am intentionally not talking about names.
> > The 6+ years I've spent in the URN struggle have specified one clear
> > thing: no one really knows what the word "name" means. ;-) For most
> > of the folx in the URI groups in the IETF it has a specific conotation of
> > infinite presistence. For the kind of system I'm thinking about
> persistence
> > of the HFI is not required. They can be extremely short lived. What is
> > persistant is the URN pointing to the resource that is included in the
> > RDF metadata that was returned during the HFI resolution.
> > 
> > > . Administration of the namespace
> > 
> > hehe. The IETF has developed a very strong imunne response to policy. ;-)
> > Seriously, in what context? As far as trademark disputes, operational
> > and registration issues or more of a exact mechanics of how an end user
> > updates their HFI?
> > 
> > > . Services definition ( F2L, F2F, etc...)
> > 
> > Good question. They could follow the URI ones: I.e. I2L, I2N, I2C.
> > (we got rid of the N2L in favor of I2L since the input types itself.)
> > Strategicly though I think it would be a good thing to default to 
> > returning fairly rich metadata. If you don't then people still default
> > to just returning the URL and you don't get enough data in the leaf
> > nodes to support a good directory service as an add on.. 
> > 
> > 
> > This has been nice. Its hard to find people who can understand these
> > issues and talk about them intelligently. Feel free to continue the
> > email exchange since I'd like to keep a record of this conversation.
> > There are some good points here I'd like to preserve...
> > 
> > -MM
> > 
> > -- 
> >
> ----------------------------------------------------------------------------
> > ----
> > Michael Mealling	|      Vote Libertarian!       |
> > www.rwhois.net/michael
> > Sr. Research Engineer   |   www.ga.lp.org/gwinnett     | ICQ#:
> > 14198821
> > Network Solutions	|          www.lp.org          |
> > michaelm@netsol.com
> > 
> 
> 
> -- 
> ----------------------------------------------------------------------------
> ----
> Michael Mealling	|      Vote Libertarian!       |
> www.rwhois.net/michael
> Sr. Research Engineer   |   www.ga.lp.org/gwinnett     | ICQ#:
> 14198821
> Network Solutions	|          www.lp.org          |
> michaelm@netsol.com
> 


-- 
--------------------------------------------------------------------------------
Michael Mealling	|      Vote Libertarian!       | www.rwhois.net/michael
Sr. Research Engineer   |   www.ga.lp.org/gwinnett     | ICQ#:         14198821
Network Solutions	|          www.lp.org          |  michaelm@netsol.com