Re: draft-mealling-human-friendly-identifier-req-00.txt from Michael Mealling on 1998-10-13 (uri@w3.org from October 1998)

From: Michael Mealling <michael@bailey.dscga.com>
Date: Tue, 13 Oct 1998 10:40:30 -0400 (EDT)
To: nico@centraal.com (Nicolas Popp)
Cc: michaelm@netsol.com, nico@centraal.com, masinter@parc.xerox.com, uri@Bunyip.Com
Message-Id: <199810131440.KAA14410@bailey.dscga.com>
Nicolas Popp said this:
[Charset iso-8859-1 unsupported, filtering to ASCII...]
> >I think I'm understanding that you are wanting the match semantics to 
> > a so match on natural language aspects of something even in the identifier
> >queried for isn't there? I.e. Joe has a pizza place but he doesn't
> >register "Joe's Pizza" he just registered "Joe's". Then the user queries
> >for "Joe's Pizza" and the matching subsystem is smart enough to make
> >that leap? Wow. That's pretty difficult. Or do I have it wrong?
> 
> It is not too difficult if Joe registered his identifier in the category
> "restaurant".

Oh. Yea. That's what I mean by the industry segment identifier. There's
an ISO standard identifier for those which is very convenient...

> That being said, I think that you are right, this is going way too far. I am
> ready to settle for the syntactic match as the common denominator for the
> HFI "matching" service.
> I got carried away by the desire to see the notion of metadata and  "search"
> services make it into the spec, and you have alreay addressed this point
> with me.

Hold on there. ;-) I think there is a place for a very set of
discriminators. I think location and industry segment ("restaurant")
are fine. I could be convinced for the need for date and popularity.

> >Yep. If all you want is a syntactic match on the actual registered Unicode
> >plus one or two minor discriminators then there isn't really a need for
> >all of the data to be in the root. Its also doubtfull if a directory with
> >that much centralized data could scale all that well. 
> 
> In the case of a large directory service, without having a  hierarchical
> distributed  architecture like DNS, you can still distribute the data across
> "root" servers so that none of them contains the entire set of resource
> characteristics.  The handle system, I believe, does something like that
> with just the "names". Interesting problem, but once again, I am getting off
> topic. The important point is that the HFI requirements document should not
> specify whether the namespace must be distributed or not.

Yep....

> >This is an interesting additional context. I'm not sure how to guage
> >popularity from an architectural standpoint, though. Hit metering has
> >always been an issue discussed from time to time. Popularity is also
> >a relative vector since a particular resource may be popular to one 
> >community while completely unknown by another. 
> 
> True. It is not perfect but measurable by the system through its resolution
> service. There are other algorithms, but there again, it is probably
> irrelevant for this forum. Clearly, we should not specify any ordering
> algorithm for the HFI "matching" service.

Ah. I see, you're only talking about measured popularity of queries at
the root. I thought it was a more complex measure of popularity based
actual hits to the entities web pages or something like that. Popularity
at that level is certainly doable (depends on what gets distributed
where but that can also be worked out.)

-MM

> 
> 
> -----Original Message-----
> From: Michael Mealling [mailto:michael@bailey.dscga.com]
> Sent: Monday, October 12, 1998 1:04 PM
> To: nico@centraal.com
> Cc: michaelm@netsol.com; nico@centraal.com; masinter@parc.xerox.com;
> uri@bunyip.com
> Subject: Re: draft-mealling-human-friendly-identifier-req-00.txt
> 
> 
> Nicolas Popp said this:
> > >>How this information is gotten from the user is a human factors issue.
> > >I'm willing to bet its easily discernable from local browsing patterns.
> > >Location is a bit more difficult since there isnt' a standard out there
> > >for it. The research I've done is showing that Country-<n number of 
> > >administative regions>-City-<n number of neighborhood designations> is
> > >sufficient for most situations.
> > 
> > It will have to be a user input, and this is not as good. Ask AltaVista
> how
> > many users use their advance search capabilities. 
> 
> Sure, but, since I've used both, advanced doesn't really give me all that
> much (unless they've recently added something). This isn't a problem with
> altavista its a problem with the quality of data they have to work with.
> You can only get so advanced off of crappy metadata...
> 
> > As far as extracting
> > patterns, this is not easy either. That's why I prefer global uniqueness
> to
> > uniqueness for a specific region and industry sector. But again, each
> > approach has its limits, and as long as the spec does not prohibit any
> > approach, we are ok.
> 
> I suspect it will be an evolutionary/community specific thing. If users
> come to find that region/industry context is not useful then registrants
> will
> tend to register names that are unique....
> 
> > >Nope. You still search on "Joe's Pizza" since that's the name. If Joe is
> > >silly enough to just put "go:Joe" on the side of a bus then he's to far
> > >gone for help. I.e. registrants should use their trademarks. But,
> substring
> > >matches should be allowed if the user requests it. "Joe's Pizza" should
> > >never match "Joe's Shoes". "Joe's" would match all of them in a given
> > >industry segment/location. That list wouldn't be to outrageous.
> > 
> > My point was that if there is no "Joe's Pizza" in the database, it becomes
> > very difficult to return a precise list on a simple textual search on the
> > identifiers (I know that, that's what the RealName System is doing, and
> > sometime our results list are not as good as I would like it). In such a
> > case, the only way to get a short and precise list is to use the other
> > properties of the resource characteristic. For instance "Joe's pizza" as a
> > query implies category "Food/restaurant" which could be used to filter
> > entries out. 
> 
> I think I'm understanding that you are wanting the match semantics to 
> also match on natural language aspects of something even in the identifier
> queried for isn't there? I.e. Joe has a pizza place but he doesn't
> register "Joe's Pizza" he just registered "Joe's". Then the user queries
> for "Joe's Pizza" and the matching subsystem is smart enough to make
> that leap? Wow. That's pretty difficult. Or do I have it wrong?
> 
> > >One point for clarity: I also think it is very important that the root
> > >only contain enough data to handle matches and a referral to the locally
> > >maintained server. Much like DNS, the root contains very little since
> > >it is a heavily loaded service. The real data about a domain is kept
> > >at that domain's nameserver. 
> > 
> > That's an implementation issue. There are many advantages to a centralized
> > system as there is to a distributed one. For instance, it is difficult to
> > provide any interesting directory service if the data is never centralized
> > in one place. These architectures are not incompatible either. The data
> can
> > be distributed, and the resoultion/search services centralized. For
> > instance, in the RealName System, the metadata is maintained in
> distributed
> > RDF files (saved on our customer's Web site), but the resolvers aggregate
> > the metadata in order to build coherent results list. It really depends on
> > the scope of the namespace anyway.
> 
> Yep. If all you want is a syntactic match on the actual registered Unicode
> plus one or two minor discriminators then there isn't really a need for
> all of the data to be in the root. Its also doubtfull if a directory with
> that much centralized data could scale all that well. 
> 
> > >Exactly. But I also want to allow for a very low barrier to entry for
> > >non-businesses. In my proposal (out this week sometime) there is a 
> > >.....
> > >"McDonalds" via some registrar. Now, if my neighbors 12 year old son
> > >is also known as "McDonalds" in his online game, he can also register
> > >that identifier. When someone requests "McDonalds" they can get both or
> > none
> > >depending on if they have requested that unqualified entries be turned
> off.
> > 
> > Yes, but you are throwing a new concept at the user. I am not saying that
> > the concept is bad. But, you are asking a "naive" user has to understand
> > that there are qualified and unqualified names, and that he/she needs to
> > manipulate a preference in order to filter results in and out. With users,
> > less is always better. Did you know that one of the most frequently typed
> > query in a search engine is "www.yahoo.com"? 
> 
> Sure. But its analogous to what they do in real life. People routinely
> deal with the same "name" being used for many different things. I also
> expect that different communities will treat qualified vs unqualified
> differently. I.e. the online gaming community would optimize for
> the unqualified while the consumer/e-commerce market would optimize for the 
> qualified. But, as with DNS, there are always features that can fall by the
> wayside if unused (MAILA and MAILB?).
> 
> > As a side note, One interesting thing that we are doing to reduce the
> noise
> > in our results list is to use the "popularity" of an identifier to order
> the
> > results. So if you type "books", you get a list with "Barnes & Noble
> Books"
> > and "Amazon.com Books" but "Nico's Books" is way below although all these
> > entries have the same relevance for the query "books". That's because one
> of
> > the properties of the resource characteristics is "usage" (a function of
> how
> > many times the identifier has been resolved). Of course this only shows
> the
> > value of going beyond simple textual search.
> 
> This is an interesting additional context. I'm not sure how to guage
> popularity from an architectural standpoint, though. Hit metering has
> always been an issue discussed from time to time. Popularity is also
> a relative vector since a particular resource may be popular to one 
> community while completely unknown by another. 
> 
> -MM
> 
> 
> > -----Original Message-----
> > From: Michael Mealling [mailto:michael@bailey.dscga.com]
> > Sent: Monday, October 12, 1998 6:59 AM
> > To: nico@centraal.com
> > Cc: michaelm@netsol.com; nico@centraal.com; masinter@parc.xerox.com;
> > uri@bunyip.com
> > Subject: Re: draft-mealling-human-friendly-identifier-req-00.txt
> > 
> > 
> > Nicolas Popp said this:
> > > >Yes and no. ;-) From my experience users want an interesting mix of
> both.
> > > >They want navigation (unique lookup) most of the time. But when
> something
> > > >changes they want the navigation to be able to detect that and turn it
> > > >into a search. Also, navigation is when you have a known quantity. The
> > > >intent is that the known quantity is slightly more complex than the
> > > >friendly identifier itself. I.e. if I see "go:Joe's Pizza" on the side
> > > >of a bus and I type it in then I'm not all that concerned about
> > > >getting a search (most users would expect it since its what they're
> > > accustomed
> > > >to with the yellow pages). Now, once I've done that and selected the
> > > >one I want, if I type in "go:Joe's Pizza" again I should get the same
> > > >one back. Only unless I expand the identifier (specify a search
> > explicitly)
> > > >do I get the original list back.
> > > 
> > > Let me just add a few "practical points" about the value of uniqueness.
> > > First, you will never see "go:Joe's Pizza" on the side of a bus  unless
> > the
> > > identifier is unique. 
> > 
> > Sure. And it should be unique for any given geographic area or else
> > Joe has a very lucrative trademark case to make.
> > 
> > > Joe would not pay for it otherwise. Of course, this is
> > > the registrant view point. Nevertheless, for me as a user, it is also
> very
> > > important that Joe be ready to let me know about his HFI.  If it had not
> > > been for the bus, I would have never learned how to find this
> restaurant.
> > > Also, if a simple action such as "go:Joe's Pizza" returns a list of 100
> > > possibles destination, "Joe's Pizza" seems like a lousy resource
> > > "identifier" to me. Why would I even bother  remembering it?
> > 
> > Correct. But I'm also assuming that there are a few very controlled 
> > discriminators that go along with the match request. At a minimum
> > it would be geographic location and industry segment. Industry segment
> > allows the user to ask for "Apple" and get Apple Computing or Apple Music.
> > How this information is gotten from the user is a human factors issue.
> > I'm willing to bet its easily discernable from local browsing patterns.
> > Location is a bit more difficult since there isnt' a standard out there
> > for it. The research I've done is showing that Country-<n number of 
> > administative regions>-City-<n number of neighborhood designations> is
> > sufficient for most situations.
> > 
> > > >Yep. Its search but its search only on the identifier. Not on the data
> > > >represented by the identifier....
> > > 
> > > Humm, but then , your query will return "Joe's Shoes", "Joe's
> BookStore",
> > > and "Joe's fabulous stamp collection". Then you are quickly become as
> bad
> > as
> > > today;'s search engines, especially on generic terms queries.  I guess I
> > am
> > > back to the scalability problem...
> > 
> > Nope. You still search on "Joe's Pizza" since that's the name. If Joe is
> > silly enough to just put "go:Joe" on the side of a bus then he's to far
> > gone for help. I.e. registrants should use their trademarks. But,
> substring
> > matches should be allowed if the user requests it. "Joe's Pizza" should
> > never match "Joe's Shoes". "Joe's" would match all of them in a given
> > industry segment/location. That list wouldn't be to outrageous.
> > 
> > > >This is my intent.. I'm working on an architectural draft now. It
> assumes
> > > >that what is returned in an RDF object that has a minimal schema but
> that
> > > >can be extended by RDF methods to include whatever information the
> > > >owner of the identifier wants...
> > > 
> > > Interesting. Our resolvers do something like that, except that we used
> our
> > > own XLM tags.  Try
> > >
> >
> http://www.realnames.com/resolver.dll?action=query&realName=books&contentTyp
> > > e=xml.
> > > Of course, it should be RDF (we are fixing that).  If you subscribe
> > > withRealName, you get an RDF file,though. 
> > 
> > Cool. The RDF Schema group is just about wrapping up their first draft.
> > Between that and RDF itself we should have a format that allows a 
> > standard schema for the entire system but also allows for community
> specific
> > information where appropriate. For example, I could very well see the
> > chemistry community using something like this. They could get an HFI for
> > each chemical. The common schema just doesn't give them the data they 
> > want so they extend the RDF on their local server so that it contains
> > their own schema.
> > 
> > One point for clarity: I also think it is very important that the root
> > only contain enough data to handle matches and a referral to the locally
> > maintained server. Much like DNS, the root contains very little since
> > it is a heavily loaded service. The real data about a domain is kept
> > at that domain's nameserver. 
> > 
> > > >hehe. The IETF has developed a very strong imunne response to policy.
> ;-)
> > > >Seriously, in what context? As far as trademark disputes, operational
> > > >and registration issues or more of a exact mechanics of how an end user
> > > >updates their HFI?
> > > 
> > > Management of the namespace is important to ensure that the identifiers
> > and
> > > the meta data are of good quality. If you don't do that, you get a lot
> of
> > > noise. As example, consider the quality of the META HTML tags. Everyone
> > > abuses them. Even respected vendor embeds their competitor products
> names
> > in
> > > their homepage's meta tags so that they can be found in all
> circumstances
> > > (in our example, Joe from "Joe's Pizza" will try to get "Fred's Pizza").
> > 
> > Exactly. But I also want to allow for a very low barrier to entry for
> > non-businesses. In my proposal (out this week sometime) there is a 
> > distinction made in the ranking of responses from the root. There are
> > "qualified" and "unqualified" entries. Qualified entries are made by
> > a registrar that has meet certain legal/contractual obligations for what
> > it puts in the root. It can gaurantee trademark compliance, a level
> > of service for the data, etc. Unqualified entries are those made by the
> > masses at large and have a low barrier to entry. For example, the gaming
> > community is searching for some system that can handle all of the myriads
> > of nicknames they use on their online games. There is overlap and the
> > data maintained behind the name isn't all that good. But it is of value
> > to that community.
> > 
> > There is one rule though that makes this usefull: when ranking, qualified
> > entries always outrank qualified entries. Example: McDonald's registers
> > "McDonalds" via some registrar. Now, if my neighbors 12 year old son
> > is also known as "McDonalds" in his online game, he can also register
> > that identifier. When someone requests "McDonalds" they can get both or
> none
> > depending on if they have requested that unqualified entries be turned
> off.
> > 
> > (It also gives trademark holders an easy way to see who is infringing...)
> > 
> > 
> > 
> > 
> > > ----------------------------------------------------------------------
> > > 
> > > Nicolas Popp said this:
> > > > Michael.
> > > > 
> > > > Let me start by saying that I am truly excited by your submission to
> the
> > > > IETF.  > As you probably know, Centraal has developed a "human
> friendly 
> > > > namespace" for commercial Web pages (http://www.centraal.com). 
> > > 
> > > Yep. I've been following you guys for a while...
> > > 
> > > > Therefore, I am extremely pleased of your interest in capturing the
> > > general
> > > > requirements for Human Friendly Namespaces.  I would actually really
> > like 
> > > > to meet with you face to face to discuss these issues in more details.
> 
> > > > In the meantime, I could not resist sending you some preliminary
> > comments.
> > > 
> > > Sure. You guys are on the left coast, right? I could always fly out
> > > sometime to talk about this more in depth. 
> > > 
> > > Let me preface my remarks and that document with a bit of background.
> > > As with most IETF requirements documents, that one was written
> > > with a non-vague solution in mind. It really isn't meant to be
> > > an exhaustive set of requirements for all methods of user oriented
> > > navigation or search (now there's a PhD thesis). Instead, it is
> > > meant to outline a set of objectives for a potential working group.
> > > Once those requirements are met the working group can be declared
> > > done and disbanded. In the case of these systems, there are several
> > > potential solution spaces that have slightly different paths.
> > > 
> > > As with all requirements documents though, it is maliable. If 
> > > the working group decides that a feature is really a requirement
> > > then it can be added. Also, if the Area Directors are shown that
> > > the entire Working Group thinks that a particular requirement
> > > needs to be scraped then that is also possible.
> > > 
> > > > So, here there are:
> > > > 
> > > > >Succinctly stated, the requirements that are considered out of scope
> > > > aregeneric >search/navigation [...]
> > > > 
> > > > I tend to disagree with this. Navigation and search services are core
> > > > services of a human friendly namespace. Hence, they should be part of
> > the
> > > > requirements. I look at them as the equivalent of the name resolution
> > > > services in the URN specification. They are intimately related. Also
> by
> > > > making these services part of the specification, we open the doors to
> > the
> > > > standardization of the search interface which in turn will facilitate
> > > their
> > > > integration in application and Web sites like the search engines.
> > > 
> > > If this was any other organization that the IETF I would have to agree
> > > with you. This requirement is more of a finessing of the IETF process
> > > than anything. My hope is to form a working group. In order to do
> > > that the Area Directors and those interested in the subject generally
> > > want to know that you are solving a portion of the problem space that
> > > is actually doable. Historically the IETF has been very reluctant to
> > > solve generic search/navigation problems. I ran into this at the last
> > > meeting where I tried for form the METAD group to look at merging 
> > > the whois++ and rwhois specs.
> > > 
> > > What is important is what is left _unsaid_. While generic search and
> > > navigation are not requirements there is an unstated requirement
> > > that any system such as this should be engineered with an eye toward
> > > being an integral part of a system that does support generic search
> > > and navigation. I.e. we can always go beyond the requirements if its
> > > doable and doesn't impact other requirements....
> > > 
> > > Keith Moore (Applications Area Director) calls those "weasel words". ;-)
> > > 
> > > I detect some vagueness in my wording. I should be clearer and say that
> > > this search and navigation feature is over the data behind the
> identifier.
> > > Without search/navigation of the identifier itself it wouldn't be all
> > > that useful. I.e. if we couch this as a generic directory service the
> > > IETF will run away screaming "overly broad scope!"
> > > 
> > > > >N-to-N mappingA single identifier should be capable of being used by
> > two
> > > > separate entities. 
> > > > >Conversely, an entity should be capable of having more than one
> > > identifier.
> > > > 
> > > > Does it mean that uniqueness is tolerated (1-N)? Personally,  I regard
> > > > uniqueness of the identifier as an important asset for a human
> friendly
> > > > namespace. Although not all the namespaces will be able to enforce
> > > > uniqueness, should not it be encouraged?  Uniqueness enables direct
> > > > navigation (accessing a resource using its HFI in one single and
> simple
> > > > step). Direct navigation is far more user friendly than navigation
> > through
> > > > search.
> > > 
> > > Yes and no. ;-) From my experience users want an interesting mix of
> both.
> > > They want navigation (unique lookup) most of the time. But when
> something
> > > changes they want the navigation to be able to detect that and turn it
> > > into a search. Also, navigation is when you have a known quantity. The
> > > intent is that the known quantity is slightly more complex than the
> > > friendly identifier itself. I.e. if I see "go:Joe's Pizza" on the side
> > > of a bus and I type it in then I'm not all that concerned about
> > > getting a search (most users would expect it since its what they're
> > > accustomed
> > > to with the yellow pages). Now, once I've done that and selected the
> > > one I want, if I type in "go:Joe's Pizza" again I should get the same
> > > one back. Only unless I expand the identifier (specify a search
> > explicitly)
> > > do I get the original list back.
> > > 
> > > Also, there is the concept of architecturally enforced uniqueness and
> > > policy enforeced uniqueness. I think that, architecturally, the system
> > > should allow 1-N but that, in certain cases such as trademark
> > infringement,
> > > the policies of those inserting the identifiers could preclude such an
> > > arrangement.
> > > 
> > > Also, 1-N is a case that is massaged by the fact that the client is
> > > _strongly_ encouraged to send discriminating contexts such as 
> > > location and assumed industry/topic segment. My hunch is that for
> > > most businesses this will give a fairly high level of uniqueness. If not
> > > then I bet someone has a trademark infringement case waiting to happen.
> > > 
> > > > >Matching semanticsAt the least, substring matches are required. Other
> > > > methodsof matching 
> > > > >should be evaluated based on performance and abilityto give the user
> an
> > > > accurate result set.
> > > > 
> > > > Is not that search? 
> > > 
> > > Yep. Its search but its search only on the identifier. Not on the data
> > > represented by the identifier....
> > > 
> > > > Can it scale if the only search capabilities that you
> > > > offers are textual search on the names? 
> > > 
> > > For its intended, limited scope, yes. But this doesn't mean that the 
> > > system can't add features or be extended latter. These are IETF
> > > requirements, not exhaustive features. Think of it this way, if these
> > > minimal things aren't done then it fails and you don't get a standard.
> > > That doesn't mean you can't do more....
> > > 
> > > > Can we extend HFI namespace from the
> > > > simple notion of aliased address to the notion of a space of named
> > > resource
> > > > characteristics. This is more interesting than the simple mapping of a
> > > name
> > > > to a physical locator (e.g. HFI to URL or HFI to email address). If we
> > > allow
> > > > the namespace to be about metadata, then we can also develop a general
> > > > framework for precise search and navigation (search/directory service
> > > > infrastructure). For instance, if my namespace is about restaurants
> and
> > > > supports the meta property "geographical region", a user would be able
> > to
> > > > find the unique "pizza hut in palo-alto"(formally, hfi:pizza hut +
> > > > region:palo-alto; please do not draw any conclusion about my food
> > habits).
> > > 
> > > This is my intent. I'm working on an architectural draft now. It assumes
> > > that what is returned in an RDF object that has a minimal schema but
> that
> > > can be extended by RDF methods to include whatever information the
> > > owner of the identifier wants...
> > > 
> > > This is where the generic search from above comes in. Once that data
> > > is in their local resolvers, it is a rather simple operation to build
> > > a real directory service on top of it...
> > > 
> > > > >The identifier should be capable of expressing hierarchy. In
> somecases
> > it
> > > > makes
> > > >  >sense for an identifier to appear to belong to ahierarchy. But this
> is
> > > > merely a capability. It isnot 
> > > > >a hierarchy.It is expected that hierarchical identifiers will be a
> > > distinct
> > > > minority.
> > > > 
> > > > In the case of a namespace of resource characteristics, the properties
> > of
> > > > the resource can be used to create structure and hierarchies within
> the
> > > > namespace while not sacrificing the human friendliness of the name
> (e.g.
> > > the
> > > > "region" would organize the names according to the region hierarchy,
> a
> > > > "yellow page category property would organize the names according to
> the
> > > > classification hierarchy, etc...but still no slashes or dots in the
> > HFI)"
> > > .
> > > 
> > > Yes. By hierarchy I'm thinking of more of an ownership type of 
> > > system like DNS. I.e where once you get register "foo" then do you 
> > > administrative control over "foo bar" or insted is it "foo/bar".
> > > But I think you are right though. There really is no difference when
> > > we're talking about flat spaces and non-uniqueness of identifiers.
> > > I.e. "foo bar" and "foo/bar" are essentially the same. The only
> difference
> > > is that "foo/bar" may be specified at an intranet level in a local
> > > server instead of the flat root. I think this could be a toss up.
> > > Either one could be planned for. I do think a bit of vanity comes 
> > > into it for end users. I think they like to think they have 'control'.
> > > 
> > > > Lastly, do you think that it would be useful to consider the
> following:
> > > > 
> > > > . Persistence of the names
> > > 
> > > hmmm... there might be a slight divergence of intent here. First let me
> > > preface this by saying that I am intentionally not talking about names.
> > > The 6+ years I've spent in the URN struggle have specified one clear
> > > thing: no one really knows what the word "name" means. ;-) For most
> > > of the folx in the URI groups in the IETF it has a specific conotation
> of
> > > infinite presistence. For the kind of system I'm thinking about
> > persistence
> > > of the HFI is not required. They can be extremely short lived. What is
> > > persistant is the URN pointing to the resource that is included in the
> > > RDF metadata that was returned during the HFI resolution.
> > > 
> > > > . Administration of the namespace
> > > 
> > > hehe. The IETF has developed a very strong imunne response to policy.
> ;-)
> > > Seriously, in what context? As far as trademark disputes, operational
> > > and registration issues or more of a exact mechanics of how an end user
> > > updates their HFI?
> > > 
> > > > . Services definition ( F2L, F2F, etc...)
> > > 
> > > Good question. They could follow the URI ones: I.e. I2L, I2N, I2C.
> > > (we got rid of the N2L in favor of I2L since the input types itself.)
> > > Strategicly though I think it would be a good thing to default to 
> > > returning fairly rich metadata. If you don't then people still default
> > > to just returning the URL and you don't get enough data in the leaf
> > > nodes to support a good directory service as an add on. 
> > > 
> > > 
> > > This has been nice. Its hard to find people who can understand these
> > > issues and talk about them intelligently. Feel free to continue the
> > > email exchange since I'd like to keep a record of this conversation.
> > > There are some good points here I'd like to preserve...
> > > 
> > > -MM
> > > 
> > > -- 
> > >
> >
> ----------------------------------------------------------------------------
> > > ----
> > > Michael Mealling	|      Vote Libertarian!       |
> > > www.rwhois.net/michael
> > > Sr. Research Engineer   |   www.ga.lp.org/gwinnett     | ICQ#:
> > > 14198821
> > > Network Solutions	|          www.lp.org          |
> > > michaelm@netsol.com
> > > 
> > > -----Original Message-----
> > > From: Michael Mealling [mailto:michael@bailey.dscga.com]
> > > Sent: Friday, October 09, 1998 7:11 AM
> > > To: nico@centraal.com
> > > Subject: Re: draft-mealling-human-friendly-identifier-req-00.txt
> > > 
> > > 
> > > Nicolas Popp said this:
> > > > Michael.
> > > > 
> > > > Let me start by saying that I am truly excited by your submission to
> the
> > > > IETF.  > As you probably know, Centraal has developed a "human
> friendly 
> > > > namespace" for commercial Web pages (http://www.centraal.com). 
> > > 
> > > Yep. I've been following you guys for a while...
> > > 
> > > > Therefore, I am extremely pleased of your interest in capturing the
> > > general
> > > > requirements for Human Friendly Namespaces.  I would actually really
> > like 
> > > > to meet with you face to face to discuss these issues in more details.
> 
> > > > In the meantime, I could not resist sending you some preliminary
> > comments.
> > > 
> > > Sure. You guys are on the left coast, right? I could always fly out
> > > sometime to talk about this more in depth. 
> > > 
> > > Let me preface my remarks and that document with a bit of background.
> > > As with most IETF requirements documents, that one was written
> > > with a non-vague solution in mind. It really isn't meant to be
> > > an exhaustive set of requirements for all methods of user oriented
> > > navigation or search (now there's a PhD thesis). Instead, it is
> > > meant to outline a set of objectives for a potential working group.
> > > Once those requirements are met the working group can be declared
> > > done and disbanded. In the case of these systems, there are several
> > > potential solution spaces that have slightly different paths.
> > > 
> > > As with all requirements documents though, it is maliable. If 
> > > the working group decides that a feature is really a requirement
> > > then it can be added. Also, if the Area Directors are shown that
> > > the entire Working Group thinks that a particular requirement
> > > needs to be scraped then that is also possible.
> > > 
> > > > So, here there are:
> > > > 
> > > > >Succinctly stated, the requirements that are considered out of scope
> > > > aregeneric >search/navigation [...]
> > > > 
> > > > I tend to disagree with this. Navigation and search services are core
> > > > services of a human friendly namespace. Hence, they should be part of
> > the
> > > > requirements. I look at them as the equivalent of the name resolution
> > > > services in the URN specification. They are intimately related. Also
> by
> > > > making these services part of the specification, we open the doors to
> > the
> > > > standardization of the search interface which in turn will facilitate
> > > their
> > > > integration in application and Web sites like the search engines.
> > > 
> > > If this was any other organization that the IETF I would have to agree
> > > with you. This requirement is more of a finessing of the IETF process
> > > than anything. My hope is to form a working group. In order to do
> > > that the Area Directors and those interested in the subject generally
> > > want to know that you are solving a portion of the problem space that
> > > is actually doable. Historically the IETF has been very reluctant to
> > > solve generic search/navigation problems. I ran into this at the last
> > > meeting where I tried for form the METAD group to look at merging 
> > > the whois++ and rwhois specs.
> > > 
> > > What is important is what is left _unsaid_. While generic search and
> > > navigation are not requirements there is an unstated requirement
> > > that any system such as this should be engineered with an eye toward
> > > being an integral part of a system that does support generic search
> > > and navigation. I.e. we can always go beyond the requirements if its
> > > doable and doesn't impact other requirements....
> > > 
> > > Keith Moore (Applications Area Director) calls those "weasel words". ;-)
> > > 
> > > I detect some vagueness in my wording. I should be clearer and say that
> > > this search and navigation feature is over the data behind the
> identifier.
> > > Without search/navigation of the identifier itself it wouldn't be all
> > > that useful. I.e. if we couch this as a generic directory service the
> > > IETF will run away screaming "overly broad scope!"
> > > 
> > > > >N-to-N mappingA single identifier should be capable of being used by
> > two
> > > > separate entities. 
> > > > >Conversely, an entity should be capable of having more than one
> > > identifier.
> > > > 
> > > > Does it mean that uniqueness is tolerated (1-N)? Personally,  I regard
> > > > uniqueness of the identifier as an important asset for a human
> friendly
> > > > namespace. Although not all the namespaces will be able to enforce
> > > > uniqueness, should not it be encouraged?  Uniqueness enables direct
> > > > navigation (accessing a resource using its HFI in one single and
> simple
> > > > step). Direct navigation is far more user friendly than navigation
> > through
> > > > search.
> > > 
> > > Yes and no. ;-) From my experience users want an interesting mix of
> both.
> > > They want navigation (unique lookup) most of the time. But when
> something
> > > changes they want the navigation to be able to detect that and turn it
> > > into a search. Also, navigation is when you have a known quantity. The
> > > intent is that the known quantity is slightly more complex than the
> > > friendly identifier itself. I.e. if I see "go:Joe's Pizza" on the side
> > > of a bus and I type it in then I'm not all that concerned about
> > > getting a search (most users would expect it since its what they're
> > > accustomed
> > > to with the yellow pages). Now, once I've done that and selected the
> > > one I want, if I type in "go:Joe's Pizza" again I should get the same
> > > one back. Only unless I expand the identifier (specify a search
> > explicitly)
> > > do I get the original list back.
> > > 
> > > Also, there is the concept of architecturally enforced uniqueness and
> > > policy enforeced uniqueness. I think that, architecturally, the system
> > > should allow 1-N but that, in certain cases such as trademark
> > infringement,
> > > the policies of those inserting the identifiers could preclude such an
> > > arrangement.
> > > 
> > > Also, 1-N is a case that is massaged by the fact that the client is
> > > _strongly_ encouraged to send discriminating contexts such as 
> > > location and assumed industry/topic segment. My hunch is that for
> > > most businesses this will give a fairly high level of uniqueness. If not
> > > then I bet someone has a trademark infringement case waiting to happen.
> > > 
> > > > >Matching semanticsAt the least, substring matches are required. Other
> > > > methodsof matching 
> > > > >should be evaluated based on performance and abilityto give the user
> an
> > > > accurate result set.
> > > > 
> > > > Is not that search? 
> > > 
> > > Yep. Its search but its search only on the identifier. Not on the data
> > > represented by the identifier....
> > > 
> > > > Can it scale if the only search capabilities that you
> > > > offers are textual search on the names? 
> > > 
> > > For its intended, limited scope, yes. But this doesn't mean that the 
> > > system can't add features or be extended latter. These are IETF
> > > requirements, not exhaustive features. Think of it this way, if these
> > > minimal things aren't done then it fails and you don't get a standard.
> > > That doesn't mean you can't do more....
> > > 
> > > > Can we extend HFI namespace from the
> > > > simple notion of aliased address to the notion of a space of named
> > > resource
> > > > characteristics. This is more interesting than the simple mapping of a
> > > name
> > > > to a physical locator (e.g. HFI to URL or HFI to email address). If we
> > > allow
> > > > the namespace to be about metadata, then we can also develop a general
> > > > framework for precise search and navigation (search/directory service
> > > > infrastructure). For instance, if my namespace is about restaurants
> and
> > > > supports the meta property "geographical region", a user would be able
> > to
> > > > find the unique "pizza hut in palo-alto"(formally, hfi:pizza hut +
> > > > region:palo-alto; please do not draw any conclusion about my food
> > habits).
> > > 
> > > This is my intent. I'm working on an architectural draft now. It assumes
> > > that what is returned in an RDF object that has a minimal schema but
> that
> > > can be extended by RDF methods to include whatever information the
> > > owner of the identifier wants...
> > > 
> > > This is where the generic search from above comes in. Once that data
> > > is in their local resolvers, it is a rather simple operation to build
> > > a real directory service on top of it...
> > > 
> > > > >The identifier should be capable of expressing hierarchy. In
> somecases
> > it
> > > > makes
> > > >  >sense for an identifier to appear to belong to ahierarchy. But this
> is
> > > > merely a capability. It isnot 
> > > > >a hierarchy.It is expected that hierarchical identifiers will be a
> > > distinct
> > > > minority.
> > > > 
> > > > In the case of a namespace of resource characteristics, the properties
> > of
> > > > the resource can be used to create structure and hierarchies within
> the
> > > > namespace while not sacrificing the human friendliness of the name
> (e.g.
> > > the
> > > > "region" would organize the names according to the region hierarchy,
> a
> > > > "yellow page category property would organize the names according to
> the
> > > > classification hierarchy, etc...but still no slashes or dots in the
> > HFI)"
> > > .
> > > 
> > > Yes. By hierarchy I'm thinking of more of an ownership type of 
> > > system like DNS. I.e where once you get register "foo" then do you 
> > > administrative control over "foo bar" or insted is it "foo/bar".
> > > But I think you are right though. There really is no difference when
> > > we're talking about flat spaces and non-uniqueness of identifiers.
> > > I.e. "foo bar" and "foo/bar" are essentially the same. The only
> difference
> > > is that "foo/bar" may be specified at an intranet level in a local
> > > server instead of the flat root. I think this could be a toss up.
> > > Either one could be planned for. I do think a bit of vanity comes 
> > > into it for end users. I think they like to think they have 'control'.
> > > 
> > > > Lastly, do you think that it would be useful to consider the
> following:
> > > > 
> > > > . Persistence of the names
> > > 
> > > hmmm... there might be a slight divergence of intent here. First let me
> > > preface this by saying that I am intentionally not talking about names.
> > > The 6+ years I've spent in the URN struggle have specified one clear
> > > thing: no one really knows what the word "name" means. ;-) For most
> > > of the folx in the URI groups in the IETF it has a specific conotation
> of
> > > infinite presistence. For the kind of system I'm thinking about
> > persistence
> > > of the HFI is not required. They can be extremely short lived. What is
> > > persistant is the URN pointing to the resource that is included in the
> > > RDF metadata that was returned during the HFI resolution.
> > > 
> > > > . Administration of the namespace
> > > 
> > > hehe. The IETF has developed a very strong imunne response to policy.
> ;-)
> > > Seriously, in what context? As far as trademark disputes, operational
> > > and registration issues or more of a exact mechanics of how an end user
> > > updates their HFI?
> > > 
> > > > . Services definition ( F2L, F2F, etc...)
> > > 
> > > Good question. They could follow the URI ones: I.e. I2L, I2N, I2C.
> > > (we got rid of the N2L in favor of I2L since the input types itself.)
> > > Strategicly though I think it would be a good thing to default to 
> > > returning fairly rich metadata. If you don't then people still default
> > > to just returning the URL and you don't get enough data in the leaf
> > > nodes to support a good directory service as an add on. 
> > > 
> > > 
> > > This has been nice. Its hard to find people who can understand these
> > > issues and talk about them intelligently. Feel free to continue the
> > > email exchange since I'd like to keep a record of this conversation.
> > > There are some good points here I'd like to preserve...
> > > 
> > > -MM
> > > 
> > > -- 
> > >
> >
> ----------------------------------------------------------------------------
> > > ----
> > > Michael Mealling	|      Vote Libertarian!       |
> > > www.rwhois.net/michael
> > > Sr. Research Engineer   |   www.ga.lp.org/gwinnett     | ICQ#:
> > > 14198821
> > > Network Solutions	|          www.lp.org          |
> > > michaelm@netsol.com
> > > 
> > 
> > 
> > -- 
> >
> ----------------------------------------------------------------------------
> > ----
> > Michael Mealling	|      Vote Libertarian!       |
> > www.rwhois.net/michael
> > Sr. Research Engineer   |   www.ga.lp.org/gwinnett     | ICQ#:
> > 14198821
> > Network Solutions	|          www.lp.org          |
> > michaelm@netsol.com
> > 
> 
> 
> -- 
> ----------------------------------------------------------------------------
> ----
> Michael Mealling	|      Vote Libertarian!       |
> www.rwhois.net/michael
> Sr. Research Engineer   |   www.ga.lp.org/gwinnett     | ICQ#:
> 14198821
> Network Solutions	|          www.lp.org          |
> michaelm@netsol.com
> 


-- 
--------------------------------------------------------------------------------
Michael Mealling	|      Vote Libertarian!       | www.rwhois.net/michael
Sr. Research Engineer   |   www..ga.lp.org/gwinnett     | ICQ#:         14198821
Network Solutions	|          www.lp.org          |  michaelm@netsol.com
Received on Tuesday, 13 October 1998 11:25:51 UTC