Re: FOAF URI Scheme

From: Dirk-Willem van Gulik <dirkx@asemantics.com> · Date: Mon, 1 Mar 2004 15:57:16 +0100

>> Anyway - as I've been developing away - I've come up with the idea of  
>> a
>> foaf uri - that allows webservices to communicate foaf notification
>> events between one-another. Basically:
...
>> href="foaf://da75ef7d77a785a75da75a75ddd75751241/added/" />

You may be interested in some work we've been doing on experimental
foaf URN's. URN's are a specific, location independend class of URIs
with some very specific properties (see rfc 2141, rfc2276). We then
resolve those using DDDS (See rfc 3401 to 3405).

In short:
======

1.	Given a mailbox:	dirkx@webweaving.org
	Or someones sha1	3b0c3795b1496c64d621e5c5094adc74528b7dcd

2.	compose a URN (see below about the namespace 'foaf'):

		urn:foaf:dirkx@webweaving.org
		urn:foaf:3b0c3795b1496c64d621e5c5094adc74528b7dcd

3.	And lets now 'agree' that this FOAF urn denotes the owner, person
	described by the FOAF, rather than the RDF.

4,	Then use the Dynamic Delegation Discovery System  (DDDS, see below)
	to resolve those into:

	I2L		A Location, i.e. the URL du jour. This URL can be
			as simple as mailto:foo@bar.com. Note that
			in some cases this can be not quite trivial; e.g.
			consider a URN of the Satellite picture of todays
			cloud cover; which resolves to different long
			term URLs.

	I2R		A resource, i.e. something you do a GET on; say
			for example your home page; e.g.

				http://www.apache.org/~dirkx/

			or perhaps resolves to the vCard or a FOAF-naut
			link (try a random foaf urn below - as the latter is
			the default).

	I2C		A URC, Characteristics, i.e. metadata about the object,
			say for example an HTTP-GET on the raw RDF of
			your foaf:
				http://foaf.asemantics.com/zac

	Using the protocol(s) negotiated in the DDDS protocol (i.e. plain
	http-get, soap, dns) and the list of servernames, IP address and
	port numbers retrieved. In the above I've kept it simple and
	only asked for flat URI's to be returned. See LsID, the live
	sciences identifier for a more advanced SOAP/wsdl based
	final retrieval protocol.

So in short what this allows you to do is, given a foaf URN, is to get
either the current URL for it, the RDF or something more fancy over
a protocol you are able to speak. (Note that one could argue that this
is also usable as a bNode/smushing solution).

Note that the two key things are:

->	Given a URN; be able to separately
	ascertain how to get the actual thing, the current
	location of the thing -or- inform about the thing.

	I.e. do not give me the HTML page, give me
	metadata ABOUT that page; as a graph, tell
	me about provenance; who said what.

->	Create some level of abstraction between a URL,
	associated with he who manages the object and
	the entity which manages names.

If you are not one for reading a lot; try these URLs, show is the
output of a piece of perl (java also available) which walks the
entire RDF resolution process.

Zac:	Foaf: 		http://foaf.asemantics.com/zac
	Assigned: 	urn:foaf:3b0c3795b1496c64d621e5c5094adc74528b7dcd  
	Show:		http://demo.asemantics.com/biz/rdfa/walker.pl?urn:foaf: 
3b0c3795b1496c64d621e5c5094adc74528b7dcd

Alberto:
	Foaf:			http://foaf.asemantics.com/alberto
	Assigned:		urn:foaf:areggiori@webweaving.org
	Show:		http://demo.asemantics.com/biz/rdfa/walker.pl?urn:foaf: 
areggiori@webweaving.org

What you see is a URN beeing 'resolved'. Or fetch either
the perl or java from the URL below to play:

	http://foaf-demo.asemantics.com/java-urn-dds-0.01.tgz
	http://foaf-demo.asemantics.com/perl-urn-ddds-0.01.tgz

Or play with this wizzard to see what needs to be put in DNS

	http://foaf-demo.asemantics.com/

Note that the above comes with -a-lot-of-strings-attached- as
we very intentionally have not officially registered the FOAF
urn's with IANA. We are committed to doing so once we
see some consensus on the structure of the namespace
desired (see below).

The Long.
========

Most people here will notice that there are several enormous ratholes
I kind of skirted in the above; so here is my attempt to correct this:

To start at the beginning:

RFC 1737
  	Functional Requirements for Uniform Resource Names

defines a global, ubiquitous class of URI's which anywhere, regardless
of context defines the same resource.

The reason why we, as a company/developers are interested is that
within our customers we find that any URL based 'referal' mechanism
are met with immediate questions like "what happens if that domain
name is re-assigned or sold ? what happens if that organisation
ceases to exist ?" and a genuine desire to deal with federations
in as neutral a way as possible. Our concern are mostly with
governements, large libraries and space agencies; but see
the Life Sciences ID project -

	http://www.i3c.org/wgr/ta/resources/lsid/docs/

for another group struggling with much the same problem. Note that
this largely is a social/legislative problem; i.e. we are trying to mold
a naming/management system in the internet world as closely as
possible on existing procedures, laws, workflow's and protocols; rather
than insist everyone should webbify and adhere to a more
webby culture.

RFC 2276
      	 Architectural Principles of Uniform Resource Name Resolution

then goes on to introduce the concept of 'Resolution' - that is given
a URN how do I 'get' the 'thing', or where can I learn more 'about'
the thing.

It should be noted that the above is also suitable for most URL's; and
is something increasingly important for the semantic web; given a
resource; where do I find the metadata about it, some RDF describing it.

Key here is that many such resolution systems can exist; and that
depending on the resource at hand different systems will be needed.

E.g. a Library-of-Congress number assigned by the LoC will propably
be something the LoC wants to have a fair amounth of control over,
whereas pinpointing a random article on a blog is something more
akin to a free for all.

This is where:

RFC 2611 / BCP 33
	URN Namespace Definition Mechanisms

namespace come into play. I.e. within the URN world it is possible
to have numerous namespaces; which may differ in areas like:

->	central or decentral managent. Can anyone issue them
	at will, or is there some central organisation or is there an entity
	which is more of a loose federation whose issue names
	together.

->	wether they can be 'resolved'; an ISSN number perhaps
	can be resolved; as someone in Paris keeps then unique,
	and coordinates with the libraries participating whereas
	an ISBN number cannot easily as there is no concept
	of trackign who uses what number for which.

->	the social contract; i.e. how unique or reliable is the
	number; the British library is quite careful about not
	recycling their catalog numbers; ISBN numbers are
	very commonly recycled.

->	what is the commitment with respect to live time,
	with respect to on going management, etc.

->	What happens when the organisation issuing a
	name ceases to exist, merges, or when a collection
	has to be split ?

	(Note: for those wondering how this differs from
	a clever URL schema; each of the above can
	demonstratibly be handled by URNs;
	even the last item in the list.)

Now for FOAF, in our demo, we have designed a namespace
with the following properties:

foaf URNs of the form:	
	urn:foaf:peter@pan.com

->	Resolution federated and delegated automagically
	to the entity responsible for 'pan.com'; and no
	central elements except for a registration at
	IANA, the  Internet Assigned Number
	Authority (i.e. the folks assigning things like
	port 80 to http).

foaf URNs of the form:	
	urn:foaf:3b0c3795b1496c64d621e5c5094adc74528b7dcd

->	Resolution delegated, and managed, by a owner managed
	DNS zone in his own DNS on a first-come, first served
	basis (i.e. if two people claim the first hash, the first gets
	it) - with optional email verification.

	This relies on a neutral delegation zone in DNS, currently
	on a demo server. This is generally NOT a good idea; so
	long term we propably want this type of URN to become

	urn:foaf:qug.org:3b0c3795b1496c64d621e5c5094adc74528b7dcd

	I.e. get it prefixed by the FQDN of a kind DNS owner who is willing
	to handle further delegation for people it likes (in this case   
'quq.org')
	so that anyone can be a neutral (or not so neutral) delegator.

So normally one now would use

	RFC 2611 / BCP 33
		URN Namespace Definition Mechanisms

to ask IANA to formally define the foaf namespace. This has  
intentionally
not been done; as desired properties of the namespace are not quite  
clear
yet. But we're committed to help guide this through IANA once this  
community
has reached consensus on the namespace semantics. The above demo
relies on 'foaf.urn-arpa.asemantics.com' rather than the real'  
foaf.urn.arpa'.

Once this is in place the DDDS can start; this is effectively a  
language/scheme
neutral algorithm which jumps through the required levels of  
indirection to
route you to the actual endpoint du jour. It should be noted that DDDS  
can
also be used for other types of uri's (try dig NAPTR http.uri.arpa. to  
get
some inspiration)

The definition in the foaf namespace right now consists of two
possible paths;

->	Path one for the peter@pan.com urn's; these are delegated
	straight to pan.com

->	Path for the sha1 urn's. These need to go to a central (or
	with the quq.org example above - multiple) friendly/neutral
	body; which then delegates to a dns entry under the sha1's
	owners control.

This is done by installing a so called NAPTR level in a specific
tld for the foaf namespace: 'foaf.urn.arpa':

	!^urn:foaf:([^@]+)@([^:]+)(:?.*)!foaf_.\2!i
	^urn:foaf:([a-f0-9]{10})([a-f0-9]{10})([a-f0-9]{10})([a-f0 
-9]{10})(:?.*)!\1.\2.\3.\4.foaf-sha1.net!i

(See a demo one installed in  foaf.urn-arpa.asemantics.com. wit the  
command)

	dig NAPTR  foaf.urn-arpa.asemantics.com
	dig NAPTR rules.foaf.asemantics.com.

In the case of urn:foaf:peter@pan.com this then rewrites the urn into

	pan.com

and the next step is to obtain the NAPTR record from there. What follows
is a sequence of steps in which you

->	choose if you want I2R, I2L or I2C, i.e. the resource, the location
	of the resource or information about the resource.

And while doing this

->	select what protocol you are willing to speak; i.e. http, smtp, soap  
(see
	LSiD for a good rationale why they prefer that) or any other protocol
	you may have invented.

And finaly; at the endpoint of all this you will get a URL or a list of  
IP and
port numbers (multiple; much like the MX record for failover and  
loadbalancing).

And at this point you can actually fetch the thing.

This looks like (just showing the terminal nodes which resolve to URI's  
for simplicyt)

	"u" "http+I2L"  
"!^urn:foaf:([^@]+)@([^:]*)(:?)(.*)!mailto:peter.pan@pan.com!i" .
	"u" "http+I2N"  
"!^urn:foaf:([^@]+)@([^:]*)(:?)(.*)!urn:foaf:peter@pan.org!i" .
	"u" "http+I2R"  
"!^urn:foaf:([^@]+)@([^:]*)(:?)(.*)!http://www.peterpan.com!i" .
	"u" "http+I2C"  
"!^urn:foaf:([^@]+)@([^:]*)(:?)(.*)!http://foaf.peterpan.com/peter.rdf! 
i" .

So for peter@pan.com we simply punt and delegate straight from  
foaf.urn.arpa
to the domain of pan.com;  but for the SHA1 we need an extra level of  
indirection.
As the sha1's are essentially random we need some sort of re-direction  
registry
service. Or several of those (the quq.org postulated above). A  
practical problem
is that it should also be easy to update those. For our demo we are  
using something
called Secure Dynamic DNS update for this.

In short - we break down the URN sha1 into 4 blocks of 10 (the reason  
for this
is to allow the registry to scale) and then register a delegation plus  
a public
key in the zone

	^urn:foaf:([a-f0-9]{10})([a-f0-9]{10})([a-f0-9]{10})([a-f0 
-9]{10})(:?.*)!\1.\2.\3.\4.foaf-sha1.net!i

	urn:foaf:3b0c3795b1496c64d621e5c5094adc74528b7dcd

		thus becomes:

		 3b0c3795b1.496c64d621.e5c5094adc.74528b7dcd.foaf-sha1.net

delegates to (e.g. foaf-sha1.net; in the demo  
foaf-sha1.asemantics.com). At that
location the owner of the sha1 has much the same NAPTR block running as
show above for peter@pan.com

 From that moment on any entity posessing the private key belonging to  
the public key
in the delegating zone can change the delegation; i.e. in effect we  
push the management
of that zone back to the person who 'owns' the sha1.

As DNSsec is not quite trivial - the wizzard at:

	http://foaf-demo.asemantics.com/

helps you through this process. Note that in this public demo we are  
-not-
doing any DNS zone signing; as it just adds complexity and is more for  
govt.
style customers really. So the DNS sec is only used to give you a  
public/privat
key pair so you can remotely manage the delegation of an anonymous sha1

Demo-caveats:

->	use of urn-arpa.asemantics.com rather than urn.arpa.

->	use of a single sha1 jumpboard; rather than federate this
	to anyone with a dns server who is willing.

->	walker.pl *tries* to show you all possible options; which is
	fundamentally wrong - you are expected to walk into the
	DDDS loop with a a-priori defined need, i.e.i2c, 2l or 2r and
	list of protocols. This then gives you a unambigiously defined
	single answer. Our walker sort of 'cheats' and can actually
	walk/show things which cannot be reached OR mis delegations
	which are valid within DDDS. I.e. the walker does NOT follow
	DDDS - so when in doubt - fetch the precompiled java or
	fetch the perl and give that a spin.

Food for thought:

->	DDDS is equally valid for URI's; i.e. http.uri.arpa.
	does exist; so you _can_ actually try to resolve a
	http-URL and get back an rdf-URL with info about
	another URL.

Have fun !

Dw

_______________________________________________
rdfweb-dev mailing list
rdfweb-dev@vapours.rdfweb.org
wiki: http://rdfweb.org/topic/FoafProject
http://rdfweb.org/mailman/listinfo/rdfweb-dev

Attachments