semantic pingback improvement request for foaf from Story Henry on 2010-04-15 (public-lod@w3.org from April 2010)

From: Story Henry <henry.story@bblfish.net>
Date: Thu, 15 Apr 2010 22:36:09 +0100
To: public-lod community <public-lod@w3.org>, foaf-protocols@lists.foaf-project.org
Cc: Sören Auer <auer@informatik.uni-leipzig.de>, Philipp Frischmuth <pfrischmuth@googlemail.com>, Sebastian Tramp <tramp@informatik.uni-leipzig.de>
Message-Id: <1B017104-9081-4822-8D3A-DBD2CD25D5E3@bblfish.net>

Hi,

I often get asked how one solve the friend request problem on open social networks that use foaf in the hyperdata way.

On the closed social networks when you want to make a friend, you send them a request which they can accept or refuse. It is easy to set up, because all the information is located in the same database, owned by the same company. In a distributed social foaf network anyone can link to you, from anywhere, and your acceptance can be expressed most clearly by linking back. The problem is: you need to find out when someone is linking to you.

So then the problem is how does one notify people that one is linking to them. Here are the solutions in order of simplicity.

0. Search engine solution
-------------------------

Wait for a search engine to index the web, then ask the search engine which people are linking to you.

Problems:

- This will tend to be a bit slow, as a search engine optimised to search the whole web will need to be notified first, even if this is only of minor interest to them
- It makes the search engine a core part of the communication between two individuals, taking on the role of the central database in closed social networks
- It will not work when people deploy foaf+ssl profiles, where they access control who can see their friends. Search engines will not have access to that information, and so will not be able to index it.

1. HTTP Referer Header
----------------------

The absolute simplest solution would be just to use the mis-spelled HTTP Referer Header, that was designed to do this job. In a normal HTTP request the location from which the requested URL was found can be placed in the header of the request.

http://en.wikipedia.org/wiki/HTTP_referrer

The server receiving the request and serving your foaf profile, can then find the answer to the referrer in the web server logs.

Perhaps that is all that is needed! When you make a friend request, do the following:

1. add the friend to your foaf profile

<http://bblfish.net/#hjs> foaf:knows <http://kingsley.idehen.name/dataspace/person/kidehen#this> .

2. Then just do a GET on their Web ID with the Referrer header set to your Web Id. They will then find in their apache logs, something like this:

93.84.41.131 - - [31/Dec/2008:02:36:54 -0600] "GET /dataspace/person/kidehen HTTP/1.1" 200 19924 "http://bblfish.net/" "Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5"

This can then be analysed using incredibly simple scripts such (as described in [1] for example)

3. The server could then just verify that information by

a. doing a GET on the Referer URL to find out if indeed it is linking to the users WebId
b. do some basic trust analysis (is this WebId known by any of my friends?), in order to rank it before presenting it to the user

The nice thing about the above method is that it will work even when the initial linker's server does not have a Ping service for WebIDs. If the pages linking are in html with RDFa most browsers will send the referrer field.

There is indeed a Wikipedia entry for this: it is called Refback.
http://en.wikipedia.org/wiki/Refback

Exactly why Refback is more prone to spam than the ping back or linkback solution is still a bit of a mystery to me.

2. Referer with foaf+ssl
------------------------

In any case the SPAM problem can be reduced by using foaf+ssl [2]. If the WebId is an https WebId - which it really should be! - then the requestor will authentify himself, at least on the protected portion of the foaf profile. So there are the following types of people who could be making the request on your WebId.

P1. the person making the friend request

Here their WebId and the referer field will match.
(this can be useful, as this should be the first request you will receive - a person making a friend request, should at least test the link!)

P2. A friend of the person making the friend request

Perhaps a friend of P1 goes to his page, comes across your WebId, clicks on it to find out more, and authentifies himself on your page. If P2 is a friend of yours too, then your service would have something somewhat similar to a LinkedIn introduction!

P3. Just someone on the web, a crawler...

Then you know that he is making his friendship claim public. :-)

The above seems to be just some of the interesting information one could get
from the analysing the Referer field logs.

3. Pingback
-----------

For some reason though the Referer Header solution was not enough, and so the pingback protocol was invented.

http://www.hixie.ch/specs/pingback/pingback

I am still not quite clear what this solution brings in addition to the refback one, other than that

- it declares the method of the pingback declaratively. If there is a ping back header, then it is clear that it can be used. The referer header is so much part of the web, it won't be clear to anyone if the WebId server is using it.

- it makes it possible for the web page owner to decide who should process the
pings, rather than leaving that to the apache server owner (though that is not true of the HTTP Header mechanism proposed)

- it makes it easy to chose another server as the ping server

Looking at the specification one has a feeling that it is pretty well thought through. Mostly. One glaringly archaic piece now is the requirement on the xmlrpc response. Essentially in order to notify someone of something that is referring to them they have set up an xmlrpc system, where a simple HTML FORM would have done! XMLRPC I think is no longer the flavour du jour, and people have moved on. HTML FORMS remain used by everyone everywhere. They don't seem to go out of fashion. They are also really easy to use, and every developer needs to know how to use them.

4. Semantic Ping Back
---------------------

The linked data movement developed an enhancement of the Ping Back service described in 3 above. Essentially it adds an ontology to the link system described in the ping back service above, and the details are described here

http://aksw.org/Projects/SemanticPingBack

Most important perhaps is the pingback service relation

http://purl.org/net/pingback/service

defined as

<http://purl.org/net/pingback/service> a owl:ObjectProperty ;
:comment "This property is used to link the target resource with a pingpack RPC service URI. It is the RDF " ;
:isDefinedBy <http://purl.org/net/pingback/> ;
:label "pingback service" .

5. Improved Semantic Ping Back
------------------------------

So my guess is that being the early days of the semantic web, 4 is still new enough that it can be changed. Ie, none of the xmlrpc agents are going to be looking for that relation, and so we have a chance either to add a new relation, or to create a very similar relation to fix the bugs of pingback. Here is what I propose

@prefix ping: <http://purl.org/net/pingback/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

service:ping a rdf:Property;
rdfs:domain foaf:Agent; #probably a restriction to be removed, or be refined...
rdfs:range xxx:POSTResource;
rdfs:comment """
This relation specifies a method for services that wish to let
document owner know that they are linking to this resource.

The relation relates a WebId to a collection (named ?coll from here
on). A new resource of type PingEvent can be created in that collection
by POST ing a URL that mentions the given WebId.

The content that should be sent to the collection is what would be the result
of POSTing the following form

<form action="POST" action="?coll">
referer: <input type="text" name="referer"/><br/>
comment: <input type="text" name="comment"/>
</form>

The representation returned by a GET on the POSTResource can even return
the above html form, making it human readable.

( A nice improvement would be for the form to contain rdfa markup, that
would make it clear what the semantics of the form was, by using relations
described in this ontology )

The resource created should be a named ping request, which itself
can be described using this ontology.
""" .

This it seems to me would be so transparently simple as to be self explanatory
to any web developer, increasing uptake and reducing the need for explanation -
especially if the resource returns a web form as described above.

6. Improving Semantic PingBack with foaf+ssl
--------------------------------------------

Just as with 2, semantic ping back can be improved with foaf+ssl, helping
the ping back service identify the user making the ping request. This can be very
useful in linked data worlds between large databases that may be pinging each other
very often. This would allow trusted agent's pings to be accepted more automatically
than new ones.

Henry Story

[1] http://www.the-art-of-web.com/system/logs/
[1] http://esw.w3.org/Foaf%2Bssl/FAQ

Social Web Architect
http://bblfish.net/

Received on Thursday, 15 April 2010 21:36:53 UTC