Re: foaf:page vs. foaf:topic

Hi Martin,

On Sat, Feb 13, 2010 at 1:02 PM, Martin Hepp (UniBW)
<martin.hepp@ebusiness-unibw.org> wrote:
> Dear all:
>
> In the context of the GoodRelations ontology, there is a regular need to
> link
>
> 1. a data entity (e.g. representing a company, a product, or an offer)
>
> with
>
> 2. the URI of a XHTML/HTML Web Resource that contains human-readable
> information about that entity (often combining the info for multiple such
> entities, i.e. it is NOT a direct representation of the data entity).

Presumably you wouldn't exclude the possibility that the Web document
eventually contains some level of machine-readable description?
Microformats, RDF, or even content negotiation.

> Example: We define Microsoft as a business entity in our own namespace and
> want to preserve a link to the established, browsable resource.
>
> foo:microsoft a gr:BusinessEntity;
>                       gr:legalName "Microsoft Corp.".
>
> Up to now, we generally use and recommend rdfs:seeAlso for the link from the
> data entity to the Web page URI, e.g.
>
> foo:microsoft a gr:BusinessEntity;
>                       gr:legalName "Microsoft Corp.";
>                        rdfs:seeAlso <http://www.microsoft.com/>.

See also was designed for this. It has a slight cultural bias towards
citing documents that have a machine-readable form but in the RDFCore
specs, nothing that mandates this. I made some additional notes in
http://esw.w3.org/topic/UsingSeeAlso explaining why we took this
design decision btw.

> Note that we cannot simply do content negotiation (i.e. redirect http
> requests for html to http://www.microsoft.com), because of practical and
> theoretical reasons. Also, content negotiation is IMO no substitute for a
> traversable link from the data node to the HTML node in the graph of data).

Understood and agreed. 'Normal' Web pages are deeply embedded in
social practice, not to mention business cards, databases and even QR
Codes.

> The initial motivation for rdfs:seeAlso was that it does not require
> importing a second ontology like FOAF, and I would also hold that using
> rdfs:seeAlso is, in principle, correct.

I appreciate the tradeoffs here. All vocabularies describe the same
world so naturally our efforts overlap. It makes sense to document the
cases where we have similar idioms in different namespaces.

> However, due to the growing amount of links on the Web of Linked Data,
> rdfs:seeAlso is now being used so frequently that it has become too
> unspecific for our purpose.

Ah, a success disaster? :)

When I was pitching 'hypertext RDF' to an XML audience in 2003 I
suggested an idiom that has not really been used yet. Rather than just
mention a seeAlso pointing to a 'document', make additional statements
about the thing we point to. Perhaps a Person is further described by
a Resume document, or a Bibliography document, or a PhotoGallery
document. Similarly with companies, there are different kinds of
document - for humans and for machines - which we can link to.

In http://www.oreillynet.com/xml/blog/2003/12/dan_brickleys_rdfsseealso_rdf.html
Bob Ducharme picked up on this, "Note in particular the ninth of Dan’s
11 slides, which demonstrates how to assign a type to the link
destination. Like any link typing or link destination typing, this
adds value to the link by letting human or automated agents decide
whether traversing the link will give them information they want
without requiring them to follow the link."
... but nobody else has seemed too interested yet :)

via http://www.w3.org/2001/sw/Europe/talks/xml2003/Overview-3.html

<Person>
 <name>Dan Brickley</name>
 <rdfs:seeAlso>
   <x:Bibliography rdf:about="../stuffIwrote.rdf"/>
 </rdfs:seeAlso>
 <rdfs:seeAlso>
   <x:Resume rdf:about="../cv.rdf"/>
 </rdfs:seeAlso>
</Person>


> If there are 20+ rdfs:seeAlso links from an entity,  e.g. to images and
> other resources, it's hard for a user agent to spot the single one link that
> points to the Web page, e.g. for actually buying a product.

I'd argue that this should motivate us to create a few classes
indicating typical forms of RDF document. These aren't 1:1 with RDF
vocabularies, since each namespace can be used in many different ways.
The Dublin Core community lately call these markup idioms "application
profiles".

> Now, the two main candidate predicates for replacing rdfs:seeAlso are IMHO
>
> 1. foaf:topic
> and
> 2. foaf:page.
>
> I have seen many usages of foaf:topic in such scenarios, but from reading
> the FOAF spec, I think that foaf:page is much more appropriate.

As someone else pointed out, these are inverses. A (foaf:)Document can
have many things that are its topics; for each of those things,
(foaf:)page points back the other way; to a Web page about that thing.

FOAF highlights an important case, where a document has some
particular thing as its (foaf:)primaryTopic. We used this as a tricky
for helping ground RDF descriptions in the deployed world of 'normal'
Web sites, while maintaining a distinction between things and the
documents that describe them. The inverse property
(foaf:)isPrimaryTopicOf is there as a convenience for situations
especially in RDF/XML where the markup is primarily about the thing,
and the indicative document is mentioned somewhat in passing as an XML
subelement. In RDFa the need for this is somewhat less since we have
the rev= notation (in XHTML at least).

> Proposed Pattern:
>
> foo:microsoft a gr:BusinessEntity;
>                       gr:legalName "Microsoft Corp.";
>                       foaf:page <http://www.microsoft.com/>.

I would write foaf:homepage there. FOAF homepage is an OWL inverse
functional property, so if you find two descriptions mentioning blah
blah blah having a foaf:homepage of <htttp://www.microsoft.com/> you
can infer they are both talking about Microsoft.

> foaf:topic could be used for linking back from the Web page URI to the data
> entity URI, e.g.
>
> <http://www.microsoft.com/> foaf:topic foo:microsoft.

Yep, although foaf:primaryTopic would work here slightly better in
this specific example, assuming it is reasonably agreed by all
concerned that the page is mainly about a single entity Microsoft.

Your original question mentioned 'often combining the info for
multiple such entities', in which case 'topic' is perfectly fine. But
if it is possible to specialcase those documents which we know have a
distinct primary topic, I'd recommend doing so since it helps hugely
with data merging and identity reasoning.


So, when to use foaf:isPrimaryTopicOf versus foaf:homepage?

foaf:homepage is
    <rdfs:subPropertyOf rdf:resource="http://xmlns.com/foaf/0.1/page"/>
    <rdfs:subPropertyOf
rdf:resource="http://xmlns.com/foaf/0.1/isPrimaryTopicOf"/>

...and the dividing line between saying that a document is something's
"home page" versus merely a page that has it as a primary topic, is a
hard one to articulate precisely. It has something to do with control
and authority, and so works a bit differently depending on the kind of
entity. For example, I could write a page about Madonna which had here
as a primary topic, but we wouldn't say it was her homepage. Just a
page about her. Similarly with companies. But as you move into other
entity types, eg. pets, children and products, the intuitions blur a
bit.

Pragmatically, 'homepage' has an open rdfs:domain of owl:Thing so that
we can freely talk about lots of kinds of thing having homepages. It
is also a much nicer property name than 'isPrimaryTopicOf'. I would
recommend it for at least company Web sites.  Your examples were a
company, a product, and an offer. I'd go with 'homepage' for the first
two. For offer, I haven't looked at examples but I would guess either
primaryTopic of isPrimaryTopicOf would work; and your choice there
could be driven by syntax-level concerns from RDF/XML or RDFa, rather
than pure modelling. Are you mainly targetting RDFa here? Does it need
to work in HTML5 as well as XHTML, etc.

> What's your opinion on that? Will that work with your software applications?
> Or should we use foaf:topic instead? If so, in which direction?
>
> Alternative 1:
>
> foo:microsoft a gr:BusinessEntity;
>                       gr:legalName "Microsoft Corp.".
>
> <http://www.microsoft.com/> foaf:topic foo:microsoft.
>
> Alternative 2:
>
> foo:microsoft a gr:BusinessEntity;
>                       gr:legalName "Microsoft Corp.";
>                       foaf:topic <http://www.microsoft.com/>.
>
> I personally think that the second alternative is wrong, because the data
> entity does not describe the Web page, but vice versa.

Correct.

> Since this decision will be important for compatibility with SemWeb /
> Linkedata applications, I would be very thankful for your comments.

Thanks for bringing this up. From a linking point of view, it is good
to build connections between datasets structured around real-world
entities (companies, products) and datasets that have URLs for 'old
fashioned' homepages. I think we gain even more when those
relationships make use of 'Inverse Functional Property' and
'Functional Property' constructs from OWL since it helps with
automated data merging (as used in Sindice, Garlik etc.). Whether you
prefer to use FOAF or not, only you can decide :) If you do use it,
let me know and I'll drop an example into the FOAF spec pointing to
the relevant bit of GR documentation. If you'd rather keep things
homogeneous and define a relation in the gr: namespace, perhaps at
least express it's connection to the foaf:primaryTopic concept using
OWL (and I can reciprocate that in the FOAF RDFS/OWL schema). Finally
if you do choose to explore the idea of using seeAlso with more
precise typing of the thing linked to, I'd love to hear how that works
out.

cheers,

Dan

Received on Saturday, 13 February 2010 14:57:06 UTC