Re: Tag ontology RFC from Richard Newman on 2005-04-05 (semantic-web@w3.org from April 2005)

From: Richard Newman <r.newman@reading.ac.uk>
Date: Tue, 5 Apr 2005 18:03:52 +0100
To: Stefano Mazzocchi <stefanom@mit.edu>
Cc: semantic-web@w3.org
Message-Id: <55401fd1f36925be377ac7dbed43a335@reading.ac.uk>
Stefano,
   Thank you for your comments. Replies below.

> First of all, tagging is the idea of allowing people to choose their 
> own 'things' instead of relying on somebody else's concepts... and 
> here you are, defining things like
>
>  :equivalentTag
>  :relatedTag
>
> that are exactly those idealized concepts that fit your mindset but 
> might not fit mine. I suggest to remove those alltogether and let 
> ad-hoc-ontologies handle the relationships between tags. Why? well, 
> there is no difference between tag:relatedTag and a general purpose 
> RDF property anyway.

Firstly, I have been drifting towards reifying tag relations, too, as I 
think I added to that document. I am of the belief that both taggings 
and interrelationships between tags are user-centric, and so must be 
reified to capture this additional information. Danny has eloquently 
put forth a companion view, that similar goals can be accomplished by 
distinguishing between users' tags through namespaces.

Having some way to express relationships between tags is useful, I 
think. E.g. I use data from two sources, one American and one English; 
the former tags with "humor" and the latter with "humour". Someone else 
might not equate the two, but I might wish to integrate them. Various 
uses can be applied for less-strict relationships, right down to 
relatedTag (which, as you note, is pretty much synonymous with the 
existence of any property). If it's going into ad-hoc ontologies, I may 
as well make them less ad-hoc and put them on the same page, no?

> The only relationships that should be put in a tag ontology are those 
> that are objective to the tag themselves, for example "collidesWith" 
> if they share at least one label. The rest should be left to the users 
> to decide (whether they are equivalent, related, or in what kind of 
> relation they are).

I think you're solely commenting on the single-triple tag 
interrelations. Yes, I quite agree... I was of a mind to remove 
equivalentTag for that reason. However, :relatedTag can be considered 
as objective as any statement (e.g. shared tagged object = related?).

> Minor, but I think :tagName should be :name, let the namespace provide 
> the context.

Mmm.

> :taggedResource is useless: it can be easily inferred.

You mean as the inverse of tags:tag? I kept both in at this stage to 
provide an option for discussing. I can see two possible uses:

1. you're focused on a resource, and wish to tag it. tags:tag pointing 
to a reification makes most sense.
2. you're focused on a tag, or a user, and wish to model their tagging. 
The tagged resource isn't primary, so you point to it rather than from 
it.

The same applies for retrieval. Any thoughts from other interested 
parties? Which way should the Tagging <-> resource relationship point?

> Honestly, I don't think the complexity is worth the value of modelling 
> the 'act of tagging',

I would disagree with that... even if one takes the simplest kind of 
collaborative tagging, del.icio.us, exporting that database to RDF 
requires dealing with that problem. del.icio.us's database has two 
dates, an author, a resource, and a set of tags in addition to all the 
bookkeeping. Try fitting that lot in a triple :)

> but in any case I definately disagree with rss:Item rdfs:subClassOf 
> tags:Tagging .
>
> If you go down this path, pretty much any action related to add RDF to 
> something has to be a subClass of tagging.... and pretty soon you end 
> up modelling provenance, trust and all that yourself.

We (Seth Russell, Danny, and myself) had a fair amount of discussion of 
this, which led to my conclusion:
- some rss:items might be considered taggings (those that annotate a 
resource with some categories)
- but it's far from a perfect match.

I summarised this by saying "Personally, I would tend to go for the 
"tag the rss:item" approach." i.e. it's not a close enough match to 
formalise.

> There is really nothing different between tagging and adding RDF. The 
> only difference is that the inference needed to extract :collidesWith 
> is different enough that requires me to type it.
>
> Anything else is just the exact same modelling that applies to any RDF 
> creation action, so we should just build on the shoulders of those who 
> are working on provenance and trust, instead of reinventing the wheel 
> every single time.

That's quite an interesting point. In an ideal world, this work indeed 
wouldn't be necessary at all; RDF would have shipped in '99 with 
quads/named graphs/signing/WoT etc. as necessary, and we'd be able to 
simply tag a resource with some RDF and figure out when and who did the 
tagging. But we can't. RDF as it stands can't do generalised annotation 
of statements within the model.

As such, this little ontology came about with the goal of modelling the 
output of something like del.icio.us --- a system which allows _users_ 
to apply text-string tags to URIs. The complexity of the data model 
(use of reification) is rather irrelevant. To create a tag: mint a URI 
and point it to the given label (optionally language tagged). Tagging a 
resource: encode who and when, and dump them out as a few triples. The 
user interface hides the rest.

The reification is pretty handy in practice --- I've already put 
together a system that knows about related nodes through shared 
authors/tags/resources, and can do all the other stuff that one would 
expect a tagging system to do. Expecting to use provenance etc. at the 
moment wouldn't even be able to separate my tags from yours, or sort my 
taggings by date, which is unfortunate. If it could, it would be 
through using a non-standard technology like Named Graphs, Redland 
context nodes, etc.

Regards,
-R
Received on Tuesday, 5 April 2005 17:04:01 UTC