Terminology again: was RDF's curious literals from John Black on 2007-08-01 (semantic-web@w3.org from August 2007)

From: John Black <JohnBlack@kashori.com>
Date: Wed, 1 Aug 2007 06:48:49 -0400
To: "Richard Cyganiak" <richard@cyganiak.de>, "Garret Wilson" <garret@globalmentor.com>
Cc: "Story Henry" <henry.story@bblfish.net>, "Tim Berners-Lee" <timbl@w3.org>, "Semantic Web" <semantic-web@w3.org>
Message-ID: <0aaa01c7d429$8ca74ab0$6601a8c0@KASHORI001>

Richard Cyganiak wrote:
>
> On 1 Aug 2007, at 01:49, Garret Wilson wrote:
>> So let me state this another way: to say that non-literal resources  use 
>> URIs as identifiers and literal resources use strings as  identifiers is 
>> a false dichotomy. RDF uses strings for all its  identifiers. It's just 
>> that for non-literals, these strings conform  to a format called URI
>
> That's simply not true.
>
>     <http://dbpedia.org/wiki/George_W._Bush>
>
> and
>
>     "http://dbpedia.org/wiki/George_W._Bush"
>
> do not identify the same resource. The first identifies a person, the 
> 43rd president of the U.S.. The second identifies a string of Unicode 
> characters that happens to conform to the URI syntax.
>

I think this quote reveals a problem of terminology that has an easy remedy. 
It is the confusing use of *the symbol used by an agent* for *the agent that 
uses the symbol* in the subject position of sentences using the verb 
"identify". The worst examples are statements of the form, "This URI 
identifies xyz".

URIs don't *do* anything. They aren't agents. Identify is a verb. Verbs 
require animate subjects. The solution is to be clear about *who* is doing 
the identifying.  None of these are animate subjects: "non-literal 
resources", "literal resources", "RDF", "<http://dbpedia.org/wiki/George W. 
Bush>", " "http://dbpeadia.org/wiki/George W. Bush" ". None of these belong 
as the subject, the *who is doing*, of sentences with "identify" as the 
verb. In order to shorten discussions about these topics, I think it would 
be helpful if we remember to be clear about who is doing the identifying, 
referring, etc.

As  *bad* examples to illustrate my point consider these sentences:

"After the URI  <http://dbpedia.org/wiki/George_W._Bush> identified the 
person George W. Bush, it went down to the local pub for a beer."

"The literal string "http://dbpeadia.org/wiki/George W. Bush" identifies a 
string of unicode characters, and also runs 3 miles before breakfast."

"RDF uses strings for all its identifiers, but it doesn't use floss after 
eating and thus has gingivitis."

Here are some *good* examples:

"People following the best practices guidelines of the semantic web 
initiative use <http://dbpedia.org/wiki/George_W._Bush> to identify a 
person, the 43rd president of the of the U.S."  (NOT: 
<http://dbpedia.org/wiki/George_W._Bush> identifies a person,.....)

"People following the best practices guidelines of the semantic web 
initiative use  "http://dbpedia.org/wiki/George_W._Bush" to identify a 
string of Unicode characters."  (NOT: 
"http://dbpedia.org/wiki/George_W._Bush" identifies a string of 
Unicode....).

"People following the best practices guidelines of the semantic web 
initiative use URIs as identifiers of non-literal resources." (NOT: 
"non-literal resources" use URIs to identify...)

"People following the semantic web standards of the W3C use strings to 
identify literals."  (NOT: "literals" use strings to identify....)

"People who use RDF use character strings for different purposes as 
described by the standards that define RDF. But from a different 
perspective, outside of the definitions contained in the RDF standards, 
those people who use RDF are also using strings for everything, as it is a 
text-based technology." (NOT: "RDF" uses strings to....)

John

Received on Wednesday, 1 August 2007 10:49:35 UTC