Re: bbc-programmes.dyndns.org from Peter Ansell on 2008-06-22 (public-lod@w3.org from June 2008)

From: Peter Ansell <ansell.peter@gmail.com>
Date: Sun, 22 Jun 2008 16:40:35 +1000
To: "Alan Ruttenberg" <alanruttenberg@gmail.com>
Cc: "Richard Cyganiak" <richard@cyganiak.de>, "Nicholas Humfrey" <Nicholas.Humfrey@bbc.co.uk>, public-lod@w3.org
Message-ID: <a1be7e0e0806212340y30a04040y9f1795dc4b273526@mail.gmail.com>
2008/6/22 Alan Ruttenberg <alanruttenberg@gmail.com>:
>
> On Jun 21, 2008, at 11:52 PM, Peter Ansell wrote:
>>>>
>>>> Not if you type it with xsd:anyURI...
>>>
>>> The you are saying the page is an xsd:anyURI, not a web page.
>>
>> You aren't saying that all RDF Resource (non-literals) are web pages
>> though. So why is saying that it is an RDF Resource supposed to
>> indicate that it is a web page?
>
> It doesn't. But it doesn't say it can't be, which is what you are doing if
> you talk about the literal - literals and individuals are disjoint.
>
> By using foaf:page you add a little more - since the range is foaf:document,
> you know that the resource is a document now. It would be better to have a
> more refined ontology of things that one lands up interacting with in a
> browser window. I'm involved in a couple of efforts in that direction, as I
> am sure others are. In foaf, there is foaf:homepage, but it isn't quite
> right. Two things to note about it:
>
> (1) Even though it's range is specified as foaf:document, it's english
> documentation, which expresses the intention better, says that the target is
> "public Web document". It's just that foaf doesn't have the vocabulary to
> use as more precise range for this property.
>
> (2) The reason it isn't suitable in this case is that the documentation
> says: "The homepage is usually controlled, edited or published by the thing
> whose homepage it is". Since in this case the subject of the page isn't the
> sort of thing that controls, edits, or publishes, it doesn't really match,
> although the hedging "usually" might allow for it. Best to ask the author
> (Hi Dan, we know you're out there ;-)  of the ontology and have them clear
> up the documentation.

I vaguely remember reading that foaf, ie, dan and co., actually
changed foaf from using a literal for web pages to using a resource,
something that I disagreed with when I read it but realised it might
be a common presumption that web pages form a subset of all semantic
web things so decided not to make much of a fuss until I understood it
just a bit better. I haven't changed my ideas after reading further
into the area.

>>>> Is there no separation allowed between the web and the semantic web
>>>> really?
>>>
>>> Need there be?
>>
>> Clearly, there is a big wide world out there with a web that exists
>> perfectly fine with the semantic constrains ;) IRL!
>
> Sorry, don't know what "IRL" means. But if I get your drift, you are saying
> is that the web does not require the Semantic Web to talk about it. However
> that doesn't imply that it doesn't or shouldn't.

I find the hype and universality that RDF presumes to be rather
pretentious, in a practical sense. Colleagues mocking because it
definitely doesn't fit the hype doesn't help, but with a disjoint
approach I can convince them of its usefulness without looking like a
fool in their eyes.

>>>> I thought the semantic web was based on logic not web structures?
>>>
>>> Where did you get that idea?
>>
>> By definition not all URI's are web structures, therefore the basis is
>> in a non-web scenario, of which web structures occupy a distinct
>> logical subset. RDF and OWL assume that there are abstract classes,
>> which are not web structures by any means.
>
> I don't know what an abstract class is.
>
> However, if you are saying that RDF and OWL talk about more than web
> structures, you are absolutely right. That means the domain of the Semantic
> Web is a set that *subsumes*  web pages, not a set *disjoint* from them.

I think disjoint is more accurate, and that is how I approach the
Semantic Web so far, and have had no issues in using non-semantic
URI's with xsd:anyURI as a typed literal to make the disjointedness
clear.

>>>> The semantic web doesn't gain anything from the result of that page,
>>>> which
>>>> clearly has an
>>>> alternative semantic representation available that you are already
>>>> looking at when you see the foaf:page (or whatever predicate allows
>>>> literals) statement.
>>>
>>> It isn't about the result of what you fetch so much as it is speaking
>>> clearly, as I said earlier. The domain of foaf:page is a document.
>>> Neither a
>>> string nor an xsd:anyURI is a document. End of story.
>>
>> It is clear to me what the string means. And saying it is a
>> foaf:Document doesn't help with that at all. foaf:Page having a domain
>> of rdf:Resource doesn't have any more practical benefit than if it
>> didn't say what its domain was.
>
> To you perhaps. To others it does. For one thing it can be used to do some
> basic checking for nonsense statements. (such as the one you were about to
> make ;-)

Nonsense as in getting HTML after resolving an identifier where every
other identifier previously revealed RDF representations when
resolved.  At least they could expect based on the vocabulary that a
typed literal needed to be dealt with differently on a
non-semantically extended HTML page. The set of RDF statements that
would otherwise be generated from the HTML page would be an empty set,
ie, completely unuseful, whereas if they were aware that HTML elements
on their own had value to them they wouldn't bother parsing for RDF
and could find the HTML page.

>>>> If you accept that the ontology you are using puts xsd:anyURI typed
>>>> literals into a given field it is perfectly meaningful to use the
>>>> string as you do any other URI string,
>>>
>>> If you use another ontology than foaf, with a different relation whose
>>> domain is an xsd:anyURI, and that relation is documented in such a way as
>>> to
>>> make sense, then sure. I don't happen to see what is gained by doing
>>> that.
>>
>> The ability to have a string as you say which won't be presumed to be
>> a semantic resource identifier on its own which people can look at and
>> resolve themselves.
>
> And?
> -
> What is a "semantic resource identifier"?

Semantic resource identifier (n.) : Anything that can be used in the
subject position of an RDF Statement.

> I'm still failing to see harm in <http://....>. One can examine an RDF
> representation, read that, and resolve that manually as well.

A computer must presume that for RDF identifiers RDF is the preferred
format. Wouldn't it be easier just to acknowledge that HTML exists and
identify it using typed literals consistently so people recognise the
difference.

>>>> just in a context which won't be interfered with, or interfere itself
>>>> with, the logic based semantic
>>>> web rules.
>>>
>>> I don't know what you mean by "interfered with" or what connection you
>>> are
>>> making between this particular choice and logic based semantic
>>> web rules. It seems to me that the main benefit of using foaf:page here
>>> is
>>> that a lot of people know what it is supposed to mean.
>>
>> Do they really gain the benefit specifically from its use as an
>> rdf:Resource though?
>
> The instances of rdf:Resource are defined to be *everything*. I'm not sure
> what you mean by "benefit specifically from its use as an rdf:Resource", but
> I don't need to because by definition everything is a rdf:Resource.
>
> It's like saying: Do I gain specifically from being composed of matter? That
> I am is a matter of fact. The question might be of metaphysical interest,
> but not practical interest.

It is of practical interest if people ever acknowledge that
non-semantic resources do and will always exist outside of the RDF
universe and should be as easily accessible as any other resource in
order for people to mix semantic's with non-semantic representations.

>> Or do they really do a non-semantic retrieval of
>> the resource? Should they only expect to be able to retrieve machine
>> readable representations if they resolve this resource?
>
> Who are they?

Consumers (n.) : Those people who will eventually penetrate the
academic jungle that is the semantic web community and utilise
resources without days of special instruction from the "experts".

> What is a machine readable representation?

One of the many (confusingly) diverse RDF representations.

>> How do you actually say that a specific rdf resource doesn't actually
>> direct to
>> an rdf representation as an idenfifier itself.
>
> I'm having trouble parsing this sentence. Could you rephrase it?

See below with respect to knowing to not ask for RDF if it isn't an
identifier but still a typed anyURI literal.

>>>>> The web page is
>>>>>
>>>>> <http://www.bbc.co.uk/programmes/b00b07kw.html> (the thing that the URI
>>>>> denotes)
>>>>
>>>> It isn't an RDF Resource any more than my street and suburb address
>>>> though, it is a simple human based locator which doesn't really have a
>>>> need or want to be an RDF Resource IMO.
>>>
>>> In both the case of the house, and the case of the web page, there is the
>>> resource - the house and the web page - and there is the address of the
>>> house and of the web page (also resources, but different ones). In
>>> discussion, one says different things about the address and the thing.
>>> For
>>> instance,
>>>
>>> "http://www.bbc.co.uk/programmes/b00b07kw.html" has 45 characters.
>>> or <http://www.bbc.co.uk/programmes/b00b07kw.html> uses the stylesheet
>>> <http://www.bbc.co.uk/programmes/r/23870/stylesheets/decor.css>
>>> or "http://www.bbc.co.uk/programmes/b00b07kw.html" is a name for
>>> <http://www.bbc.co.uk/programmes/b00b07kw.html>
>>
>> I don't see why your convention of not dealing with URI's as strings
>> themselves really helps.
>
> You keep thinking that I am arguing that some convention is useful. The only
> thing I am arguing is useful speaking clearly. There is a difference between
> the string and the thing it names (when the string actually names
> something). If you use the string for both cases one can't tell, in general,
> which it is that is your subject of discourse. Nor can you infer that it
> even is to be used as a name. Ambiguous statements work (to a certain
> extent) with people. They work to a lesser extent with machines,  at least
> for the moment.

A typed literal is not a string IMO even if it is represented as a set
of unicode characters. I find no issues with getting a machine to
understand <semanticURI> foaf:page "http://..blah.html"^^xsd:anyURI
and being able to for instance compile these html representations and
perform textual searches on the results. If it was only ever a
semantic URI the machine would natively search for RDF instead of
freetext and hence would be confused when the result was delivered as
an HTML page with  no semantic extensions. Far easier to just say it
is a URI and by convention you would know this string was a resolvable
URL that they could parse in a different way. They would know not to
expect and therefore not ask for, RDF representations at this stage
also, which is better than you could do for the other case where you
really should assume that RDF Resource identifiers represent RDF pages
and hence you should ask for RDF if you ever want to resolve them.

It simplifies a lot to have an explicit difference at the ontology and
instance level.

>>> "32 vassar avenue, cambridge, ma, usa" has 36 characters or
>>> <the MIT Stata Center> foaf:depiction
>>> <http://en.wikipedia.org/wiki/Image:Wfm_stata_center.jpg>
>>> or "32 vassar avenue, cambridge, ma, usa"  entered into google maps, will
>>> locate <the MIT Stata Center>
>>
>> And I am trying to say your last statement exactly. When entered into
>> a web browser the .html version will produce something they can look
>> at... Why is it different for addresses?
>
> It's not. There are great many things one can say. foaf:page doesn't say
> this. Invent a relation that means what you want it to, document it well,
> and use it.  David Booth calls this relation hasURI
> (http://esw.w3.org/topic/AwwswDboothsRules)

That document blurs a lot of rules that RDF/XML has in comparison to
N3, and hence it is not really useful for a fully compatible RDF
system, but the ideas in relation to hasURI are useful. What is
stopping the author of this scheme using it explicitly? It doesn't
have to be assumed, particularly if you have no other use for the
resource identifier yourself and don't see anyone else needing to
extend your description of the HTML page as anything else.

>>>> It is a coincidence IMO that it is defined in the same way that RDF
>>>> Resources are, and it isn't
>>>> useful to mix everything up by presuming that URL's of web pages are
>>>> useful as RDF Resources any more than arbitrary string literals.
>>>
>>> First, in the RDF world, everything is an rdf:resource, including
>>> rdf:Literals. So they are "mixed up" already. While there were perhaps
>>> mistakes made in RDF, that web pages are considered resources is most
>>> certainly not one of them. Finally, I'll point out once again that the
>>> issue
>>> here isn't what is or is not a "good" resource. The issue is speaking
>>> clearly. If you want to talk about the literal, by all means do so, and
>>> if
>>> you want to talk about the web page, likewise. But don't confused one
>>> with
>>> the other.
>>
>> I have never quite understood the reason for putting Literals inside
>> of "Resources" when you can't say anything about Literals as a subject
>> except in reverse as the object of a statement and by common-sense you
>> should be able to state properties of Resources directly rather than
>> indirectly as RDF provides for the Literal subset.
>
> Me either. Perhaps because they just didn't think that people would want to
> say that many things about literals. Don't know. I've heard it mumbled that
> if RDF goes through another edit, this might get fixed. Mostly it's not a
> problem, unless you want to say something where both the subject and object
> are literals, since in the other case you can invert the relation. In the
> literal p literal case I've seen people use the idiom:
>
> _:foo hasLength "45"^^xsd:Integer
> _:foo owl:sameAs "http://www.bbc.co.uk/programmes/b00b07kw.html"
>
> (not that i'd particularly recommend it)
>
> or
>
> _:foo hasLength "45"^^xsd:Integer
> _:foo rdf:value "http://www.bbc.co.uk/programmes/b00b07kw.html"
>
> (less of evils)

Please don't put that in an introduction on how to reference current
html pages within the semantic web. I avoid blank nodes like the
plague, even if the plague is rare in my world, blank nodes are even
rarer.

>> I personally think its a bad idea to smudge the differences by saying
>> all web pages are semantic resources
>
> All web pages are rdf:Resources. What is this "semantic resource" you speak
> of?

I disagree, majorly.

>> , as they aren't... Many have no
>> inherent RDF semantics whatsoever and hence can't be reasonably used
>> as the subject of statements.
>
> Umm...
> <http://www.bbc.co.uk/programmes/b00b07kw.html> is about an episode of the
> television programme Dr. Who.
>
> "http://www.bbc.co.uk/programmes/b00b07kw.html" is a string of length 45.

Not if it was typed as an XML Schema anyURI... In that scenario you
can't say it is a string, it is simply being represented as a string
for current purposes.

>> It would be much better if by default they were thought of as Literals and
>> kept as objects of statements in
>> semantic terms.
>
> Well, I can see that you are making this assertion, but I can't understand
> the reasoning behind it.

It goes against your (and others) assumption that it is valuable to
consider all current web URL's as useful references for use as
identifiers for rdf:Resources. I don't see the usefulness for the vast
majority of current web pages and think it would be more reasonable to
make the difference distinct and require them to make up identifiers
which locate information resources with semantic information. I
particularly don't like the hit and miss approach of content
negotiation without redirection in this respect as there is no way to
say that if you attempt to negotiate with a particular content type
you will actually get it, as opposed to the .html and .rdf versions
that are well done in this bbc project.


Hopefully I restricted myself to repeating myself 5 or 10 times. Feel
free to respond to 0 or more of my statements.

:)

Cheers,

Peter
Received on Sunday, 22 June 2008 06:41:12 UTC