RE: Outstanding Issues - rdfms-xmllang from Pat Hayes on 2002-02-25 (w3c-rdfcore-wg@w3.org from February 2002)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Sun, 24 Feb 2002 22:19:07 -0600
To: Misha.Wolf@reuters.com
Cc: w3c-rdfcore-wg@w3.org, w3c-i18n-ig@w3.org
Message-Id: <p05101450b89f679760c2@[65.212.118.219]>
>On 24/02/2002 22:22:28 Pat Hayes wrote:
>>  As I won't be at the F2F, my 2c worth of comment on this issue:
>>
>>  >
>>  >-  The above seems to suggest that degrees of fuzziness are required, at
>>  >    user option, as with regular search engines.
>>
>>  Fuzziness of matching is not acceptable for RDF: it would break every
>>  inference engine ever written.
>
>In that case these inference engines either do not handle text

Right, they typically do not. At least, they treat text as character 
strings, which I guess isn't what you mean by 'handle' (?)

>or else
>they only handle text in a known language.
>
>>  Language tagging is largely incidental
>>  to proposed RDF usage in any case, as RDF is not intended to be read
>>  by human beings.
>
>What's that got to do with it?  It is intended to be "read" by engines.
>Reading implies "understanding" in the sense of being able to make
>"decisions" (such as A and B are the same).  Understanding of text

As far as I know, no program has ever been written that can 
understand NL text. I meant to refer to engines that perform RDF(S) 
inferences.

>requires knowledge of the language.  I walk on the pavement every day in
>London.  If I did so in the US, I'd be dead within a few minutes.

Quite. But you seem to be talking about the general AI -NL problem, 
which is way outside the scope of RDF or indeed of any proposed web 
formalism

>
>>  Also bear in mind that most proposed RDF usage is
>>  not concerned with text.
>
>Where is this documented?

Well, such things are rarely documented. That was my own opinion.

>  It certainly was very much concerned with
>text when the original RDF M&S WG (of which I was a member) did its
>work.  In most use cases, the "literals" were text (as opposed to dates,
>numbers etc).

Oh, sure, the literals are text; but all that RDF knows about a 
literal is whether or not it is string-identical to another literal. 
(Plus things that might be extractable from a datatyping scheme). It 
has no access to the meanings of textual literals, or any ability to 
make cross-language translations between texts.

>  > >-  All of the above is closely related to other "control" constructs
>>  >    needed for correctly writing text in different languages, eg BiDi
>>  >    controls for BiDirectional languages.  Though Math(s) is a language
>>  >    in quite a different sense, the same problem arises.  Let's say the
>>  >    title of a paper contains something that can't be expressed in plain
>>  >    text, eg an integral from value A to value B.  How do I do this in
>>  >    RDF
>>
>>  I would say that RDF deals with Unicode strings. How to encode an
>>  integral in Unicode is someone else's problem.
>
>Sorry, I didn't make myself clear.  How to encode an integral is a
>solved problem: MathML solved it.  How to make use of this in RDF is an
>RDF problem.

I'm not sure what you mean by 'make use of'. Surely you do not expect 
RDF to be able to reason about the *meanings* of mathematical 
formulae involving integrals?

>  > >and how will others match on it?
>>
>>  Again, someone else's problem. Intelligent text retrieval is a large
>>  research area, but it is also largely independent of ontology
>>  language design. RDF does not have the resources to do both jobs. at
>>  once.
>
>I'm not sure what you mean by text retrieval.

I meant the problem of locating the most relevant pieces of NL texts 
from a large corpus in order to answer a query or to provide 
information relevant to some topic, where problems arise of making 
selections based not (just) on surface form but on their meanings, eg 
noticing synonyms across languages. Matching in RDF means 
string-matching of Unicode character strings, plus maybe certain 
transformations that are supported by datatype schemes, eg 
leading-zero-suppression in numerals. I don't think that RDF can be 
expected to do more than that.

>Are you suggesting that
>all/most RDF literals will be numbers, dates etc?

I would think that in many applications that would be the case, yes. 
However literals can be any piece of text in RDF, I believe. I did 
not mean to imply any kind of exclusion.

Pat

-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Sunday, 24 February 2002 23:19:15 UTC