- From: Sampo Syreeni <decoy@iki.fi>
- Date: Tue, 3 Nov 2009 00:11:58 +0200 (EET)
- To: Pat Hayes <phayes@ihmc.us>
- cc: Damian Steer <pldms@mac.com>, semantic-web@w3.org
On 2009-11-02, Pat Hayes wrote: >> I would argue against this. Reification, in one form or another, is a >> highly valuable part of the standard, because it let's us pose >> hypotheticals and metadata relating to them. > > Only if you also ignore part of the spec. Which is a mess. I believe > we can do this better. Yes. I'm the pragmatic kind of guy. I believe we could go with a sort of interim semantics which retained the most useful properties of quotation or the like, and still make those semantics circumspect enough to a) not introduce any true logical problems, and to b) enable people to create more or less valid, useful applications simply by using their intution. That way, we'd get both the short term and the long term benefits of semantic technology, which is what I'd call a win-win situation. >> Eventhough Pat is likely to vehemently disagree with me on this one, >> I'd take hazy reification/quotation/whatever semantics over the lack >> of the basic mechanism, anyday of the week. I mean, otherwise we're >> bound to have even *hazier* concoctions in its place. > > Why? Because there is a place and a need for this sort of thing. Natural language easily guides us to the simplest example: the free form quatation. There's a need for it, eventhough it's far from logically pure, or possessing of formal semantics. It's used all the time. And then the idiom continuously leaks to just about every system of knowledge representation/semantic network as well. Hell, even the fact that we happened to allow a fully unconstrained, willy-nilly, textual literal, originally without even a language type into RDF testifies to the fact. I argue that it's better to have a mechanism in place that explicitly encodes that sort of stuff, that mechanism should rather be understandable and usable wrt the common coder than completely well axiomatized, and that after that we could be left with less off a mess and more of a usable application, than when we try to circuit the formal, logical, AI route. > Are they, though? My sense is that the lisp-style lists used by OWL > are used much more than bag, seq, alt. (Actually, the lists are the > collections: these three are the containers.) In that department we agree. The syntax is a mess, again. Representation shouldn't matter, no matter what the input (DDL) or manipulation (DML) syntax. It should be idiot-proof and master-friendly. What I meant was that the absract datatypes encoded by the types mentioned were highly sensible. Perhaps it's once again that RDF's triple model guides us to build the wrong kind of storage and/or GUI models for our data? > The key lack right now is any standard way to refer to a 'part' of an > RDF graph from the outside. That, too. That sort of stuff would mandate naming every triple/binary predicate, and inventing a system of referring to huge sets of those in a URI. Not doable. Then we could use named graphs, but the grouping and naming then takes place on the author's side, which really gives hir too much control in a distributed environment. Thus most people just download/syndicate other people's stuff and apply whatever logic they happen to like to the facts. That works, but then it doesn't really allow a) efficient references to subsets of data from first, second or third parties, nor b) especially technological byuse of such data, e.g. to accelerate aggregation >> No. From my relational background, I tend to treat bnodes like I'd deal >> with perfect, opaque surrogate keys. [...] In another post of yours, you already handled n-ary stuff quite well. No need to add anything there. >> If even that... Personally I'm of the opinion that literals should be >> removed from the model altogether. > > Oh no, they are the bread and butter of all the linked data. I'm all > for putting datatyped literals into logic itself, in fact. I'm not against them as such, no. But from my perspective, modularization calls for treating them in a different standard. Like, in HTML? As an editor of an ezine, my favorite literal of course is my very own, libertarian minded, entire, HTML formatted rant about freedom. Do you *really* want to type that sort of thing properly and comprehensivly within RDF? Or would it perhaps be better to leave it as referred material, with only the axiomatizable metadata retained within RDF proper? > Thats not the real issue. The problem is, people need something weaker > than sameAs to express a link, in many cases. Ah. I went with something similar above, so I can relate. But what would be this "something weaker" here? We can imagine tons of weaker, intuitive things, but since at the end we have to have formal semantics which can be handled by a computer, which axiomatics precisely allow us to go weaker, and help the people at the same time? > Its not all people misusing sameAs because they don't understand it: > they misuse it because there is no alternative, and they have to use > something. Its up to us to provide some better alternatives. Quite. But as I said, we can't stray from machine processability either. It might be that FOPL and its ilk take too much of our attention, relatively speaking. Still, I at least find it pretty difficult to find any truly "semantic" alternatives to the usual logical connectives we're used to utilizing. True, we could have these hazy "related-to" assertions. Maybe they could help some NLP machine gather extra facts for end user consumption. Or we could have more specific, more formalistic adaptations towards today's technology: "forall <x>,<y> [type of person]<x>, [type of information]<y> interest-is(<x>,<y>,["semantic web technology"]). I think it's pretty clear that this sort of thing cannot serve as the usable alternative, at least until we have some damn sophisticated AI in place to serve the general population. Till then, we will have to do with rather hazy semantics, because that's what people deal in, and we cannot then understand half of even that programmatically. My opinion is that the situation warrants a two-pronged approach: first, allow lots of hazy connectives into the model, while being sure you make no guarantees about them. Guide them towards coherence and convergence, and should that then actually happen, owl:sameas it suddenly is, in your and your well-placed friends' triplebases. Then, number two, formalize subsets of them fully; build complete axiomatic semantics for them, and then market the end result as having some specific programmatical advantages resulting from the tighter formalism. Like the one that no, we're no longer talking about an interesting restaurant which you might perhaps have something to do with while it would perhaps want to send you some advertisements; no, we're talking now about the the rocking joint which already knows you like the precise kind of music they have to offer, their cook is just a little bit related to you, chinese, plus adept at precisely the kind of Szechuan cuisine you love the most. 'Cause they happen to have your triples, and can interpret them unambiguously as well... > Its the last one that I think we are obliged to attempt. Perhaps, and thanks. But don't neglect the practical side either. In total, laziness counts quite a lot, and that's why people bitch about RDF/XML so much as well. It's a real hindrance. > That is one good strategy in the present state of the art, yes. See > how SKOS approaches similar issues. My personal favorite are the approaches which start with RDF encodings of WordNet, using intuitive semantics related to the latter. > Well, linking data using not-quite-sameAs-maybe is something that very > many people care a lot about right now. I hear more about this issue > than any other. Personally I think this is an IS-A -issue. I couldn't really fully decipher what Brachman was saying about it, either, but intuitively speaking, this is precisely the same thing. I mean, IS-A means one thing is kind of like another. If you declare it both ways, it kind of like means the two things are the same. But not quite, because there are these hideous, little, semantic, interpretational, goddamn hermeneutic details lurking around the equation. In the end, I've also never heard of a proper generalization of the IS-A relation, so that those little nasty details could be abstracted away for the moment. The closest I've come is the statement that a) subtypes inherit the methods and internal data fields of the supertype, plus b) the inheritance might work a bit differently with passed/returned, i.e. contravariant/covariant types of parameters. That's it. It's formal semantics, granted, and it ain't quite trivial either, but then, it ain't gonna help you a lot in developing a useful application either. > Most of the nasties in RDF are just ugly, or nuisances, but this is a > real urgent problem that will get worse very rapidly. I can relate. I mean, as I said, I'm mostly the relational kinda guy myself. I think that theory is reasonably well developed. And yet this sort of stuff is something that isn't really addressed in the literature in the full generality, and which now seems to reflect on the RDF/SW folks as well. Personally I'd name this a "generalized entity integrity issue, under an open world assumption". -- Sampo Syreeni, aka decoy - decoy@iki.fi, http://decoy.iki.fi/front +358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Monday, 2 November 2009 22:12:41 UTC