Re: word clouds in schema.org from Paola Di Maio on 2013-12-01 (public-vocabs@w3.org from December 2013)

From: Paola Di Maio <paola.dimaio@gmail.com>
Date: Mon, 2 Dec 2013 01:30:34 +0530
To: Dan Brickley <danbri@google.com>
Cc: Charles McCathie Nevile <chaals@yandex-team.ru>, W3C Web Schemas Task Force <public-vocabs@w3.org>, Paola Di Maio <paoladimaio10@googlemail.com>
Message-ID: <CAMXe=SojGoKK7eJn7bk72O5Jo08HY18nzLoUxTtFGWYAwhFWCA@mail.gmail.com>

Thanks, Charles and Dan


> > We have partial range/domain stuff (rangeIncludes…), which can do a bit
> of
> > that but not all of it.
>
yeah, ok

> >
> > I think this is really difficult actually. It's the motivation for
> > versioning, but as the recent thread on changing namespaces for RDF
> basics
> > shows, versioning is actually really hard to do in practice :(
>
> In this case it sounds like analyzing a corpus of actual schema.org
> data is closer to Paola's needs. There's only so much of this notion
> of similarity that you can ever expect to find in the machine readable
> schema.
>
> well, actually what I was thinking is that a word on the web -
irrespective whether in RDF or not - is a node in a semantic network

it would be desirable imho that artifacts - such as schemas, and their
possible implementations - could capture (even vaguely) relations of nodes
with the rest of the network because that provides the context, it
constitutes largely its semantics, thanks to which dynamic processes such
as reasoning and intelligence can take place

No?

really - it was really just a broad thought, I am not suggesting any action
to be taken
just checking if there is such a provision here, or even whether it would
make any sense at all


Although the coverage isn't all you might hope for, it might be worth
> taking at look at the Web Data Commons extractions from the Common
> Crawl dataset http://webdatacommons.org/ & http://commoncrawl.org/
>
> There should be more than enough in there to do some basic analysis of
> which properties and types co-occur in the same pages, or when
> describing the same entities.
>

 I ll do that thanks. I admit I have not studied the schemas in detail, I
ll report back if I find anything interesting

>
> The use case here though doesn't sound so much about RDF vocabulary
> terms, more about natural language similarity measures in general
> (e.g. the likes of http://arxiv.org/abs/1309.4168 ), which is a bit
> beyond the scope of this list.
>

actually it just came to me while I was on the other list, talking about
the EM framework we came up with, then thought about schema.org not yet
having a schema for EM, then of course the question followed about the
related terms, hence I asked

I actually don't think of RDf the same way you do, maybe :-)
I am interested in schemas as they offer a mechanism to support
some structure, which in turn support some functionality (retrieval,
classification etc), enabled RDF

as such it can be about RDF, as well as about NL. But I should rephrase, I
am not much interested in a quantitative measure of relatedness,  as a more
rounded notional association and discovery- the kind that intelligent
beings and things can make when connecting dots. I can work things out
myself , even that just to some extent only - but machine needs to be told
pretty much everything


pdm


>
> Dan
>

Received on Sunday, 1 December 2013 20:01:01 UTC