Re: Vocabulary description data from Jiri Prochazka on 2008-09-17 (semantic-web@w3.org from September 2008)

From: Jiri Prochazka <ojirio@gmail.com>
Date: Thu, 18 Sep 2008 00:20:43 +0200
To: mpbelanger@jarg.com, semantic-web@w3.org
Message-ID: <48D182BB.70404@gmail.com>
Thanks for the response (everyone), but I think I failed to express what
I really meant.
Let me try again.

Lets say we have this triple: foaf:Person owl:sameAs some_vocab:Person
which I think is true, but in neither vocabulary it is stated.
So I publicize this on my webserver.
The problem is how to share this information, so other people see it and
can use it as kind of plug-in to the vocabularies (technically it will
be vocabulary too - only with purpose to supplement the "real"/"full"
vocabularies)

My idea is that you should be able to, in some smart and effective way,
browse all rdf vocabularies on the web, and compose your set of
vocabularies you want to use, and if you do not find vocabulary
describing what you desire, create your own vocabulary (or vocabulary
supplement (again - supplement is a vocabulary too))

I'm not interested in in natural languages in any way.

Thanks,
Jiri


Michael Belanger wrote:
> The Web will always be a growing and inefficient interactive database and no
> more.  Globally, there remain many additional interactive databases both
> personal and within every organization.  All these contain the complete
> spectrum of broad band content from within all forms of information objects
> - not just words.  As information specialists decide how to "digitally
> represent" (graph) much of the remaining content - such as touch, smell,
> balance, texture, mood and many more technical and human experiences - these
> growing and inefficient interactive databases will become more highly
> specialized.  In other words, these information specialists will be
> constantly inventing new expressions and sub-graphs that carry (digitally
> represent) deep meaning within their unique field of specialty.  True
> understanding of "digitally representations" will only be found within each
> of these specialized communities; they each possess their "long tail's"
> meaning.  In this real world environment, your vision of overlapping
> vocabularies serving RDF-OWL (Semantic Web) as something consisting of
> interconnected tripebases for you to query using natural language has no
> technical solution for idea or object content understanding.  
> 
> One effort to combine the Web's content and information from all other human
> sources is Wikipedia; a bag of words, numbers and images with definitions,
> but not relative meaning.  Attempts to use complex natural language to
> impart the power of context into a Wikipedia query has only "phrase"
> potential because of limited joins of SQL.  If you fail to provide the full
> context that will find that long-tail "needle" you have in mind, that idea's
> solution will remain invisible.  To achieve your Common Sense solution to
> overlapping vocabularies, the solution must embrace articulate contextually
> relevant conversational queries.  Cell phone requests will be the end game.
> 
> Much cruder than Wikipedia, Google views the world's information as a simple
> distributed graph.  Google looks into every permitted information source on
> the planet and pulls out a bag of words, noting where each bag lives.  The
> Google index is of these extracted words; still bigger, but much dumber,
> than Wikipedia.  Powerset made NLP search into Wikipedia as their first
> mission.  Their Venture Capitalists saw the SQL limited context issues and
> shopped Powerset ASAP to Microsoft. 
> 
> Overcoming your overlapping vocabularies is possible by embracing each
> long-tail community's unique sub-graphs that carry (digitally represent)
> deep meaning within their unique field of specialty.  These domain
> sub-graphs are derived form that communities knowledge base, which is an
> object content parsing and vocabulary intensive (not a modeling) Ontology.
> 
> Matching subgraph query fragments at the master INDEX of source sub-graph
> fragments (if built) solves your overlapping vocabularies problem.
> -Michael, Jarg Corp.
> 
> -----Original Message-----
> From: semantic-web-request@w3.org [mailto:semantic-web-request@w3.org] On
> Behalf Of Jiri Prochazka
> Sent: Sunday, September 14, 2008 6:53 PM
> To: semantic-web@w3.org
> Subject: Vocabulary description data
> 
> Hello,
> I gathered some questions when reading various documents on the web so
> here goes some of the most pressing...
> 
> Due to decentralized nature of the web (and world) there will be created
> many RDF vocabularies, majority of them overlapping...
> To be able to make the information most complete, solution had to be
> developed to help linking of data on WWW, to build more signposts in the
> City of WWW...
> The solution are vocabularies defining relation between various RDF
> vocabularies - RDFS and OWL,
> here I see one problem - the number of vocabulary describing
> vocabularies cannot grow, or we can recurse to infinity (describing
> vocabulary. which is describing vocabulary, which is...) - this needs as
> few as well established standards as possible.
> The second (the main) problem is sharing the information.
> All data in semantic web can be anything - just triples which were
> publicized on some web-server... Anyone can say anything....
> Anyone can say that he thinks that this term in one vocabulary is
> equivalent to other term in other vocabulary (not only the creator of
> the vocabulary, else the number of vocabularies will increase and they
> won't be really well linked) for example...
> I see need some mechanism to crawl the vocabulary description data (and
> to decide credibility).
> Then it will be possible to infer RDF data from WWW to preferred
> vocabularies and use it.
> Was there any research on this topic?
> What is the status of the development in the trust area?
> 
> Regards,
> Jiri
> 
>
Received on Wednesday, 17 September 2008 22:21:52 UTC