Re: web to semantic web : an automated approach

Stefan

I have a content management background, and I am a Drupal user.
I have been looking forward to the advances that you mention below however
my problem is modelling the triples.
I am sure that a tool can automatically extract/infer triples from
content, but I am not sure these would be meaningful/representative.

Assuming the functionality is available to expose the data as RDF (of
course I would have to upgrade to drupal 7 and all the custom modules
and functionalities written for drupal 6 would have to be rewritten,
but thats admittedly another problem) but what kind of knowledge
schema/ontology  would it adhere to? would the system automaticlaly
infer what is the subject what is the predicate, and would I (website
mom) be able to override what the system suggests? I havent quite
worked out how the system would work

clues welcome

Paola Di Maio



On Mon, Oct 20, 2008 at 4:28 PM, Stephane Corlosquet
<scorlosquet@gmail.com> wrote:
> Hi all,
>
> Popular content management systems have a great role to play in
> democratizing the semantic web. Some CMS like Drupal have already understood
> this and are rapidly moving towards exposing their content as RDF data.
> Because so many people are using, there is a great potential in implementing
> some built in semantic web features, and that's what's happening with Drupal
> 7, which will ship with RDFa support in core. Drupal will then be part of
> the category A2, with more than 30 000 RDFa enabled sites!
>
> --
> Stéphane,
> scor @ drupal.org
> http://drupal.org/user/52142
>
> On Mon, Oct 20, 2008 at 9:55 AM, Andreas Langegger <al@jku.at> wrote:
>>
>> Hi,
>> it's all happening, but it's not so easy as one may think in the first
>> place.
>> Basically there are multiple sources of structured/interlinked information
>> (A) and multiple ways of how to expose (B) linked data on the Web.
>> (A)
>> 1. generated (wrapped) from information systems (RDBMS, etc) => needs
>> mapping
>> 2. user-generated (natively RDF-based systems, Semantic Wikis, etc.) =>
>> already in the right form
>> 3. extracted (AI, heuristics, cypher, etc. - different levels of
>> granularity; difficult, sometimes wrong)
>> (B)
>> 1. RDF documents
>> 2. SPARQL endpoints
>> 3. embedded into HTML (RDFa)
>> The Linked Data Community project plays an important role regarding A1 and
>> A2. A3 is cumbersome and may produce wrong links and information - a
>> nightmare without implicit support for provenance. In corporate environments
>> A3 is already very popular, but in the broader Web-scale I'm a bit sceptical
>> this will work well. What do you tink?
>> Regards,
>> AndyL
>>
>>
>> On Oct 20, 2008, at 10:35 AM, Kannan Rajkumar wrote:
>>
>> Hi Mr. Ravinder
>>
>> It is a nice idea, why cannot we transform web content to semantic web
>> content.
>>
>> This is a necessity and will avoid regeneration of web content as semantic
>> web data.
>>
>> Even I am focusing in this direction.
>>
>> With regards,
>>
>> Dr. Rajkumar Kannan
>> Associate Professor
>> Dept. of Computer Science
>> Bishop Heber College, Tiruchirappalli, TN, India
>> URL: http://member.acm.org/~rajkumark/
>>
>>
>> ===================================================
>>
>> On 10/20/08, रविंदर ठाकुर (ravinder thakur) <ravinderthakur@gmail.com>
>> wrote:
>>>
>>> any thoughts on this...
>>>
>>>
>>> On Mon, Oct 20, 2008 at 12:38 AM, ravinder thakur
>>> <ravinderthakur@gmail.com> wrote:
>>>>
>>>> Hello friends,
>>>>
>>>> I have been following semantic web for some time now and have seen quite
>>>> a lot of projects being run (dbpedia, FOAF etc) trying to generate some
>>>> semantic content. While these approaches might have been successful in their
>>>> goals, one major problem plaguing semantic web as a whole is the lack of
>>>> semantic content. Unfortunately there is nothing in sight that we can rely
>>>> on to generate semantic content for the truckloads of information being put
>>>> on web everyday. I think one of the _wrong_ assumption in semantic web
>>>> community is that content creators will be creating a semantic data which I
>>>> think is too much for the asking from even more technically sound part of
>>>> web community let along whole of the web community. It hasn't happened over
>>>> last so many years and I don't see it happening in the near future.
>>>>
>>>> I think what we need to move the semantic web forward is a mechanism to
>>>> _automatcially_ convert the information over the web to semantic
>>>> information. There are many softwares/services that can be used for this
>>>> purpose. I am currently developing one prototype for this purpose. This
>>>> prototype uses services from OpenCalais(http://www.opencalais.com/) to
>>>> convert ordinary text to semantic form. This service is very limited in what
>>>> entities supports at the moment but its a very good start. I am pretty sure
>>>> there will be many other good options available that might be unknown to me.
>>>> The currently very primitive prototype can be seen at
>>>> http://arcse.appspot.com. This currently implements very few of the ideas I
>>>> have for this. This is hosted on Google's AppEngine so sometime gives
>>>> timeout messages internally so please bear with this :).
>>>>
>>>> This automatic conversion however is not a simple task and needs work in
>>>> lot in domains ranging form NLP to artificial intelligence to semantic web
>>>> to logic etc. So thats why this mail. I will be more than happy if we can
>>>> join together to form a like minded team that can work on solving this most
>>>> important problem plaguing semantic web currently.
>>>>
>>>> Waiting for your suggestions/criticisms
>>>> Ravinder Thakur
>>>>
>>>
>>
>>
>>
>>
>> Web of Data Practitioners Days / Oct 22-23 / Vienna
>> http://www.webofdata.info
>> ----------------------------------------------------------------------
>> Dipl.-Ing.(FH) Andreas Langegger
>> Institute for Applied Knowledge Processing
>> Johannes Kepler University Linz
>> A-4040 Linz, Altenberger Straße 69
>> http://www.langegger.at
>>
>>
>>
>
>



-- 
Paola Di Maio
School of IT
www.mfu.ac.th
*********************************************

Received on Monday, 20 October 2008 10:20:44 UTC