W3C home > Mailing lists > Public > public-lod@w3.org > June 2009

Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation

From: Martin Hepp (UniBW) <martin.hepp@ebusiness-unibw.org>
Date: Mon, 29 Jun 2009 15:01:04 +0200
Message-ID: <4A48BB10.9040802@ebusiness-unibw.org>
To: Yihong Ding <ding@cs.byu.edu>
CC: Kingsley Idehen <kidehen@openlinksw.com>, semantic-web@w3.org, public-lod@w3.org, semantic-web at W3C <semantic-web@w3c.org>
Hi Yihong:
I am a big fan of Codd's "one fact in one place" credo. However, in this
particular case, that principle is violated anyway, since the literal
values are often duplicated for presentation and meta-data prupolses
anyway (think of "2009-06-29" vs. "June 29, 2009"). Second, for dynamic
Web apps, it does not really matter whether the same fact is exposed
once or twice, since the central location is one place in the database
anyway. Third, this is the only way how a tool like the GoodRelations
annotator [1] can create RDFa snippets for simple copy-and-paste into
existing pages.

Also note that in the particular case of RDFa, the principle of "one
fact in one place" clashes with the "separation of concerns" principle,
in particular, that of keeping data and presentation separate.

The textbook-style "beauty of simplicity" of RDFa holds for adding a
dc:creator property to a string value that is the same for presentation
and at the data level. Beyond that, RDFa can create code that is very
hard to maintain. In fact, I know that a large software company
dismissed the use of RDFa in their products because of the unmanageable
mix of conceptual and presentation layer.

As far as security is concerned: I there is no real difference in my 
proposal, as the "content" attribute of RDFa allows serving different 
data to human and to machines, and this is a needed feature anyway. 
Digital signatures at the document or element level and / or data 
provenance approached will likely cater for that.

Best

Martin

Yihong Ding wrote:
> Hi Kingley and Martin,
>
> A potential problem of the model Martin suggested is that the same data has
> to be presented at least TWICE in one document. Although the RDFa portion of
> the data is supposed to be automatically generated, it, however, does not
> prohibit anybody from manually revising it. Therefore, it leaves a huge hole
> for the hackers (or anybody who want to do some deceptive job). In our
> imperfect world, this problem is severe.
>
> Adding an extra layer of data mapping always causes additional work on data
> maintenance. This time, the extra work could be a nightmare though the
> architecture is neat.
>
> yihong
>
>
> On Mon, Jun 29, 2009 at 8:03 AM, Kingsley Idehen <kidehen@openlinksw.com>wrote:
>
>   
>> Martin Hepp (UniBW) wrote:
>>
>>     
>>> Hi Tom:
>>>
>>>       
>>>> Amen. Thank you for writing this. I completely agree. RDFa has some
>>>> great use cases but (like any technology) has its limitations. Let's
>>>> not oversell it.
>>>>         
>>> We seem to agree on the observation, but not on the conclusion. What I
>>> want and suggest is using RDFa also for exchanging a bit more complex RDF
>>> models / data by simply using a lot of div / span or whatever elements that
>>> represent the RDF part in the SAME document BUT NOT too closely linked with
>>> the presentation level.
>>>
>>> <body>
>>> <h1>This is the car I want to sell</h1>
>>> Actually, a pretty cool car, for only $1.000. Offer valid through July 31,
>>> 2009
>>>
>>> <span>
>>> ... my whole RDF in RDFa
>>>  </span>
>>> <body>
>>>
>>> The advantage of that would be that
>>>
>>> - you just have to maintain ONE file,
>>> - data and metadata are close by, so the likelihood of being up to date
>>> increases, and
>>> - at the same time, the code does not get too messy.
>>> - Also - no problems setting up the server (*).
>>> - Easy to create on-line tools that generate RDFa snippets for simple
>>> pasting.
>>> - Yahoo and Google will most likely honor RDFa meta-data only.
>>>
>>> Also note that often the literal values will be in content attributes
>>> anyway, because the string for the presentation is not suitable as meta-data
>>> content anyway (e.g.  dates, country codes,...)
>>>
>>> I think the approach sketched above would be a cheap and useful way of
>>> publishing RDF meta-data. It could work with CMS / blogging software etc.
>>>  Imaging if we were able to allow eBay sellers to put GoodRelations
>>> meta-data directly into the open XHTML part of their product description.
>>>
>>> The main problem with my proposal is that there is the risk that Google
>>> considers this "cloaking" and may remove respective resources from their
>>> index (Mark raised that issue). If that risk was confirmed, we would really
>>> have a problem. Imagine me selling Semantic Web markup as a step beyond SEO
>>> ... and the first consequence of following my advice is being removed from
>>> the Google index.
>>>
>>> A second problem is that if the document contains nodes that have no
>>> counterpart on the presentation level (e.g. intermediate nodes for holding
>>> n-ary relations), then they will also not be dereferencable. The same holds
>>> for URIs or  nodes that are outside the scope of the actual RDFa / XHTML
>>> document - I see no simple way of serving neither XHTML nor RDF content for
>>> those.
>>>
>>>       
>> Martin,
>>
>> If Google doesn't see invisible DIVs as cloaking, the issue vaporizes.
>>
>> Also, if people take the SEO + SDQ (Linked Data Expressed in RDFa) approach
>> they will at least remain in the Google index via usual SEO oriented keyword
>> gimmickry, albeit generally suboptimal.
>>
>> If we make a recipe doc showcasing these issues, we will more than likely
>> get Google to recalibrate back to the Web; especially if we can demonstrate
>> that other search engine players --that have support RDFa -- not being
>> afflicted with the same cloaking myopia.
>>
>> Kingsley
>>
>>     
>>> Best
>>>
>>> Martin
>>>
>>>
>>>
>>> Tom Heath wrote:
>>>
>>>       
>>>> Martin,
>>>>
>>>> 2009/6/27 Martin Hepp (UniBW) <martin.hepp@ebusiness-unibw.org>:
>>>>
>>>>
>>>>         
>>>>> So if this "hidden div / span" approach is not feasible, we got a
>>>>> problem.
>>>>>
>>>>> The reason is that, as beautiful the idea is of using RDFa to make a)
>>>>> the
>>>>> human-readable presentation and b) the machine-readable meta-data link
>>>>> to
>>>>> the same literals, the problematic is it in reality once the structure
>>>>> of a)
>>>>> and b) are very different.
>>>>>
>>>>> For very simple property-value pairs, embedding RDFa markup is no
>>>>> problem.
>>>>> But if you have a bit more complexity at the conceptual level and in
>>>>> particular if there are significant differences to the structure of the
>>>>> presentation (e.g. in terms of granularity, ordering of elements, etc.),
>>>>> it
>>>>> gets very, very messy and hard to maintain.
>>>>>
>>>>>
>>>>>           
>>>> Amen. Thank you for writing this. I completely agree. RDFa has some
>>>> great use cases but (like any technology) has its limitations. Let's
>>>> not oversell it.
>>>>
>>>> Tom.
>>>>
>>>>
>>>>
>>>>         
>>> --
>>> --------------------------------------------------------------
>>> martin hepp
>>> e-business & web science research group
>>> universitaet der bundeswehr muenchen
>>>
>>> e-mail:  mhepp@computer.org
>>> phone:   +49-(0)89-6004-4217
>>> fax:     +49-(0)89-6004-4620
>>> www:     http://www.unibw.de/ebusiness/ (group)
>>>         http://www.heppnetz.de/ (personal)
>>> skype:   mfhepp twitter: mfhepp
>>>
>>> Check out the GoodRelations vocabulary for E-Commerce on the Web of Data!
>>> ========================================================================
>>>
>>> Webcast:
>>> http://www.heppnetz.de/projects/goodrelations/webcast/
>>>
>>> Talk at the Semantic Technology Conference 2009: "Semantic Web-based
>>> E-Commerce: The GoodRelations Ontology"
>>> http://tinyurl.com/semtech-hepp
>>>
>>> Tool for registering your business:
>>> http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
>>>
>>> Overview article on Semantic Universe:
>>> http://tinyurl.com/goodrelations-universe
>>>
>>> Project page and resources for developers:
>>> http://purl.org/goodrelations/
>>>
>>> Tutorial materials:
>>> Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A
>>> Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo!
>>> SearchMonkey
>>>
>>> http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009
>>>
>>>
>>>
>>>
>>>
>>>       
>> --
>>
>>
>> Regards,
>>
>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen<http://www.openlinksw.com/blog/%7Ekidehen>
>> President & CEO OpenLink Software     Web: http://www.openlinksw.com
>>
>>
>>
>>
>>
>>
>>     
>
>
>   

-- 
--------------------------------------------------------------
martin hepp
e-business & web science research group
universitaet der bundeswehr muenchen

e-mail:  mhepp@computer.org
phone:   +49-(0)89-6004-4217
fax:     +49-(0)89-6004-4620
www:     http://www.unibw.de/ebusiness/ (group)
          http://www.heppnetz.de/ (personal)
skype:   mfhepp
twitter: mfhepp

Check out the GoodRelations vocabulary for E-Commerce on the Web of Data!
========================================================================

Webcast:
http://www.heppnetz.de/projects/goodrelations/webcast/

Talk at the Semantic Technology Conference 2009:
"Semantic Web-based E-Commerce: The GoodRelations Ontology"
http://tinyurl.com/semtech-hepp

Tool for registering your business:
http://www.ebusiness-unibw.org/tools/goodrelations-annotator/

Overview article on Semantic Universe:
http://tinyurl.com/goodrelations-universe

Project page and resources for developers:
http://purl.org/goodrelations/

Tutorial materials:
Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A 
Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! 
SearchMonkey

http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009






Received on Monday, 29 June 2009 13:01:50 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:21 UTC