Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation from Yihong Ding on 2009-06-29 (public-lod@w3.org from June 2009)

From: Yihong Ding <ding@cs.byu.edu>
Date: Mon, 29 Jun 2009 09:46:21 -0400
To: martin.hepp@ebusiness-unibw.org
Cc: Kingsley Idehen <kidehen@openlinksw.com>, semantic-web@w3.org, public-lod@w3.org, semantic-web at W3C <semantic-web@w3c.org>
Message-ID: <8cbe5b450906290646lf2a5f9ci8593beb07c5e611f@mail.gmail.com>
Hi Martin,

I agree to most of your opinions, especially the architecture of data
representation you suggest. The only point I would like to emphasize is to
figure out a way that eliminates the demand of storing a fact multiple
times. Even though you think that it might be inevitable, personally I still
believe the possibility.

A typical analogical example is the CSS. CSS is a great example for data
displaying. By using CSS, we do not have to store the same data multiple
times but simultaneously reach the goal of flexible data display. Well,
certainly CSS is not directly applicable in the semantic realm. But I
believe it is the right way of thinking we need to approach.

Actually, the philosophy of Microformat is closer to CSS though Microformat
is much more limited a mechanism. I envision an innovation of semantic data
display combining the strengthes of Microformat and RDFa/RDF. But it is
surely not easy.

BTW: I like your work on GoodRelations. I am now working on the radiological
medicine domain and trying to develop something like it. (And indeed data
display is a critical issue for me to solve in the project.) Hopefully we
may have some chances to cooperate in the future again.

cheers,

yihong


On Mon, Jun 29, 2009 at 9:01 AM, Martin Hepp (UniBW) <
martin.hepp@ebusiness-unibw.org> wrote:

> Hi Yihong:
> I am a big fan of Codd's "one fact in one place" credo. However, in this
> particular case, that principle is violated anyway, since the literal
> values are often duplicated for presentation and meta-data prupolses
> anyway (think of "2009-06-29" vs. "June 29, 2009"). Second, for dynamic
> Web apps, it does not really matter whether the same fact is exposed
> once or twice, since the central location is one place in the database
> anyway. Third, this is the only way how a tool like the GoodRelations
> annotator [1] can create RDFa snippets for simple copy-and-paste into
> existing pages.
>
> Also note that in the particular case of RDFa, the principle of "one
> fact in one place" clashes with the "separation of concerns" principle,
> in particular, that of keeping data and presentation separate.
>
> The textbook-style "beauty of simplicity" of RDFa holds for adding a
> dc:creator property to a string value that is the same for presentation
> and at the data level. Beyond that, RDFa can create code that is very
> hard to maintain. In fact, I know that a large software company
> dismissed the use of RDFa in their products because of the unmanageable
> mix of conceptual and presentation layer.
>
> As far as security is concerned: I there is no real difference in my
> proposal, as the "content" attribute of RDFa allows serving different data
> to human and to machines, and this is a needed feature anyway. Digital
> signatures at the document or element level and / or data provenance
> approached will likely cater for that.
>
> Best
>
> Martin
>
> Yihong Ding wrote:
>
>> Hi Kingley and Martin,
>>
>> A potential problem of the model Martin suggested is that the same data
>> has
>> to be presented at least TWICE in one document. Although the RDFa portion
>> of
>> the data is supposed to be automatically generated, it, however, does not
>> prohibit anybody from manually revising it. Therefore, it leaves a huge
>> hole
>> for the hackers (or anybody who want to do some deceptive job). In our
>> imperfect world, this problem is severe.
>>
>> Adding an extra layer of data mapping always causes additional work on
>> data
>> maintenance. This time, the extra work could be a nightmare though the
>> architecture is neat.
>>
>> yihong
>>
>>
>> On Mon, Jun 29, 2009 at 8:03 AM, Kingsley Idehen <kidehen@openlinksw.com
>> >wrote:
>>
>>
>>
>>> Martin Hepp (UniBW) wrote:
>>>
>>>
>>>
>>>> Hi Tom:
>>>>
>>>>
>>>>
>>>>> Amen. Thank you for writing this. I completely agree. RDFa has some
>>>>> great use cases but (like any technology) has its limitations. Let's
>>>>> not oversell it.
>>>>>
>>>>>
>>>> We seem to agree on the observation, but not on the conclusion. What I
>>>> want and suggest is using RDFa also for exchanging a bit more complex
>>>> RDF
>>>> models / data by simply using a lot of div / span or whatever elements
>>>> that
>>>> represent the RDF part in the SAME document BUT NOT too closely linked
>>>> with
>>>> the presentation level.
>>>>
>>>> <body>
>>>> <h1>This is the car I want to sell</h1>
>>>> Actually, a pretty cool car, for only $1.000. Offer valid through July
>>>> 31,
>>>> 2009
>>>>
>>>> <span>
>>>> ... my whole RDF in RDFa
>>>>  </span>
>>>> <body>
>>>>
>>>> The advantage of that would be that
>>>>
>>>> - you just have to maintain ONE file,
>>>> - data and metadata are close by, so the likelihood of being up to date
>>>> increases, and
>>>> - at the same time, the code does not get too messy.
>>>> - Also - no problems setting up the server (*).
>>>> - Easy to create on-line tools that generate RDFa snippets for simple
>>>> pasting.
>>>> - Yahoo and Google will most likely honor RDFa meta-data only.
>>>>
>>>> Also note that often the literal values will be in content attributes
>>>> anyway, because the string for the presentation is not suitable as
>>>> meta-data
>>>> content anyway (e.g.  dates, country codes,...)
>>>>
>>>> I think the approach sketched above would be a cheap and useful way of
>>>> publishing RDF meta-data. It could work with CMS / blogging software
>>>> etc.
>>>>  Imaging if we were able to allow eBay sellers to put GoodRelations
>>>> meta-data directly into the open XHTML part of their product
>>>> description.
>>>>
>>>> The main problem with my proposal is that there is the risk that Google
>>>> considers this "cloaking" and may remove respective resources from their
>>>> index (Mark raised that issue). If that risk was confirmed, we would
>>>> really
>>>> have a problem. Imagine me selling Semantic Web markup as a step beyond
>>>> SEO
>>>> ... and the first consequence of following my advice is being removed
>>>> from
>>>> the Google index.
>>>>
>>>> A second problem is that if the document contains nodes that have no
>>>> counterpart on the presentation level (e.g. intermediate nodes for
>>>> holding
>>>> n-ary relations), then they will also not be dereferencable. The same
>>>> holds
>>>> for URIs or  nodes that are outside the scope of the actual RDFa / XHTML
>>>> document - I see no simple way of serving neither XHTML nor RDF content
>>>> for
>>>> those.
>>>>
>>>>
>>>>
>>> Martin,
>>>
>>> If Google doesn't see invisible DIVs as cloaking, the issue vaporizes.
>>>
>>> Also, if people take the SEO + SDQ (Linked Data Expressed in RDFa)
>>> approach
>>> they will at least remain in the Google index via usual SEO oriented
>>> keyword
>>> gimmickry, albeit generally suboptimal.
>>>
>>> If we make a recipe doc showcasing these issues, we will more than likely
>>> get Google to recalibrate back to the Web; especially if we can
>>> demonstrate
>>> that other search engine players --that have support RDFa -- not being
>>> afflicted with the same cloaking myopia.
>>>
>>> Kingsley
>>>
>>>
>>>
>>>> Best
>>>>
>>>> Martin
>>>>
>>>>
>>>>
>>>> Tom Heath wrote:
>>>>
>>>>
>>>>
>>>>> Martin,
>>>>>
>>>>> 2009/6/27 Martin Hepp (UniBW) <martin.hepp@ebusiness-unibw.org>:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> So if this "hidden div / span" approach is not feasible, we got a
>>>>>> problem.
>>>>>>
>>>>>> The reason is that, as beautiful the idea is of using RDFa to make a)
>>>>>> the
>>>>>> human-readable presentation and b) the machine-readable meta-data link
>>>>>> to
>>>>>> the same literals, the problematic is it in reality once the structure
>>>>>> of a)
>>>>>> and b) are very different.
>>>>>>
>>>>>> For very simple property-value pairs, embedding RDFa markup is no
>>>>>> problem.
>>>>>> But if you have a bit more complexity at the conceptual level and in
>>>>>> particular if there are significant differences to the structure of
>>>>>> the
>>>>>> presentation (e.g. in terms of granularity, ordering of elements,
>>>>>> etc.),
>>>>>> it
>>>>>> gets very, very messy and hard to maintain.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> Amen. Thank you for writing this. I completely agree. RDFa has some
>>>>> great use cases but (like any technology) has its limitations. Let's
>>>>> not oversell it.
>>>>>
>>>>> Tom.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> --
>>>> --------------------------------------------------------------
>>>> martin hepp
>>>> e-business & web science research group
>>>> universitaet der bundeswehr muenchen
>>>>
>>>> e-mail:  mhepp@computer.org
>>>> phone:   +49-(0)89-6004-4217
>>>> fax:     +49-(0)89-6004-4620
>>>> www:     http://www.unibw.de/ebusiness/ (group)
>>>>        http://www.heppnetz.de/ (personal)
>>>> skype:   mfhepp twitter: mfhepp
>>>>
>>>> Check out the GoodRelations vocabulary for E-Commerce on the Web of
>>>> Data!
>>>> ========================================================================
>>>>
>>>> Webcast:
>>>> http://www.heppnetz.de/projects/goodrelations/webcast/
>>>>
>>>> Talk at the Semantic Technology Conference 2009: "Semantic Web-based
>>>> E-Commerce: The GoodRelations Ontology"
>>>> http://tinyurl.com/semtech-hepp
>>>>
>>>> Tool for registering your business:
>>>> http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
>>>>
>>>> Overview article on Semantic Universe:
>>>> http://tinyurl.com/goodrelations-universe
>>>>
>>>> Project page and resources for developers:
>>>> http://purl.org/goodrelations/
>>>>
>>>> Tutorial materials:
>>>> Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A
>>>> Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo!
>>>> SearchMonkey
>>>>
>>>> http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>> --
>>>
>>>
>>> Regards,
>>>
>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen<http://www.openlinksw.com/blog/%7Ekidehen>
>>> <http://www.openlinksw.com/blog/%7Ekidehen>
>>> President & CEO OpenLink Software     Web: http://www.openlinksw.com
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
> --
> --------------------------------------------------------------
> martin hepp
> e-business & web science research group
> universitaet der bundeswehr muenchen
>
> e-mail:  mhepp@computer.org
> phone:   +49-(0)89-6004-4217
> fax:     +49-(0)89-6004-4620
> www:     http://www.unibw.de/ebusiness/ (group)
>         http://www.heppnetz.de/ (personal)
> skype:   mfhepp
> twitter: mfhepp
>
> Check out the GoodRelations vocabulary for E-Commerce on the Web of Data!
> ========================================================================
>
> Webcast:
> http://www.heppnetz.de/projects/goodrelations/webcast/
>
> Talk at the Semantic Technology Conference 2009:
> "Semantic Web-based E-Commerce: The GoodRelations Ontology"
> http://tinyurl.com/semtech-hepp
>
> Tool for registering your business:
> http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
>
> Overview article on Semantic Universe:
> http://tinyurl.com/goodrelations-universe
>
> Project page and resources for developers:
> http://purl.org/goodrelations/
>
> Tutorial materials:
> Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A
> Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo!
> SearchMonkey
>
> http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009
>
>
>
>
>
>


-- 
===================================
Yihong Ding

http://yihongs-research.blogspot.com/
Received on Monday, 29 June 2009 13:47:05 UTC