Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation from John Graybeal on 2009-06-26 (public-lod@w3.org from June 2009)

From: John Graybeal <graybeal@mbari.org>
Date: Thu, 25 Jun 2009 21:56:54 -0700
To: martin.hepp@ebusiness-unibw.org
Cc: Danny Ayers <danny.ayers@gmail.com>, bill.roberts@planet.nl, public-lod@w3.org, semantic-web at W3C <semantic-web@w3c.org>
Message-Id: <5B52C26E-6A9F-43C4-9789-3310B404E39B@mbari.org>
Just because it's on your server doesn't mean the visitor to the  
restaurant's web page has to know that.  (Does it?)  Hmm, maybe that  
takes us back to the .htaccess argument....

I agree the shop owner has to feel ownership.  So whatever solution  
you choose, the shop owner has to have access to the tool which  
enables its easy use, in their language and context.   I mention this  
because I don't know if the snippet solution will pass that test. It  
will be cool if it does.  (Please let us know how it turns out, you  
are the cutting edge research here! I find what you are doing very  
exciting.)

John

On Jun 25, 2009, at 12:26 PM, Martin Hepp (UniBW) wrote:

> Hi John:
> We also thought of hosting meta-data for the users, but I don't like  
> that because I want the shop operators to feel ownership for the data:
> If the opening hours expressed in RDF are wrong but on the personal  
> Web page of that restaurant, anybody facing closed doors will blame  
> the restaurant.
> If the outdated opening hours in RDF are on my SW server, the  
> unlucky customer will blame the Semantic Web for having crappy data.
>
> So maybe the snippet solution in RDFa is the best.
>
> Best
> Martin
>
>
> John Graybeal wrote:
>> This is a principal reason MMI decided to offer a vocabulary server  
>> for its community. The idea that 1000 different providers would all  
>> develop a level of web competency (for which there is evidence at  
>> only a minority of providers) for serving their RDF and OWL content  
>> -- let alone the capability to do versioning, adopt best practices,  
>> learn SKOS, and whatever other nuances are called for -- seemed  
>> like a non-starter.
>>
>> This is not exactly the same problem you're facing, but something  
>> to consider (if the model allows it) is creating a way to serve the  
>> annotations from another place than the host institution.  The  
>> institution can refer to those served files from their own sites,  
>> and even update them remotely, but not have to incur all the  
>> management overhead as standards improve, files change, authorship  
>> changes, etc.
>>
>> (Which is not to disagree with your plan either. That sounds fine.)
>>
>> One other delivery model could be for them to give you an existing  
>> HTML, you give them back the modified HTML (saves them cutting and  
>> pasting steps?).
>>
>> I'm a little ignorant on your tools and processes, so apologies if  
>> these are non-starters.
>>
>> John
>>
>>
>> On Jun 25, 2009, at 9:44 AM, Martin Hepp (UniBW) wrote:
>>
>>> Hi all:
>>>
>>> After about two months of helping people generate RDF/XML metadata  
>>> for their businesses using the GoodRelations annotator [1],
>>> I have quite some evidence that the current best practices of  
>>> using .htaccess are a MAJOR bottleneck for the adoption of  
>>> Semantic Web technology.
>>>
>>> Just some data:
>>> - We have several hundred entries in the annotator log - most  
>>> people spend 10 or more minutes to create a reasonable description  
>>> of themselves.
>>> - Even though they all operate some sort of Web sites, less than  
>>> 30 % of them manage to upload/publish a single *.rdf file in their  
>>> root directory.
>>> - Of those 30%, only a fraction manage to set up content  
>>> negotiation properly, even though we provide a step-by-step recipe.
>>>
>>> The effects are
>>> - URIs that are not dereferencable,
>>> - incorrect media types and
>>> and other problems.
>>>
>>> When investigating the causes and trying to help people, we  
>>> encountered a variety of configurations and causes that we did not  
>>> expect. It turned out that helping people just managing this tiny  
>>> step of publishing  Semantic Web data would turn into a full-time  
>>> job for 1 - 2 administrators.
>>>
>>> Typical causes of problems are
>>> - Lack of privileges for .htaccess (many cheap hosting packages  
>>> give limited or no access to .htaccess)
>>> - Users without Unix background had trouble name a file so that it  
>>> begins with a dot
>>> - Microsoft IIS require completely different recipes
>>> - Many users have access just at a CMS level
>>>
>>> Bottomline:
>>> - For researchers in the field, it is a doable task to set up an  
>>> Apache server so that it serves RDF content according to current  
>>> best practices.
>>> - For most people out there in reality, this is regularly a  
>>> prohibitively difficult task, both because of a lack of skills and  
>>> a variety in the technical environments that turns into an  
>>> engineering challenge what is easy on the textbook-level.
>>>
>>> As a consequence, we will modify our tool so that it generates  
>>> "dummy" RDFa code with span/div that *just* represents the meta- 
>>> data without interfering with the presentation layer.
>>> That can then be inserted as code snippets via copy-and-paste to  
>>> any XHTML document.
>>>
>>> Any opinions?
>>>
>>> Best
>>> Martin
>>>
>>> [1]  http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
>>>
>>> Danny Ayers wrote:
>>>> Thank you for the excellent questions, Bill.
>>>>
>>>> Right now IMHO the best bet is probably just to pick whichever  
>>>> format
>>>> you are most comfortable with (yup "it depends") and use that as  
>>>> the
>>>> single source, transforming perhaps with scripts to generate the
>>>> alternate representations for conneg.
>>>>
>>>> As far as I'm aware we don't yet have an easy templating engine for
>>>> RDFa, so I suspect having that as the source is probably a good  
>>>> choice
>>>> for typical Web applications.
>>>>
>>>> As mentioned already GRDDL is available for transforming on the  
>>>> fly,
>>>> though I'm not sure of the level of client engine support at  
>>>> present.
>>>> Ditto providing a SPARQL endpoint is another way of maximising the
>>>> surface area of the data.
>>>>
>>>> But the key step has clearly been taken, that decision to publish  
>>>> data
>>>> directly without needing the human element to interpret it.
>>>>
>>>> I claim *win* for the Semantic Web, even if it'll still be a few  
>>>> years
>>>> before we see applications exploiting it in a way that provides  
>>>> real
>>>> benefit for the end user.
>>>>
>>>> my 2 cents.
>>>>
>>>> Cheers,
>>>> Danny.
>>>>
>>>>
>>>>
>>>
>>> -- 
>>> --------------------------------------------------------------
>>> martin hepp
>>> e-business & web science research group
>>> universitaet der bundeswehr muenchen
>>>
>>> e-mail:  mhepp@computer.org
>>> phone:   +49-(0)89-6004-4217
>>> fax:     +49-(0)89-6004-4620
>>> www:     http://www.unibw.de/ebusiness/ (group)
>>>       http://www.heppnetz.de/ (personal)
>>> skype:   mfhepp twitter: mfhepp
>>>
>>> Check out the GoodRelations vocabulary for E-Commerce on the Web  
>>> of Data!
>>> = 
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>>
>>> Webcast:
>>> http://www.heppnetz.de/projects/goodrelations/webcast/
>>>
>>> Talk at the Semantic Technology Conference 2009: "Semantic Web- 
>>> based E-Commerce: The GoodRelations Ontology"
>>> http://tinyurl.com/semtech-hepp
>>>
>>> Tool for registering your business:
>>> http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
>>>
>>> Overview article on Semantic Universe:
>>> http://tinyurl.com/goodrelations-universe
>>>
>>> Project page and resources for developers:
>>> http://purl.org/goodrelations/
>>>
>>> Tutorial materials:
>>> Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day:  
>>> A Hands-on Introduction to the GoodRelations Ontology, RDFa, and  
>>> Yahoo! SearchMonkey
>>>
>>> http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009
>>>
>>>
>>>
>>>
>>> <martin_hepp.vcf>
>>
>>
>> John
>>
>> --------------
>> John Graybeal   <mailto:graybeal@mbari.org>  -- 831-775-1956
>> Monterey Bay Aquarium Research Institute
>> Marine Metadata Interoperability Project: http://marinemetadata.org
>>
>>
>>
>
> -- 
> --------------------------------------------------------------
> martin hepp
> e-business & web science research group
> universitaet der bundeswehr muenchen
>
> e-mail:  mhepp@computer.org
> phone:   +49-(0)89-6004-4217
> fax:     +49-(0)89-6004-4620
> www:     http://www.unibw.de/ebusiness/ (group)
>        http://www.heppnetz.de/ (personal)
> skype:   mfhepp twitter: mfhepp
>
> Check out the GoodRelations vocabulary for E-Commerce on the Web of  
> Data!
> = 
> = 
> ======================================================================
>
> Webcast:
> http://www.heppnetz.de/projects/goodrelations/webcast/
>
> Talk at the Semantic Technology Conference 2009: "Semantic Web-based  
> E-Commerce: The GoodRelations Ontology"
> http://tinyurl.com/semtech-hepp
>
> Tool for registering your business:
> http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
>
> Overview article on Semantic Universe:
> http://tinyurl.com/goodrelations-universe
>
> Project page and resources for developers:
> http://purl.org/goodrelations/
>
> Tutorial materials:
> Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A  
> Hands-on Introduction to the GoodRelations Ontology, RDFa, and  
> Yahoo! SearchMonkey
>
> http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009
>
>
>
>
> <martin_hepp.vcf>


John

--------------
John Graybeal   <mailto:graybeal@mbari.org>  -- 831-775-1956
Monterey Bay Aquarium Research Institute
Marine Metadata Interoperability Project: http://marinemetadata.org
Received on Friday, 26 June 2009 04:57:51 UTC