- From: Martin Hepp <martin.hepp@ebusiness-unibw.org>
- Date: Mon, 29 Jul 2013 19:26:06 +0200
- To: "Dawson, Laura" <Laura.Dawson@bowker.com>
- Cc: Wes Turner <wes.turner@gmail.com>, Dave Pawson <dave.pawson@gmail.com>, "public-vocabs@w3.org" <public-vocabs@w3.org>, Dan Brickley <danbri@google.com>
Thanks, you are very welcome - yes, I understand, books are different - but the basic pattern is the same: You can never win by making access to (selected parts of) your content less machine-friendly. That is like not putting price-tags on products to escape from price-comparison. It may work for a short while in selected segments, but it won't allow survival of an otherwise inferior business model or deficient operations. (Note: I am not saying that producing books is per se a bad business model ;-)) Martin PS: Side-story: I once read in the disclaimers of a small German winery's Web site that "external linking to this site without written consent was forbidden" ;-) On Jul 29, 2013, at 7:14 PM, Dawson, Laura wrote: > This is excellent - of course, book publishers just don't think this way. > Thank you for this!!! > > On 7/29/13 1:12 PM, "Martin Hepp" <martin.hepp@ebusiness-unibw.org> wrote: > >> Hi Dawson: >> >> I also have a common reply to the concern raised by site-owners that rich >> data markup makes it easier for your competitors to abuse your content. >> >> "But schema.org will make it easy for my competitors to harvest my >> prices, product descriptions, or dealer network information!" >> >> First, most site-owners do not realize how easy it is as of today for >> anybody to extract content from others' Web sites via crowdsourcing: >> Assumed your competitor wants an Excel table with all your dealers, their >> addresses, and opening hours. With services like Amazon Mechanical Turk >> or CrowdFlower, it will be a job of 15 Minutes and 50 USD or less to hire >> human labor to extract that information for you, including reformatting, >> spell-check, etc. Even a small competitor of yours can take that effort >> if seriously interested. And it is actually more expensive to operate a >> decent Web crawler for structured data than that. However, your >> prospective clients will likely neither spend the time nor money to >> access your data that way. >> >> It it true that structured data simplifies the access to and use of the >> information on your Web site, but it does so for anybody. If you decide >> against structured data on your site, you put a much greater barrier on >> your potential target audience than on your competitors. The latter can >> extract and analyze all your public Web data via crowdsourcing services >> anyway. >> >> Second, you also have legal means to protect your content against reuse. >> If you have unique product description texts of a sufficient creative >> value, you can sue anybody who extracts and republishes that content." >> >> Martin >> On Jul 29, 2013, at 6:39 PM, Dawson, Laura wrote: >> >>> What I've been looking for is an interface that allows a "web monkey" >>> or home user to do thisŠin book files. To mark up ebooks semantically, >>> and have search engines ingest the files in their indexes, would be a >>> huge leap forward. It would help search, it would help books, it would >>> help society as a whole. >>> >>> But we are missing three things in that: the Wordpress-y like interface >>> that would allow this; the ability for an epub or mobi file to handle >>> this markup without breaking; and the willingness of the book market to >>> experiment. (To wit: Authors Guild lawsuit against Google Books >>> regarding indexing and abstracting. Walled garden ebook environments. >>> Etc.) >>> >>> From: Wes Turner <wes.turner@gmail.com> >>> Date: Monday, July 29, 2013 12:33 PM >>> To: Martin Hepp <martin.hepp@ebusiness-unibw.org> >>> Cc: Dave Pawson <dave.pawson@gmail.com>, "public-vocabs@w3.org" >>> <public-vocabs@w3.org>, Dan Brickley <danbri@google.com> >>> Subject: Re: Ease of adoption >>> Resent-From: <public-vocabs@w3.org> >>> Resent-Date: Monday, July 29, 2013 12:34 PM >>> >>> +1. http://en.m.wikipedia.org/wiki/Schema.org >>> On Jul 29, 2013 10:46 AM, "Martin Hepp" >>> <martin.hepp@ebusiness-unibw.org> wrote: >>>> Here is my suggestion for a new intro: >>>> >>>> "Many individuals and organizations use the Web to articulate their >>>> messages: companies offer products, newspapers present news, bloggers >>>> share opinions, etc. >>>> Historically, the most relevant audience for a Web site were humans - >>>> they found your Web site via a search engine and then consumed the >>>> information from your site directly in their Web browsers. >>>> >>>> Now, there are more and more digital devices between a Web site and >>>> its target audience, and they cover a bigger share of the process of >>>> using information from the Web. For instance, nowadays, the most >>>> relevant results in a search engine are often not "main" pages, but >>>> deep, detailed links into a Web site. >>>> >>>> As a consequence, the decision for or against a product, restaurant, >>>> newspaper, etc., -- in other words: your offer --, is made already in >>>> the search results returned by the Web search engine. The better the >>>> search engine understands the information inside your pages, the better >>>> it can select, summarize, and present it to the target audiences. >>>> >>>> Schema.org is a standard for marking-up the information in your Web >>>> content in a way that search engines and other computer-based services >>>> can understand. In database terminology, the structures used to >>>> represent information as data are called a "schema". Schema.org defines >>>> a common schema for the interface between your Web content and search >>>> engines. It allows search engines and other services to better extract >>>> and understand your site. >>>> >>>> Why bother? Site owners spend a lot of effort for optimizing the user >>>> experience of their site for human visitors, with stylesheets, icons, >>>> font choices, etc. Schema.org is the next step: Optimizing the user >>>> experience for your site when it is presented to your target audience >>>> by a search engine, a mobile application, a browser extension, or any >>>> new digital intermediary that may be in between." >>>> >>>> Best >>>> >>>> Martin Hepp >>>> >>>> PS: I offer this text under Creative Commons CC BY 3.0 ;-) >>>> >>>> On Jul 29, 2013, at 5:17 PM, Dave Pawson wrote: >>>> >>>>> On 29 July 2013 15:23, Wes Turner <wes.turner@gmail.com> wrote: >>>>>> >>>>>> On Jul 29, 2013 3:53 AM, "Dave Pawson" <dave.pawson@gmail.com> >>>> wrote: >>>>>>> >>>>>>> Reading http://schema.org/docs/gs.html (IMHO) I don't see the >>>> salesmans >>>>>>> version, >>>>>>> a trainers view of the ideas behind schema.org. >>>>>>> >>>>>>> Has anyone started to think of how a web monkey or home user might >>>> be >>>>>>> persuaded >>>>>>> to adopt microdata for their own usage? E.g. taking the user >>>> perspective? >>>>>>> Dan and others may well find their way round schema.org, but it >>>> isn't so >>>>>>> easy >>>>>>> to get started when a new user comes across it? >>>>>> >>>>>> When you say "taking the user perspective", what exactly do you >>>> mean by >>>>>> that? How are you suggesting the pitch should be modified in order >>>> to reach >>>>>> the target audience? >>>>> >>>>> IMHO that says it, succinctly and for a knowledgeable audience. >>>>> If you look at intro type books (dummys ... etc), there is much more >>>>> of a sell there. Persuasion as to why this tech is useful for them, >>>>> meets an objective the reader may have? >>>>> >>>>> E.g. "A collection of schemas"... WTF is a schema...? >>>>> >>>>> " html tags, that webmasters can use to markup their pages in ways >>>>> recognized by major search providers." >>>>> Oh - that's not me then, I'm not a webmaster... >>>>> >>>>> I.e just the slant? >>>>> >>>>> Does that make sense? >>>>> >>>>> regards DaveP >>>>> >>>>> >>>>>> >>>>>> schema.org has a fairly great description: >>>>>> >>>>>> """ >>>>>> What is Schema.org? >>>>>> This site provides a collection of schemas, i.e., html tags, that >>>> webmasters >>>>>> can use to markup their pages in ways recognized by major search >>>> providers. >>>>>> Search engines including Bing, Google, Yahoo! and Yandex rely on >>>> this markup >>>>>> to improve the display of search results, making it easier for >>>> people to >>>>>> find the right web pages. >>>>>> Many sites are generated from structured data, which is often >>>> stored in >>>>>> databases. When this data is formatted into HTML, it becomes very >>>> difficult >>>>>> to recover the original structured data. Many applications, >>>> especially >>>>>> search engines, can benefit greatly from direct access to this >>>> structured >>>>>> data. On-page markup enables search engines to understand the >>>> information on >>>>>> web pages and provide richer search results in order to make it >>>> easier for >>>>>> users to find relevant information on the web. Markup can also >>>> enable new >>>>>> tools and applications that make use of the structure. >>>>>> A shared markup vocabulary makes it easier for webmasters to decide >>>> on a >>>>>> markup schema and get the maximum benefit for their efforts. So, in >>>> the >>>>>> spirit of sitemaps.org, search engines have come together to >>>> provide a >>>>>> shared collection of schemas that webmasters can use. >>>>>> """ >>>>>> >>>>>> schema.org/docs/gs.html has the following heading structure: >>>>>> >>>>>> Getting started with schema.org >>>>>> * How to mark up your content using Microdata >>>>>> * Why use Microdata? [what about RDFa, these days] >>>>>> * Using the schema.org vocabulary >>>>>> * Advanced-topic: machine-understandable versions of information >>>>>> >>>>>>> The other side of this is the breadth of options? How might the >>>>>>> increasingly large >>>>>>> number of terms be 'filtered' for use by the man in the street to >>>>>>> optimise his/her >>>>>>> chances of a search engine result? >>>>>>> >>>>>>> I think this aspect could and should be given consideration as the >>>> size of >>>>>>> the main term set increases. >>>>>>> >>>>>>> Just a thought. Is there work being done in this area? >>>>>> >>>>>> There is a fair amount of research regarding meta tag stuffing in >>>> regards to >>>>>> SEO. >>>>>> >>>>>>> >>>>>>> regards >>>>>>> >>>>>>> -- >>>>>>> Dave Pawson >>>>>>> XSLT XSL-FO FAQ. >>>>>>> Docbook FAQ. >>>>>>> http://www.dpawson.co.uk >>>>>>> >>>>>> >>>>>> IMHO, from an en-US perspective, the copy text for the schema.org >>>> Ontology: >>>>>> >>>>>> * is fairly verbose >>>>>> * could have a few more bullet points >>>>>> * could be updated to reference the supported formats >>>>>> (RDF/XML, Turtle, JSON-LD, N3, NTriples, HTML5 Microdata, and >>>> *RDFa*) >>>>>> * could more directly allude to schema.rdfs.org and >>>>>> http://schema.rdfs.org/tools.html >>>>>> * could link to topical Wikipedia pages >>>>>> >>>>>> Wikipedia pages >>>>>> >>>>>> * /Linked_data >>>>>> * /Semantic_web >>>>>> * /Microdata_(HTML) >>>>>> >>>>>> I collected a number of Wikipedia links that may be useful for, as >>>> you put >>>>>> it, teh "web monkey and home user" here: >>>>>> >>>> http://www.reddit.com/r/semanticweb/comments/1dvakc/schemaorgdataset_sta >>>> ndard_schema_for_linked_data/ >>>>>> >>>>>> Please feel free to share and incorporate this research. >>>>> >>>>> >>>>> >>>>> -- >>>>> Dave Pawson >>>>> XSLT XSL-FO FAQ. >>>>> Docbook FAQ. >>>>> http://www.dpawson.co.uk >>>>> >>>> >>>> -------------------------------------------------------- >>>> martin hepp >>>> e-business & web science research group >>>> universitaet der bundeswehr muenchen >>>> >>>> e-mail: hepp@ebusiness-unibw.org >>>> phone: +49-(0)89-6004-4217 >>>> fax: +49-(0)89-6004-4620 >>>> www: http://www.unibw.de/ebusiness/ (group) >>>> http://www.heppnetz.de/ (personal) >>>> skype: mfhepp >>>> twitter: mfhepp >>>> >>>> Check out GoodRelations for E-Commerce on the Web of Linked Data! >>>> ================================================================= >>>> * Project Main Page: http://purl.org/goodrelations/ >>>> >>>> >>>> >> >> -------------------------------------------------------- >> martin hepp >> e-business & web science research group >> universitaet der bundeswehr muenchen >> >> e-mail: hepp@ebusiness-unibw.org >> phone: +49-(0)89-6004-4217 >> fax: +49-(0)89-6004-4620 >> www: http://www.unibw.de/ebusiness/ (group) >> http://www.heppnetz.de/ (personal) >> skype: mfhepp >> twitter: mfhepp >> >> Check out GoodRelations for E-Commerce on the Web of Linked Data! >> ================================================================= >> * Project Main Page: http://purl.org/goodrelations/ >> >> >> >> > > -------------------------------------------------------- martin hepp e-business & web science research group universitaet der bundeswehr muenchen e-mail: hepp@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! ================================================================= * Project Main Page: http://purl.org/goodrelations/
Received on Monday, 29 July 2013 17:26:32 UTC