- From: Dawson, Laura <Laura.Dawson@bowker.com>
- Date: Mon, 29 Jul 2013 17:28:16 +0000
- To: Martin Hepp <martin.hepp@ebusiness-unibw.org>
- CC: Wes Turner <wes.turner@gmail.com>, Dave Pawson <dave.pawson@gmail.com>, "public-vocabs@w3.org" <public-vocabs@w3.org>, Dan Brickley <danbri@google.com>
Book publishers NEED to think this way.But they, like your German winery, are horribly invested in control. On 7/29/13 1:26 PM, "Martin Hepp" <martin.hepp@ebusiness-unibw.org> wrote: >Thanks, you are very welcome - yes, I understand, books are different - >but the basic pattern is the same: You can never win by making access to >(selected parts of) your content less machine-friendly. That is like not >putting price-tags on products to escape from price-comparison. It may >work for a short while in selected segments, but it won't allow survival >of an otherwise inferior business model or deficient operations. (Note: I >am not saying that producing books is per se a bad business model ;-)) > >Martin > >PS: Side-story: I once read in the disclaimers of a small German winery's >Web site that "external linking to this site without written consent was >forbidden" ;-) > > >On Jul 29, 2013, at 7:14 PM, Dawson, Laura wrote: > >> This is excellent - of course, book publishers just don't think this >>way. >> Thank you for this!!! >> >> On 7/29/13 1:12 PM, "Martin Hepp" <martin.hepp@ebusiness-unibw.org> >>wrote: >> >>> Hi Dawson: >>> >>> I also have a common reply to the concern raised by site-owners that >>>rich >>> data markup makes it easier for your competitors to abuse your content. >>> >>> "But schema.org will make it easy for my competitors to harvest my >>> prices, product descriptions, or dealer network information!" >>> >>> First, most site-owners do not realize how easy it is as of today for >>> anybody to extract content from others' Web sites via crowdsourcing: >>> Assumed your competitor wants an Excel table with all your dealers, >>>their >>> addresses, and opening hours. With services like Amazon Mechanical Turk >>> or CrowdFlower, it will be a job of 15 Minutes and 50 USD or less to >>>hire >>> human labor to extract that information for you, including >>>reformatting, >>> spell-check, etc. Even a small competitor of yours can take that effort >>> if seriously interested. And it is actually more expensive to operate a >>> decent Web crawler for structured data than that. However, your >>> prospective clients will likely neither spend the time nor money to >>> access your data that way. >>> >>> It it true that structured data simplifies the access to and use of the >>> information on your Web site, but it does so for anybody. If you decide >>> against structured data on your site, you put a much greater barrier on >>> your potential target audience than on your competitors. The latter can >>> extract and analyze all your public Web data via crowdsourcing services >>> anyway. >>> >>> Second, you also have legal means to protect your content against >>>reuse. >>> If you have unique product description texts of a sufficient creative >>> value, you can sue anybody who extracts and republishes that content." >>> >>> Martin >>> On Jul 29, 2013, at 6:39 PM, Dawson, Laura wrote: >>> >>>> What I've been looking for is an interface that allows a "web monkey" >>>> or home user to do thisİin book files. To mark up ebooks semantically, >>>> and have search engines ingest the files in their indexes, would be a >>>> huge leap forward. It would help search, it would help books, it would >>>> help society as a whole. >>>> >>>> But we are missing three things in that: the Wordpress-y like >>>>interface >>>> that would allow this; the ability for an epub or mobi file to handle >>>> this markup without breaking; and the willingness of the book market >>>>to >>>> experiment. (To wit: Authors Guild lawsuit against Google Books >>>> regarding indexing and abstracting. Walled garden ebook environments. >>>> Etc.) >>>> >>>> From: Wes Turner <wes.turner@gmail.com> >>>> Date: Monday, July 29, 2013 12:33 PM >>>> To: Martin Hepp <martin.hepp@ebusiness-unibw.org> >>>> Cc: Dave Pawson <dave.pawson@gmail.com>, "public-vocabs@w3.org" >>>> <public-vocabs@w3.org>, Dan Brickley <danbri@google.com> >>>> Subject: Re: Ease of adoption >>>> Resent-From: <public-vocabs@w3.org> >>>> Resent-Date: Monday, July 29, 2013 12:34 PM >>>> >>>> +1. http://en.m.wikipedia.org/wiki/Schema.org >>>> On Jul 29, 2013 10:46 AM, "Martin Hepp" >>>> <martin.hepp@ebusiness-unibw.org> wrote: >>>>> Here is my suggestion for a new intro: >>>>> >>>>> "Many individuals and organizations use the Web to articulate their >>>>> messages: companies offer products, newspapers present news, bloggers >>>>> share opinions, etc. >>>>> Historically, the most relevant audience for a Web site were humans - >>>>> they found your Web site via a search engine and then consumed the >>>>> information from your site directly in their Web browsers. >>>>> >>>>> Now, there are more and more digital devices between a Web site and >>>>> its target audience, and they cover a bigger share of the process of >>>>> using information from the Web. For instance, nowadays, the most >>>>> relevant results in a search engine are often not "main" pages, but >>>>> deep, detailed links into a Web site. >>>>> >>>>> As a consequence, the decision for or against a product, restaurant, >>>>> newspaper, etc., -- in other words: your offer --, is made already in >>>>> the search results returned by the Web search engine. The better the >>>>> search engine understands the information inside your pages, the >>>>>better >>>>> it can select, summarize, and present it to the target audiences. >>>>> >>>>> Schema.org is a standard for marking-up the information in your Web >>>>> content in a way that search engines and other computer-based >>>>>services >>>>> can understand. In database terminology, the structures used to >>>>> represent information as data are called a "schema". Schema.org >>>>>defines >>>>> a common schema for the interface between your Web content and search >>>>> engines. It allows search engines and other services to better >>>>>extract >>>>> and understand your site. >>>>> >>>>> Why bother? Site owners spend a lot of effort for optimizing the user >>>>> experience of their site for human visitors, with stylesheets, icons, >>>>> font choices, etc. Schema.org is the next step: Optimizing the user >>>>> experience for your site when it is presented to your target audience >>>>> by a search engine, a mobile application, a browser extension, or any >>>>> new digital intermediary that may be in between." >>>>> >>>>> Best >>>>> >>>>> Martin Hepp >>>>> >>>>> PS: I offer this text under Creative Commons CC BY 3.0 ;-) >>>>> >>>>> On Jul 29, 2013, at 5:17 PM, Dave Pawson wrote: >>>>> >>>>>> On 29 July 2013 15:23, Wes Turner <wes.turner@gmail.com> wrote: >>>>>>> >>>>>>> On Jul 29, 2013 3:53 AM, "Dave Pawson" <dave.pawson@gmail.com> >>>>> wrote: >>>>>>>> >>>>>>>> Reading http://schema.org/docs/gs.html (IMHO) I don't see the >>>>> salesmans >>>>>>>> version, >>>>>>>> a trainers view of the ideas behind schema.org. >>>>>>>> >>>>>>>> Has anyone started to think of how a web monkey or home user might >>>>> be >>>>>>>> persuaded >>>>>>>> to adopt microdata for their own usage? E.g. taking the user >>>>> perspective? >>>>>>>> Dan and others may well find their way round schema.org, but it >>>>> isn't so >>>>>>>> easy >>>>>>>> to get started when a new user comes across it? >>>>>>> >>>>>>> When you say "taking the user perspective", what exactly do you >>>>> mean by >>>>>>> that? How are you suggesting the pitch should be modified in order >>>>> to reach >>>>>>> the target audience? >>>>>> >>>>>> IMHO that says it, succinctly and for a knowledgeable audience. >>>>>> If you look at intro type books (dummys ... etc), there is much more >>>>>> of a sell there. Persuasion as to why this tech is useful for them, >>>>>> meets an objective the reader may have? >>>>>> >>>>>> E.g. "A collection of schemas"... WTF is a schema...? >>>>>> >>>>>> " html tags, that webmasters can use to markup their pages in ways >>>>>> recognized by major search providers." >>>>>> Oh - that's not me then, I'm not a webmaster... >>>>>> >>>>>> I.e just the slant? >>>>>> >>>>>> Does that make sense? >>>>>> >>>>>> regards DaveP >>>>>> >>>>>> >>>>>>> >>>>>>> schema.org has a fairly great description: >>>>>>> >>>>>>> """ >>>>>>> What is Schema.org? >>>>>>> This site provides a collection of schemas, i.e., html tags, that >>>>> webmasters >>>>>>> can use to markup their pages in ways recognized by major search >>>>> providers. >>>>>>> Search engines including Bing, Google, Yahoo! and Yandex rely on >>>>> this markup >>>>>>> to improve the display of search results, making it easier for >>>>> people to >>>>>>> find the right web pages. >>>>>>> Many sites are generated from structured data, which is often >>>>> stored in >>>>>>> databases. When this data is formatted into HTML, it becomes very >>>>> difficult >>>>>>> to recover the original structured data. Many applications, >>>>> especially >>>>>>> search engines, can benefit greatly from direct access to this >>>>> structured >>>>>>> data. On-page markup enables search engines to understand the >>>>> information on >>>>>>> web pages and provide richer search results in order to make it >>>>> easier for >>>>>>> users to find relevant information on the web. Markup can also >>>>> enable new >>>>>>> tools and applications that make use of the structure. >>>>>>> A shared markup vocabulary makes it easier for webmasters to decide >>>>> on a >>>>>>> markup schema and get the maximum benefit for their efforts. So, in >>>>> the >>>>>>> spirit of sitemaps.org, search engines have come together to >>>>> provide a >>>>>>> shared collection of schemas that webmasters can use. >>>>>>> """ >>>>>>> >>>>>>> schema.org/docs/gs.html has the following heading structure: >>>>>>> >>>>>>> Getting started with schema.org >>>>>>> * How to mark up your content using Microdata >>>>>>> * Why use Microdata? [what about RDFa, these days] >>>>>>> * Using the schema.org vocabulary >>>>>>> * Advanced-topic: machine-understandable versions of information >>>>>>> >>>>>>>> The other side of this is the breadth of options? How might the >>>>>>>> increasingly large >>>>>>>> number of terms be 'filtered' for use by the man in the street to >>>>>>>> optimise his/her >>>>>>>> chances of a search engine result? >>>>>>>> >>>>>>>> I think this aspect could and should be given consideration as the >>>>> size of >>>>>>>> the main term set increases. >>>>>>>> >>>>>>>> Just a thought. Is there work being done in this area? >>>>>>> >>>>>>> There is a fair amount of research regarding meta tag stuffing in >>>>> regards to >>>>>>> SEO. >>>>>>> >>>>>>>> >>>>>>>> regards >>>>>>>> >>>>>>>> -- >>>>>>>> Dave Pawson >>>>>>>> XSLT XSL-FO FAQ. >>>>>>>> Docbook FAQ. >>>>>>>> http://www.dpawson.co.uk >>>>>>>> >>>>>>> >>>>>>> IMHO, from an en-US perspective, the copy text for the schema.org >>>>> Ontology: >>>>>>> >>>>>>> * is fairly verbose >>>>>>> * could have a few more bullet points >>>>>>> * could be updated to reference the supported formats >>>>>>> (RDF/XML, Turtle, JSON-LD, N3, NTriples, HTML5 Microdata, and >>>>> *RDFa*) >>>>>>> * could more directly allude to schema.rdfs.org and >>>>>>> http://schema.rdfs.org/tools.html >>>>>>> * could link to topical Wikipedia pages >>>>>>> >>>>>>> Wikipedia pages >>>>>>> >>>>>>> * /Linked_data >>>>>>> * /Semantic_web >>>>>>> * /Microdata_(HTML) >>>>>>> >>>>>>> I collected a number of Wikipedia links that may be useful for, as >>>>> you put >>>>>>> it, teh "web monkey and home user" here: >>>>>>> >>>>> >>>>>http://www.reddit.com/r/semanticweb/comments/1dvakc/schemaorgdataset_s >>>>>ta >>>>> ndard_schema_for_linked_data/ >>>>>>> >>>>>>> Please feel free to share and incorporate this research. >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Dave Pawson >>>>>> XSLT XSL-FO FAQ. >>>>>> Docbook FAQ. >>>>>> http://www.dpawson.co.uk >>>>>> >>>>> >>>>> -------------------------------------------------------- >>>>> martin hepp >>>>> e-business & web science research group >>>>> universitaet der bundeswehr muenchen >>>>> >>>>> e-mail: hepp@ebusiness-unibw.org >>>>> phone: +49-(0)89-6004-4217 >>>>> fax: +49-(0)89-6004-4620 >>>>> www: http://www.unibw.de/ebusiness/ (group) >>>>> http://www.heppnetz.de/ (personal) >>>>> skype: mfhepp >>>>> twitter: mfhepp >>>>> >>>>> Check out GoodRelations for E-Commerce on the Web of Linked Data! >>>>> ================================================================= >>>>> * Project Main Page: http://purl.org/goodrelations/ >>>>> >>>>> >>>>> >>> >>> -------------------------------------------------------- >>> martin hepp >>> e-business & web science research group >>> universitaet der bundeswehr muenchen >>> >>> e-mail: hepp@ebusiness-unibw.org >>> phone: +49-(0)89-6004-4217 >>> fax: +49-(0)89-6004-4620 >>> www: http://www.unibw.de/ebusiness/ (group) >>> http://www.heppnetz.de/ (personal) >>> skype: mfhepp >>> twitter: mfhepp >>> >>> Check out GoodRelations for E-Commerce on the Web of Linked Data! >>> ================================================================= >>> * Project Main Page: http://purl.org/goodrelations/ >>> >>> >>> >>> >> >> > >-------------------------------------------------------- >martin hepp >e-business & web science research group >universitaet der bundeswehr muenchen > >e-mail: hepp@ebusiness-unibw.org >phone: +49-(0)89-6004-4217 >fax: +49-(0)89-6004-4620 >www: http://www.unibw.de/ebusiness/ (group) > http://www.heppnetz.de/ (personal) >skype: mfhepp >twitter: mfhepp > >Check out GoodRelations for E-Commerce on the Web of Linked Data! >================================================================= >* Project Main Page: http://purl.org/goodrelations/ > > > >
Received on Monday, 29 July 2013 17:28:52 UTC