W3C home > Mailing lists > Public > public-vocabs@w3.org > July 2013

Re: Ease of adoption

From: Dawson, Laura <Laura.Dawson@bowker.com>
Date: Mon, 29 Jul 2013 17:28:16 +0000
To: Martin Hepp <martin.hepp@ebusiness-unibw.org>
CC: Wes Turner <wes.turner@gmail.com>, Dave Pawson <dave.pawson@gmail.com>, "public-vocabs@w3.org" <public-vocabs@w3.org>, Dan Brickley <danbri@google.com>
Message-ID: <CE1C1E44.49763%laura.dawson@bowker.com>
Book publishers NEED to think this way.But they, like your German winery,
are horribly invested in control.

On 7/29/13 1:26 PM, "Martin Hepp" <martin.hepp@ebusiness-unibw.org> wrote:

>Thanks, you are very welcome - yes, I understand, books are different -
>but the basic pattern is the same: You can never win by making access to
>(selected parts of) your content less machine-friendly. That is like not
>putting price-tags on products to escape from price-comparison. It may
>work for a short while in selected segments, but it won't allow survival
>of an otherwise inferior business model or deficient operations. (Note: I
>am not saying that producing books is per se a bad business model ;-))
>
>Martin
>
>PS: Side-story: I once read in the disclaimers of a small German winery's
>Web site that "external linking to this site without written consent was
>forbidden" ;-)
>
>
>On Jul 29, 2013, at 7:14 PM, Dawson, Laura wrote:
>
>> This is excellent - of course, book publishers just don't think this
>>way.
>> Thank you for this!!!
>> 
>> On 7/29/13 1:12 PM, "Martin Hepp" <martin.hepp@ebusiness-unibw.org>
>>wrote:
>> 
>>> Hi Dawson:
>>> 
>>> I also have a common reply to the concern raised by site-owners that
>>>rich
>>> data markup makes it easier for your competitors to abuse your content.
>>> 
>>> "But schema.org will make it easy for my competitors to harvest my
>>> prices, product descriptions, or dealer network information!"
>>> 
>>> First, most site-owners do not realize how easy it is as of today for
>>> anybody to extract content from others' Web sites via crowdsourcing:
>>> Assumed your competitor wants an Excel table with all your dealers,
>>>their
>>> addresses, and opening hours. With services like Amazon Mechanical Turk
>>> or CrowdFlower, it will be a job of 15 Minutes and 50 USD or less to
>>>hire
>>> human labor to extract that information for you, including
>>>reformatting,
>>> spell-check, etc. Even a small competitor of yours can take that effort
>>> if seriously interested. And it is actually more expensive to operate a
>>> decent Web crawler for structured data than that. However, your
>>> prospective clients will likely neither spend the time nor money to
>>> access your data that way.
>>> 
>>> It it true that structured data simplifies the access to and use of the
>>> information on your Web site, but it does so for anybody. If you decide
>>> against structured data on your site, you put a much greater barrier on
>>> your potential target audience than on your competitors. The latter can
>>> extract and analyze all your public Web data via crowdsourcing services
>>> anyway.
>>> 
>>> Second, you also have legal means to protect your content against
>>>reuse.
>>> If you have unique product description texts of a sufficient creative
>>> value, you can sue anybody who extracts and republishes that content."
>>> 
>>> Martin
>>> On Jul 29, 2013, at 6:39 PM, Dawson, Laura wrote:
>>> 
>>>> What I've been looking for is an interface that allows a "web monkey"
>>>> or home user to do thisİin book files. To mark up ebooks semantically,
>>>> and have search engines ingest the files in their indexes, would be a
>>>> huge leap forward. It would help search, it would help books, it would
>>>> help society as a whole.
>>>> 
>>>> But we are missing three things in that: the Wordpress-y like
>>>>interface
>>>> that would allow this; the ability for an epub or mobi file to handle
>>>> this markup without breaking; and the willingness of the book market
>>>>to
>>>> experiment. (To wit: Authors Guild lawsuit against Google Books
>>>> regarding indexing and abstracting. Walled garden ebook environments.
>>>> Etc.)
>>>> 
>>>> From: Wes Turner <wes.turner@gmail.com>
>>>> Date: Monday, July 29, 2013 12:33 PM
>>>> To: Martin Hepp <martin.hepp@ebusiness-unibw.org>
>>>> Cc: Dave Pawson <dave.pawson@gmail.com>, "public-vocabs@w3.org"
>>>> <public-vocabs@w3.org>, Dan Brickley <danbri@google.com>
>>>> Subject: Re: Ease of adoption
>>>> Resent-From: <public-vocabs@w3.org>
>>>> Resent-Date: Monday, July 29, 2013 12:34 PM
>>>> 
>>>> +1. http://en.m.wikipedia.org/wiki/Schema.org
>>>> On Jul 29, 2013 10:46 AM, "Martin Hepp"
>>>> <martin.hepp@ebusiness-unibw.org> wrote:
>>>>> Here is my suggestion for a new intro:
>>>>> 
>>>>> "Many individuals and organizations use the Web to articulate their
>>>>> messages: companies offer products, newspapers present news, bloggers
>>>>> share opinions, etc.
>>>>> Historically, the most relevant audience for a Web site were humans -
>>>>> they found your Web site via a search engine and then consumed the
>>>>> information from your site directly in their Web browsers.
>>>>> 
>>>>> Now, there are more and more digital devices between a Web site and
>>>>> its target audience, and they cover a bigger share of the process of
>>>>> using information from the Web. For instance, nowadays, the most
>>>>> relevant results in a search engine are often not "main" pages, but
>>>>> deep, detailed links into a Web site.
>>>>> 
>>>>> As a consequence, the decision for or against a product, restaurant,
>>>>> newspaper, etc., -- in other words: your offer --, is made already in
>>>>> the search results returned by the Web search engine. The better the
>>>>> search engine understands the information inside your pages, the
>>>>>better
>>>>> it can select, summarize, and present it to the target audiences.
>>>>> 
>>>>> Schema.org is a standard for marking-up the information in your Web
>>>>> content in a way that search engines and other computer-based
>>>>>services
>>>>> can understand. In database terminology, the structures used to
>>>>> represent information as data are called a "schema". Schema.org
>>>>>defines
>>>>> a common schema for the interface between your Web content and search
>>>>> engines. It allows search engines and other services to better
>>>>>extract
>>>>> and understand your site.
>>>>> 
>>>>> Why bother? Site owners spend a lot of effort for optimizing the user
>>>>> experience of their site for human visitors, with stylesheets, icons,
>>>>> font choices, etc. Schema.org is the next step: Optimizing the user
>>>>> experience for your site when it is presented to your target audience
>>>>> by a search engine, a mobile application, a browser extension, or any
>>>>> new digital intermediary that may be in between."
>>>>> 
>>>>> Best
>>>>> 
>>>>> Martin Hepp
>>>>> 
>>>>> PS: I offer this text under Creative Commons CC BY 3.0 ;-)
>>>>> 
>>>>> On Jul 29, 2013, at 5:17 PM, Dave Pawson wrote:
>>>>> 
>>>>>> On 29 July 2013 15:23, Wes Turner <wes.turner@gmail.com> wrote:
>>>>>>> 
>>>>>>> On Jul 29, 2013 3:53 AM, "Dave Pawson" <dave.pawson@gmail.com>
>>>>> wrote:
>>>>>>>> 
>>>>>>>> Reading http://schema.org/docs/gs.html (IMHO) I don't see the
>>>>> salesmans
>>>>>>>> version,
>>>>>>>> a trainers view of the ideas behind schema.org.
>>>>>>>> 
>>>>>>>> Has anyone started to think of how a web monkey or home user might
>>>>> be
>>>>>>>> persuaded
>>>>>>>> to adopt microdata for their own usage?  E.g. taking the user
>>>>> perspective?
>>>>>>>> Dan and others may well find their way round schema.org, but it
>>>>> isn't so
>>>>>>>> easy
>>>>>>>> to get started when a new user comes across it?
>>>>>>> 
>>>>>>> When you say "taking the user perspective", what exactly do you
>>>>> mean by
>>>>>>> that? How are you suggesting the pitch should be modified in order
>>>>> to reach
>>>>>>> the target audience?
>>>>>> 
>>>>>> IMHO that says it, succinctly and for a knowledgeable audience.
>>>>>> If you look at intro type books (dummys ... etc), there is much more
>>>>>> of a sell there. Persuasion as to why this tech is useful for them,
>>>>>> meets an objective the reader may have?
>>>>>> 
>>>>>> E.g. "A collection of schemas"... WTF is a schema...?
>>>>>> 
>>>>>> " html tags, that webmasters can use to markup their pages in ways
>>>>>> recognized by major search providers."
>>>>>> Oh - that's not me then, I'm not a webmaster...
>>>>>> 
>>>>>> I.e just the slant?
>>>>>> 
>>>>>> Does that make sense?
>>>>>> 
>>>>>> regards DaveP
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> schema.org has a fairly great description:
>>>>>>> 
>>>>>>> """
>>>>>>> What is Schema.org?
>>>>>>> This site provides a collection of schemas, i.e., html tags, that
>>>>> webmasters
>>>>>>> can use to markup their pages in ways recognized by major search
>>>>> providers.
>>>>>>> Search engines including Bing, Google, Yahoo! and Yandex rely on
>>>>> this markup
>>>>>>> to improve the display of search results, making it easier for
>>>>> people to
>>>>>>> find the right web pages.
>>>>>>> Many sites are generated from structured data, which is often
>>>>> stored in
>>>>>>> databases. When this data is formatted into HTML, it becomes very
>>>>> difficult
>>>>>>> to recover the original structured data. Many applications,
>>>>> especially
>>>>>>> search engines, can benefit greatly from direct access to this
>>>>> structured
>>>>>>> data. On-page markup enables search engines to understand the
>>>>> information on
>>>>>>> web pages and provide richer search results in order to make it
>>>>> easier for
>>>>>>> users to find relevant information on the web. Markup can also
>>>>> enable new
>>>>>>> tools and applications that make use of the structure.
>>>>>>> A shared markup vocabulary makes it easier for webmasters to decide
>>>>> on a
>>>>>>> markup schema and get the maximum benefit for their efforts. So, in
>>>>> the
>>>>>>> spirit of sitemaps.org, search engines have come together to
>>>>> provide a
>>>>>>> shared collection of schemas that webmasters can use.
>>>>>>> """
>>>>>>> 
>>>>>>> schema.org/docs/gs.html has the following heading structure:
>>>>>>> 
>>>>>>> Getting started with schema.org
>>>>>>> * How to mark up your content using Microdata
>>>>>>>  * Why use Microdata? [what about RDFa, these days]
>>>>>>> * Using the schema.org vocabulary
>>>>>>> * Advanced-topic: machine-understandable versions of information
>>>>>>> 
>>>>>>>> The other side of this is the breadth of options? How might the
>>>>>>>> increasingly large
>>>>>>>> number of terms be 'filtered' for use by  the man in the street to
>>>>>>>> optimise his/her
>>>>>>>> chances of a search engine result?
>>>>>>>> 
>>>>>>>> I think this aspect could and should be given consideration as the
>>>>> size of
>>>>>>>> the main term set increases.
>>>>>>>> 
>>>>>>>> Just a thought. Is there work being done in this area?
>>>>>>> 
>>>>>>> There is a fair amount of research regarding meta tag stuffing in
>>>>> regards to
>>>>>>> SEO.
>>>>>>> 
>>>>>>>> 
>>>>>>>> regards
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Dave Pawson
>>>>>>>> XSLT XSL-FO FAQ.
>>>>>>>> Docbook FAQ.
>>>>>>>> http://www.dpawson.co.uk
>>>>>>>> 
>>>>>>> 
>>>>>>> IMHO, from an en-US perspective, the copy text for the schema.org
>>>>> Ontology:
>>>>>>> 
>>>>>>> * is fairly verbose
>>>>>>> * could have a few more bullet points
>>>>>>> * could be updated to reference the supported formats
>>>>>>> (RDF/XML, Turtle, JSON-LD, N3, NTriples, HTML5 Microdata, and
>>>>> *RDFa*)
>>>>>>> * could more directly allude to schema.rdfs.org and
>>>>>>> http://schema.rdfs.org/tools.html
>>>>>>> * could link to topical Wikipedia pages
>>>>>>> 
>>>>>>> Wikipedia pages
>>>>>>> 
>>>>>>> * /Linked_data
>>>>>>> * /Semantic_web
>>>>>>> * /Microdata_(HTML)
>>>>>>> 
>>>>>>> I collected a number of Wikipedia links that may be useful for, as
>>>>> you put
>>>>>>> it, teh "web monkey and home user" here:
>>>>>>> 
>>>>> 
>>>>>http://www.reddit.com/r/semanticweb/comments/1dvakc/schemaorgdataset_s
>>>>>ta
>>>>> ndard_schema_for_linked_data/
>>>>>>> 
>>>>>>> Please feel free to share and incorporate this research.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Dave Pawson
>>>>>> XSLT XSL-FO FAQ.
>>>>>> Docbook FAQ.
>>>>>> http://www.dpawson.co.uk
>>>>>> 
>>>>> 
>>>>> --------------------------------------------------------
>>>>> martin hepp
>>>>> e-business & web science research group
>>>>> universitaet der bundeswehr muenchen
>>>>> 
>>>>> e-mail:  hepp@ebusiness-unibw.org
>>>>> phone:   +49-(0)89-6004-4217
>>>>> fax:     +49-(0)89-6004-4620
>>>>> www:     http://www.unibw.de/ebusiness/ (group)
>>>>>         http://www.heppnetz.de/ (personal)
>>>>> skype:   mfhepp
>>>>> twitter: mfhepp
>>>>> 
>>>>> Check out GoodRelations for E-Commerce on the Web of Linked Data!
>>>>> =================================================================
>>>>> * Project Main Page: http://purl.org/goodrelations/
>>>>> 
>>>>> 
>>>>> 
>>> 
>>> --------------------------------------------------------
>>> martin hepp
>>> e-business & web science research group
>>> universitaet der bundeswehr muenchen
>>> 
>>> e-mail:  hepp@ebusiness-unibw.org
>>> phone:   +49-(0)89-6004-4217
>>> fax:     +49-(0)89-6004-4620
>>> www:     http://www.unibw.de/ebusiness/ (group)
>>>        http://www.heppnetz.de/ (personal)
>>> skype:   mfhepp
>>> twitter: mfhepp
>>> 
>>> Check out GoodRelations for E-Commerce on the Web of Linked Data!
>>> =================================================================
>>> * Project Main Page: http://purl.org/goodrelations/
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>
>--------------------------------------------------------
>martin hepp
>e-business & web science research group
>universitaet der bundeswehr muenchen
>
>e-mail:  hepp@ebusiness-unibw.org
>phone:   +49-(0)89-6004-4217
>fax:     +49-(0)89-6004-4620
>www:     http://www.unibw.de/ebusiness/ (group)
>         http://www.heppnetz.de/ (personal)
>skype:   mfhepp 
>twitter: mfhepp
>
>Check out GoodRelations for E-Commerce on the Web of Linked Data!
>=================================================================
>* Project Main Page: http://purl.org/goodrelations/
>
>
>
>
Received on Monday, 29 July 2013 17:28:52 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:28 UTC