Re: Ease of adoption

This is excellent - of course, book publishers just don't think this way.
Thank you for this!!!

On 7/29/13 1:12 PM, "Martin Hepp" <martin.hepp@ebusiness-unibw.org> wrote:

>Hi Dawson:
>
>I also have a common reply to the concern raised by site-owners that rich
>data markup makes it easier for your competitors to abuse your content.
>
>"But schema.org will make it easy for my competitors to harvest my
>prices, product descriptions, or dealer network information!"
>
>First, most site-owners do not realize how easy it is as of today for
>anybody to extract content from others' Web sites via crowdsourcing:
>Assumed your competitor wants an Excel table with all your dealers, their
>addresses, and opening hours. With services like Amazon Mechanical Turk
>or CrowdFlower, it will be a job of 15 Minutes and 50 USD or less to hire
>human labor to extract that information for you, including reformatting,
>spell-check, etc. Even a small competitor of yours can take that effort
>if seriously interested. And it is actually more expensive to operate a
>decent Web crawler for structured data than that. However, your
>prospective clients will likely neither spend the time nor money to
>access your data that way.
>
>It it true that structured data simplifies the access to and use of the
>information on your Web site, but it does so for anybody. If you decide
>against structured data on your site, you put a much greater barrier on
>your potential target audience than on your competitors. The latter can
>extract and analyze all your public Web data via crowdsourcing services
>anyway.
>
>Second, you also have legal means to protect your content against reuse.
>If you have unique product description texts of a sufficient creative
>value, you can sue anybody who extracts and republishes that content."
>
>Martin
>On Jul 29, 2013, at 6:39 PM, Dawson, Laura wrote:
>
>> What I've been looking for is an interface that allows a "web monkey"
>>or home user to do thisŠin book files. To mark up ebooks semantically,
>>and have search engines ingest the files in their indexes, would be a
>>huge leap forward. It would help search, it would help books, it would
>>help society as a whole.
>> 
>> But we are missing three things in that: the Wordpress-y like interface
>>that would allow this; the ability for an epub or mobi file to handle
>>this markup without breaking; and the willingness of the book market to
>>experiment. (To wit: Authors Guild lawsuit against Google Books
>>regarding indexing and abstracting. Walled garden ebook environments.
>>Etc.)
>> 
>> From: Wes Turner <wes.turner@gmail.com>
>> Date: Monday, July 29, 2013 12:33 PM
>> To: Martin Hepp <martin.hepp@ebusiness-unibw.org>
>> Cc: Dave Pawson <dave.pawson@gmail.com>, "public-vocabs@w3.org"
>><public-vocabs@w3.org>, Dan Brickley <danbri@google.com>
>> Subject: Re: Ease of adoption
>> Resent-From: <public-vocabs@w3.org>
>> Resent-Date: Monday, July 29, 2013 12:34 PM
>> 
>> +1. http://en.m.wikipedia.org/wiki/Schema.org
>> On Jul 29, 2013 10:46 AM, "Martin Hepp"
>><martin.hepp@ebusiness-unibw.org> wrote:
>>> Here is my suggestion for a new intro:
>>> 
>>> "Many individuals and organizations use the Web to articulate their
>>>messages: companies offer products, newspapers present news, bloggers
>>>share opinions, etc.
>>> Historically, the most relevant audience for a Web site were humans -
>>>they found your Web site via a search engine and then consumed the
>>>information from your site directly in their Web browsers.
>>> 
>>> Now, there are more and more digital devices between a Web site and
>>>its target audience, and they cover a bigger share of the process of
>>>using information from the Web. For instance, nowadays, the most
>>>relevant results in a search engine are often not "main" pages, but
>>>deep, detailed links into a Web site.
>>> 
>>> As a consequence, the decision for or against a product, restaurant,
>>>newspaper, etc., -- in other words: your offer --, is made already in
>>>the search results returned by the Web search engine. The better the
>>>search engine understands the information inside your pages, the better
>>>it can select, summarize, and present it to the target audiences.
>>> 
>>> Schema.org is a standard for marking-up the information in your Web
>>>content in a way that search engines and other computer-based services
>>>can understand. In database terminology, the structures used to
>>>represent information as data are called a "schema". Schema.org defines
>>>a common schema for the interface between your Web content and search
>>>engines. It allows search engines and other services to better extract
>>>and understand your site.
>>> 
>>> Why bother? Site owners spend a lot of effort for optimizing the user
>>>experience of their site for human visitors, with stylesheets, icons,
>>>font choices, etc. Schema.org is the next step: Optimizing the user
>>>experience for your site when it is presented to your target audience
>>>by a search engine, a mobile application, a browser extension, or any
>>>new digital intermediary that may be in between."
>>> 
>>> Best
>>> 
>>> Martin Hepp
>>> 
>>> PS: I offer this text under Creative Commons CC BY 3.0 ;-)
>>> 
>>> On Jul 29, 2013, at 5:17 PM, Dave Pawson wrote:
>>> 
>>> > On 29 July 2013 15:23, Wes Turner <wes.turner@gmail.com> wrote:
>>> >>
>>> >> On Jul 29, 2013 3:53 AM, "Dave Pawson" <dave.pawson@gmail.com>
>>>wrote:
>>> >>>
>>> >>> Reading http://schema.org/docs/gs.html (IMHO) I don't see the
>>>salesmans
>>> >>> version,
>>> >>> a trainers view of the ideas behind schema.org.
>>> >>>
>>> >>> Has anyone started to think of how a web monkey or home user might
>>>be
>>> >>> persuaded
>>> >>> to adopt microdata for their own usage?  E.g. taking the user
>>>perspective?
>>> >>> Dan and others may well find their way round schema.org, but it
>>>isn't so
>>> >>> easy
>>> >>> to get started when a new user comes across it?
>>> >>
>>> >> When you say "taking the user perspective", what exactly do you
>>>mean by
>>> >> that? How are you suggesting the pitch should be modified in order
>>>to reach
>>> >> the target audience?
>>> >
>>> > IMHO that says it, succinctly and for a knowledgeable audience.
>>> >  If you look at intro type books (dummys ... etc), there is much more
>>> > of a sell there. Persuasion as to why this tech is useful for them,
>>> > meets an objective the reader may have?
>>> >
>>> > E.g. "A collection of schemas"... WTF is a schema...?
>>> >
>>> > " html tags, that webmasters can use to markup their pages in ways
>>> > recognized by major search providers."
>>> >  Oh - that's not me then, I'm not a webmaster...
>>> >
>>> > I.e just the slant?
>>> >
>>> > Does that make sense?
>>> >
>>> > regards DaveP
>>> >
>>> >
>>> >>
>>> >> schema.org has a fairly great description:
>>> >>
>>> >> """
>>> >> What is Schema.org?
>>> >> This site provides a collection of schemas, i.e., html tags, that
>>>webmasters
>>> >> can use to markup their pages in ways recognized by major search
>>>providers.
>>> >> Search engines including Bing, Google, Yahoo! and Yandex rely on
>>>this markup
>>> >> to improve the display of search results, making it easier for
>>>people to
>>> >> find the right web pages.
>>> >> Many sites are generated from structured data, which is often
>>>stored in
>>> >> databases. When this data is formatted into HTML, it becomes very
>>>difficult
>>> >> to recover the original structured data. Many applications,
>>>especially
>>> >> search engines, can benefit greatly from direct access to this
>>>structured
>>> >> data. On-page markup enables search engines to understand the
>>>information on
>>> >> web pages and provide richer search results in order to make it
>>>easier for
>>> >> users to find relevant information on the web. Markup can also
>>>enable new
>>> >> tools and applications that make use of the structure.
>>> >> A shared markup vocabulary makes it easier for webmasters to decide
>>>on a
>>> >> markup schema and get the maximum benefit for their efforts. So, in
>>>the
>>> >> spirit of sitemaps.org, search engines have come together to
>>>provide a
>>> >> shared collection of schemas that webmasters can use.
>>> >> """
>>> >>
>>> >> schema.org/docs/gs.html has the following heading structure:
>>> >>
>>> >> Getting started with schema.org
>>> >> * How to mark up your content using Microdata
>>> >>   * Why use Microdata? [what about RDFa, these days]
>>> >> * Using the schema.org vocabulary
>>> >> * Advanced-topic: machine-understandable versions of information
>>> >>
>>> >>> The other side of this is the breadth of options? How might the
>>> >>> increasingly large
>>> >>> number of terms be 'filtered' for use by  the man in the street to
>>> >>> optimise his/her
>>> >>> chances of a search engine result?
>>> >>>
>>> >>> I think this aspect could and should be given consideration as the
>>>size of
>>> >>> the main term set increases.
>>> >>>
>>> >>> Just a thought. Is there work being done in this area?
>>> >>
>>> >> There is a fair amount of research regarding meta tag stuffing in
>>>regards to
>>> >> SEO.
>>> >>
>>> >>>
>>> >>> regards
>>> >>>
>>> >>> --
>>> >>> Dave Pawson
>>> >>> XSLT XSL-FO FAQ.
>>> >>> Docbook FAQ.
>>> >>> http://www.dpawson.co.uk
>>> >>>
>>> >>
>>> >> IMHO, from an en-US perspective, the copy text for the schema.org
>>>Ontology:
>>> >>
>>> >> * is fairly verbose
>>> >> * could have a few more bullet points
>>> >> * could be updated to reference the supported formats
>>> >>  (RDF/XML, Turtle, JSON-LD, N3, NTriples, HTML5 Microdata, and
>>>*RDFa*)
>>> >> * could more directly allude to schema.rdfs.org and
>>> >> http://schema.rdfs.org/tools.html
>>> >> * could link to topical Wikipedia pages
>>> >>
>>> >> Wikipedia pages
>>> >>
>>> >> * /Linked_data
>>> >> * /Semantic_web
>>> >> * /Microdata_(HTML)
>>> >>
>>> >> I collected a number of Wikipedia links that may be useful for, as
>>>you put
>>> >> it, teh "web monkey and home user" here:
>>> >> 
>>>http://www.reddit.com/r/semanticweb/comments/1dvakc/schemaorgdataset_sta
>>>ndard_schema_for_linked_data/
>>> >>
>>> >> Please feel free to share and incorporate this research.
>>> >
>>> >
>>> >
>>> > --
>>> > Dave Pawson
>>> > XSLT XSL-FO FAQ.
>>> > Docbook FAQ.
>>> > http://www.dpawson.co.uk
>>> >
>>> 
>>> --------------------------------------------------------
>>> martin hepp
>>> e-business & web science research group
>>> universitaet der bundeswehr muenchen
>>> 
>>> e-mail:  hepp@ebusiness-unibw.org
>>> phone:   +49-(0)89-6004-4217
>>> fax:     +49-(0)89-6004-4620
>>> www:     http://www.unibw.de/ebusiness/ (group)
>>>          http://www.heppnetz.de/ (personal)
>>> skype:   mfhepp
>>> twitter: mfhepp
>>> 
>>> Check out GoodRelations for E-Commerce on the Web of Linked Data!
>>> =================================================================
>>> * Project Main Page: http://purl.org/goodrelations/
>>> 
>>> 
>>> 
>
>--------------------------------------------------------
>martin hepp
>e-business & web science research group
>universitaet der bundeswehr muenchen
>
>e-mail:  hepp@ebusiness-unibw.org
>phone:   +49-(0)89-6004-4217
>fax:     +49-(0)89-6004-4620
>www:     http://www.unibw.de/ebusiness/ (group)
>         http://www.heppnetz.de/ (personal)
>skype:   mfhepp 
>twitter: mfhepp
>
>Check out GoodRelations for E-Commerce on the Web of Linked Data!
>=================================================================
>* Project Main Page: http://purl.org/goodrelations/
>
>
>
>

Received on Monday, 29 July 2013 17:15:09 UTC