- From: Dawson, Laura <Laura.Dawson@bowker.com>
- Date: Mon, 29 Jul 2013 17:18:17 +0000
- To: Martin Hepp <martin.hepp@ebusiness-unibw.org>
- CC: Wes Turner <wes.turner@gmail.com>, Dave Pawson <dave.pawson@gmail.com>, "public-vocabs@w3.org" <public-vocabs@w3.org>, Dan Brickley <danbri@google.com>
It occurs to me that there are also parallel arguments regarding DRM. "Making my content more accessible makes my content more accessible! OH NO WHATEVER SHALL WE DO?" On 7/29/13 1:12 PM, "Martin Hepp" <martin.hepp@ebusiness-unibw.org> wrote: >Hi Dawson: > >I also have a common reply to the concern raised by site-owners that rich >data markup makes it easier for your competitors to abuse your content. > >"But schema.org will make it easy for my competitors to harvest my >prices, product descriptions, or dealer network information!" > >First, most site-owners do not realize how easy it is as of today for >anybody to extract content from others' Web sites via crowdsourcing: >Assumed your competitor wants an Excel table with all your dealers, their >addresses, and opening hours. With services like Amazon Mechanical Turk >or CrowdFlower, it will be a job of 15 Minutes and 50 USD or less to hire >human labor to extract that information for you, including reformatting, >spell-check, etc. Even a small competitor of yours can take that effort >if seriously interested. And it is actually more expensive to operate a >decent Web crawler for structured data than that. However, your >prospective clients will likely neither spend the time nor money to >access your data that way. > >It it true that structured data simplifies the access to and use of the >information on your Web site, but it does so for anybody. If you decide >against structured data on your site, you put a much greater barrier on >your potential target audience than on your competitors. The latter can >extract and analyze all your public Web data via crowdsourcing services >anyway. > >Second, you also have legal means to protect your content against reuse. >If you have unique product description texts of a sufficient creative >value, you can sue anybody who extracts and republishes that content." > >Martin >On Jul 29, 2013, at 6:39 PM, Dawson, Laura wrote: > >> What I've been looking for is an interface that allows a "web monkey" >>or home user to do thisŠin book files. To mark up ebooks semantically, >>and have search engines ingest the files in their indexes, would be a >>huge leap forward. It would help search, it would help books, it would >>help society as a whole. >> >> But we are missing three things in that: the Wordpress-y like interface >>that would allow this; the ability for an epub or mobi file to handle >>this markup without breaking; and the willingness of the book market to >>experiment. (To wit: Authors Guild lawsuit against Google Books >>regarding indexing and abstracting. Walled garden ebook environments. >>Etc.) >> >> From: Wes Turner <wes.turner@gmail.com> >> Date: Monday, July 29, 2013 12:33 PM >> To: Martin Hepp <martin.hepp@ebusiness-unibw.org> >> Cc: Dave Pawson <dave.pawson@gmail.com>, "public-vocabs@w3.org" >><public-vocabs@w3.org>, Dan Brickley <danbri@google.com> >> Subject: Re: Ease of adoption >> Resent-From: <public-vocabs@w3.org> >> Resent-Date: Monday, July 29, 2013 12:34 PM >> >> +1. http://en.m.wikipedia.org/wiki/Schema.org >> On Jul 29, 2013 10:46 AM, "Martin Hepp" >><martin.hepp@ebusiness-unibw.org> wrote: >>> Here is my suggestion for a new intro: >>> >>> "Many individuals and organizations use the Web to articulate their >>>messages: companies offer products, newspapers present news, bloggers >>>share opinions, etc. >>> Historically, the most relevant audience for a Web site were humans - >>>they found your Web site via a search engine and then consumed the >>>information from your site directly in their Web browsers. >>> >>> Now, there are more and more digital devices between a Web site and >>>its target audience, and they cover a bigger share of the process of >>>using information from the Web. For instance, nowadays, the most >>>relevant results in a search engine are often not "main" pages, but >>>deep, detailed links into a Web site. >>> >>> As a consequence, the decision for or against a product, restaurant, >>>newspaper, etc., -- in other words: your offer --, is made already in >>>the search results returned by the Web search engine. The better the >>>search engine understands the information inside your pages, the better >>>it can select, summarize, and present it to the target audiences. >>> >>> Schema.org is a standard for marking-up the information in your Web >>>content in a way that search engines and other computer-based services >>>can understand. In database terminology, the structures used to >>>represent information as data are called a "schema". Schema.org defines >>>a common schema for the interface between your Web content and search >>>engines. It allows search engines and other services to better extract >>>and understand your site. >>> >>> Why bother? Site owners spend a lot of effort for optimizing the user >>>experience of their site for human visitors, with stylesheets, icons, >>>font choices, etc. Schema.org is the next step: Optimizing the user >>>experience for your site when it is presented to your target audience >>>by a search engine, a mobile application, a browser extension, or any >>>new digital intermediary that may be in between." >>> >>> Best >>> >>> Martin Hepp >>> >>> PS: I offer this text under Creative Commons CC BY 3.0 ;-) >>> >>> On Jul 29, 2013, at 5:17 PM, Dave Pawson wrote: >>> >>> > On 29 July 2013 15:23, Wes Turner <wes.turner@gmail.com> wrote: >>> >> >>> >> On Jul 29, 2013 3:53 AM, "Dave Pawson" <dave.pawson@gmail.com> >>>wrote: >>> >>> >>> >>> Reading http://schema.org/docs/gs.html (IMHO) I don't see the >>>salesmans >>> >>> version, >>> >>> a trainers view of the ideas behind schema.org. >>> >>> >>> >>> Has anyone started to think of how a web monkey or home user might >>>be >>> >>> persuaded >>> >>> to adopt microdata for their own usage? E.g. taking the user >>>perspective? >>> >>> Dan and others may well find their way round schema.org, but it >>>isn't so >>> >>> easy >>> >>> to get started when a new user comes across it? >>> >> >>> >> When you say "taking the user perspective", what exactly do you >>>mean by >>> >> that? How are you suggesting the pitch should be modified in order >>>to reach >>> >> the target audience? >>> > >>> > IMHO that says it, succinctly and for a knowledgeable audience. >>> > If you look at intro type books (dummys ... etc), there is much more >>> > of a sell there. Persuasion as to why this tech is useful for them, >>> > meets an objective the reader may have? >>> > >>> > E.g. "A collection of schemas"... WTF is a schema...? >>> > >>> > " html tags, that webmasters can use to markup their pages in ways >>> > recognized by major search providers." >>> > Oh - that's not me then, I'm not a webmaster... >>> > >>> > I.e just the slant? >>> > >>> > Does that make sense? >>> > >>> > regards DaveP >>> > >>> > >>> >> >>> >> schema.org has a fairly great description: >>> >> >>> >> """ >>> >> What is Schema.org? >>> >> This site provides a collection of schemas, i.e., html tags, that >>>webmasters >>> >> can use to markup their pages in ways recognized by major search >>>providers. >>> >> Search engines including Bing, Google, Yahoo! and Yandex rely on >>>this markup >>> >> to improve the display of search results, making it easier for >>>people to >>> >> find the right web pages. >>> >> Many sites are generated from structured data, which is often >>>stored in >>> >> databases. When this data is formatted into HTML, it becomes very >>>difficult >>> >> to recover the original structured data. Many applications, >>>especially >>> >> search engines, can benefit greatly from direct access to this >>>structured >>> >> data. On-page markup enables search engines to understand the >>>information on >>> >> web pages and provide richer search results in order to make it >>>easier for >>> >> users to find relevant information on the web. Markup can also >>>enable new >>> >> tools and applications that make use of the structure. >>> >> A shared markup vocabulary makes it easier for webmasters to decide >>>on a >>> >> markup schema and get the maximum benefit for their efforts. So, in >>>the >>> >> spirit of sitemaps.org, search engines have come together to >>>provide a >>> >> shared collection of schemas that webmasters can use. >>> >> """ >>> >> >>> >> schema.org/docs/gs.html has the following heading structure: >>> >> >>> >> Getting started with schema.org >>> >> * How to mark up your content using Microdata >>> >> * Why use Microdata? [what about RDFa, these days] >>> >> * Using the schema.org vocabulary >>> >> * Advanced-topic: machine-understandable versions of information >>> >> >>> >>> The other side of this is the breadth of options? How might the >>> >>> increasingly large >>> >>> number of terms be 'filtered' for use by the man in the street to >>> >>> optimise his/her >>> >>> chances of a search engine result? >>> >>> >>> >>> I think this aspect could and should be given consideration as the >>>size of >>> >>> the main term set increases. >>> >>> >>> >>> Just a thought. Is there work being done in this area? >>> >> >>> >> There is a fair amount of research regarding meta tag stuffing in >>>regards to >>> >> SEO. >>> >> >>> >>> >>> >>> regards >>> >>> >>> >>> -- >>> >>> Dave Pawson >>> >>> XSLT XSL-FO FAQ. >>> >>> Docbook FAQ. >>> >>> http://www.dpawson.co.uk >>> >>> >>> >> >>> >> IMHO, from an en-US perspective, the copy text for the schema.org >>>Ontology: >>> >> >>> >> * is fairly verbose >>> >> * could have a few more bullet points >>> >> * could be updated to reference the supported formats >>> >> (RDF/XML, Turtle, JSON-LD, N3, NTriples, HTML5 Microdata, and >>>*RDFa*) >>> >> * could more directly allude to schema.rdfs.org and >>> >> http://schema.rdfs.org/tools.html >>> >> * could link to topical Wikipedia pages >>> >> >>> >> Wikipedia pages >>> >> >>> >> * /Linked_data >>> >> * /Semantic_web >>> >> * /Microdata_(HTML) >>> >> >>> >> I collected a number of Wikipedia links that may be useful for, as >>>you put >>> >> it, teh "web monkey and home user" here: >>> >> >>>http://www.reddit.com/r/semanticweb/comments/1dvakc/schemaorgdataset_sta >>>ndard_schema_for_linked_data/ >>> >> >>> >> Please feel free to share and incorporate this research. >>> > >>> > >>> > >>> > -- >>> > Dave Pawson >>> > XSLT XSL-FO FAQ. >>> > Docbook FAQ. >>> > http://www.dpawson.co.uk >>> > >>> >>> -------------------------------------------------------- >>> martin hepp >>> e-business & web science research group >>> universitaet der bundeswehr muenchen >>> >>> e-mail: hepp@ebusiness-unibw.org >>> phone: +49-(0)89-6004-4217 >>> fax: +49-(0)89-6004-4620 >>> www: http://www.unibw.de/ebusiness/ (group) >>> http://www.heppnetz.de/ (personal) >>> skype: mfhepp >>> twitter: mfhepp >>> >>> Check out GoodRelations for E-Commerce on the Web of Linked Data! >>> ================================================================= >>> * Project Main Page: http://purl.org/goodrelations/ >>> >>> >>> > >-------------------------------------------------------- >martin hepp >e-business & web science research group >universitaet der bundeswehr muenchen > >e-mail: hepp@ebusiness-unibw.org >phone: +49-(0)89-6004-4217 >fax: +49-(0)89-6004-4620 >www: http://www.unibw.de/ebusiness/ (group) > http://www.heppnetz.de/ (personal) >skype: mfhepp >twitter: mfhepp > >Check out GoodRelations for E-Commerce on the Web of Linked Data! >================================================================= >* Project Main Page: http://purl.org/goodrelations/ > > > >
Received on Monday, 29 July 2013 17:18:59 UTC