- From: <martin.hepp@ebusiness-unibw.org>
- Date: Tue, 20 May 2014 17:37:34 +0200
- To: Stéphane Corlosquet <scorlosquet@gmail.com>
- Cc: Markus Lanthaler <markus.lanthaler@gmx.net>, Dan Scott <dan@coffeecode.net>, Dan Brickley <danbri@google.com>, W3C Web Schemas Task Force <public-vocabs@w3.org>
Note that for an informed decision, we would have to look into the data structures driving typical dynamic Web sites. If they store single keywords, we can recommend single keywords per property. If they typically hold lists of keywords, we should recommend a string with a delimiter. If both is popular, allow both. It can be unnecessary burdensome for a Web developer to tokenize a string given from a back-end database in the template code with regular expressions or similar. So in a nutshell, a good Web vocabulary should support a dynamic degree of granularity - allowing site-owners to preserve as much structure as is available from the existing data sources, but not enforcing the lifting and cleansing of the data, since this will limit the amount of data published. Martin On 20 May 2014, at 16:45, Stéphane Corlosquet <scorlosquet@gmail.com> wrote: > > > > On Tue, May 20, 2014 at 10:39 AM, Markus Lanthaler <markus.lanthaler@gmx.net> wrote: > On Tuesday, May 20, 2014 4:00 PM, Dan Scott wrote: > > On Tue, May 20, 2014 at 02:17:12PM +0100, Dan Brickley wrote: > > >On 17 May 2014 06:31, Stéphane Corlosquet <scorlosquet@gmail.com> wrote: > > >> From previous conversations on this list, it looks like > > >> http://schema.org/keywords is meant to hold a list of comma-separated > > >> keywords, like the RDFa on this page: > > >> http://arc.lib.montana.edu/msu-photos/item/286: > > >> > > >> <span property="keywords">john burke, msc, football, team</span> > > >> > > >> If this is correct, the description for this property, which currently > reads > > >> "The keywords/tags used to describe this content", could be a bit more > > >> detailled. I suggest: > > >> > > >> A comma-separated list of keywords/tags used to describe this content. > > > > > >This sounds reasonable to me. The only objections I can think of > > >involve trying to stretch this property too far, e.g. phrases that > > >contain commas within them. Let's keep it simple... > > > > > >Does anyone here think that this change would not be an improvement? > > I was just wondering why there doesn't exist a singular version of > "keywords", i.e., "keyword". Was that somehow forgotten when all plurals > were deprecated or was this a deliberate decision? > > I had the same reaction as you at first when I discovered this, but 'keywords' was kept plural for that very reason, because it's one string containing a list of comma-separated keywords. I was surprised initially but apparently there are system/folks who prefer to use that as a opposed to breaking down the list into individual properties. > > Steph. > > > I think this matters because... > > > there are currently hundreds and, as sites upgrade, will be thousands > > of library Web sites that express "keywords" like: > > > > * keywords: Linux. > > * keywords: Internet programming. > > * keywords: Web sites > Design. > > * keywords: Electronic mail systems > Security measures. > > > > This is because we augment the existing display of subject headings like > > so: > > > > <div property="keywords"> > > <a href="search?email">Electronic mail systems</a> > > > <a href="search?email+security">Security measures.</a> > > </div> > > could also be expressed as > > <span property="keyword"><a href="search?email">Electronic mail > systems</a></span> > > <span property="keyword"><a href="search?email+security">Security > measures</a></span>. > > which would have the advantage that the keywords are already tokenized by > the publisher instead of forcing the consumer to do so... which would, btw., > also address Stéphane's concern below > > > >This sounds reasonable to me. The only objections I can think of > > >involve trying to stretch this property too far, e.g. phrases that > > >contain commas within them. Let's keep it simple... > > > -- > Markus Lanthaler > @markuslanthaler > > > > > -- > Steph.
Received on Tuesday, 20 May 2014 15:38:10 UTC