RE: OK enough, lets fix blogspam

Yes, I agree.  In fact it needs to be more granular for other reasons
too that have nothing to do with spam or blogs.
 
Unfortunately <META> operates at document level.  I suspect ROBOTS was
implemented as a META tag for the sole reason that it was the path of
least resistance (easier than changing the core HTML spec) but it's the
wrong level of granularity.
 
What's needed is something more like this:
<robots index=false>
<p>This paragraph will not be scanned by search engine spiders.
<\robots>
 
I agree with MEZ that this falls outside WSC scope, but it does seem
like something W3C ought to look at.

Michael McCormick, CISSP 
Lead Architect, Information Security 

This message may contain confidential and/or privileged information.  If
you are not the addressee or authorized to receive this for the
addressee, you must not use, copy, disclose, or take any action based on
this message or any information herein.  If you have received this
message in error, please advise the sender immediately by reply e-mail
and delete this message.  Thank you for your cooperation.

  _____  

From: Hallam-Baker, Phillip [mailto:pbaker@verisign.com] 
Sent: Wednesday, January 03, 2007 4:05 PM
To: McCormick, Mike; public-wsc-wg@w3.org
Subject: RE: OK enough, lets fix blogspam


I think it needs to be more granular. 
 
I want Google to index my post but not the comments.
 
 


  _____  

	From: public-wsc-wg-request@w3.org
[mailto:public-wsc-wg-request@w3.org] On Behalf Of
michael.mccormick@wellsfargo.com
	Sent: Wednesday, January 03, 2007 5:03 PM
	To: Hallam-Baker, Phillip; public-wsc-wg@w3.org
	Subject: RE: OK enough, lets fix blogspam
	
	
	Could this be done with <META NAME="ROBOTS"
CONTENT="NOINDEX,NOFOLLOW"> tag?  Maybe it just needs to be more
granular to apply to specific portions of a HTML body instead of a whole
page.

	Michael McCormick, CISSP 
	Lead Architect, Information Security 

	This message may contain confidential and/or privileged
information.  If you are not the addressee or authorized to receive this
for the addressee, you must not use, copy, disclose, or take any action
based on this message or any information herein.  If you have received
this message in error, please advise the sender immediately by reply
e-mail and delete this message.  Thank you for your cooperation.

	 

  _____  

	From: public-wsc-wg-request@w3.org
[mailto:public-wsc-wg-request@w3.org] On Behalf Of Hallam-Baker, Phillip
	Sent: Wednesday, January 03, 2007 3:02 PM
	To: W3 Work Group
	Subject: OK enough, lets fix blogspam
	
	
	My blogs overunneth with spam. Yea my cup is full to
overflowing.
	 
	 
	None of the spam is targetted at either me or my readers. It is
all targetted at Google's web crawler and their pagerank algorithm.
	 
	At the F2F meeting someone opposite me raised a similar solution
to the following but in the context of scripting its a simple fix that I
think would work.
	 
	 
	The idea is to have a HTML attribute or element that allows a
server to declare that a section of a Web page came from an external
source. The idea would be to encapsulate all blog comments and the like
so that browsers can look at the content and conclude 'don't run any
code from this region' and Web crawlers can ignore the content for the
purposes of PageRank and the like.
	 
	In order to get maximal security the best approach would be to
use some form of nonce sentinel value at the start and finish of the
block as was proposed at one of the TIPPI workshops.
	 
	In order to engage the type of accountability controls that I
want to establish it should also be possible to specify the
authenticated poster identity if known.
	 
	 
	So for example we might have:
	 
	<p>My Web 3.14159265 meme seems to be catching on. 
	<Inc:Start rel="foreign" authID="mailto:alice@example.com"
authmech="saml1.2" sentinel="aegq3tgr2q3uyt1387==" />
	Nice post but have you considered this? <a
href="http://www.spamisus.com/spork>Spork dietary supplement really
works!</a>
	<Inc:End sentinel="aegq3tgr2q3uyt1387==">
	 
	 
	It needs some work to fit it into XHTML properly. Close tags
don't take attributes in XML which is a challenge.
	 
	To be effective the sentinel values have to be synthesized on
the fly with the rest of the content but that should not be a huge
issue.
	 
	 
	Where is the best place to work on this? Do we have any Google
people here?

Received on Thursday, 4 January 2007 17:34:22 UTC