- From: LJ.Garcia <lj.garcia.co@gmail.com>
- Date: Tue, 3 Nov 2020 12:34:22 +0100
- To: "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>
- Cc: Dan Brickley <danbri@google.com>, "public-bioschemas@w3.org" <public-bioschemas@w3.org>
- Message-ID: <CAPZUG=DrSArS6762S_7tnfzzFnzhui41dvZ79aZZPRrgikxw3g@mail.gmail.com>
Hi Alasdair, I would say good practices about sitemaps and robots.txt would fall into the subject for our next community call. Regards, On Tue, Nov 3, 2020 at 10:05 AM Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk> wrote: > Hi All > > > > Dan thanks for the prompt on this and I would also encourage the use of > sitemaps to allow us to know what pages are available on your site. > > > > I have added a field to the list of live deploys that lists the sitemap as > well, although this is currently not shown on the website it is useful for > us to have a list of these. You can find details in the following PR > > https://github.com/BioSchemas/bioschemas.github.io/pull/340 > > > > Best regards > > > > Alasdair > > > > -- > > Alasdair J G Gray > > Associate Professor in Computer Science, > School of Mathematical and Computer Sciences > Heriot-Watt University, Edinburgh, UK. > > Email: A.J.G.Gray@hw.ac.uk <A.J.G.Gray@hw.ac.uk> > Web: http://www.macs.hw.ac.uk/~ajg33 > ORCID: http://orcid.org/0000-0002-5711-4872 > Office: Earl Mountbatten Building 1.39 > Twitter: @gray_alasdair > > > > > > Heriot-Watt is a global University, as a result my working hours may not > be your working hours. Do not feel pressure to reply to this email outside > your working hours. > > > > > > To arrange a meeting: https://doodle.com/mm/alasdairgray/book-a-time > > > > > > *From: *"danbri@google.com" <danbri@google.com> > *Date: *Monday, 2 November 2020 at 19:12 > *To: *"public-bioschemas@w3.org" <public-bioschemas@w3.org> > *Subject: *Robots.txt and Sitemap files > *Resent from: *"public-bioschemas@w3.org" <public-bioschemas@w3.org> > *Resent date: *Monday, 2 November 2020 at 19:11 > > > > > ***************************************************************** * > *Caution: This email originated from a sender outside Heriot-Watt > University. Do not follow links or open attachments if you doubt the > authenticity of the sender or the content. * > * ***************************************************************** > > > > > > Just a quick note to encourage discussion of robots.txt > <https://en.wikipedia.org/wiki/Robots_exclusion_standard> and sitemap > <https://en.wikipedia.org/wiki/Sitemaps> files as something that > bioschemas implementers should think about. There are a few cases of > bioschemas-publishing sites excluding most crawlers via a very restrictive > robots.txt file. Similarly, sitemap files can make large and complex sites > easier for crawlers (whether simple code or large/commercial) to collect > data from efficiently, including URL discovery. Since the hope has always > been that bioschemas will encourage innovative uses of marked up data, it > seems worth making sure that sites aren't accidentally excluding > bioschema-crawlers... > > > > cheers, > > > > Dan > ------------------------------ > > Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With > campuses and students across the entire globe we span the world, delivering > innovation and educational excellence in business, engineering, design and > the physical, social and life sciences. This email is generated from the > Heriot-Watt University Group, which includes: > > 1. Heriot-Watt University, a Scottish charity registered under number > SC000278 > 2. Heriot- Watt Services Limited (Oriam), Scotland's national > performance centre for sport. Heriot-Watt Services Limited is a private > limited company registered is Scotland with registered number SC271030 and > registered office at Research & Enterprise Services Heriot-Watt University, > Riccarton, Edinburgh, EH14 4AS. > > The contents (including any attachments) are confidential. If you are not > the intended recipient of this e-mail, any disclosure, copying, > distribution or use of its contents is strictly prohibited, and you should > please notify the sender immediately and then delete it (including any > attachments) from your system. >
Received on Tuesday, 3 November 2020 11:34:47 UTC