Re: International Search Engine Submission from Erik van der Poel on 2000-02-08 (www-international@w3.org from January to March 2000)

From: Erik van der Poel <erik@netscape.com>
Date: Tue, 08 Feb 2000 13:28:16 -0800
To: Suzanne Topping <stopping@rochester.rr.com>
CC: www <www-international@w3.org>, nelocsig <nelocsig@egroups.com>
Message-ID: <38A08A70.BD378F0A@netscape.com>

I have a question about the search engines. In addition to being able to
submit Web sites to search engines, these search engine companies
usually(?) run robots (crawlers) that automatically find Web sites and
index them for searching.

I'm wondering whether those crawlers deal with such character encodings
as ISO-2022-JP, where bytes such as '<', '>' and '&' can appear, but
don't have the same meaning as the HTML characters.

In other words, do the crawlers deal with ISO-2022-JP? Or do they fail
to parse those, thereby failing to follow any of the URLs in them?

Erik

Received on Tuesday, 8 February 2000 16:32:15 UTC