- From: Henrik Frystyk Nielsen <frystyk@w3.org>
- Date: Tue, 16 Apr 1996 12:16:20 -0400
- To: Gecse Roland <groland@balu.sch.bme.hu>
- Cc: www-lib <www-lib@w3.org>
Gecse Roland writes: > In my robot I use the HTParse with PARSE_HOST to extract the > starting hostname from a command line query. The robot takes everything > under this URL. BUT, is a host has more than one names, and in the HTML > are references to both of them, how can I know if this is the same host? Good question. In the 4.0D vesion, the HTDNS module kept its own cache of DNS entries in order to do the timing and to save DNS queries. After intense discussion on the HTTP working group mailing list, it is enforced that _if_ you cache DNS entries then you _must_ honor the TTL for the DNS records. Unfortunately, gethostbyname doesn't provide this information, so the current version of HTDNS does not conform to this. However, as you can manually set the timeout for entries and it automatically flushes the cache if an error occurs, it does better than most other Web applications that I have seen. In the next version, the DNS stuff will be separated out and eventually we have to write our own DNS resolver that gives better information about: - canonical names - connect times > 1) How can I get the the IP address of a host from the hostname using the > reflib? Currently there is no way (because of limitations in gethostbyname) to know that www12.w3.org is the same as www.w3.org. > 2) How can I get all the different names of a host? Again, there is no way to do this. The best you can get is the IP addresses of a multihomed host - not the alias names. -- Henrik Frystyk Nielsen, <frystyk@w3.org> World-Wide Web Consortium, MIT/LCS NE43-356 545 Technology Square, Cambridge MA 02139, USA
Received on Tuesday, 16 April 1996 12:16:43 UTC