Hosts and domains

Dear all,

I'm working on the new ICRA label generator and, since it asks for user 
input, it's 99% error checking of course.

I'm writing the code to extract the domain info which the draft rule spec 
calls for. This is made easier by Perl's URI module that has a nice little 
method that returns the host for a given URL. OK. So:

http://www.example.org returns www.example.org. So far so good. I can strip 
off the www and use that.

Then we have http://subdomain.example.org. The domain is still example.org 
but the host is now subdomain.example.org. OK, rather than writing a rule to 
match ".*" I can now write it to match ".*subdomain.example.org.*". This is 
going to be important for people who have homepages on big ISPs whose 
websites have addressed like mydog.btyahoo.com

But extracting a domain from a host is not always easy. It's OK for TLDs 
like .org and .com - you just take the bits either side of the last "." - 
kind of breaks down with example.co.uk though.

So... as discussed earlier, the restriction is important. We need that to 
limit the scope of a label that matches ".*" but on reflection, can it be 
host rather than domain? This takes care of those personal websites and a 
heap of other stuff.

If we have:

<rule:hostRestriction>example.org</rule:hostRestriction>

Then "matches .*" should still match all subdomains of example.org. But we 
can also have:

<rule:hostRestriction>subdomain.example.org</rule:hostRestriction>

Which would only match against that host or its subdomains. In essence, the 
value of the hostRestriction must match the right hand side of the host 
within the URI under test.

OK?

Phil. 

Received on Wednesday, 15 December 2004 14:29:22 UTC