W3C home > Mailing lists > Public > www-jigsaw@w3.org > September to October 1996

parseURI and others...

From: Viktor <viktor@irisz.hu>
Date: Fri, 6 Sep 1996 01:54:28 +0200
Message-ID: <01BB9B96.57D5CE10@moon.irisz.hu>
To: "'www-jigsaw@w3.org'" <www-jigsaw@w3.org>
Instead of fixing the bug, I wrote an new one, 
which uses a regexp-package of... hmmmmm...
Jonathan Payne, from Starwave Co. 
I lost the original url, and he didn't
wrote it in his docs. I will lookup in a search enginee..

So it's: http://www.starwave.com/people/jpayne/java/

Maybe it doesn't fits for your needs, sure it's slower
than parsing it 'manualy'.

Actualy the header-extract of HTMLResource is uses
this too (match /<title>\s*(.*)\s*</title>/), but it's terrible
slow in case of files *not* having title, even when
I limited to the first 1K of the file...
So now I'm looking for other solution, getting
information and make experiments with some mistic
tools, like jax/jell/cup (flex/lalr) etc... (thay are mistic for me
at least, I try to discover which is for what...).

Since I'm using perl I really hate to write this kind of
things with a lot of while/if/break/else/continue...

So here the source is...
(I hope my mailer won't reformat it so terrible ;)
(I would prefer write <pre> here!)

---- in LookupState.java

import starwave.util.regexp.*;

---- .... ----

	static Regexp compReg, fragReg, queryReg;

	static {
		try {
			queryReg = Regexp.compile("\\?(.*)$");
			fragReg = Regexp.compile("#(.*)$");
			compReg = Regexp.compile("/?([^/]*)/?");
//			compReg = Regexp.compile("/?(.*)/?");
		} catch (Exception ex) {
			System.exit(1);		// TODO: find beter method here

	protected void parseURI ()
		throws HTTPException {

		Vector comps  = new Vector(8);
		StringBuffer buffer = new StringBuffer(uri);
		Result result;
		int pos=0;

		if ((result = fragReg.searchForward(buffer.toString(), pos)) != null) {
			request.defineField("frag", result.getMatch(1));
		if ((result = queryReg.searchForward(buffer.toString(), pos)) != null) {
			request.defineField("query", result.getMatch(1));
		is_directory = (buffer.charAt(buffer.length()-1) == '/');
		while ((result = compReg.searchForward(buffer.toString(), pos)) != null) {
			if (result.getMatch(1).length() != 0)
				comps.addElement (unescape(result.getMatch(1)));
			pos = result.getMatchEnd();
		components = new String[comps.size()] ;
		comps.copyInto (components) ;
		index = 0 ;
Received on Thursday, 5 September 1996 20:06:23 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:30 UTC