Re: databases: storing pages in / web front ends to

On Thu, 18 Jan 1996, Dan Weinreb wrote:

|    Date: Thu, 18 Jan 1996 14:30:10 +0100 (MET)
|    From: Robin Stephenson <>
|      My short-term aim (read: next month or so) is to identify some way
|    of putting Web pages into a database as objects, and having an object
|    server sitting inbetween our httpd and the database files.  Ideally
|    I'd have some way of ensuring (or at least checking) link integrity,
|    etc.  The appeal of this solution to me is that it is scaleable, and
|    in the long term I may have to manage thousands of Web pages, rather
|    than the few dozen I do at the moment.
| Gee, do you actually want to put regular HTML pages into a database as
| objects, even if they just contain static text?  I've been asking
| people about this, and haven't heard much interest in it, so I'm
| curious what you feel would be the benefits ot such an approach.

Yes, I do.  My reason for wanting to do this is that if I am providing
a single `page' in several formats, e.g. Lynx-compatible, Mosaic,
Mozilla, and several different languages, I don't want to have to deal
with umpteen files.  If everything is encapsulated in one place, and I
can just say `display yourself' to this object, life suddenly becomes
easier.  My idea is to treat it much like a C++ class, and derive
classes with more complex behaviour from a basic page object, which I
envisage resembling a normal HTML page in appearance, but with the
ability to add another language version to itself when requested -
some sort of admin tool would check out the text for a language ( la
SCCS/RCS) and check it back in.  The page object would also be able to
do integrity checking, by maintaining pointers to other page objects,
and creating links to whatever they happen to be called on the fly.
Moving things around within the web hierarchy then becomes easier as
well, almost as a side-effect.

| For checking link integrity, I've mostly been hearing about various
| Web-site-developer tools that are supposed to do this for you.  I get
| the impression that they're kind of "batch" oriented, i.e. they go
| over your whole site and check out all the within-site links.  I think
| WebStar (is that the right name?  The O'Reilly web server that's sold
| in a box) comes witha tool like this, and Vermeer has one, and Adobe
| SiteMill is another.

It is a reasonably simple job to write a Perl script that connects to
port 80, requests each page, follows all links, and reports on bad
ones.  It's been reinvented several times, and is undoubtedly a useful
tool, but I'd rather have something that makes it difficult to create
bad links in the first place, and ideally something that warns you as
soon as links become bad - if a page object is told to display itself,
perhaps it could do a quick check on its links, and just unlink any
that appeared to be dead, or link them to a `dead link'page?  This
sort of thing is where I see the real benefit in having `live'web
pages.  All of this /would/be possible by writing some sort of program
to check each page as it went out from a normal HTML file, but it
seems to me to be a less natural way of thinking about it.

| What scaling problems are you anticipating?  I get the impression that
| there are a lot of web sites with thousands of pages that are just
| stored as ordinary files.  I guess checking links in "batch" mode
| could cause one scalability problem (it would just take too long to
| run it).  Were you thinking of anything else?

I have two worries.  One is maintenance - if someone changes their
email address I do not want to have to update all the pages either by
hand or with the equivalent of a sed script.  Likewise, if someone
decides they don't like the style of the icons, I do not want to have
to alter all of the links by hand.  My other worry is the sort of
creeping degeneration you see in some web sites - dangling links,
broken images (special case of dangling links, I suppose) and so on.
I feel that for large sites some mechanism to check this /whilst
creating pages/ would be useful, in addition to the standard
`site-checking'software.  I would like, ultimately, to be able to let
someone with very little knowledge of UNIX, HTML, and so on, design
and implement their own pages.

| I'm interested in this because we (Object Design) make an
| object-oriented database system (called ObjectStore) and I'm trying to
| learn more about the suitability of our product for the kind of
| purposes you're talking about.

I'd be interested to know your thoughts on the suitability of your

|      Another thing that I'd like to be able to do is implement a Web
|    front-end to a transaction-processing database - e.g. write a simple
|    script to get information from a form, tell the database it's a
|    payment, or a debit, and have the database do most of the work (then I
|    can get on with writing nice Web pages, and leave Perl more-or-less
|    alone...)  It seems to make sense to me to implement both database
|    systems on a common platform, or at least with a common interface.
| If you're just talking about using a conventional RDBMS, there are

  Well, ideally I'd have one program that could do everything.  Then I
could use the db to both store the pages forming the interface and the
data forming the application.  So I guess I'm looking for an OORDBMS.
  The thing about programming is the sense that all computers and
applications are equivalent - provided they're sufficiently complex,
of course.  By this I mean that I'm sure I could, given time,
implement this system with a notepad and pencil, and sit on my
computer here typing down the line every time someone connected to
port 80 - `Content-type: text/html', etc.  What I want is a neat,
elegant solution that gives me the expandability and flexibility that
I'm going to need.

| quite a lot of products out there to help make this easier.  There's
| net.Forms from net.Genesis, and Cold Fusion from Allaire, and IBM's
| DB2-WWW Connection, and others.

This is precisely why I'm asking the question - out of the plethora of
products, I need to choose just one..  I'm following up all of the
products that are mentioned to me, and am pleased to receive advice
and recommendations.
  Thank you very much for your detailed answer, by the way - as a
newcomer to this mailing list, I'm unaware of what's been said
previously.  I presume that this is a topic that has been discussed at
some length?

Robin Stephenson.      (send email with subject `send pgp key' for pgp key)
Add Butter And Vanilla

Received on Thursday, 18 January 1996 13:02:09 UTC