Re: [OpenJade-devel] Re: Any C++ programmers around? (was: Unix --> NT (source code stuff))

From: Adam Di Carlo (adam@onshore.com)
Date: Sat, Mar 17 2001

  • Next message: Udi Wertheimer: "The "iso-8859-8-i" encoding"

    Date: Sat, 17 Mar 2001 05:28:40 -0500 (EST)
    To: openjade-devel@lists.sourceforge.net
    Cc: Nick Kew <nick@webthing.com>, "Michael D. Crawford" <crawford@goingware.com>, W3C Validator <www-validator@w3.org>
    From: Adam Di Carlo <adam@onshore.com>
    Message-ID: <oa66h8ofjn.fsf@arroz.fake>
    Subject: Re: [OpenJade-devel] Re: Any C++ programmers around? (was: Unix --> NT (source code   stuff))
    
    Terje Bless <link@tss.no> writes:
    
    > [ Any objections? Adam? Brandon? Anyone?                            ]
    
    Nope, not at all!
    
    > >But again, I've had to do that to run it from the Web, so a commandline
    > >variant should be equally straightforward (though a hack).
    > 
    > Well, the point is that (Open)SP expects some things to be specified in
    > environment variables instead of as switches on the command line. This is a
    > nice feature addition for the command line -- as you won't have to specify
    > the switches every time -- and kinda works in CGI land -- because the
    > environment goes away with each invocation -- but it's a sordid mess when
    > you move to mod_perl or other persistent interpreters where the lifetime of
    > the environment (parent process) spans several invocations of SP.
    > 
    > We're currently futzing around with SGML_CATALOG_FILES, SGML_SEARCH_PATH,
    
    Well, if you build with the right settings, you don't need to do this.
    
    > SP_CHARSET_FIXED, and SP_ENCODING. In particular, SP_CHARSET_FIXED and
    > SP_ENCODING are "magical" in that they are necessary to enable XML mode.
    
    Hmm.  I find that XML mode works pretty well without those.  But maybe
    I haven't looked deeply enough.  Can someone enlighten me.
    
    Anyhow, in principle (aside from SGML_CATALOG_FILES and
    SGML_SEARCH_PATH), I agree there should be cmd-line equivalents.
    
    > >>* Ability to say: "use this SGML Declaration and this DTD".
    > >>  -- SGML Open Catalogs are fine and dandy an all, but for some things it
    > >>     would be much less painfull to say "use this" on the command line.
    > >
    > >I'd like to, but that'll be a longer-term thing.  I'd like it still better
    > >if someone with a much deeper knowledge of SGML than mine looked at it.
    > 
    > Et tu, Brute? Aren't there any real SGML gurus around that could help my
    > poor tortured brain -- and Nick's, apparently :-) -- tackle SGML issues? I
    > barely understand half of what the SP man pages are trying to tell me
    > because they speak in SGML-alese (i.e. in tounges for the good it does me)
    > and XML is "double the fun" (that was your cue Sean! ;D).
    
    Oh, I pointed out in my last email that DTDDECL in your SGML open
    catalogs will fix this. We ship a lot of DTDs in Debian this way (but
    I had to hack Jade for Debian so Jade wouldn't bitch about the Jade
    non-supported directive).
    
    Perhaps I should give an example:
    
    PUBLIC  "-//W3C//DTD XHTML 1.0 Strict//EN"        "xhtml1-strict.dtd"
    DTDDECL "-//W3C//DTD XHTML 1.0 Strict//EN"        "xhtml1.dcl"
    
    
    > >> * Blue Sky: A Perl (XS) Module Interface
    
    > >I don't see myself getting involved with that.  If I'm hacking SP,
    > >that's because I don't want to wrap it in something else - like Perl.
    [...]
    > In consolation, it should be a pretty easy task if you A) know C++ and B) a
    > little about Perl XS modules. Perl comes with tools to automate parts of
    > the job and man pages that describe the process fairly well (at least, it
    > looked pretty good to my untrained eyes even if I can't do it myself ;D).
    > To all appearances it's mainly a mechanical job of doing data type
    > conversions from/to Perl/C++ and similar things.
    
    Yes, I agree it would be pretty easy and quite nice.  Even just
    getting the entity mgmt function from libsp (or libosp) would be
    pretty sweet.
    
    > >> * Blue Sky: Configurable Error Format
    > >>   -- The error messages are an exception for most SGML Processors, but
    > >>      for the Validator they are the norm. Being able to play tricks with
    > >>      the format and fields of the error output would be usefull.
    > >>      Reporting context a bonus!
    > >
    > >I've looked at that a little, and I've implemented a compile-time option
    > >to switch between JJC's format and HTML-ised format for Code Valet.
    > >I'll be doing some more work in this field in due course.
    > 
    > Well, the reason I'm so gung ho on switching to OpenSP is that it has a
    > switch "-n" that outputs message numbers ("relevant clauses") with error
    > messages: "onsgmls:OSPF<0>:1:1:1:E: DOCTYPE Missing" meaning it detected
    > that there was no DOCTYPE on line one, character one, and this violates
    > clause number one (of some ISO standard presumably). Since we're wrapping
    > SP in a Perl CGI app, it's much easier to parse out the error _number_ (or
    > some other semi-unique identifier) then the free-form text message.
    
    Doesn't that solve this wishlist item, then?
    
    > Other usefull things to have in an easily parseable format is stuff like
    > containing element (last opened element), asking for warnings about
    > "expected foo, but got bar, assuming baz" so we have a way to report when
    > someone forgot to close their TD or puts weird stuff in the HEAD section
    > that will implicitly close HEAD and open BODY.
    
    I use 'onsgmls -gues file' and that tells me containing element.
    
    > BTW, since I'm yelling about SGML gurus and C++... Did anyone ever have any
    > ideas about why some errors get reported only once with a HTML 4.01 Strict
    > DTD, but multiple times with the HTML 4.01 Transitional DTD? Either this is
    > an intentional difference in the two DTDs -- one that I can't find or
    > understand the point of (I didn't even know this was possible to express in
    > a DTD!) -- or a bug in all SP-based parsers. In particular, a bogus
    > attribute on the IMG element gets reported only once with strict.dtd, but
    > at every occurence in loose.dtd, using lq-nsgmls, JJC/SP nsgmls, and
    > OpenJade's OpenSP.
    
    I think I saw that email come in but I haven't had time to look yet.
    
    FYI, did anyone but me notice that XHTML 1.0 is not clean according to
    onsgmls -Wall?  'xhtml1-strict.dtd' contains bogus stuff like an
    entity 'FrameTarget' which is defined but never used.
    
    I emailed the address given for comments, but have gotten no feedback.
    I wish the spec committes would at least try to validate their DTDs
    fully before publishing the spec...
    
    
    -- 
    .....Adam Di Carlo....adam@onShore.com.....<URL:http://www.onShore.com/>