W3C home > Mailing lists > Public > www-math@w3.org > February 2006

Re: *.xhtml filenames, mimetypes and and mathML problem

From: Richard Kaye <R.W.Kaye@bham.ac.uk>
Date: Thu, 23 Feb 2006 15:41:41 +0000
To: www-math@w3.org
Message-Id: <200602231541.41465.R.W.Kaye@bham.ac.uk>

Many thanks to those that replied so quickly.

I am quite sure that my pages were not being searched by google.
I was using definitely using "application/xhtml+xml" earlier, so
I presume google doesn't like this mime type.

As a temporary measure I have changed the mime type for "xhtml" 
on my system to "text/xml" (which is what David's set-up uses) to see 
what happens. 

But I note that there is a *different* problem with google and David's pages:
The google listing shows them as:
    File Format: Unrecognized - View as HTML
To many people that might look like a dire warning that you have to 
click on the "View as HTML" link rather than the link for the nasty
unrecognised format.  In fact the opposite is true and "View as HTML"
gives gibberish.  Oh dear....

I obviously need to look into content-negotiation.  I'm not an expert
in configuring apache (and I have quite an old apache here) so i don't 
want to launch into experiments on these rather important pages that 
do need to be accessible at all times.  I'll do some experiments on some
non-critical pages first.

Richard

On Thursday 23 February 2006 13:21, David Carlisle wrote:
> The extension on the file shouldn't matter (it doesn't matter to
> mozilla, with IE it does sometimes apparently have an effect)
> But it must be served with an xml mime type (text or applictation /xml or
> xhtml+xml) If you serve the pages as text/html which is probably the
> default on your server for .html pages the mathml will not display.
>
>   BUT (and this is the problem) it seems that my home page at
>    http://mat140.bham.ac.uk/~richard/MSM1Bb/index.html
>   is indexed at google, but none of the others are.  I presume this is
> something to do with the mime type and/or filename I am using.  I must
> admit to being rather upset by this as the whole point of using XHTML and
> MathML is to make the pages *easier* to index and search rather than
> harder!
>
>
> Hmm we have a similar setup here (using .xml as the extension and
> text/xml as the mime type) and google seems to find things.
>
> eg
>
> http://www.google.co.uk/search?hl=en&q=site%3Anag.co.uk+filetype%3Axml+comp
>utes+the+QR+factorization+of+a+real+m+by+n+matrix.
>
>
> finds several references to the xhtml+mathml documents that describe
> routines using such a factori[sz]ation.
>
> google does list them as
>  File Format: Unrecognized
> but the links it provides is to the xhtml files do work.
>
>
> It does appear to be the case that google isn't indexing your files.
> I must admit that it is probably by luck rather than by design that it
> is indexing mine, but perhaps if you compare the setups you will find
> the trick??
>
> If you are happy with your setup and don't want to change anything other
> than to get google to index the pages, perhaps submitting a sitemap
> http://www.google.com/webmasters/sitemaps/docs/en/about.html
> would help (but I've not tried this)
>
> David
>
> ________________________________________________________________________
> This e-mail has been scanned for all viruses by Star. The
> service is powered by MessageLabs. For more information on a proactive
> anti-virus service working around the clock, around the globe, visit:
> http://www.star.net.uk
> ________________________________________________________________________
Received on Thursday, 23 February 2006 15:47:27 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:13:40 UTC