W3C home > Mailing lists > Public > public-webplatform@w3.org > September 2013

Re: Getting 503 Error

From: Renoir Boulanger <renoir@w3.org>
Date: Wed, 11 Sep 2013 15:30:59 -0400
Message-Id: <A72B794F-7D30-4B5D-98C8-84B472364B45@w3.org>
To: List WebPlatform public <public-webplatform@w3.org>
Hey everybody,

A rather exhaustive heads up regarding 503 errors and where I am at the moment.

It is following up the issue described in [1] and [6] and logged into the task [0]



503 errors. Is the situation solved?

I think it is now   

/o/  :D

Since I came back from vacation and did what described here [1]. I did not have to 'flush hosts' on any database server.

3 full days! The best we ever had so far.

Let's see when I am done with the Piwik update and back up again.

See: [0] for details



Improving the situation

Part of the performance is that I think we need to optimize the caching strategy from fastly.  [2]

Among caching, I am also planning to use fastly's built-in features
- custom error messages
- asset caching (images,javascript,css)
- Fastly is /also/ a CDN

Things that we are not really using to its full potential. yet.



Testing backend

I am currently [2] testing an alternate caching (Varnish) configuration on a separate fastly host: test.webplatform.org with two brand new app servers (app6, app7) specific for this test.

Note that although test.webplatform.org uses the same database and memcached configuration, the code giving the following is on separate nodes (app6, app7) that are not used for production. This is a temporary setting in the meantime we get a REAL separate environment.

Here are some temporary views showing error messages handled by Fastly:
 - 500 error: http://test.webplatform.org/w/force500.php  
 - 503 error: http://test.webplatform.org/w/force503.php

Remember: those files are hosted on app6 as two php files, not in source control, to emulate what we would see in a real situation. 

Since app6,app7 are not in prod, no worries.



Optimization

Prior to this work session, I realized something interesting. The error pages and other static documents are using mediawiki asset system. Something I think is not a good idea from my past experience. 

I think it is better to serve contextual variants of the same code base and build them prior to deploy. That way we can isolate CSS/JS/Images from the web application server… and most importantly; isolate them for appropriate caching and CDN distribution.

An other problem is that when we have a backend server error (e.g. app4), it is most likely that backend regenerated asset management is broken too. Along with the desired CSS documents.

For this, I created a static css file [3]. It has to be made shorter; it most likely do not need ALL WPD styling for a 90 lines of HTML.

I have in my todo to make a better static version of the CSS: [4]


  [0]: http://project.webplatform.org/infrastructure/issues/INFR-7
  [1]: http://lists.w3.org/Archives/Public/public-webplatform/2013Sep/0028.html
  [2]: http://project.webplatform.org/infrastructure/issues/INFR-43
  [3]: http://www.webplatform.org/errors/style.css
  [4]: http://project.webplatform.org/infrastructure/issues/INFR-42
  [5]: http://lists.w3.org/Archives/Public/public-webplatform/2013Jul/0128.html

If you have questions or suggestions, meet me on IRC freenode, I am 'renoirb' :)

Regards,

Renoir Boulanger  |  Developer operations engineer
W3C  |  Web Platform Project

http://w3.org/people/#renoirbhttps://renoirboulanger.com/  ✪  @renoirb
~







Received on Wednesday, 11 September 2013 19:31:07 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:20:54 UTC