We are using TileCache http://tilecache.org/ to create caches of Ordnance Survey data of various flavours, for use in high demand web sites. We are tending to pre-seed the caches, then access them directly from disk, as this gives the best overall performance. However building the caches for the larger map scales is a significant task, requiring many days of processing and hundreds of gigabytes of storage.
This post addresses a particular problem – the pre-seeding routines failing intermittently with HTTP 502 errors.
Pre-seeding is carried out on the command line, issuing a command like this:
D:\Websites\UKBaseMap\scripts\tilecache\tilecache-2.11\tilecache_seed.py --bbox=0,0,600000,1300000 OSOpenOSGB 0 10
where OSOpenOSGB refers to a config section in tilecache.cfg (which in turn points at the WMS), and “0 10” means process levels 0 to 9.
The normal tilecache_seed.py looks like this:
================
#!/usr/bin/env python
# BSD Licensed, Copyright (c) 2006-2010 TileCache Contributors
"""This is intended to be run as a command line tool. See the accompanying
README file or man page for details."""
import TileCache.Client
TileCache.Client.main()
================
But we have been finding that this fails intermittently with errors being raised, like:
urllib2.HTTPError: HTTP Error 502: Bad Gateway
We do not know the root cause, but apparently the CGI MapServer WMS is simply failing to respond sometimes, and unfortunately this causes the cache seeding process to stop. We are seeing this at varying intervals, from every 5 minutes to every couple of hours. Enough to completely disrupt the construction of a large cache.
So, in my first foray into Python goodness, I’ve enhanced tilecache_seed.py a little to recover from the failures and continue until the caching finished without an error:
================
#!/usr/bin/env python
# BSD Licensed, Copyright (c) 2006-2010 TileCache Contributors
"""This is intended to be run as a command line tool. See the accompanying
README file or man page for details."""
# amended CF 22/2/2012 to resume after errors urllib2.HTTPError: HTTP Error 502: Bad Gateway
import time
import TileCache.Client
i = 1
while i > 0:
try:
i = 0
TileCache.Client.main() # do the work
except:
i = 1
print 'HTTPError occurred - resuming processing...'
f = open('errorlog.txt', 'a') # open file for appending
f.write('HTTP Error occurred at: ' + time.strftime('%x %X') + '\n')
f.close()
time.sleep( 1 ) # have a rest to let MapServer recover from whatever was troubling it
continue # resume the loop
else:
print 'Processing completed!' # no error was encountered
================
This also logs the occurrence of errors into a log file like this:
HTTP Error occurred at: 02/23/12 23:50:00
The caching now bashes on regardless of intermittent errors. However, best to avoid the –f flag on the command line as this will re-start the job from the beginning and re-create all tiles, therefore may never complete if errors are occurring regularly.