Case study: A Python-based CMS in a low-cost hosting environment
I recently replaced the simple (PHP-based) backend for my Good Web Hosting Info site with something written in Python that at least deserves the CMS tag.
I decided to use CGI for this because it’s easiest to find web hosting for that. And if I want to start more websites I can just plug the CMS into a budget hosting account. If I later want to use a better framework (like CherryPy)I can easily port it because it is in fact structured like I have learned to structure CherryPy applications. The major difference is that I needed to write my own url-to-function code.
What it does
From the web interface I can add, edit and delete categories and articles, adjust the ordering of the categories and set a few configuration options, including page caching. The CMS also produces RSS feeds for all categories and writes them to static files.
How I built it
I started with a template manager and a view class that I have developed for CherryPy applications but they can just as well be used with any other framework. These two classes take all the stress out of building the interface and all that is left is to place the templates in the correct files. They are adapted to Cheetah but they can easily be modified to use most other template modules. I got a page up with PXTL within a few hours. Not bad considering that multiple templates are used for one page and this was my first time using PXTL.
With the view component figured out of the way I could dive straight into the database module. I started with the SnakeSQL pure-Python database and it worked well for what I told it to do even if it is not of production quality yet. What stopped me from progressing with SnakeSQL is that it can’t store longer strings than 256 characters.
Since hosting availability was an important factor the obvious database choice was MySQL. It doesn’t have all the features of Postgres and Firebird but this is a simple CMS with only two database tables so hosting availability is a far more important factor than stored procedures etc..
I have written my own module for form handling. It’s called hbform and represents each form with a Python class. It helps with validating user input, filling and manipulating forms, and generates xhtml code for the form controls. It can generate code for the whole form, or work together with a templating solution like Cheetah. I plan to release it some day…
CGI and efficiency
Besides the fact that the Python interpreter needs to be started for every request there are other factors that slow down CGI-based programs. First, each and every module needs to be reloaded. And Cheetah templates are compiled to Python classes so it’s smart to cache the compiled templates in a way. This turned out to be very hard with CGI.
With a persistent application my template manager stores the templates in an in-memory dictionary. The obvious CGI alternative was to pickle the dictionary with the templates but that didn’t work, I don’t remember the exact error but it was something with a function type used by Cheetah that is not picklable. Bummer!
Generating a page from scratch takes around 0.5 seconds and a little inaccurate profiling revealed that about 90% of this was used for template parsing. So I thought caching the templates was the key to speed. After struggling with this I managed to write the generated Python classes for the templates to the file structure and import them almost like regular Python modules. This resulted in a small speed increase (~0.1 second). I also tried to use non-compiled templates in the form of PXTL but that was even slower.
I was disappointed by this so I did more profiling and it turned out that ~0.1 second was originally used for template parsing and most of this was eliminated by template caching. But loading the Cheetah module took ~0.15 seconds. Strange, but I realized that caching the generated pages was a better approach. A few lines of code for pickling pages and the time to deliver a cached page is ~0.01 second. Problem solved :-)
Hosting environment compatibility
Good web Hosting Info is hosted by Site5 and they have Python 2.2.2 and the MySQLdb module installed. I run 2.4.1 at home so a few compatibility problems were to be expected. I experienced 3, involving the DateTime module that was introduced with 2.3, MySQLdb’s executemany function and a warning about different versions of the C interface used for the NameMapper module in Cheetah. Could have been worse.
I did a user-level install of Cheetah. Maybe Site5 would have installed it for me but I didn’t ask. I just copied the src directory of the Cheetah download and placed it in the base directory for my CMS. I removed namemapper.pyd (C code) to avoid warnings in the error log about different versions of the C interface. There is a fallback for the C component written in Python so this only leads to a little less speed. This installation procedure is not described in the Cheetah documentation but it has worked so far.
Conclusion
With the help of some decent modules, developing my own simplistic, efficient and usable CMS in Python, suitable for a low-cost hosting environment wasn’t too hard. I get exactly the functionality I want and if I need something else it’s easy to add since I know the code in and out. It’s also pleasing to run my own website on my own CMS.
Even if the Python support in budget hosting accounts normally is far from ideal it’s often sufficient for typical web applications. In my case, Python 2.2.2 and MySQLdb was all that was supplied from the web host but that is just what I needed. If you develop with a fresh Python version (2.4.1) the incompatibilities doesn’t have to be a problem.
Ps!
If you just want better Python support in a shared hosting environment, try a more specialized hosting provider like Python-hosting.com or GrokThis.net.