Industrie Toulouse: Configurable Components

Tres Seaver: "Pluggable components are good; configurable components are better," from a series of slides speaking of "From CMF to Zope 3: Lessons Learned".

I've heard this line before about Zope 3, at least once (only once to my recallable memory), and it stuck out at me then, and sticks out even more now. Actually, today it hit like a bolt of lightning. It just might be in how concise the phrase is, due to it being a bullet point in a slide... No, it's more substantial than that.

Last week we hit a major performance issue with a web application that is in its transition phase. A page was taking up to twenty or more seconds to load, which is absolutely unacceptable. Theories and ideas were batted about, but it seemed to be the fault of one call that was being made a LOT of times. We started thinking about advanced caching schemes, and how to be able to start migrating some of the data access calls to read and flush the cache appropriately. But I thought "this is a query that could really be done once instead of so many times." That idea was (initially) shot down. Upon investigation, however, it seems that due to some TAL/Dreamweaver quirkiness, the data access method was being called at least THREE times for every item in a product matrix, that on the page in question featured roughly 30 items - ninety database calls for only thirty uses of the call that could be reduced to one. I found the spot in the code where this single call could be inserted and put in a simple mapping structure that the client (the template) could then access in place of the original data access method. Boom!, major speedup - especially after the erroneous TAL was cleaned up (even if it stayed, it would have had far less impact on performance).

But as I was applying this code and thinking of another site that this applied to, I realized that this script had interesting differences between the two sites (one in transition, one in early development) that used it. A problem that we've been tossing back and forth in our heads has been the issue of how best to deal with a fairly common Zope site scenario - a site built out of templates, scripts, and SQL calls, organized in folders in an object database. While this makes development and some degree of maintenance (especially emergency maintenance) delightfully easy, it's a pain when it comes to Software Configuration Management. It's fine when doing a one-off site, but becomes painful when trying to maintain many sites, with a couple in different development stages at the same time. For the past few months, I've been moving some fairly generic code into Python modules that could be imported into Python Scripts for utility purposes and to help move some common code into a real base framework (after many false starts) and into CVS for source control. But that hasn't addressed how to really start dealing with moving these massive scripts out of the ZODB and into real manageable Python code in source control. Not wanting to leap on anything until a good solution presented itself, we've waited and thought and brainstorm and waited while other more important projects took up valuable time. But todays events - reading over Tres' slides, thinking over some recent Zope 3 debates (that seem largely to focus around the need for configuration languages), seeing the release of ZConfig into the Python Package Index, and finally trying to deal with two code forks of unmanaged software - suddenly brought a solution (at least to this particular situation) into mind. A few heavy hours of coding later, it basically worked on first try. The client page of the one (fairly large) script that I "componentized" was completely unaware of the change.

So what went into this? First, I knew that I needed Formulator on my side, but wanted to define my forms in Python Code rather than through the web. Basically, I wanted a way of describing configuration properties that was closer to Zope 3 Schema's than to Zope 2's usable but not as expressive PropertyManager system. Snooping around Formulator's CVS tree, I finally came across FormulatorDemoProduct, a rudimentary Zope 2 product that shows a way of using Formulator with Python based product code. Using FormulatorDemoProduct as a base example, I came up with a small persistent object called "Configuration", that took a list of Formulator Fields as an argument. This set of fields is used, like in the demo product code, to update a dictionary (PersistentMapping in my case) upon successful validation of the form data. Using import as to shorten the name of StandardFields to the more Zope 3 style 'schema', a configuration definition can be built out of descriptions like the following:

from Products.Formulator import StandardFields as schema
LevelConfigurationFields = [
    schema.LinesField("text_data_blobs",
                      title="Text Data Blobs",
                      description=(
    "Subcategory identifiers for entries "
    "in the generic blob table that map "
    "to this particular level's id."),
                      required=0,
                      width=20,
                      height=5,
                      ),
    ]

Next, ConfigurationHolder comes into play as the base class for the service objects that will be replacing certain folders in the system that contain nothing but scripts and SQL methods. As a ConfigurationHolder, a few special methods and a Zope Management screen is made to list and access the configuration objects within each, and each configuration object has a Zope Management screen built automatically off of its Formulator data. The above example might exist as a subobject level_config. Since all of this is built purely out of descriptions (a style preferred by systems such as Naked Objects, which was brought up early during Zope 3 development and I assume contributed heavily to the power of the Schema system), adding new configurations and configuration holders should be relatively easy. We have the ability to look at certain pieces of business logic and ask "what might change here?" and make new configurations off of it without worrying about making a usable user interface for dealing with these changes.

Finally, there's the business and data access logic itself. Much of it exists in various Python scripts of various sizes, dealing with many different aspects of the system. While Zope Python Scripts are great, flexible, etc, it's easy to fall into the same maintenance nightmare that one might run into with a large application written in poorly crafted PHP/*SP pages, where large scripts dominate the landscape instead of modules and classes. Once you have a migration route out of those large scripts, which sometimes can't be found until an application is far enough along to fully understand a business problem, methods of Refactoring can finally be applied. I've done some refactoring in script land already - about as much as can be sanely done - by keeping the scripts that actually respond to HTTP requests small handlers that call into the larger "common" scripts. Usually these handlers make one call to a large script, respond to exceptions raised, or go on after a success. This allows the handlers to be tweaked to deal with UI changes without affecting the larger code of the back system, but ultimately I've been weighed down for months with the thought that "there's just too much back there". Now these scripts are being taken apart using common refactoring techniques such as Extract Method to be applied. Now code is in manageable chunks, code browsers can navigate it more easily, and performance is actually increased a little bit since it doesn't have to run in restricted python mode. And then, like the little handlers on the public side of the site, the ConfigurationHolder becomes a simple front end, or Façade to the complex business object behind it. As the interfaces of these Façades stabilize over the next few iterations of the software, I'm now (finally) confident of the software's adaptability in a way that doesn't compromise deployment and configuration management.

It's a system that I always wanted to develop in a component oriented way, but couldn't see how to do it until just now. We've had a few component based elements show up in the design already, but I think I was focused too much on trying to componentize the data (all in a relational database) without focusing on the true meat of the system - the business logic that operates on all of that data.

Interestingly, we sort of did design the system as replaceable components, hoping initially that these folders of scripts would yield to something better. It wasn't until the realization of component configuration came along that it worked out though. Bringing us back to: "Pluggable components good; configurable components better." Whew! Next stop - XML import/export of configurations so that the different component configurations for customers can be tracked by the SCM process (such as it is).