Case study: Porting and modularizing an in-house Python application

May 30, 2011

Is it just a feeling, or has the number of technical posts decreased on the Planet? I definitely need to change that! (Update: The post got quite long, so I’ve shortened the syndicated version.)

These days, I’m usually working on my Diplomarbeit at the Institute of Theoretical Physics at the Technical University here in Dresden. (A Diplomarbeit is about comparable to a master’s thesis, though preparation time is one year since there was no bachelor’s thesis before it.) Our working group has an in-house application for inspecting iterated functions, which is quite intuitively called Iterator. Like many scientific applications, it has on one hand a very sharp functional objective, but contains tons of different tools and plugins.

As a part-time project besides my usual work, I’m working on a Qt port of the Iterator, which is currently based on wxWidgets. Of course, a port offers the opportunity to refactor badly engineered parts of the application. It clearly shows that all involved developers are first and foremost physicists which have not received any formal training in software design. Many antipatterns can be observed in the code. The most interesting is the existence of a god class called IteratorApp.

IteratorApp is not the application instance which drives the eventloop, as one might expect from the name, but the main window. The iterator.gui.iteratorapp module imports about all other core modules, and instantiates them inside the IteratorApp class. The reference to the IteratorApp class is then passed around to all components inside the IteratorApp, and also all plugins loaded by it. The IteratorApp also contains most of the basic business logic: loading of plugins, creating and managing all widgets in the mainwindow, etc.

There are evidences that it was even worse some years ago. People have already worked on extracting modules out of IteratorApp, but they are still closely tied to IteratorApp, and thus have direct connections to all other parts of the code. The most interesting thing about this design is that the Python programming language makes it very hard to discover dependencies between modules in this design. Consider the following Python code:


from external import BarComponent

class FooApp(object):
    def __init__(self):
        self.bar = BarComponent()

FooApp is our equivalent to IteratorApp here. Now when you have a reference “foo” somewhere, “foo.bar” will give you the corresponding BarComponent instance, which you can directly use if you need functions of BarComponent. You can also pass it to other modules with different variable names and stuff. All this makes it quite hard to track efficiently on which components the specific module you are looking at depends. One can grep for “foo.bar” if “foo” is the established name for FooApp references. But if you pass the BarComponent reference to some other code, this also won’t help. The problem is that you do not need to explicitly import BarComponent like in FooApp. This is a non-issue in C++, where used classes are usually forward-declared:


class BarComponent;

class FooApp
{
    public:
        BarComponent* bar() const;
};

If you use methods of “foo->bar()” somewhere in your code, you absolutely need to include the BarComponent header. The dependency of the component in question to BarComponent is then obvious. (Executive summary for this part: Forward declarations are good not only for reducing compilation time.) How can we transfer this advantage of C++ to Python?

The Qt port of the Iterator does not include something as powerful as the IteratorApp. However, in an application consisting mostly of plugins (i.e. the iterable functions, and the tools that the different group members use to solve their distinct problems), there must be some authority that keeps everything together, and which is passed to plugins. In the Qt Iterator, this position is taken by a new PluginLoader class.

The PluginLoader itself is – on purpose – very light-weight: Its tasks are restricted to creating application-global instances of other classes which are called application plugins. For example, the Qt main window is an application plugin. One can ask the plugin loader to load a specific plugin by calling it (i.e. its __call__ method) with the type of this plugin. For example, to instantiate a PluginLoader and get a MainWindow from it, the code is:

from iterator.common.pluginloader import PluginLoader
from iterator.qtgui.mainwindow import MainWindow

pl = PluginLoader()
mw = pl(MainWindow)

The plugin loader places a reference to itself in all loaded plugin instances in the “pl” attribute. This behavior is similar to the IteratorApp, which distributed “app” references all over the place. The important difference is that all business logic resides in application plugins. These can be obtained from the plugin loader at any time, but using a syntax which requires to name the type. Naming the type requires the Python developer to import the type explicitly, thus making module inter-dependencies visible e.g. to automated dependency graphing solutions, which can, as a follow-up to this initial refactoring, be used to decide on subsequent refactoring steps.

Along with the plugin loader, I defined (and documented!) a simple protocol which plugins must follow. This protocol defines a common set of functions which plugins can implement e.g. to be stopped at runtime or to get notified when other plugins are started or stopped. The latter allows for optional dependencies between modules. This protocol allows for some nifty features: The plugin loader installs a sys.excepthook and stops interface plugins which throw unhandled exceptions. The exception and trace is displayed directly in the interface, as can be seen in the lower right here:

The PluginLoader has been designed with the migration path from wx to Qt in mind: It has no Qt dependency, and can also be used in the old Iterator. (Legacy installations might not have PyQt available, so Qt can only be an optional code path at this point.) In fact, some of the new application plugins, which are used for internal data management, are also used in the old IteratorApp now, although the data is additionally made available in the old attributes for compatibility.

For communication between application plugins, Qt signals/slots have been chosen. As these are obviously not available in wx or Python itself, I’ve written a basic, sufficiently source-compatible implementation which is used as a drop-in replacement in non-Qt environments.

Here’s the source code for the plugin loader (minus some highly application-specific parts like plugin shutdown, which can be easily inferred by copying the startup code) and the platform detection including optionally Qt-based signals and slots. But I cannot upload text files or tar archives here in this blog, and any linked external source to which I upload the archive file can go away any time, which I do not like. (I tend to get mad on 404 links.) I therefore insert the BZip2-compressed tar archive as Base64. To get the archive, copy the following character block into a file e.g. “foo.txt” and then do “base64 -d foo.txt | tar xjf -“.


QlpoOTFBWSZTWawqqUsADKD/pPzURwB49//fL+/9qv////ACAAABABhgDTy3zMBQbWew72mxqou8
7jWp6xSBUlaZ7zdm1p7tDEhATRpkZUeamJpDT1GgDR6gyAAD1DTT1GjQaAITEmlPTIJkmQA0GgAa
A0AAANANTGkqejU00GTQNA9QxAAAANABoaA0ASakImgmhqYahkepkoHpBp6mmh6jQ0AAMEZNBxoa
Bo0yNNGmQGJggABoDQGmQGBMgSIhNBNE00DU1PTKm9ARqGnkj1PEgaAA2p6Q9Q0f+/uH0zzeecsf
MDzKhAdUhJMeKdQekPCTTwPHxq2fkQ1iCaAgwgJJBKjIEWBOf0+hbnv6hGh6WzefYFqBjVkhkEyH
04xG5yhG/lSez3pwNXqCcEp3/oeAtyQ547/3Nu3HhTqgK/JlMSeJ9X79ot4/8mFbZE6UrP2TVtJs
Zzbzffe7Nl3RG7oeX27rljc9a760Nj198GP9rTv3dYjsQ0JCNgMQkKqARkCJII6VN83ydYVhnZpB
bTVgiq5WkOEsVUyhE2a5oUDrZxnrtwbBLjzbgSVpvIKgcMcAi6IqOMzsISIwZjjv7P3ASvZWg7n6
xHSLIi5W+CUzyTgWMnsulqxDqcG++B1dt5PqVVtvReKIqrfnZJOTkaC9asA6ar0GFzXU6g0ckLfC
1dYVHCw3JwhqEymzaHhiRmhUZIvl6bHDHzRrnrCC8Dgcy6jNpN2u2z7UDJmQzeJyVTjCRJ04bLtG
9rZ2Ota33lbJ3eW6174uEFHsubFxlsGE7V8NwMnqzsNcTIa434nTit1NKYBuG0u1OkRfc9ASsk8H
7x5VCkHbr1djN9S5BNzJ0Oi8c9lUkKU1QzTOpuEuN1AlZmxudefVBPL8SiQhNC1WvoR6rVgdhiuk
BqacOoQ1lX4cDZUQ9xkZ1ZNCVM1xyvcKdOEWg08waNN6VY7kdmTdRxB0sBTx+OILQwnSR5PA0OLc
oxxZz1TvWDK+uFO6Gq+GKwzI2iC7w5yluGM1BnigbRKPzhE0SAinYLnDBzUwlUWGVVo5h5oUhqJe
mDowwFQcxHmmHEyek2ZbKlnF7pacdmc627DtR4CSEBlhAkeQt42eZH3EkCfeu5dfWw4+mJScSBcE
Z/EkwRnTM7gAVHNVhGSRjisFUWCDlLjpGuctck4cK41rPTi2xqlHkrFKRDg4UK+nmGcpJYTEJz9v
ZTKHXZPTUDA6k1MXUy/xEGSRO+EcZ5jy3nDoHCT8o9OetS1sFQ3QzQYfVsGrnKGuyCijJgrzAS0p
ywHNLfyToz8MEtRBosyuxkZlKzwWpAnR9BfPkEWxIOI7rMY7ddoXDiQTbhEgww51+O5t9QIPGFPH
T9tAPWFg4MJYNPbiDyVw2B7rGAMtAaTINGWRgEXpVs+qmz64vYaXFFu424hotuk4B7dugVddGFrn
xTthk10l5oGJJiFojjvm4ssm9oTiXJz86neBCFwcILop4Q3azHDQ49UU3TGh709p5QD3DDHtnwhD
2GYiDhd5uhQJE0+05yhy1Shbz3/A9VViMRUH7jvP7ki14n1HwR+FTHxacJG07tGUQ4IzfwLKfRnu
LF8qLNgnRWBlNo9dUPzn4C3xn4jifjRjC52aoTidf0Y2p5FkxR3coF4RsRdCsYuDbpEoB2iwOExr
uR8KvZAgLwyBGQnEJLiEAa/XcU/PCGekUyeBayOARBhps9wj/SPasa/fk36MPzvGIBa+/YvQIoFW
L4BqjbJ6wsjVIuls7e+Sn6yDbnQXqoLiwvQN5Pu+9iet/WaVprldhrJDgc2eIUpXzlTpz4qQWZUY
NtNnV1nVhJIyzYsKLgs1plQaMhc9thwR+qd0nZwh28tlHoqTtSLJteHxh7SbIIxTOSbGKytSIPyn
l8VDNpDqHUX7uDBexdEUKloGdblNtDFWbfY1lKC9CUh5nrjVNQPbpU6TQk2OxWQUdtDla0Lj3t1B
lCUwVq1FhWMDwqoC53OaBou7jmWPLXDOCiyN62wbFrBwdU+TC7nAGiUfYaiMN9LEOODOU49XZh4X
FBCIIhQF1BfoMwbNRYxeEzQuyQipCJEqTUCJjDZlKJE5vV+rEr43pYXzDGxgBoKXYFreZSa5xeGu
Z2Glw6568Tg6g7KYPRSvesIatpNHOrzPdpNWY6jQzshuNZfG76iF2zWTKYgwc71y19GM8Bw31sg5
oyOFyaj5bzEwq1rO05ngCwC8PB0JkBVnw9pbk35Hg6xuuk2OlraDiuteZHLGdawMOkLwnlHljKlN
lIkMoDNhRhRLllQXIKcoHvWw6bi1+LtWX7YstOo92dxRSwoNg/sXmlRUpmEIfTySHDk3RbgGXT0/
et3sOg+OW7St6EnEGNwQexkmTCkujSpKsSGJyMBzvnyNLc9Q5vW1L8YZUYUD5fZaTFAYxaEkxopZ
U6eKgw7EAiDCdTeQnbDdDuEYKGSyVImxGoQ98GLILd9QxSRRgNhQSm3CZGyewZpMgBXY6LQs5LYQ
ItOm+6iwZw3JlDBZW5HwdPLhIAm8YHUKxYLEm/6Z3EOnx4w0nZugcz3wWAsiJtJiEXAj2o6tHTy4
7+qDx7PLUot3A60cF6A3ciRqxd0ngYWYWGF0LtgzEbkzPOho0NjBsaPjqRX59/RQOkG6hs5jmFPX
biDwXV0GMbLKBl4h0FeTXnY+b4MvFExQj62hYrLRlMjoA2SriV7SjzwGYS7SiH2UlEsuCk2SPauw
BUWRjpmyG21WEK9IWVhbwbGsfyHYAabtNIFpoV2xDdPcyKEUUUUFgCCQVYM9m7TXtq0763k27oC5
9kPTOg4uXDPnaouXu3qzVWh8HxGJcDzJi2RD742upoJNAvkIbMqZcVu2CmC0wplqzcXxarMshZMW
vamXQFEQgpA6IQoYbTKYYXGUddAUIKmVMLgiCmh0wmq+l80LQFa5gh3okUQEoqI+apAfSdoogwwa
MmlkkCxrqAiIy1qKJpe9rpOoPchOSd3bQdF15e2wbFgjztjSbUCe5HSmqlvJ1IJIokssniIzMGXD
dsxcYjHq4JZBtB6jjZnIY9zIZofmmhstlHeUquTDGqJcNRdOZKhnyCc/NxgIIw3+g13EOCvw4uLC
ZE8eiOzq6Tdv4IudS5G2INeD/wO+iRn5jwGLMarmv8h8kBf0h9vpNxtMZT0oZsJxRH7i+Wse/e0w
iNN2foPr2befmwxg0gClHsjUegjRrHCAkJBoEpYi4DKcpUcZjgmjO3zk83b58Vt2LvglQRFWOVU8
6FRFPRECnEGooyWvVM5WuGBj8tFLHDj5u9XudmFjhCg8O84Tz7ANooQsDVGYpX4hecRbVUvh9CxF
MgNg8WhGB7Fn9APBVKrIPJ1VKM4SpUOgxDgJFWkcgoQ5RQc9yoUKQiqHVQEWVYK8sFo6itEelYcS
0OxqFV17qeAjoZt3Y3YmIvDqRaD1iYqa0DlgqE+plvywCl2HW6gFzqOjpmHJYNOaekN0AaDGNbe2
F357CTFsM8JDRm0JrYI4w2NkhhJYKnLvCr48AeRe5vo67IGHNZiAuxl4h2i5ENLRrSikGsRmIMus
5ZxKybhujRngGFgKsWCTEOjupKDOKR/zVcQhugQA2QTeVI7u1mj0oiS1KNIgaLtoqFbmKCe6KijB
Zu1zYhvTF89m/GLlXsGvwL7N8bknINEaNDlCPE0CjTKEiSWPw2uK9kBEDsJMVRBrJahe4LVBELkY
gosAp7nVCGeXS+tyzpMsgGMNESa7Vg0A0srZ6IPXlqt3y5ELIIOedswCEfcZfMR0Y5rxIIJZa4tI
YI1MEyAYnPNYRw/4B/qm5wNYtdWBqn7QCKVlG+qq2yqKJHB5S12rXOa+4ulUIi5fnZxevdUyZJkk
u+oncHvVkHauOWvZAQ+Waus0ndFJo9Llb1C00a/ABpCpReE3W1UM+8iiMyJ363VUPIeokh/8XckU
4UJCsKqlLA==

Advertisements

4 Responses to “Case study: Porting and modularizing an in-house Python application”

  1. Anon Says:

    “Is it just a feeling, or has the number of technical posts decreased on the Planet?”

    I’ve noticed this, and in general I feel that KDE is becoming like the GNOME of 3-4 years ago: at that time, there was so little actual development done in GNOME, that their blog was an absolute wasteland of personal stuff and recipes and (non-GNOME) conference reports etc.

    KDE’s own development momentum, as seen in the steadily declining commit activity (http://lists.kde.org/?l=kde-commits&r=1&w=2 – and this is after the switch to git, which usually results in a sharp *increase* of activity!) and the increasingly anemic digests is becoming more and more like that of the GNOME of yesteryear, with only a handful of vital (and usually commercially-backed) projects showing any real sign of health (I’m thinking here of Plasma, KDEPIM and KOffice).

    What’s weird is that it seems to have begun very suddenly roughly mid-way through February, like someone flicked a switch or something. Really odd.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s