Python experiences, or: Why I like C++ more ;-)

July 4, 2009

So I’ve been coding some Python lately, because this is the language of choice for the “Computational Physics” course I’m attending. As the name says, it is more about physical and numerical problems than about programming, and the instructors chose Python as it’s easy to get into (most Physics students had not coded up to now).

I was quite excited about that, as I have wanted to learn Python for about a year. The fact that the course has its own mailing list where students can ask questions to their fellows (and, of course, the teachers), allowed for some insight on how beginners in programming get into Python.

First off, Python is mostly good for beginners. “print” works on almost everything, and allows you to very easily inspect your program. You can load your work-in-progress code in IPython, and inspect the values of the computations. (I must admit though that I never used this feature, I’m just too used to inserting “print” statements everywhere and re-running the program.)

In the course of the course (pun intended), the only problem that many people ran into was copy-by-reference. When you’re working on some numerical problem, you’re dealing with number arrays most of the time (we were instructed to use NumPy, together with matplotlib for graphical output). And those are, unlike simple numbers, copied by reference. If you say “b = a”, where a is an array, and then modify the contents of b, you’ll also modify a, as b is just another name for a. The solution is “b = 1 * a”. The multiplication operation causes a deep copy of a to be created.

A big mistake which I (as a senior C++ programmer) did often in the first few weeks is the excessive usage of for-loops with numpy arrays. For example, the following two code pieces are equivalent:
a = numpy.ndarray((N, N)) for x in xrange(N): for y in xrange(N): a[x, y] = x ** 2 + y ** 2 # or a = numpy.arange(N)[:,newaxis] ** 2 + numpy.arange(N)[newaxis,:] ** 2
On the first look, the first one looks more well-arranged than the second one (given we do not worry about the double parentheses), but the first one is sllllllloooooooowwwww. For N = 2000, the first one needs 4.16 seconds on my system, while the second one is processed in 0.03 seconds. The simple reason that the for-loops in the second codes are hidden deep in Numpy’s implementation, which is done mostly in C (and some Fortran). In an actual example, I had to compare the numerical precision of various discrete integration algorithms. The first version needed 16 seconds to calculate everything and start up, and after transforming all for-loops into Numpy operator expressions, the startup time was reduced to 2 seconds (most of this time was actually spent on loading all modules and initializing matplotlib).

But even this is just a problem of how accustomed you are to an API. I would not say that Python is inferior because it takes ages to execute for-loops in Python itself, rather than in underlying C libraries. Rather, Python is an elegant wrapper on top of many nice libraries, which helps to greatly reduce boilerplate code. Or, as I said to a fellow student, “every scripting language has its use case: Perl has extremely fast string operations, PHP is for website coding, and Python is for extremely rapidly escaping back to C code”.

Today I’ve been working on my final project. (We have to build a bigger application around some numerical or physical problem and present it to the rest of the class.) This has mostly been fun, as this was the first time where I could escape to well-known APIs through the magical key called PyQt4. While I’ve been working on this project, I finally found a way to criticize Python: It’s not optimized on escaping back to C++ code, because the OOP concepts in Python and C++ are, well, completely different things. In C++, classes are blueprints which can be used to create objects. Compare it to civil engineering: Using a construction plan, you put together various materials until the building has emerged.

In Python, a class is essentially a special type of object that works as a function. When this class function is called, it creates a copy of itself and calls the __init__ method of this copy. Here we see that these concepts do not quite fit: It’s like one would just copy the construction plan of a building, lay it out at the building site, connect an air pump to it, inflate it and hope that a building emerges.

There is nothing bad about the class concept in Python in general, the bad thing for me is that it does not match my perspective that I gained to OOP in the last ~10 years. And because it does not match the OOP model in C++, creating bindings seems to be more complicated. For example, I could not create a class that subclasses QObject and QGraphicsItem (a pattern which I regularly used e.g. in Kolf, to get a 2D representation together with signals/slots and properties), and I could not create a Borg-style class that derives from QTimer.

Apart from that, my criticism of Python sums up to missing explicit namespace declaration (why am I bound to file and even directory structure here?) and missing visibility modifiers for class members (“public”, “protected” and “private”) and, most prominently, that duck typing.

The last one is usually destined to be discussed most controversial, so I’ll just neutrally put down my two cents on this topic. If I can choose, I’ll only use strongly typed languages, as in: languages where each variable has a fixed type and unsafe conversions are not allowed, just because it eliminates a very big mass of problem sources. Actually, I would like a programming language which ensures that the most problems do never happen to me (i.e. a strongly typed language with garbage collection, bounds checking and stuff) _and_ runs fast, with very little overhead, and always with the possibility to bypass these checks if the overhead becomes too big. Sadly, I do not know of any language currently in existence that offers me such things.

I do not feel that I’m at a good point to end this blog post, so I’ll just show off some random bling-bling from my Python code at the end: The actual goal of the project is to implement simple pathfinding with genetic algorithms. That means that you start with some random paths, kill bad ones, and mutate and combine the good ones to produce possibly better paths (you know, “survival of the fittest”). I decided to make the thing a bit fancier by allowing obstacles to move around during the calculation, to see how the population reacts on them. This was the perfect moment to try something I’ve wanted to try since a long time: a fluent programming interface. These are programming interfaces where the statements read like sentences:

s = SimulationArea(500, 500)

s.addObject(WallObstacle(400, 0) .moveTo(-200, 100).immediately() .standStill().forSeconds(2) .moveTo(200, 100).forSeconds(5) ) s.addObject(WallObstacle(400, 0) .moveTo(700, 200).immediately() .standStill().forSeconds(2) .moveTo(300, 200).forSeconds(5) )
“Hey, wall, move to (700, 200) immediately, wait there for 2 seconds, then move to (300, 300) over 5 seconds.” You get the trick?

If someone’s interested in the full code (once it is finished), I can put it up somewhere (although it will be uninteresting for most of you).

Update: Mh, this blog post got much longer than I intended.

Posted by Stefan Majewsky
Filed in Programming

39 Comments »

39 Responses to “Python experiences, or: Why I like C++ more ;-)”

Roberto Alsina Says:

July 4, 2009 at 22:17
The difference is not passing by reference or by value, you are having a C++ism.

The difference is mutable and immutable types.

If you pass an immutable type, of course you are not modifying the original object.

Reply
- Stefan Majewsky Says:
  
  July 5, 2009 at 08:34
  Doesn’t that boil down to the same? I would rather say “hardware-ism” instead of “C++-ism” because I compare this to how something would be written in Assembler, if they behave like that.
  
  Reply
  - Roberto Alsina Says:
    
    July 7, 2009 at 12:23
    No, because immutable types are also “passed by reference”. The difference is what happens when you assing to them, not what happens when you pass them to the function.
    
    Reply
  - Roberto Alsina Says:
    
    July 7, 2009 at 12:27
    Argh, I meant “mutable and immutable types are passed by reference” in the previous response.
    
    Reply
Jello B. Says:

July 4, 2009 at 22:43
Python has duck typing and strong typing.

>>> a=1
>>> b=”1″
>>> print a+b
TypeError: unsupported operand type(s) for +: ‘int’ and ‘str’

Reply
- Stefan Majewsky Says:
  
  July 5, 2009 at 08:15
  Perhaps I should not use the term “strong typing”. In the way I use it, it contains compile-time type checking, but the Wikipedia article suggests that there are multiple definitions.
  
  Reply
  - illissius Says:
    
    July 6, 2009 at 11:44
    Yeah, that’s called “static typing”.
    
    Reply
Soap Says:

July 4, 2009 at 23:38
Strongly typed, garbage collecting _and_ runs fast? Sounds like Haskell to me.

Haskell has a stronger type system than C/C++, garbage collecting built in (it’s necessary for laziness), bounds checking isn’t done for you but is simple, as well as other stuff (ad-hoc and parametric polymorphism, functions as first class types, etc.).

It can get comparable speeds to C, because it is compiled and optimized (ever improving).

And, you can bypass the language features by interfacing with C.

Unfortunately, it requires learning a completely different programming style (purely functional) rather than continuing with OOP. Libraries are also improving rapidly.

Reply
- illissius Says:
  
  July 6, 2009 at 11:46
  Haskell is pretty mind-bending. In a good way.
  
  Reply
Lucian Says:

July 4, 2009 at 23:48
Classes are templates in python just as much as in C++. But they’re better, because you can change them, pass them as arguments and do all sorts of useful things with them. Metaclasses are one important but intimidating use of their dynamicity and class decorators (in py3k) are meant for more casual use.

If you couldn’t subclass QObject, then the PyQt4 is incomplete (or buggy). I routinely subclass gobject.GObject. Although Qt is the best GUI framework on earth, the Python bindings could be much better.

NOTHING can easily call into C++ code, sometimes not even other C++ code. Python is actually much better at this than other languages (Boost.python), only D 2.0 and C++ itself are better.

Any name starting with an underscore is private. Any name starting and ending with double underscores belongs to or is called by the interpreter. You can go around this and use private stuff if you want to, but you can do that in other languges as well (even C++), it’s just a bit harder.

Unsafe conversions are not allowed in Python. Try “23” + 2. It will complain, you’d have to use “23” + str(2). But you also don’t have to fight the type system all the time; and you get nice, implicit interfaces (looks like a duck, use it like a duck). Best of both worlds if you ask me (C is statically and weakly typed, Java is statically and (mostly) strongly typed, Haskell is statically and strongly typed and Python is dynamically and strongly typed).

Reply
- Stefan Majewsky Says:
  
  July 5, 2009 at 08:17
  I could subclass QObject, and I could subclass QGraphicsItem. But I couldn’t subclass both of them at one time (which works just fine in C++).
  
  “use private stuff if you want to, but you can do that in other languges as well (even C++), it’s just a bit harder.”
  
  How can you access a private variable in a C++ class, except for a public interface defined by this class? (“Friends” belong to the public interface of the class.)
  
  Reply
  - David Boddie Says:
    
    July 5, 2009 at 22:51
    “For example, I could not create a class that subclasses QObject and QGraphicsItem (a pattern which I regularly used e.g. in Kolf, to get a 2D representation together with signals/slots and properties)…”
    
    That’s an unfortunate limitation of the bindings:
    
    http://www.riverbankcomputing.com/static/Docs/PyQt4/pyqt4ref.html#multiple-inheritance
    
    Fortunately, you can work around it, as other people have suggested.
    
    Of course, having two “object” hierarchies with their own parents and children has its own problems, as is shown by the existence of the now-obsolete children() method of QGraphicsTextItem:
    
    http://doc.trolltech.com/4.5/qgraphicsitem-obsolete.html#children
    
    Reply
  - Thomas Bellman Says:
    
    July 6, 2009 at 20:52
    “How can you access a private variable in a C++ class, except for a public interface defined by this class?”
    
    Pointer arithmetics, or other stray pointers. It’s more difficult to willfully access a private member in C++ than in Python, but it is much easier to access it by accident in C++…
    
    Reply
- lorg Says:
  
  July 7, 2009 at 13:51
  I think it’s more like:
  Every name starting with a single underscore is similar to c++’s protected.
  Every name starting with two underscores, but not ending with two underscores, is similar to c++’s private.
  
  Reply
Robert Currie Says:

July 5, 2009 at 00:30
I remember doing a module of Fortran as part of my physics degree…

I ran into an interesting problems with creating and defining objects when I implemented modules in the programs i wrote…
Most of of the problems arose from complications around the fact you cannot create and define a variable in a quick and easy way in Fortran leading to double the code length (and the department marking me down over this adverse effect)

For all the ‘great’ things fortran can do I think most (younger) scientists that do serious programming tend yo use C++ thankfully which is due to some of Fortran’s short comings compared to modern languages

My former physics department is currently migrating to teach python (badly) which still makes me chuckle, and feel sorry for the poor students when they were only handed a document which was written about Fortran to complete a Python course

I still have the idea that numerical programming for science students should be replaced with ‘learn programming concepts’ which would be far more useful

Reply
- Stefan Majewsky Says:
  
  July 5, 2009 at 08:32
  “My former physics department is currently migrating to teach python (badly) which still makes me chuckle”
  
  Interestingly, my physics department has just decided to migrate to teach C++. As part of the switch to the new European bachelor/master system, it was decided to insert a general computer course into the first semester, because the students have to learn Linux (our computers are mostly running on Kubuntu and openSUSE), C++ (mostly general knowledge on numerical algorithms, results are written to files and plotted with gnuplot), and QtiPlot (basic requirement for lab courses today).
  
  Reply
- nicooooo Says:
  
  July 7, 2009 at 18:07
  I found that using fortran for compute-intensive parts and Python for the rest is very convenient in a lot of tasks. f2py is very easy to use.
  
  Plus you get the better optimizations coming from static fortran arrays, for free…
  
  Reply
charon Says:

July 5, 2009 at 00:44
Well, I don’t use numpy and of course Python is slower than C, but I notice that you are comparing different algoritms. In the first one you have a power on every element so N^2 powers, instead in the second one you create two “vectors” on witch you have power, so 2*N, and than you sum the first one with the transpose of the second one and numpy “does the magic”. So O(N^2) vs O(N) heavy operations.
Than I don’t get why you need visibility modifiers more than the the starting underscore, witch isn’t strong, but discourages accessing that name. Of course you can also just define getters and setters witch are as simple as “obj.name” so they look like a simple member variable access.

Reply
- Stefan Majewsky Says:
  
  July 5, 2009 at 08:21
  “Than I don’t get why you need visibility modifiers more than the the starting underscore, witch isn’t strong”
  
  Just because it is not strong. If I say that this and that are private members, I mean it.
  
  It seems to me like, in Python, subclasses tend to fiddle about in private members of their superclass instead of relying on the superclass to provide a well-defined interface to subclasses.
  
  Reply
  - lorg Says:
    
    July 7, 2009 at 13:54
    I think you are complaining here about what is actually the way some people program.
    As a general rule, subclasses should avoid playing with their superclass’s private members. Still, as the Python saying goes, “we’re all consenting adult here” and if you really want to, you can do that.
    (By the way, it’s also possible in C++, but you have to *really* want to do that 🙂
    
    Reply
Kevin Bowling Says:

July 5, 2009 at 02:01
Thanks for sharing your thoughts on both languages. I think both have their uses, and you made it clear that you were slightly biased since you have been using C++ for a while. I’m on the other side of the fence. I’ve used python and a few other languages (Java, C, PHP) but C++ is on my long term to learn list.

The big problem I see with C++ is that it has a steep learning curve, and rich libraries are not centrally located like Python, Perl, PHP, Microsoft .NET languages, and Java. I started writing code in Python without any introduction, just referencing the manual. The only thing that I noted was that many structures had different names than C-style languages. Every time I glance at C++ code, I get discouraged from diving in because it looks like you really need to have high and low level understanding of the language, which is much different than C if you want to truly be a C++ programmer. I’m not talking object oriented design (which for the most part translates across OO languages w/o much effort), but the implementation of C++ syntax and practice. In other words, the payback would be a long investment and I wouldn’t be productive in C++ for months or even years.

I would be very interested in your project code. How do you think python stacks up for AI programming? I have some projects I’d like to do involving pathfinding, optimization, and searching.

Reply
- Stefan Majewsky Says:
  
  July 5, 2009 at 08:25
  “The big problem I see with C++ is that it has a steep learning curve, and rich libraries are not centrally located like Python, Perl, PHP, Microsoft .NET languages, and Java”
  
  The library problem is by concept: C++ is designed to be a light-weight language (after compilation!). Actually, only virtual functions and perhaps exception handling have an overhead when compared to C code, and this overhead is as minimal as possible for a maximum benefit.
  
  But when it comes to general-purpose libraries, you get 98% with Qt/kdelibs. These should in functionality at least match the Python standard library.
  
  “I would be very interested in your project code. How do you think python stacks up for AI programming? I have some projects I’d like to do involving pathfinding, optimization, and searching.”
  
  I’m skeptical that it scales up, but if you’re comfortable with Python, you should at least try how far you can get.
  
  Reply
  - James Says:
    
    July 10, 2010 at 23:35
    So much of AI programming is experimentation, so in my mind that means I want a language where I can get ideas down quickly and brainstorm. Later I can make the parts that need to be fast faster or rewrite them. Usually though I end up throwing out the whole idea anyway and going on to something else, so why would I want to spend time making it fast?
    
    Reply
H.C. Says:

July 5, 2009 at 02:25
“[..] I would like [..] a strongly typed language with garbage collection, bounds checking and stuff […] _and_ runs fast […]”

The D Programming Language http://www.digitalmars.com/d/ seems to be what you are looking for.

Reply
- albatroz Says:
  
  July 5, 2009 at 21:56
  And some QT bindings for this language http://www.dsource.org/projects/qtd.
  
  Reply
Leo S Says:

July 5, 2009 at 02:45
Pretty much my experiences as well. Python is pretty neat in that you can save quite some typing, but for larger programs I still prefer C++.. My main problem with Python is that you can’t be sure about the code until it runs. This gets increasingly annoying when to get a piece of code to run requires several steps in a GUI app, only to find out that I made a typo and it throws an exception. C++/Qt would catch all those silly mistakes at compile time, and let me focus on the real problems at runtime, instead of mixing them both.

Also for the QGraphicsItem/QObject derived class, I worked around it by deriving from only QGraphicsItem, and then having a QObject member variable for the signals (so self.signalHelper.emit(‘blah’). Since pyqt can hook up signals to any python function, you can hook slots directly to your QGraphicsItem derived object.

Reply
- BennyM Says:
  
  July 6, 2009 at 16:33
  Then at least use pylint before running your code.
  It will find most stupid bugs for you by analysing your code (like unexisting functions, …)
  
  Reply
Troy Unrau Says:

July 5, 2009 at 03:24
I’ve also had the pleasure of using python for an identically named course in computational physics. Yes, it’s slow when looping on it’s own, but the numpy and scipy array stuff is quite nice, and friendly to the MATLAB folks too (well, except for the array indices starting at 0).

I’ve played with PyQt and I must say that, in contrast to your experiences moving from C++, as an experiences python programmer, PyQt is very nice. And the multiple inheritance things you mentioned doesn’t seem to be a huge problem, as long as you define things in the right order. At least, this works reasonably well for me.

class MyClass(QObject, QGraphicsItem):
def __init__(self):
QGraphicsItem.__init__(self)
QObject.__init__(self)

I still run into the occasional problem in PyQt where the underlying C++isms show through. For example, in QImage, bits() returns a void pointer, which I have to cast into an int before I can pass it then to OpenGL. I’d rather these things had a proper python way of doing things… ah well…

Hope you had fun 🙂

Reply
- Stefan Majewsky Says:
  
  July 5, 2009 at 08:28
  I tried this code, but only the first initialized class works properly. For example, the following code fails on the last line:
  
  class MyClass(QObject, QGraphicsItem):
      def __init__(self):
          QObject.__init__(self)
          QGraphicsItem.__init__(self)
  
  a = MyClass()
  b = QGraphicsLineItem(0, 0, 100, 100, a)
  
  But when I put the QGraphicsItem constructor first, QObject signals/slots do not work anymore.
  
  Reply
  - lorg Says:
    
    July 7, 2009 at 13:49
    This is not the correct way to do multiple inheritance in Python. If you are still interested, take a look at the documentation for super().
    
    Reply
Esben Mose Hansen Says:

July 5, 2009 at 07:32
Re duck typing… that is supported by C++ with templates.

Reply
- illissius Says:
  
  July 6, 2009 at 12:04
  I’ve had that thought. But the fact that they’re resolved at compile time makes them much less useful for that purpose. In something like Ruby (haven’t tried Python) you can have a list of objects of completely different types and then call doSomething and doSomethingElse on each of them, and as long as they all have those methods, it’ll work — polymorphism without inheritance. In C++, the only way to store a list of anythings is with void*, and when you pass void* to a template it checks at compile time whether void has any of the methods you’re trying to call, which it doesn’t, and you get a compile error.
  
  I believe you could do duck typing for QObjects and slots using qt_metacall, but the syntax is ugly as hell.
  
  Reply
Monkey Says:

July 5, 2009 at 09:00
http://www.thebestpageintheuniverse.net/c.cgi?u=puns

Reply
Foreigner Says:

July 5, 2009 at 09:06
Python doesn’t need visibility modifiers. We have properties that do the job stylishly.

Reply
The User Says:

July 5, 2009 at 11:08
You want to have garbage-collection, bound-checking and so on?
In C++ there are basically to ways:
-Use typedefs. You use a smart-pointer, an array class with checked bounds… and when you want to have more speed you change the typedef.
-Catch signals (like SIGSEGV) and throw an exception from there. Look here: http://forum.kde.org/viewtopic.php?f=83&t=49042
So you will have better error-handling for debugging. Of course you can’t implement reference-counting this way.
I think Garbage Collection is useless. Call delete in the destructor, that’s it. It’s not difficult to handle such things manually. Or use the Qt-object-trees.

OOP in Python is simply ugly.

Reply
- Stefan Majewsky Says:
  
  July 5, 2009 at 11:58
  This is exactly what some critics of C++ mean when they say: “C++ does not have features. It has meta-features.”
  
  Reply
The User Says:

July 5, 2009 at 12:43
Of course there are a few limitations in C++:
-Templates are like Haskell (but more limited). There’s no simple way to handle typelists.
-The Preprocessor is a universal code-generator but even more restricted
-typeinfo doesn’t contain an int-typeid. That’s bad because normal implementations should have such an id. This makes factory-pattern more complicate
-typeinfo could also provide a builtin way to construct objects
-typetraits don’t provide full introspection, you’d need a seperate code-generator (oh moc, why don’t you support plugins?)

But basically you have the most powerful language and you can create very intuitive, very flexible, very dynamic or very “meta” types.

Reply
Top Posts « WordPress.com Says:

July 6, 2009 at 00:12
[…] Python experiences, or: Why I like C++ more So I’ve been coding some Python lately, because this is the language of choice for the “Computational […] […]

Reply
Piet van Oostrum Says:

July 6, 2009 at 09:54
“… the OOP concepts in Python and C++ are, well, completely different things. In C++, classes are blueprints which can be used to create objects. Compare it to civil engineering: Using a construction plan, you put together various materials until the building has emerged.

In Python, a class is essentially a special type of object that works as a function. When this class function is called, it creates a copy of itself and calls the __init__ method of this copy. Here we see that these concepts do not quite fit: It’s like one would just copy the construction plan of a building, lay it out at the building site, connect an air pump to it, inflate it and hope that a building emerges.”

A class, when called does not create a copy of itself. It creates an instance with is similar to new in C++. I don’t understand why you get this impression.

Both languages have their strong and weak points and their application domains, so they can live happily together. Even in symbiosis if necessary.

By the way, your blog is hard to read with light gray letters on a non really white background, even when I blow up the font. Please consider using something with more contrast.

Reply