(Like this article? Read more Wednesday Wisdom!)
Last week I came across a well-meaning LinkedIn post that sought to advise junior developers on which language to pick up in order to have a good chance at a job. The number one language on the list was Python. This is of course a mistake.
Or is it?
I would have to agree with the author of the post, though grudgingly, that Python is super popular and that good knowledge of Python will help you get a job. However, the ongoing popularity of Python is a problem and one that sets our industry back years for every year of its continuing use.
I base my fairly harsh judgment on many years of experience running large applications written in Python. Using Python for a large application is like building a nuclear reactor using Lego bricks: Yes it has bright colors and it was really simple to get started. But now the reactor has been running for a while and we have radiation leaks everywhere and nobody dares touch anything. Instead people stick on new bricks all over the place to keep the thing going. Really the only thing to do is to encapsulate the reactor in concrete, let it cool down, and build a new one from proper building materials.
The problem with Python is of course that it is an interpreted language with dynamic and duck typing: We type in some code, save it, and then only find out at runtime whether a group of statements always works, sometimes works, or does not work at all, based on the data that we put into it. Typically when writing a function you are only partially in control of the data that goes into that function. That means that we need to have the discipline to check all the input data exhaustively, which history shows we are not very good at. Worse, Python's duck typing invites smart but terrible code that inevitably comes round and bites us in the ass.
Through my years supporting large Python applications I have seen and experienced horrors that would not have happened had our application been written in a sane, compiled, and type safe language.
In one instances, some years ago, I managed to convince our organization to rewrite our system in Rust. Turned out to be a master stroke.
I have on many occasions launched new versions of a big Python application in production, only to immediately drown in errors, caused by exceptions that were in turn caused by Python code where a single glance confirmed that it could never ever work and consequently had never been run: Not on a developer workstation, not in a unit test, and not in an integration test.
There will be Python defenders that will say that this is not a deficiency of the language but of the code reviewing and testing strategy. They are wrong. The fact that it is theoretically possible to look at every line and test every input and every scenario does not mean that this is practically possible. One of the values of a good programming language is that you don't have to review and test every possible permutation of bits in memory because it guarantees that a large amount of these permutations cannot happen. If I have to review and test every "a = b + c" exhaustively, my program will never make it to production, which if the program is written in Python, might actually be a good thing.
Quite often I have looked at Python functions and wondered if anything or anyone actually called them, and if so, with what arguments? In the absence of a compiler and linker that requires a global view of the entire program, I often had to resort to full text searches of our code base to see if I could find any call sites. Unfortunately, even when that failed to yield any results, I still have had the program crash on me when I removed the function. And even if the program didn’t immediately crash, there is often no telling if the program won’t crash in some edge condition (if the function is only used in that edge condition). You can never be sure, which is why it is safer to just leave the function in, leading to cruft.
Python defenders will say that this is a case of bad programming patterns. Of course they are wrong (again), because the value of a good programming language is that it will not allow you to write programs that are structurally deficient. Truly, checking if all the code that you will ever need is there before creating a binary is not too much to ask for? And the bad programming patterns that create these problems often come into existence to deal with refactoring problems caused by Python’s lack of strict typing and ability to figure out at runtime if something might maybe make sense.
Another problem with Python is performance. My laptop has 10 CPU cores, of which a Python application can approximately use 1.2 cores effectively. What to do? Fortunately the 1980s can come to the rescue! I once worked with a service in Python that forked worker processes to handle requests, ensuring that all cores could be used.
Unfortunately these workers ran out of memory quickly so we decided to have workers terminate themselves after a configurable number of requests had been handled so that Linux would do our memory management for us. This is not a Python problem per se, but Python makes it worse.
The forking had another effect though: Python's use of reference counting defeated copy-on-write because even memory blocks holding variables that were read-only were actually written to in order to manipulate the reference counts, thereby blowing up the combined physical memory footprint of the workers. We solved this by smurfing the interpreter to use a magic reference count number for all variables that were created by the master process and inherited by the workers, and then not touching reference counts that had the magic value. Super smart solution, but one that should be unnecessary. What use is a simple interpreted language if you need to hack the interpreter to make it actually work for you?
Python has caused me a world of pain. It is slow as molasses and patently unsuitable for any program that is larger than about 100 lines of code.
Why do people keep using it?
The number one reason I get for people wanting to use Python is that it is "easy to use". Apart from the fact that this is not true, it is also a fallacy that shows a complete lack of understanding of how Total Cost of Ownership (TCO) works. For any serious piece of software that makes it to production, 80% or more of the cost is in the phase after initial development. That's the phase where we run, debug, add features, and refactor. Who cares that the first 20% was twice as fast if the other 80% is a hellhole of debugging and finding problems that any compiler from the 1960s could have caught?
Many congratulations that your programming language is easy to use, but that is not your job. Your job is to write software that meets the requirements and that is reliable, easy to maintain, and fast. Python makes it very hard to write code that is reliable, easy to maintain, and fast.
Q: Then why do we use it?
A: One word: False economy.
Unfortunately, writing software is hard, but using good programming languages makes it a bit easier to write halfway decent programs, typically at the cost of more complexity in the phase of flight that doesn't matter much overall but which comes first, which therefore weighs heavily on people's minds and which has an extraordinary influence in our flawed project planning methodologies.
With Python you can open an editor, write a few statements, close the editor, and run the code. That seems very efficient, but upon inspection it's the equivalent of starting to build a skyscraper by putting up a few prefab walls (never mind the foundation) and immediately wanting to use the bathroom. That's not how we build skyscrapers and it's not how we should build software.
"But Python is great as glue code for other libraries". No, it is not. It's not great, it's there! There is nothing that makes Python particularly great for that other than an abundance of existing software to make it usable and lots of people already having Stockholm syndrome. Python is the equivalent of the QWERTY keyboard, it sucks and is a major contributor to RSI and carpal tunnel syndrome, but everyone is used to it already so let's build another keyboard with that layout.
Personally I switched all of my casual programming to Go. It's almost as easy to write as Python, is type safe, has a fast build system, and generates highly optimized native code binaries. Of course Go is not perfect (hint: no programming language is), but it is much better than Python if you want to write code that is reliable, fast, and where you have at least some chance at debugging and refactoring when the code has sprawled out of control.
And, not to put too fine a point on it, but if you can code Python but not Go (or another decent programming language), you probably have no business writing software for a living. I know it sounds harsh and I apologize for that, but writing software is a profession and hence you should have the knowledge, skill, and experience to use professional tools.
Python is terrible. But it has one redeeming feature: It was invented by a Dutchman! :-)
This was simply a satisfying and stress-relieving read after being forced to develop a program in Python knowing that someone thinks the same. It was an absolute nightmare spending time debugging in Python.
Python is an absolutely horrifying language to build a proper program. It's like trying to build a house on top of a base that could collapse at any point. At first, it may seem that it doesn't take much effort to build the base for a program in Python, and it is. But, low-effort work only results in low-effort product. Languages such as Java, C/C++ or Rust definitely take more time for you to build the base, but it is extremely solid and can support your program over a much, much longer period of time.
And those languages definitely takes a lot more time for you to grasp a proper understanding. But once you do, you can see a solid and strong connection between everything. Python takes little to no time for you to understand it, but your understanding is rather lose. And that will only backfire when you build large programs as that lose understanding will only result in you barely being able to draw connections between anything, turning debugging into an absolute hellhole.
As a person who prefers to have a good understanding of whatever I'm doing and not just "it magically happens", I can never see the "improvement in efficiency" when developing in Python. I only see a 10 times decrease in efficiency compared to C++. Hell, even something such as JIT Javascript is more efficient than Python for me.
And then we have how incredibly slow Python is. And then you'll have to install a shit ton of libraries to make it better. And then you realize that's like installing hundreds of fan-made mods into a game and hoping that it won't crash.
Meta's new thread backend was written by https://github.com/facebookincubator/cinder which is based on python 3.10.