The misguided yearning for a better plan
You don't need a better plan, you need better execution of the plan you have
(Like this article? Read more Wednesday Wisdom!)
Last week, over some take-home Indian dinner, I was talking with my brother-in-law who runs a highly respected education consultancy. We were talking about ballot question number 2, which seeks to abolish the Massachusetts Comprehensive Assessment System as a high school graduation requirement.
On the risk that I will invite some hate mail: I voted not to abolish the MCAS.
One thing led to another and we ended up discussing what separates good schools from bad schools. And then my brother-in-law said something very interesting: “Bad schools”, he said, “think that good schools have a better plan, but that is not at all what is going on. Good schools are running with the same plan, they just execute on it better.” This made a lot of sense to me; two schools in otherwise similar circumstances have access to the same teaching materials, the same pool of teachers, can pay these teachers approximately the same, get the same mix of students, are subject to the same regulations, et cetera. Why would one school be better than another? The answer: They execute better on the same plan.
The quest for a better plan is ubiquitous. I regularly get questions on how to improve aspects of engineering teams. “How we we make our system more reliable?”, people ask. “How do we drive down the pager load?”, “how do we improve our code quality?”, “how do we unflake our unit tests?” People who ask these questions think I got some secret and better plan for doing any of that. Nothing is further from the truth. I don’t have a secret plan and neither do I have secret knowledge. If I am successful, it’s only because I manage to execute better on the plan that everybody already knows about.
Execution is the only thing that matters. Great execution of a mediocre plan will lead to much better results than poor execution of a great plan. If you are failing, don’t start searching for a better plan. Instead, focus on better execution of the plan you already have. Only if you have near-flawless execution on the plan at hand and still want to improve can you start thinking about a better plan. You will also find that the many improvements you made executing on the less-than-optimal plan will serve you in good stead once you found the better plan.
Great execution starts with great attention to the myriad of details that govern daily (work) life. There are a lot of those details in coding, but really in all aspects of work there are details that ultimately compound to either form high quality work or not. Most organizations get the high level of any plan right. There are code reviews, people write tests, there are weekly project meetings, there are performance reviews. But that doesn’t mean that any of these are any good. Whether or not a meeting is successful does not depend on whether you managed to get it on the calendar, found a meeting room, and got everyone involved in that room. Mutatis mutandis, whether your code is any good does not depend on the rule that a pull request must have a code review. These are the performative aspects of the plan, and unfortunately that’s where a lot of execution stalls.
Whenever you do something, anything, you make choices about how well you are going to do it. An important aspect to consider there is that people always seem to be ready to meet a low bar. And even if they are not, they might not know where the bar is and might need to be enlightened.
When I joined Google in 2006 I already knew C++. In fact, I had taught C++ on multiple platforms and one of my friends was on the C++ standardization committee and we talked about C++ regularly. I also knew how to program computer and had a long history of delivering working code. So when I needed to write a piece of C++ for Google Maps, I felt confident I could get it working. And I did! It was beautifully indented, the variables had reasonable names, memory management was mostly done well, and it came with unit tests too! However, when I sent the change in for a code review, I got about fifty comments back that ranged from a little bit pedantic to structural comments that would make my code more robust. “Ah, oh, wait”, I thought: “so this is the bar!?” From then on my code improved rapidly to meet the incredibly high coding standard that Google had.
In code reviews, people will often point out: “But it works”, as if that is the only matter of any concern. Of course it works! The fact that it works is table stakes. If it didn’t work, I wouldn’t even be looking at this code.
I mentioned code reviews because that is one of the great pain points in many organizations. Most organizations are aware that their code base is a bit of a mess and there is generally a lot of worry about how to get everyone to write higher quality code. This invariably prompts the search for a better plan: Implement readability, get automated code scanning tools in place, give everyone a book, whatever. “What to do?” engineering managers often ask themselves (and me). “How do I get people to do write better code and do better code reviews?” The answer is of course simple: Just do it. We know how to. Don’t wonder why your people don’t know how to write good code or do high quality code reviews, wonder why they don’t do it!
The answer is, of course, incentives. People care about staying employed and will do pretty much anything that will keep them in that happy state. You need to make sure that you incentivize everyone to do the right thing and don’t incentivize them to do the wrong thing. Unfortunately, many companies seem to specialize in perverse incentives that prompt people to actively not do the right thing. Stories of perverse incentives abound, but every time your merge a ticking time bomb in your code in order to meet an arbitrary deadline, you have fallen victim to one.
At one of my previous employers I was asked to share some thoughts on how to improve the reliability of the control plane of an Internet facing service. The reliability was well known to be terrible, the pager was on fire, and we constantly threw blood against metal to fix things. Fairly quickly, the discussion came to the question of whether it was better to have a separate reliability team or a reliability engineer in every team with some dotted lines. This is the kind of discussions that is in essence a search for a better plan. But, as my brother-in-law said: Good organizations don’t have a better plan, they execute better on the plan we all know about. I wrote a fairly unpopular comment on that doc that said as much because it really doesn’t matter whether you have a separate team that works with different stakeholders or a distributed team that is embedded with the stakeholders. And it specifically doesn’t matter when your reliability is terrible and things are on fire! In that state the only thing that matters is that you have the organizational willpower to dedicate time and effort to improving the situation. If you do, then either mode can work for you. Instead of worrying about how to structure your future reliability team, how’s about we start worrying about how to fix the problems we have and already know how to solve?
The one-team-versus-virtual team discussion comes up quite often. Sure, in some organizations one model might work slightly better than the other one, but org design is a two-way door and once you get your reliability effort going you can start worrying about how to best organize it given the ground truth for your organization.
So the reliability initiative went nowhere because nobody really cared about reliability. Things were on fire because the code quality was suboptimal and people were constantly racing to deliver on new features against tight (some might say: unreasonable) deadlines. Doing the grunt work for improving reliability didn’t get you promoted and might actually hurt your career chances. I care about reliability, but that doesn’t amount to much in an organization that doesn’t care because I am always careful not to kick a dead whale across the beach.
By the way, of course people said they cared, but I never give two hoots about what anyone says. If you want to know what people actually care about, check what they are doing, and ignore what they are saying.
There really is no secret information anymore. Everybody knows how to design systems, how to write high quality code, how to do code reviews, how to organize reliability engineering teams and how to create budgets for high-performing schools. The search for a better plan is just an admission that you do not know how to execute on the current plan.
Abandon your search for the better plan and start improving execution on the current one!
( we should talk about MCAS next time we meet, as I have voted for removing that requirement for graduation and I would love to understand your reasoning )
On planning vs execution, I think I agree with you in the spirit, but not the letter. In the spirit, no plan survives first contact with reality (yes, this is a paraphrase). And since the now old plan is dead, part of the solid execution is to refine the plan on the go. I agree that stopping and replanning isn’t the way to go.