(Like this article? Read more Wednesday Wisdom!)
One of my favorite psychological experiments is the "five monkeys" experiment. This experiment probably never actually happened, but its lessons are so good that it should have happened.
Pet peeve: “Lessons”, not “learnings”!
The experiment goes like this: A band of scientists put five monkeys in a closed room. The room has a clutch of bananas hanging from the ceiling.
I recently learned that I have apparently always peeled bananas incorrectly, because the experts peel them from the bottom.
Obviously, the monkeys think: “Hmmm, bananas!”. However, as soon as any of them reach for the bananas, the experimenters hose down all the monkeys with cold water. Pretty soon the monkeys figure out what the deal is and from then on they ignore the bananas. Now the experimenters replace one of the monkeys with a brand new one; the new monkey sees the bananas and thinks: "Hmmm, bananas!" Unfortunately, as soon as he or she reaches for the bananas, the other monkeys, who remember the cold water hose, beat up the new monkey. Cruel but effective. After two or three tries, the new monkey learns that apparently we don't touch the bananas around here, though s/he is probably not sure why.
Next, the experimenters replace another one of the original monkeys. Same thing happens: The new monkey is beaten to within an inch of their life as soon as they reach for the bananas. There is a twist though: The other new monkey wholeheartedly joins in the beating! S/he is probably still not sure why they are doing that, but it’s important to be part of a group.
This process is repeated until all the "old" monkeys have been replaced with new ones. There are now no monkeys left in the room who have experienced the cold water hose for themselves, but all of them beat up any monkey who makes a go for the bananas, because they have all learnt that we are not touching the bananas around here.
Remember: There is always money in the banana stand!
The reason why this is such a powerful story is because everyone is aware of this exact same behavior in humans. Human groups exhibit all sorts of behaviors where nobody really knows why they are doing whatever they are doing and nobody knows or remembers what the equivalent of the cold water hose was (or if it was ever there to begin with).
Sometimes these things are just silly but otherwise harmless. For instance when I started getting into Unix, experienced system administrators told me to always issue the sync command twice before issuing a reboot:
# sync; sync; reboot
When I asked why, nobody was really sure, but that's just what everyone did.
There was an explanation of course: The sync command executes the sync(2) system call, which flushes all dirty buffer cache pages to disk. Naturally you shouldn't reboot the system until that has been done, because if you pull the plug while there are still unwritten changes in memory, that will lead to data loss, file system corruption, or both. According to POSIX.1-2001, the sync(2) system call might be asynchronous. So, depending on the system, when sync(2) returns you have no guarantee that the dirty pages have been written to disk.
Question for the audience: Does Linux guarantee that dirty pages have been written to disk when sync(2) returns?
History is a bit hazy here, but apparently on systems that did an asynchronous sync, you could have only one background write process scheduled and running. On these systems, a call to sync(2) would only return if the sync was successfully scheduled, which means that the second sync would return only if the first one had finished. So on these systems "sync; sync; reboot
" made sense because the second sync would only return after the first sync had finished and you knew it was safe to reboot.
Newer Unix implementations either made sync(2) wait until the buffer was entirely flushed or synced the buffer cache as part of the reboot or halt commands. This meant that on newer systems the double sync was not needed anymore, but the behavior stuck.
Another story: In my first job we would create some big report every month, leading to hundreds of pages of printed paper being sent to some department. At one point our manager asked us why we did that and the answer was that we didn’t know but we had always done that. He told us to stop printing and sending the report and then wait for something to happen. Of course nothing ever happened, nobody complained. To this day I don't know if the original requestor of the report had left or if someone suddenly noticed that the report didn't come anymore but didn't care enough to escalate.
As organizations we do a lot of stuff without really knowing why. When that happens, it always pays off to ask “Why?” and see if anyone knows and can provide an explanation. If not, this might be a good time to stop doing that thing altogether and see what happens.
A lot of the things we do are on checklists of some sort. I am a huge fan of checklists because they are the simplest and cheapest mechanism we have to ensure that we do not forget something important. For instance, while preparing for athletic races, I always use a checklist to make sure that I have gathered all the stuff that I am going to need and do all the stuff that I need to do, like applying the temporary race tattoo (with the bib number).
However, checklists need maintenance. As time moves on, items on the checklist might no longer be as relevant or become entirely useless. Most checklists only grow and items typically get added because something went wrong. Because of that, over time, the items on the list represent the collective failures and mistakes of the group of people using the checklist.
This is obvious in the checklists I run before flying an airplane: Literally every item on that list is there because someone of some incident.
Sometimes these checklists get so unwieldy that they become an obstacle to velocity. At Google, the combined launch checklists had become so unwieldy that it was getting very hard to launch anything.
This point was hilariously well made by a funny video called "I just want to serve 5 terabytes" (for those who have watched the video: I have Borgmon readability but I never charged peer bonus to use that power :-).
To deal with this we had a project, aptly called "Project Trainwreck", to improve launch velocity, mostly by simplifying launch checklists and removing items that were no longer applicable or that, all things considered, no longer delivered “bang for the buck”.
Of course removing items seen as safety measures is often unpopular because it increases risk. On top of that, most organizations are extremely good at inertia. For that reason, when proposing to stop doing something, you often get pushback.
In one organization I worked for we were always sending some documents to employees in certain circumstances. I was involved with automating the document flow and since the easiest thing to automate is something that doesn’t need to be done, I asked why we were sending these documents at all. "That is a legal requirement", someone answered. I love that kind of pushback… "Do you have a table that maps out this requirement per country, preferably with a link to an article of civil law, a case number, or an article from a collective bargaining agreement?" I asked.
Cue a long and frustrating mail thread in which people tried to convince me this was an important legal requirement and I kept asking them to back that up.
Things got really interesting because the Netherlands was on the list of countries with that "requirement" and I am quite certain that it is not required there, being from the Netherlands and having gotten a degree in law from a Dutch university.
Most commonly, the main reason for staying with current behavior is the risk of change. Surely nobody worried about risk is ever going to agree to changing something that seems to work, albeit slowly. Overcoming that inertia requires a lot of research, stamina, and energy. It requires difficult people who can ask annoying questions to perform that important work. I am a difficult person and I can ask annoying questions, so I am perfectly suited for that job :-)
So if you are wondering why something is being done, ask why. If you are getting pushback, try to figure out if the people who are pushing back actually know what they are talking about and ascertain if they themselves have any experience with the cold water hose and now what it is.
Or is this just something we do around here?
On the other hand, there's "Chesterton's fence" https://en.wikipedia.org/wiki/G._K._Chesterton#Chesterton's_fence -- but that would be problematical in the "five monkeys" experiment.
As to useless reports - at one place I worked, it was standard procedure in the IT group to verify that each report was necessary by periodically not printing it and see if anyone complained. (It was also standard procedure to consider each system for rewrite about every five years, on the theory that by that time the system was either not needed or had so many new requirements that a rewrite made more sense than maintenance)
And a friend from one of the big agencies told me of an audit at a large electric utility where one person's job appeared to consist of receiving each month a large report, which he read and then threw out.
This reminds me of a job many moons ago at a public institution. A small but important (in the rules & regulations sense) process involved passing pieces of paper around among representatives of a couple of different functions (in this case, HR and a set of labor unions.)
HR got a new manager, a sort with some initiative, and soon there was a request to the IT department to make this process no longer involve paper—to create a web version of the workflow.
Easy peasy said and done.
A brief bit of time passes, then in comes a feature request: "please introduce a timeout after which an errand is automatically marked as `closed approved`".
Done.
Years later, I bump into two of the folks who were involved in the specification of this little system, and ask them whether it is still serving its purpose. They look at each other, one shrugs and says "guess so, I haven't looked at it in ages", to which the other agrees.
They had shrewdly unburdened themselves of a meaningless—but apparently required—process by requesting it be automated such that even a token bit of human attention could be elided.