The Log4j lessons: If it ain’t broke, fix it now!

by Lieuwe Jan Koning and Rob Maas


A blog series with the title The Log4J lessons might suggest that the fallout of the Log4j vulnerability is mostly behind us. Indeed, since the end of 2021, there has been a tremendous effort from technology vendors, SOC’s and IT departments to mitigate this threat. But given the widespread usage of this open-source logging library and the well-publicized ease of the attack, it’s highly unlikely that we’ve heard the last of Log4j in 2022.

Why did Log4j have such an impact?

But, although Log4j and follow-up attack vectors are still a real threat for many organizations, it’s certainly not too early to draw lessons learned from this episode. What made Log4j different from other ‘classic’ 2021 vulnerabilities such as Citrix, Kaseya, and Hafnium (Exchange) is the fact that it was much harder for organizations to pinpoint if they were vulnerable. It’s relatively easy to know if you’re using a particular version of Citrix NetScaler or Microsoft Exchange, but Log4j is not a product in itself. It is an open-source logging utility the first version of which was first published in 2001.

In the first days after the publication of the Log4j CVE, the most pressing question for many IT departments was: where is Log4j in my infrastructure?

Because it’s open-source and convenient, Log4j is used under the hood of literally tens of thousands of applications, many of which are commercial software packages. In the first days after the publication of the Log4j CVE, the most pressing question for many IT departments was: where is Log4j in my infrastructure?

Even organizations who had a comprehensive and up-to-date asset inventory of all software and utilities in use faced the same problem as their colleagues without such an inventory: in the end, you had to wait for official communication from your software and hardware vendors to find out if they made use of the vulnerable versions of the library. And if that was the case, apply a patched version of their software or firmware as soon as it became available.

Take “If it ain’t broke, don’t fix it” with a grain of salt

And that’s where our first Log4j lesson surfaces. Installing a new software update is traditionally a task that poses challenges.

Backup integrity, testing the new version, minimizing downtime, and the always imminent fear of introducing unforeseen problems by installing a new software version is just some of the issues related to software updates.

Now imagine installing that critical Log4j patch in a critical e-commerce application on Christmas Eve at 11:30 PM, knowing full well that the application has not been updated for a couple of years because of compatibility issues with your ERP system. The issue never popped high enough on your list, but this particular evening it could be of critical importance.

Wouldn’t you wish you had performed earlier patches and tackled possible resultant problems at a time when you were less pressured, and had some spare time (and staff around) if things would escalate into the serious trouble zone?

Would you feel confident with a surgeon who had practiced a relatively easy procedure just once or twice before? Of course not. You would prefer someone who had done the procedure hundreds of times, and who had learned from all the things that can and do go wrong.

The adage “if it ain’t broke, don’t fix it” is equally popular in practice.

Although the necessity of applying updates and patches as soon as possible never seems to be disputed, the adage “if it ain’t broke, don’t fix it” is equally popular in practice.

Usually, the practical consequences of running a few minor or even major versions behind are limited, especially if the bug fixes are minor. Until the moment occurs you HAVE to patch, because it’s the only way to mitigate a serious threat. That’s the worst time to find out that, for instance, you cannot update to a new version because you will be breaking something else, such as an outdated API dependency or a feature that was skipped a few versions back.

Patching excellence is key

What Log4j has taught us (again) is that we need to become good at patching; get used to it, and have the confidence that we can tackle unforeseen consequences of patching/upgrading.

We shouldn’t look at patching with disdain, maybe unconsciously blaming the vendor or developers for the “repairs” that should not have been there in the first place. Patching is a natural and crucial part of software development, and we need to build a reliable patching machine with a perfect cold start and enough reliability and predictability to counter and mitigate things that get broken (they will). Fix it, even when it ain’t broke!

Patching excellence is a key requirement for agile development.

At ON2IT we try to practice what we preach by rigorously deploying new versions with automated recipes and manifests. This takes the guesswork out of updating, making updating major or minor versions as consistent, painless, and easy as possible.

This is especially true in containerized cloud environments but works equally well in traditional on-prem infrastructures. Patching excellence is a key requirement for agile development. It’s also important to remember that the operational side of updating applications should be embedded in IT-service management rather than mainly cybersecurity-driven.

Address vulnerabilities as they become known

In another Log4j lessons blog post we’ll talk about vulnerability management, but it’s important to stress that patch hygiene is not only sensible in the cyber security context but yields many other advantages.

However, it is a critical skill to address vulnerabilities as they become known. Shortening the time window between a new zero-day and installing the mitigation patch is probably one of the most critical factors in lowering risk and impact.

We all know this, but Log4j has shown us again that we don’t always act accordingly. So, embrace patching and fix that patch machine!

Lieuwe Jan Koning, CTO and Rob Maas, Lead Architect