Outside the world of open-source software, it’s likely few people would have heard about XZ Utils, a small but widely used tool for data compression in Linux systems. But late last week, security experts uncovered a serious and deliberate flaw that could leave networked Linux computers susceptible to malicious attacks.
The flaw has since been confirmed as a critical issue that could allow a knowledgeable hacker to gain control over vulnerable Linux systems. Because Linux is used throughout the world in email and web servers and application platforms, this vulnerability could have given the attacker silent access to vital information held on computers throughout the world – potentially including the device you’re using right now to read this.
Major software vulnerabilities, such as the SolarWinds hack and the Heartbleed bug, are nothing new – but this one is very different.
The XZ Utils hack attempt took advantage of the way open-source software development often works. Like many open-source projects, XZ Utils is a crucial and widely used tool – and it is maintained largely by a single volunteer, working in their spare time. This system has created huge benefits for the world in the form of free software, but it also carries unique risks.
Open source and XZ Utils
First of all, a brief refresher on open-source software. Most commercial software, such as the Windows operating system or the Instagram app, is “closed-source” – which means nobody except its creators can read or modify the source code. By contrast, with “open-source” software, the source code is openly available and people are free to do what they like with it.
Open-source software is very common, particularly in the “nuts and bolts” of software which consumers don’t see, and hugely valuable. One recent study estimated the total value of open source software in use today at US$8.8 trillion.
Until around two years ago, the XZ Utils project was maintained by a developer called Lasse Collin. Around that time, an account using the name Jia Tan submitted an improvement to the software.
Not long after, some previously unknown accounts popped up to report bugs and submit feature requests to Collin, putting pressure on him to take on a helper in maintaining the project. Jia Tan was the logical candidate.
Over the next two years, Jia Tan become more and more involved and, we now know, introduced a carefully hidden weapon into the software’s source code.
The revised code secretly alters another piece of software, a ubiquitous network security tool called OpenSSH, so that it passes malicious code to a target system. As a result, a specific intruder will be able to run any code they like on the target machine.
The latest version of XZ Utils, containing the backdoor, was set to be included in popular Linux distributions and rolled out across the world. However, it was caught just in time when a Microsoft engineer investigated some minor memory irregularities on his system.
A rapid response
What does this incident mean for open-source software? Well, despite initial appearances, it doesn’t mean open-source software is insecure, unreliable or untrustworthy.
Because all the code is available for public scrutiny, developers around the world could rapidly begin analysing the backdoor and the history of how it was implemented. These efforts could be documented, distributed and shared, and the specific malicious code fragments could be identified and removed.
A response on this scale would not have been possible with closed-source software.
An attacker would need to take a somewhat different approach to target a closed-source tool, perhaps by posing as a company employee for a long period and exploiting the weaknesses of the closed-source software production system (such as bureaucracy, hierarchy, unclear reporting lines and poor knowledge sharing).
However, if they did achieve such a backdoor in proprietary software, there would be no chance of large-scale, distributed code auditing.
Lessons to be learned
This case is a valuable opportunity to learn about weaknesses and vulnerabilities of a different sort.
First, it demonstrates the ease with which online relations between anonymous users and developers can become toxic. In fact, the attack depended on the normalisation of these toxic interactions.
The social engineering part of the attack appears to have used anonymous “sockpuppet” accounts to guilt-trip and emotionally coerce the lead maintainer into accepting minor, seemingly innocuous code additions over a period of years, pressuring them to cede development control to Jia Tan.
One user account complained:
You ignore the many patches bit rotting away on this mailing list. Right now you choke your repo.
When the developer professed mental health issues, another account chided:
I am sorry about your mental health issues, but its important to be aware of your own limits.
Individually such comments might appear innocuous, but in concert become a mob.
We need to help developers and maintainers better understand the human aspects of coding, and the social relationships that affect, underpin or dictate how distributed code is produced. There is much work to be done, particularly to improve the recognition of the importance of mental health.
A second lesson is the importance of recognising “obfuscation”, a process often used by hackers to make software code and processes difficult to understand or reverse-engineer. Many universities do not teach this as part of a standard software engineering course.
Third, some systems may still be running the dangerous versions of XZ Utils. Many popular smart devices (such as refrigerators, wearables and home automation tools) run on Linux. These devices often reach an age at which it is no longer financially viable for their manufacturers to update their software – meaning they do not receive patches for newly discovered security holes.
And finally, whoever is behind the attack – some have speculated it may be a state actor – has had free access to a variety of codebases over a two-year period, perpetrating a careful and patient deception. Even now, that adversary will be learning from how system administrators, Linux distribution producers and codebase maintainers are reacting to the attack.
Where to from here?
Code maintainers around the world are now thinking about their vulnerabilities at a strategic and tactical level. It is not only their code itself they will be worrying about, but also their code distribution mechanisms and software assembly processes.
My colleague David Lacey, who runs the not-for-profit cybersecurity organisation IDCARE, often reminds me the situation facing cybersecurity professionals is well articulated by a statement from the IRA. In the wake of their unsuccessful bombing of the Brighton Grand Hotel in 1984, the terrorist organisation chillingly claimed:
Today we were unlucky, but remember we only have to be lucky once. You will have to be lucky always.
*Sigi Goode, Professor of Information Systems, Australian National University. This article is republished from The Conversation under a Creative Commons license. Read the original article.
20 Comments
This is a great article. All software whether in the cloud, on servers, commercial or open source will have bugs and likely security flaws...
A bigger worry in my mjnd is that Ai is being taught to code.. only a matter of time til our AI makes a mistake (unchecked) or is competing with your AI to hack and fix key software. And it will do so far better without humans.
Given the increasing prolification of intelligent cars, infrastructure, banking patforms and how it runs finance, communications, militaries.. JIT supply chains for fuel, food etc.. a global attack on core systems would leave entire countries paralyzed with citizens unable to access food, oil, energy.
Will be an interesting few years.
An attacker would need to take a somewhat different approach to target a closed-source tool, perhaps by posing as a company employee for a long period and exploiting the weaknesses of the closed-source software production system (such as bureaucracy, hierarchy, unclear reporting lines and poor knowledge sharing).
Or more likely, foisting it upon them using legal means in the case of a state actor.
On the subject of open source vectors, a more entertaining example is the use of 'hallucinated' dependencies made real. Apparently ChatGPT and the like have been making up names of dependencies when presenting code. An enterprising researcher noticed that the same imaginary dependencies were being used again and again, so as an experiment, they went ahead and created an actual package with one of the made up names and then watched as it got picked up by lazy developers copying code. Alibaba was one place it ended up.
What a fascinating article. To think that manipulating one person could compromise OpenSSH - wow. I disagree that this reflects badly on open source. Open source has a conjecture and criticism mechanism by design. Multiple weakness were identified, and now hardening countermeasures can be implemented. Had this happened on a closed source project, it's likely that nobody would have even noticed.
Well for those in tech the risk has been well known and old hat that developing & maintaining the most highly used library dependencies is often unpaid volunteer work and that checking dependency updates does not happen in almost all use cases because that work is also often unpaid.
It has been well known that unless you have the skills to understand any dependency libraries the use of them blindly is a bit of a issue because the task of checking is then handed off to a community that is unlikely to also have the time and has little in the matter of trust. The risk just increases but not because of open source (at least then the faults have more potential to be caught by the people working for free instead of closed source) but because no developer at any stage is paid to spend the time checking references and updates. They just use them, often blindly, often with recommendations from sources you can barely trust.
As another commenter said ChatGPT is likely to increase the risk of blind, sometimes maliciously directed, use. The AI has even less understanding of source checking and often will spit out compete bollocks (especially since most AI code is complete rubbish).
Here's a paywalled NY Times story on the matter.
The average person, nay, the average programmer, would only be able to exploit this vulnerability in their wildest dreams. And among the very few with the skills that could, most couldn't be bothered, And among those left that could, they'll still need to devise a way to exploit it without being caught - which is actually pretty hard too ....
... Meanwhile the average person turns to ChatGPT and trusts the answers as truth.
Hmm ... What should we more more afraid of?
Apparently the attacker baked his public key into ssh and therefore only the attacker using his corresponding private key could actually exploit the vulnerability. But holy crap, they'd have root access. Super clever how they did it. THey injected encrypted and compressed binary data into the source code via the xz.utils makefile. Nobody could’ve seen the hack by looking at the source code. Who bothers to check the makefile.
"Who bothers to check the makefile."
Depending on where the distribution came from, and/or it's intended purpose, we always did. Ditto all config files prior to compilation & use. (How thorough the checks were tho is down to the skills of the individuals actually doing them and what time pressures they are under. I seriously doubt this one would have been found except by the best and paranoid.)
https://linuxunplugged.com/556 - The xz Backdoor Exposed
For more discussion of the hack and how cunning it was. Sounds like pure luck the guy that found it happened to be curious enough to track it down.
highly relevant prediction in a single frame visual
Also a good visual timeline and description of the attack mechanisms. It goes back years and was only by luck that someone was checking a random performance issue managed to find it. They were never paid to check the dependencies they had automatically included by default.
https://cdn.arstechnica.net/wp-content/uploads/2024/04/xz-backdoor-grap…
"We are lucky that this was detected and that some competent people have moved in to analyse. I presume more analysis is still being done. (Thanks to them.)
What we don't know is how many other similar attacks have been deployed, and how many are in preparation.
What can decent people do to reduce our risks?
Thoughts that come to mind:
1. Some of those who do this, often thankless, maintenance work might like more support. This might be financial contributions, or people (who know what they're doing) reviewing code submissions. Those who incorporate these libraries into their own programs (and all users) should maybe think about this. If there were a "donate to the maintainers" button on a story like this, that would convert the immediate story into something of greater value, if the maintainer would like that.
2. Some of the maintainers might appreciate recognition. Some won't, but worth considering.
3. Some who use the libraries can improve the checking they do.
4. Unpleasant people who harass maintainers should be detected and treated appropriately." -MikeGale
We welcome your comments below. If you are not already registered, please register to comment.
Remember we welcome robust, respectful and insightful debate. We don't welcome abusive or defamatory comments and will de-register those repeatedly making such comments. Our current comment policy is here.