Justin Pombrio

Preventing Log4j with Capabilities

A capability-safe language would have minimized the impact of, or even prevented, the log4j vulnerability.

If this is the first time you’re hearing about that vulnerability, you should go read about it instead of this post! And also go patch it, if you have Java software that uses or might transitively use log4j to log. It’s a doozy.

There are multiple issues surrouding this vulnerability that I’ll talk about in this post:

  1. It’s doing string interpolation on user supplied strings.
  2. It’s accessing the network, without anyone realizing it might do that. (This is the part that capabilities would help with.)

The surface issue: String interpolation on user supplied strings

The first issue is that log4j was performing string replacement on user supplied strings, not just on the template strings written by the developer. For example, if the developer using log4j writes this:

logger.debug("user-name={}", userName);

Then log4j will substitute userName into {} as it should, but it will also perform string replacements inside userName. So if someone picks the user name {o}${o} because they think it looks like a pair of glasses, then this line of code will attempt to expand ${o} by looking up what o expands to in a developer provided configuration file.

But that’s non-sensical: a user name is a string, not a logging template. The user that picked {o}${o} probably doesn’t even know how to program and they were not attempting to write a log4j string template, they were drawing a pair of glasses!

Contrast that to this code:

logger.debug("user-name=" + userName);

In this case, logger.debug() has no way of knowing that userName might contain user data: it was simply just handed a single argument, and its first argument is meant to be a template. Thus it is appropriate for it to try to expand ${o} in the log message. Writing this code is a mistake by the developer using log4j, whereas the behavior of the "user-name={}" code is a bug in log4j.

If this bug wasn’t present then there would probably be a lot fewer vulnerable applications in the wild, because many developers using log4j probably did use the form "user-name={}" instead of the form "user-name=" + userName.

The deeper issue: Why is my logger using the network?

That was one issue. Another, deeper, issue is that log4j was fetching arbitrary Java code off the network and executing it, when no one expected it to.

I have, in my head, a little person who was born and raised in a world where capability-safe software is the default. And this person is yelling. He is yelling:

I hear this vulnerability affected pretty much everyone using log4j. But why did everyone pass the network to the logger? Sure, maybe some people wanted to use JNDI and LDAP or something, but most people didn’t, so why would those people give the logger the network?

The answer, of course, is that no one “gave” the logger access to the network. It just had access, because all code in Java has access to the network. You can tell by these type signatures in the Java net library:

public class URL {
    // Make a new URL. Anyone can do it.
    public URL(String) ;

    // Turn a URL into a URLConnection. Anyone can do it.
    // (Despite the name, this doesn't actually open the connection,
    //  it just makes a URLConnection object.)
    public URLConnection openConnection();
}

public class URLConnection {
    // Actually open the connection. Anyone can do it.
    public abstract void connect();
}

(Links: new URL, openConnection, connect.)

By chaining these three methods together, arbitrary Java code can open a connection to any URL it wants to. This may not look strange to you or me, but it looks very problematic to the little capability-person in my head. He is saying:

Wait, these three methods together let you create a network connection from nothing? That’s a violation of the integrity of your type system!

It’s like… say there is an authentication package, that all authentication goes through, and you can try to authenticate a user, and if it passes you’ll get an AuthenticatedUser, and then you can use the AuthenticatedUser to perform more privileged actions.

For this to work well, it’s important that all authentication happens in the authentication package, and that only it can create an AuthenticatedUser. You can do this in Java, by making the constructors for AuthenticatedUser non-public and ensuring that they are only called in the authentication package, and only if the authentication succeeds. This can be a very useful abstraction in a large codebase: it tells you that (certain kinds of) authentication bugs can only happen inside the authentication package.

And this abstraction breaks if random code can conjure up an AuthenticatedUser from nothing, and use it to perform privileged actions.

Likewise, you shouldn’t be able to conjure up a network connection from nothing. Any network connection must ultimately originate from the Network object.

[…listening…]

Oh, you don’t have a Network object? So any random library code can just access the network, on its own. And this is true not only of your dependencies, but the dependencies of your dependencies. So the only way to check if your application might transitively access the network would be to like… search through the source code of all of your transitive dependencies? Wow. Just wow. And you’re wondering why you have so many vulner—

Let’s cut off my imaginary capabilities-person there, he’s getting a little snarky.

What he’s making fun of us for not having is a different API that looks like this (or something like it; there are a lot of ways to organize it):

public class URL {
    // Make a new URL. Anyone can do it.
    new URL(String);
}

// The ultimate source of all network access.
// This class is a capability. It grants access to all URLs.
class Network {
    // Turn a URL into a URLConnection.
    // You can only do this if you have a Network object.
    // Once it's done, the URLConnection grants the capability to open the
    // connection to that url.
    public URLConnection openConnection(URL);
}

// This class is a capability. It grants access to one particular URL.
class URLConnection {
    // Actually open the connection.
    public abstract void connect();
}

This begs the question: who can construct a Network object? If anyone can just make one, then nothing substantial has changed. The log4j package (or really, its JNDI dependency) would privately construct a Network and otherwise do the same thing.

The trick is, there are no constructors for Network. Instead, there is exactly one Network object ever in existence, and it is passed in to the program at one location, perhaps as an argument to main. (Actually, if the operating system was capability-safe and Java was cooperating with it, then main would be given a Network if and only if the executable was given network access.) And likewise for similar capabilities like a FileSystem object:

public static void main(
    String args[],
    Network network,
    Filesystem filesystem) {
      ...
}

The point is to use unforgeable Java objects to grant capabilities. Unforgable means that arbitrary code can’t create one from nothing; this can be accomplished in Java simply by it not having constructors. If you pass some code a reference to one of these capability objects, directly or indirectly, you are granting it access to the resource it represents. This is the essense of capabilities: unforgeable objects that grant access to the resource they represent. It’s very simple.

Let’s see how capability safety would influence log4j. First off, here’s what log4j’s interface looks like currently:

import org.apache.log4j.Logger;

public class Incrementer {
    private static final Logger LOGGER
      = Logger.getLogger(Incrementer.class);

    public int increment(int number) {
        LOGGER.info("Adding one");
        return number + 1;
    }
}

Notice that Network isn’t passed to LOGGER. Thus log4j can’t access the network! So when the log4j maintainers considered implementing the JNDI feature that introduced the vulnerability, a few things could have happened next:

  1. They decide that it’s totally reasonable for a logging library to access the network all the time by default, and add a Network argument to the Logger.getLogger() method. Some users accept this and fall prey to the vulnerability, but more discerning users are concerned by this request for network access and switch to a simpler logging library that doesn’t require it, thus avoiding the vulnerability.
  2. They don’t think a logging library should be accessing the network at all, or feel that requiring a Network parameter would be frightening or inconvenient to users, and reject the feature.
  3. They think that the feature is worthwhile, but don’t want the breaking API change of modifying getLogger to require a Network. So instead they introduce a new method, perhaps called Logger.getLoggerWithNetwork(MyClass.class, network). Since most users don’t use this method and log4j can’t access the network without it, this prevents the great majority of vulnerabilities.

All three possibilities are better than what actually happened, which was that log4j suddenly gained the ability to access the network, but its API did not change to reflect this so users did not notice. Thus, a capability-safe language would have saved, or at least mitigated, the day.

So future language designers, please consider making your language capability safe.

I’m not actually sure what’s a good reference for more reading, besides Mark Miller’s thesis if you really want to get into it. But here are some possibilities:

December 26, 2021