Skip to content

Fix bug #80602 Segfault when using DOMChildNode::before() #10682

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
@@ -948,6 +948,9 @@ PHP NEWS
. Fixed bug #81433 (DOMElement::setIdAttribute() called twice may remove ID).
(Viktor Volkov)

. Fixed bug #80602 (Segfault when using DOMChildNode::before()).
(Nathan Freeman)

- FFI:
. Fixed bug #79576 ("TYPE *" shows unhelpful message when type is not
defined). (Dmitry)
87 changes: 78 additions & 9 deletions ext/dom/parentnode.c
Original file line number Diff line number Diff line change
@@ -181,6 +181,15 @@ xmlNode* dom_zvals_to_fragment(php_libxml_ref_obj *document, xmlNode *contextNod
return NULL;
}

/*
* xmlNewDocText function will always returns same address to the second parameter if the parameters are greater than or equal to three.
* If it's text, that's fine, but if it's an object, it can cause invalid pointer because many new nodes point to the same memory address.
* So we must copy the new node to avoid this situation.
*/
if (nodesc > 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need a comment here to explain why the copy is necessary, with a reference to the libxml docs talking about the merging of TEXT nodes (in which case our node is freed). I also wonder if we can't check the type of the node to avoid some unnecessary copies.

Copy link
Contributor Author

@NathanFreeman NathanFreeman Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example:

<?php 
declare(strict_types=1);

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before('bar', $doc->documentElement->firstChild, 'baz');

After debugging, I found that the xmlNewDocText function will always returns same address to the second parameter if the parameters are greater than or equal to three.
If it's text, that's fine, but if it's an object, it can cause invalid pointer because many new nodes point to the same memory address.
So we must copy the new node to avoid this situation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. Could you please add a short comment in the code above the if condition (may be even one line only referencing this PR comment) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added comment in the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, I was just busy reviewing and extensively testing this right now. :)

newNode = xmlCopyNode(newNode, 1);
}

if (!xmlAddChild(fragment, newNode)) {
xmlFree(fragment);

@@ -302,7 +311,9 @@ void dom_parent_node_after(dom_object *context, zval *nodes, int nodesc)
{
xmlNode *prevsib = dom_object_get_node(context);
xmlNodePtr newchild, parentNode;
xmlNode *fragment;
xmlNode *fragment, *nextsib;
xmlDoc *doc;
bool afterlastchild;

int stricterror = dom_get_strict_error(context->document);

@@ -311,7 +322,10 @@ void dom_parent_node_after(dom_object *context, zval *nodes, int nodesc)
return;
}

doc = prevsib->doc;
parentNode = prevsib->parent;
nextsib = prevsib->next;
afterlastchild = (nextsib == NULL);
fragment = dom_zvals_to_fragment(context->document, parentNode, nodes, nodesc);

if (fragment == NULL) {
@@ -321,13 +335,42 @@ void dom_parent_node_after(dom_object *context, zval *nodes, int nodesc)
newchild = fragment->children;

if (newchild) {
fragment->last->next = prevsib->next;
prevsib->next = newchild;
// first node and last node are both both parameters to DOMElement::after() method so nextsib and prevsib are null.
if (!parentNode->children) {
prevsib = nextsib = NULL;
} else if (afterlastchild) {
/*
* The new node will be inserted after last node, prevsib is last node.
* The first node is the parameter to DOMElement::after() if parentNode->children == prevsib is true
* and prevsib does not change, otherwise prevsib is parentNode->last(first mode).
*/
prevsib = parentNode->children == prevsib ? prevsib : parentNode->last;
} else {
/*
* The new node will be inserted after first node, prevsib is first node.
* The first node is not the parameter to DOMElement::after() if parentNode->children == prevsib is true
* and prevsib does not change otherwise prevsib is null to mean that parentNode->children is the new node.
*/
prevsib = parentNode->children == prevsib ? prevsib : NULL;
}

newchild->prev = prevsib;
if (prevsib) {
fragment->last->next = prevsib->next;
if (prevsib->next) {
prevsib->next->prev = fragment->last;
}
prevsib->next = newchild;
} else {
parentNode->children = newchild;
if (nextsib) {
fragment->last->next = nextsib;
nextsib->prev = fragment->last;
}
}

newchild->prev = prevsib;
dom_fragment_assign_parent_node(parentNode, fragment);
dom_reconcile_ns(prevsib->doc, newchild);
dom_reconcile_ns(doc, newchild);
}

xmlFree(fragment);
@@ -337,10 +380,15 @@ void dom_parent_node_before(dom_object *context, zval *nodes, int nodesc)
{
xmlNode *nextsib = dom_object_get_node(context);
xmlNodePtr newchild, prevsib, parentNode;
xmlNode *fragment;
xmlNode *fragment, *afternextsib;
xmlDoc *doc;
bool beforefirstchild;

doc = nextsib->doc;
prevsib = nextsib->prev;
afternextsib = nextsib->next;
parentNode = nextsib->parent;
beforefirstchild = !prevsib;
fragment = dom_zvals_to_fragment(context->document, parentNode, nodes, nodesc);

if (fragment == NULL) {
@@ -350,19 +398,40 @@ void dom_parent_node_before(dom_object *context, zval *nodes, int nodesc)
newchild = fragment->children;

if (newchild) {
// first node and last node are both both parameters to DOMElement::before() method so nextsib is null.
if (!parentNode->children) {
nextsib = NULL;
} else if (beforefirstchild) {
/*
* The new node will be inserted before first node, nextsib is first node and afternextsib is last node.
* The first node is not the parameter to DOMElement::before() if parentNode->children == nextsib is true
* and nextsib does not change, otherwise nextsib is the last node.
*/
nextsib = parentNode->children == nextsib ? nextsib : afternextsib;
} else {
/*
* The new node will be inserted before last node, prevsib is first node and nestsib is last node.
* The first node is not the parameter to DOMElement::before() if parentNode->children == prevsib is true
* but last node may be, so use prevsib->next to determine the value of nextsib, otherwise nextsib does not change.
*/
nextsib = parentNode->children == prevsib ? prevsib->next : nextsib;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: as far as I can see, prevsib->next will always equal nextsib. So that means this check and assignment is redundant, no?

Copy link
Contributor Author

@NathanFreeman NathanFreeman Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// ext/dom/parentnode.c:170
if (newNode->parent != NULL) {
	xmlUnlinkNode(newNode);
}

xmlUnlinkNode will detach the relation between with parent node and child node.

if (!parentNode->children) {
    nextsib = NULL;
}

It mean first node and child node are both parameters to DOMElement::before() method.
So nextslib is null.

if (beforefirstchild) {
	nextsib = parentNode->children == nextsib ? nextsib : afternextsib;
}

It mean that the new node will insert before $doc->documentElement->firstChild.
Let's discuss two possible scenarios.

  1. nextsib does not change means that $doc->documentElement->firstChild is not the parameter to DOMElement::before() method.
  2. nextsib = afternextsib means that $doc->documentElement->firstChild is the parameter to DOMElement::before() method and $doc->documentElement->firstChild will removed from node tree.

It means that the new node will insert before $doc->documentElement->lastChild if neither of the above conditions is satisfied.
Let's discuss two possible scenarios too.

  1. If parentNode->children == prevsib is true, it means that $doc->documentElement->firstChild is not the parameter to DOMElement::before() method so nextsib = prevsib->next.
    The reason we use prevsib->next is that $doc->documentElement->lastChild may be a parameter to DOMElement::before() and $doc->documentElement->lastChild will removed from node tree.
    In this case, prevsib->next is NULL.

  2. If parentNode->children == prevsib is false, it means that $doc->documentElement->firstChild is the parameter to DOMElement::before() method so nextsib does not change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay that seems to make sense, thanks for explaining.

}

if (parentNode->children == nextsib) {
parentNode->children = newchild;
} else {
prevsib->next = newchild;
}

fragment->last->next = nextsib;
nextsib->prev = fragment->last;
if (nextsib) {
nextsib->prev = fragment->last;
}

newchild->prev = prevsib;

dom_fragment_assign_parent_node(parentNode, fragment);

dom_reconcile_ns(nextsib->doc, newchild);
dom_reconcile_ns(doc, newchild);
}

xmlFree(fragment);
178 changes: 178 additions & 0 deletions ext/dom/tests/bug80602.phpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
--TEST--
Bug #80602 (Segfault when using DOMChildNode::before())
--FILE--
<?php
declare(strict_types=1);

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before($target);
echo $doc->saveXML($doc->documentElement).PHP_EOL;

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before($target);
echo $doc->saveXML($doc->documentElement).PHP_EOL;


$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before($doc->documentElement->lastChild);
echo $doc->saveXML($doc->documentElement).PHP_EOL;

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before($doc->documentElement->firstChild);
echo $doc->saveXML($doc->documentElement).PHP_EOL;


$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before($target, $doc->documentElement->lastChild);
echo $doc->saveXML($doc->documentElement).PHP_EOL;

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before($doc->documentElement->lastChild, $target);
echo $doc->saveXML($doc->documentElement).PHP_EOL;


$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before($target, $doc->documentElement->firstChild);
echo $doc->saveXML($doc->documentElement).PHP_EOL;

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before($doc->documentElement->firstChild, $target);
echo $doc->saveXML($doc->documentElement).PHP_EOL;


$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before('bar','baz');
echo $doc->saveXML($doc->documentElement).PHP_EOL;

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before('bar','baz');
echo $doc->saveXML($doc->documentElement).PHP_EOL;


$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before($target, 'bar','baz');
echo $doc->saveXML($doc->documentElement).PHP_EOL;

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before('bar', $target, 'baz');
echo $doc->saveXML($doc->documentElement).PHP_EOL;

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before('bar', 'baz', $target);
echo $doc->saveXML($doc->documentElement).PHP_EOL;



$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before($target, 'bar','baz');
echo $doc->saveXML($doc->documentElement).PHP_EOL;

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before('bar', $target, 'baz');
echo $doc->saveXML($doc->documentElement).PHP_EOL;

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before('bar', 'baz', $target);
echo $doc->saveXML($doc->documentElement).PHP_EOL;



$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before('bar', $target, $doc->documentElement->lastChild);
echo $doc->saveXML($doc->documentElement).PHP_EOL;


$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before($target, 'bar', $doc->documentElement->lastChild);
echo $doc->saveXML($doc->documentElement).PHP_EOL;


$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->firstChild;
$target->before($target, $doc->documentElement->lastChild, 'bar');
echo $doc->saveXML($doc->documentElement).PHP_EOL;




$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before('bar', $doc->documentElement->firstChild, $target);
echo $doc->saveXML($doc->documentElement).PHP_EOL;


$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before($doc->documentElement->firstChild, 'bar', $target);
echo $doc->saveXML($doc->documentElement).PHP_EOL;


$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before($doc->documentElement->firstChild, $target, 'bar');
echo $doc->saveXML($doc->documentElement).PHP_EOL;

?>
--EXPECTF--
<a>foo<last/></a>
<a>foo<last/></a>
<a><last/>foo</a>
<a>foo<last/></a>
<a>foo<last/></a>
<a><last/>foo</a>
<a><last/>foo</a>
<a>foo<last/></a>
<a>barbazfoo<last/></a>
<a>foobarbaz<last/></a>
<a>foobarbaz<last/></a>
<a>barfoobaz<last/></a>
<a>barbazfoo<last/></a>
<a>foo<last/>barbaz</a>
<a>foobar<last/>baz</a>
<a>foobarbaz<last/></a>
<a>barfoo<last/></a>
<a>foobar<last/></a>
<a>foo<last/>bar</a>
<a>barfoo<last/></a>
<a>foobar<last/></a>
<a>foo<last/>bar</a>
Loading