Skip to content

Fix bug #80602 Segfault when using DOMChildNode::before() #10682

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

NathanFreeman
Copy link
Contributor

https://bugs.php.net/bug.php?id=80602
#8729
I make a new pr base on PHP-8.1 branch because PHP-8.0 that is supported for critical security issues only.

@NathanFreeman
Copy link
Contributor Author

cc @beberlei

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a while to understand all the possible paths in ::before & ::after, but I think your changes are generally correct. I'd just like some comments, especially why you need to do the copy with a reference to the docs. Otherwise, it's quite obscure.

I'm fairly limited in knowledge about this extension in PHP, so I might not be completely correct ofc...

@@ -181,6 +181,10 @@ xmlNode* dom_zvals_to_fragment(php_libxml_ref_obj *document, xmlNode *contextNod
return NULL;
}

if (nodesc > 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need a comment here to explain why the copy is necessary, with a reference to the libxml docs talking about the merging of TEXT nodes (in which case our node is freed). I also wonder if we can't check the type of the node to avoid some unnecessary copies.

Copy link
Contributor Author

@NathanFreeman NathanFreeman Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example:

<?php 
declare(strict_types=1);

$doc = new \DOMDocument();
$doc->loadXML('<a>foo<last/></a>');
$target = $doc->documentElement->lastChild;
$target->before('bar', $doc->documentElement->firstChild, 'baz');

After debugging, I found that the xmlNewDocText function will always returns same address to the second parameter if the parameters are greater than or equal to three.
If it's text, that's fine, but if it's an object, it can cause invalid pointer because many new nodes point to the same memory address.
So we must copy the new node to avoid this situation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. Could you please add a short comment in the code above the if condition (may be even one line only referencing this PR comment) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added comment in the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, I was just busy reviewing and extensively testing this right now. :)

} else if (beforefirstchild) {
nextsib = parentNode->children == nextsib ? nextsib : afternextsib;
} else {
nextsib = parentNode->children == prevsib ? prevsib->next : nextsib;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: as far as I can see, prevsib->next will always equal nextsib. So that means this check and assignment is redundant, no?

Copy link
Contributor Author

@NathanFreeman NathanFreeman Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// ext/dom/parentnode.c:170
if (newNode->parent != NULL) {
	xmlUnlinkNode(newNode);
}

xmlUnlinkNode will detach the relation between with parent node and child node.

if (!parentNode->children) {
    nextsib = NULL;
}

It mean first node and child node are both parameters to DOMElement::before() method.
So nextslib is null.

if (beforefirstchild) {
	nextsib = parentNode->children == nextsib ? nextsib : afternextsib;
}

It mean that the new node will insert before $doc->documentElement->firstChild.
Let's discuss two possible scenarios.

  1. nextsib does not change means that $doc->documentElement->firstChild is not the parameter to DOMElement::before() method.
  2. nextsib = afternextsib means that $doc->documentElement->firstChild is the parameter to DOMElement::before() method and $doc->documentElement->firstChild will removed from node tree.

It means that the new node will insert before $doc->documentElement->lastChild if neither of the above conditions is satisfied.
Let's discuss two possible scenarios too.

  1. If parentNode->children == prevsib is true, it means that $doc->documentElement->firstChild is not the parameter to DOMElement::before() method so nextsib = prevsib->next.
    The reason we use prevsib->next is that $doc->documentElement->lastChild may be a parameter to DOMElement::before() and $doc->documentElement->lastChild will removed from node tree.
    In this case, prevsib->next is NULL.

  2. If parentNode->children == prevsib is false, it means that $doc->documentElement->firstChild is the parameter to DOMElement::before() method so nextsib does not change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay that seems to make sense, thanks for explaining.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to recheck this patch later this week, and run some Valgrind tests on it to check for leaks etc

} else if (beforefirstchild) {
nextsib = parentNode->children == nextsib ? nextsib : afternextsib;
} else {
nextsib = parentNode->children == prevsib ? prevsib->next : nextsib;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay that seems to make sense, thanks for explaining.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested the changes with Valgrind, and I've checked the semantics of the checks and it looks correct. Given the fact that the semantics were already (quickly) glanced over in the previous PR and deemed correct also gives me more confidence that this is correct. It's also in line with how other non-special cases behave w.r.t. node detaching.
The code reasoning also seems correct.
Thanks! I'll wait a bit longer to see if other feedback comes in and for the CI, and if it's all good I'll see if I can merge it in the coming week.

@NathanFreeman
Copy link
Contributor Author

Thank you @nielsdos

@nielsdos nielsdos closed this in 2d6decc Mar 30, 2023
@nielsdos
Copy link
Member

Merged it! Thanks :)

nielsdos added a commit to nielsdos/php-src that referenced this pull request Mar 30, 2023
Discovered this pre-existing problem while testing phpGH-10682.
Note: this problem existed *before* that PR.

* Not all paths throw a hierarchy request error
* xmlFreeNode must be used instead of xmlFree for the fragment to also
  free its children.
* Free up nodes that couldn't be added when xmlAddChild fails.

I unified the error handling code that's exactly the same with a goto to
prevent at least some of such problems in the future.
nielsdos added a commit to nielsdos/php-src that referenced this pull request Mar 30, 2023
Discovered this pre-existing problem while testing phpGH-10682.
Note: this problem existed *before* that PR.

* Not all paths throw a hierarchy request error
* xmlFreeNode must be used instead of xmlFree for the fragment to also
  free its children.
* Free up nodes that couldn't be added when xmlAddChild fails.

I unified the error handling code that's exactly the same with a goto to
prevent at least some of such problems in the future.
nielsdos added a commit to nielsdos/php-src that referenced this pull request Apr 3, 2023
Discovered this pre-existing problem while testing phpGH-10682.
Note: this problem existed *before* that PR.

* Not all paths throw a hierarchy request error
* xmlFreeNode must be used instead of xmlFree for the fragment to also
  free its children.
* Free up nodes that couldn't be added when xmlAddChild fails.

I unified the error handling code that's exactly the same with a goto to
prevent at least some of such problems in the future.

Closes phpGH-10981.
nielsdos added a commit that referenced this pull request Apr 3, 2023
Discovered this pre-existing problem while testing GH-10682.
Note: this problem existed *before* that PR.

* Not all paths throw a hierarchy request error
* xmlFreeNode must be used instead of xmlFree for the fragment to also
  free its children.
* Free up nodes that couldn't be added when xmlAddChild fails.

I unified the error handling code that's exactly the same with a goto to
prevent at least some of such problems in the future.

Closes GH-10981.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants