Commit Graph

1034 Commits

Author SHA1 Message Date
Niels Dossche
243fa9c143
Fix GH-12616: DOM: Removing XMLNS namespace node results in invalid default: prefix
The namespace data is freed and set to NULL, but there remain references
to the namespace declaration nodes. This (rightfully) confuses libxml2
because its invariants are broken. We also have to remove all remaining
references from the subtree. This fixes the data corruption bug.

Closes GH-12681.
2023-11-22 20:39:30 -06:00
Niels Dossche
6a76e5d0a2
Fix GH-12702: libxml2 2.12.0 issue building from src
Fixes GH-12702.

Co-authored-by: nono303 <github@nono303.net>
2023-11-22 20:39:30 -06:00
Niels Dossche
20c9c4a367 Fix validation logic of php:function() callbacks in dom and xsl
Two issues:
- Assumed that at least 1 argument (function name) was provided.
- Incorrect error path for the non-callable case.

Closes GH-12593.
2023-11-02 20:28:55 +01:00
icy17
900f0cab9f Fix null pointer dereferences in case of allocation failure
Closes GH-12506.
2023-10-24 19:34:47 +02:00
Niels Dossche
d7de0ceca6 Fix registerNodeClass with abstract class crashing
This always results in a segfault when trying to instantiate, so this never
worked. At least throw an error instead of segfaulting to prevent developers
from being confused.

Closes GH-12420.
2023-10-13 19:06:09 +02:00
Niels Dossche
24e5e4ec0d Fix GH-8996: DOMNode serialization on PHP ^8.1
PHP 8.1 introduced a seemingly unintentional BC break in ca94d55a19 by
blocking the (un)serialization of DOM objects.
This was done because the serialization never really worked and just
resulted in an empty object, which upon unserialization just resulted in
an object that you can't use.

Users can however implement their own serialization methods, but the
commit made that impossible as the ACC flag gets passed down to the
child class. An approach was tried in #10307 with a new ACC flag to
selectively allow serialization with subclasses if they implement the
right methods. However, that was found to be too ad hoc.

Instead, let's abuse how the __sleep and __wakeup methods work to throw
the exception instead. If the child class implements the __serialize /
__unserialize method, then the throwing methods won't be called.
Similarly, if the child class implements __sleep and __wakeup, then
they're overridden and it doesn't matter that they throw.

For the user, this PR has the exact same behaviour for (sub)classes that
don't implement the serialization methods: an exception will be thrown.
For code that previously implemented subclasses with these methods, this
approach will make that code work again. This approach should be both BC
preserving and unbreak user's code.

Closes GH-12388.

For the test:
Co-authored-by: wazelin <contact@sergeimikhailov.com>
2023-10-09 22:10:05 +02:00
Niels Dossche
e127f87114 Restore old namespace reconciliation behaviour
The xmlDOMWrapReconcileNamespaces method we used to fix the namespace
corruption issues in 8.1.21/8.2.8 caused regressions.
Primarily, there is a similar corruption that the xmlReconciliateNs method
used to have in which a namespace is suddenly shifted
(SAML-Toolkits/php-saml#562) and the side-effect of removing redundant
namespaces causes problems when a specific serialization is required.

Closes GH-12308.
2023-09-27 22:32:01 +02:00
Niels Dossche
bffc74474b Add missing EXTENSIONS section to DOM tests 2023-08-26 18:37:42 +02:00
Niels Dossche
20ac42e1b0 Fix memory leak when setting an invalid DOMDocument encoding
Because the failure path did not release the string, there was a memory
leak.
As the only valid types for this function are IS_NULL and IS_STRING, we
and IS_NULL is always rejected in practice, solve the issue by not using
a function that increments the refcount in the first place.

Closes GH-12002.
2023-08-20 14:05:26 +02:00
Niels Dossche
d19e4da125 Fix segfault when DOMParentNode::prepend() is called when the child disappears
Closes GH-11906.
2023-08-08 20:06:39 +02:00
Niels Dossche
df6e8bd4fd Fix viable next sibling search for replaceWith
Closes GH-11888.
2023-08-07 20:23:06 +02:00
Niels Dossche
dddd309da4 Fix GH-11830: ParentNode methods should perform their checks upfront
Closes GH-11887.
2023-08-07 19:39:05 +02:00
Niels Dossche
08c4db7f36 Fix manually calling __construct() on DOM classes
Closes GH-11894.
2023-08-07 19:37:47 +02:00
Niels Dossche
6e468bbd3b Fix json_encode result on DOMDocument
According to https://www.php.net/manual/en/class.domdocument:
  When using json_encode() on a DOMDocument object the result will be
  that of encoding an empty object.

But this was broken in 8.1. The output was `{"config": null}`.
That's because the config property is defined with a default value of
NULL, hence it was included. The other properties are not included
because they don't have a default property, and nothing is ever written
to their backing field. Hence, the JSON encoder excludes them.
Similarly, `(array) $doc` would yield the same `config` key in the
array.

Closes GH-11840.
2023-08-01 17:28:51 +02:00
Ben Ramsey
ebbccb3dc6
Merge branch 'PHP-8.0' into PHP-8.1 2023-07-31 20:01:03 -05:00
Niels Dossche
62228a2568
Disable global state test on Windows
It looks like the config.w32 uses CHECK_HEADER_ADD_INCLUDE to add the include
path to libxml into the search path.
That doesn't happen in zend-test.
To add to the Windows trouble, libxml is statically linked in, ext/libxml can
only be built statically but ext/zend-test can be built both statically and
dynamically.
So the regression tests won't work in all possible configurations anyway on Windows.
All of this is no problem on Linux because it just uses dynamic linking
and pkg-config, without any magic.

Signed-off-by: Ben Ramsey <ramsey@php.net>
2023-07-31 19:55:10 -05:00
Derick Rethans
0870ebb862 Merge branch 'PHP-8.0' into PHP-8.1 2023-07-31 19:53:43 +01:00
Niels Dossche
c283c3ab0b Sanitize libxml2 globals before parsing
Fixes GHSA-3qrf-m4j2-pcrr.

To parse a document with libxml2, you first need to create a parsing context.
The parsing context contains parsing options (e.g. XML_NOENT to substitute
entities) that the application (in this case PHP) can set.
Unfortunately, libxml2 also supports providing default set options.
For example, if you call xmlSubstituteEntitiesDefault(1) then the XML_NOENT
option will be added to the parsing options every time you create a parsing
context **even if the application never requested XML_NOENT**.

Third party extensions can override these globals, in particular the
substitute entity global. This causes entity substitution to be
unexpectedly active.

Fix it by setting the parsing options to a sane known value.
For API calls that depend on global state we introduce
PHP_LIBXML_SANITIZE_GLOBALS() and PHP_LIBXML_RESTORE_GLOBALS().
For other APIs that work directly with a context we introduce
php_libxml_sanitize_parse_ctxt_options().
2023-07-31 19:47:19 +01:00
Niels Dossche
bed0e54104 Fix DOM test 2023-07-26 18:05:24 +02:00
Niels Dossche
bf4e7bd3ed Fix GH-11791: Wrong default value of DOMDocument::xmlStandalone
At one point this was changed from a bool to an int in libxml2, with
negative values meaning it is unspecified. Because it is cast to a bool
this therefore returned true instead of the expected false.

Closes GH-11793.
2023-07-26 17:20:10 +02:00
Niels Dossche
abb1d2e824 Fix empty argument cases for DOMParentNode methods
Closes GH-11768.
2023-07-24 18:58:39 +02:00
Niels Dossche
1cf2d216a2 Fix DOMCharacterData::replaceWith() with itself
Previously, when replacing the node with itself (or contained within
itself), the node disappeared.

Closes GH-11770.
2023-07-24 18:58:17 +02:00
Niels Dossche
168bc8146f Fix incorrect attribute existence check in DOMElement::setAttributeNodeNS()
Closes GH-11776.
2023-07-24 18:57:16 +02:00
Niels Dossche
d439ee18ed Fix DOMEntity field getter bugs
- publicId could crash PHP if none was provided
- notationName never worked

The fields of this classs were untested. This new test file changes that.

Closes GH-11779.
2023-07-24 18:55:51 +02:00
Niels Dossche
5c26258eeb Handle fragments consisting out of multiple children without a single root correctly
Closes GH-11698.
2023-07-13 16:09:04 +02:00
Niels Dossche
48b246e038 Add regression test for GH-11682
This bug was already fixed via 15ff830, but we really need more
test coverage.

Co-authored-by: Arne Blankerts <arne@blankerts.de>
2023-07-11 23:02:01 +02:00
Niels Dossche
15ff830373 Fix GH-11625: DOMElement::replaceWith() doesn't replace node with DOMDocumentFragment but just deletes node or causes wrapping <></> depending on libxml2 version
Depending on the libxml2 version, the behaviour is either to not
render the fragment correctly, or to wrap it inside <></>. Fix it by
unpacking fragments manually. This has the side effect that we need to
move the unlinking check in the replacement function to earlier because
the empty child list is now possible in non-error cases.
Also fixes a mistake in the linked list management.

Closes GH-11627.
2023-07-10 13:29:31 +02:00
nielsdos
c174ebfce0 Revert "Fix GH-11404: DOMDocument::savexml and friends ommit xmlns="" declaration for null namespace, creating incorrect xml representation of the DOM"
This reverts commit 7eb3e9cd17.

Although the fix follows the spec, it causes issues because a lot of old
code assumes the incorrect behaviour PHP had since a long time.
We cannot do this yet, especially not in a stable release.
We revert this for the time being.
See GH-11428.
2023-06-19 19:37:46 +02:00
Niels Dossche
9f7d88802e Fix #80332: Completely broken array access functionality with DOMNamedNodeMap
The problem is the usage of zval_get_long(). In particular, if the
string is non-numeric the result of zval_get_long() will be 0 without
giving an error or warning. This is misleading for users: users get the
impression that they can use strings to access the map because it
coincidentally works for the first item (which is at index 0). Of
course, this fails with any other index which causes confusion and bugs.

This patch adds proper support for using string offsets while accessing
the map. It does so by detecting if it's a non-numeric string, and then
using the getNamedItem() method instead of item(). I had to split up the
array access implementation code for DOMNodeList and DOMNamedNodeMap
first to be able to do this.

Closes GH-11468.
2023-06-18 14:59:19 +02:00
nielsdos
7eb3e9cd17 Fix GH-11404: DOMDocument::savexml and friends ommit xmlns="" declaration for null namespace, creating incorrect xml representation of the DOM
The NULL namespace is only correct when there is no default namespace
override. When there is, we need to manually set it to the empty string
namespace.

Closes GH-11428.
2023-06-17 13:36:00 +02:00
nielsdos
b30be40b86 Fix bug #55294 and #47530 and #47847: namespace reconciliation issues
We'll use the DOM wrapper version of libxml2 instead of the regular one.
It's conforming to the behaviour we expect of DOM.
Most of this patch is tests.

I based and extended the tests on the code attached with the aforementioned
bug reports. Therefore the credits for the tests:
Co-authored-by: hilse at web dot de
Co-authored-by: robin2008 at altruists dot org
Co-authored-by: sgunderson at bigfoot dot com

We'll also change the searching point of the internal reconciliation to
start at the top of the added tree to avoid redundant work now that the
function is changed.

Closes GH-11454.
2023-06-15 21:50:00 +02:00
Niels Dossche
10d94aca4c Fix "invalid state error" with cloned namespace declarations
Closes GH-11429.
2023-06-13 17:30:18 +02:00
Niels Dossche
e309fd8461 Fix lifetime issue with getAttributeNodeNS()
It's the same issue that I fixed previously in GH-11402, but in a
different place.

Closes GH-11422.
2023-06-13 17:29:37 +02:00
nielsdos
f2d673fb18 Fix #70359 and #78577: segfaults with DOMNameSpaceNode
* Fix type confusion and parent reference
* Manually manage the lifetime of the parent
* Add regression tests
* Break out to a helper, and apply the use-after-free fix to xpath

Closes GH-11402.
2023-06-09 21:35:55 +02:00
Niels Dossche
0e34ac864a Fix bug #77686: Removed elements are still returned by getElementById
From the moment an ID is created, libxml2's behaviour is to cache that element,
even if that element is not yet attached to the document. Similarly, only upon
destruction of the element the ID is actually removed by libxml2.
Since libxml2 has such behaviour deeply ingrained in the library, and uses the
cache for various purposes, it seems like a bad idea and lost cause to fight it.
Instead, we'll simply walk the tree upwards to check if the node is attached to
the document.

Closes GH-11369.
2023-06-04 16:20:34 +02:00
Niels Dossche
23f7002527 Fix bug #81642: DOMChildNode::replaceWith() bug when replacing a node with itself
Closes GH-11363.
2023-06-04 16:19:48 +02:00
Niels Dossche
b1d8e240e6 Fix bug #67440: append_node of a DOMDocumentFragment does not reconcile namespaces
The test was amended from the original issue report. For the test:
Co-authored-by: php@deep-freeze.ca

The problem is that the regular dom_reconcile_ns() only works on a
single node. We actually have to reconciliate the whole tree in case a
fragment was added. This also required to move some code around such
that this special case could be handled separately.

Closes GH-11362.
2023-06-04 16:19:04 +02:00
nielsdos
7812772105 Fix GH-11347: Memory leak when calling a static method inside an xpath query
It's a type confusion bug. `zend_make_callable` may change the function name
of the fci to become an array, causing a crash in debug mode on
`zval_ptr_dtor_str(&fci.function_name);` in `dom_xpath_ext_function_php`.
On a production build it doesn't crash but only causes a leak, because
the array elements are not destroyed, only the array container itself
is. We can use the nogc variant because it cannot contain cycles, the
potential array can only contain 2 strings.

Closes GH-11350.
2023-05-31 17:14:57 +02:00
nielsdos
b374ec399d Fix DOMElement::append() and DOMElement::prepend() hierarchy checks
We could end up in an invalid hierarchy, resulting in infinite loops and
eventual crashes if we don't check for the DOM hierarchy validity.

Closes GH-11344.
2023-05-30 17:36:26 +02:00
Niels Dossche
154c251013 Fix spec compliance error for DOMDocument::getElementsByTagNameNS
Spec link: https://dom.spec.whatwg.org/#concept-getelementsbytagnamens
Spec says we should match any namespace when '*' is provided. This was
however not the case: elements that didn't have a namespace were not
returned. This patch fixes the error by modifying the namespace check.

Closes GH-11343.
2023-05-30 17:35:38 +02:00
divinity76
761b9a44f8 Fix return value in stub file for DOMNodeList::item
Not explicitly documenting the possibility of returning DOMElement causes
the Intelephense linter (a popular PHP linter with ~9 million downloads:
https://marketplace.visualstudio.com/items?itemName=bmewburn.vscode-intelephense-client)
to think this code is bad:

  $xp->query("whatever")->item(0)->getAttribute("foo");

DOMNode does not have getAttribute (while DOMElement does).
Documenting the DOMElement return type should fix Intelephense's linter.

Closes GH-11342.
2023-05-29 18:49:26 +02:00
Niels Dossche
c473787abb Fix GH-10234: Setting DOMAttr::textContent results in an empty attribute value
We can't directly call xmlNodeSetContent, because it might encode the string
through xmlStringLenGetNodeList for types
XML_DOCUMENT_FRAG_NODE, XML_ELEMENT_NODE, XML_ATTRIBUTE_NODE.
In these cases we need to use a text node to avoid the encoding.
For the other cases, we *can* rely on xmlNodeSetContent because it is either
a no-op, or handles the content without encoding and clears the properties
field if needed.

The test was taken from the issue report, for the test:
Co-authored-by: ThomasWeinert <thomas@weinert.info>

Closes GH-10245.
2023-05-29 14:10:59 +02:00
nielsdos
cba335d61e Fix GH-11288 and GH-11289 and GH-11290 and GH-9142: DOMExceptions and segfaults with replaceWith
This replaces the implementation of before and after with one following
the spec very strictly, instead of trying to figure out the state we're
in by looking at the pointers. Also relaxes the condition on text node
copying to prevent working on a stale node pointer.

Closes GH-11299.
2023-05-25 23:04:19 +02:00
Niels Dossche
7c0dfc5cf5 Fix GH-11160: Few tests failed building with new libxml 2.11.0
It's possible to categorise the failures into 2 categories:
  - Changed error message. In this case we either duplicate the test and
    modify the error message. Or if the change in error message is
    small, we use the EXPECTF matchers to make the test compatible with both
    old and new versions of libxml2.
  - Missing warnings. This is caused by a change in libxml2 where the
    parser started using SAX APIs internally [1]. In this case the
    error_type passed to php_libxml_internal_error_handler() changed from
    PHP_LIBXML_ERROR to PHP_LIBXML_CTX_WARNING because it internally
    started to use the SAX handlers instead of the generic handlers.
    However, for the SAX handlers the current input stack is empty, so
    nothing is actually printed. I fixed this by falling back to a
    regular warning without a filename & line number reference, which
    mimicks the old behaviour. Furthermore, this change now also shows
    an additional warning in a test which was previously hidden.

[1] 9a82b94a94

Closes GH-11162.
2023-05-06 23:10:07 +02:00
Niels Dossche
0579beb842 Fix incorrect error handling in dom_zvals_to_fragment()
Discovered this pre-existing problem while testing GH-10682.
Note: this problem existed *before* that PR.

* Not all paths throw a hierarchy request error
* xmlFreeNode must be used instead of xmlFree for the fragment to also
  free its children.
* Free up nodes that couldn't be added when xmlAddChild fails.

I unified the error handling code that's exactly the same with a goto to
prevent at least some of such problems in the future.

Closes GH-10981.
2023-04-03 21:21:35 +02:00
NathanFreeman
2d6decc14c Fix bug #80602: Segfault when using DOMChildNode::before()
This furthermore fixes the logic error explained in
https://github.com/php/php-src/pull/8729#issuecomment-1161737132

Closes GH-10682.
2023-03-30 20:49:05 +02:00
Stanislav Malyshev
85d9278db2 Merge branch 'PHP-8.0' into PHP-8.1 2023-02-12 21:33:39 -07:00
Niels Dossche
ec10b28d64 Fix array overrun when appending slash to paths
Fix it by extending the array sizes by one character. As the input is
limited to the maximum path length, there will always be place to append
the slash. As the php_check_specific_open_basedir() simply uses the
strings to compare against each other, no new failures related to too
long paths are introduced.
We'll let the DOM and XML case handle a potentially too long path in the
library code.
2023-02-12 20:56:19 -07:00
George Peter Banyard
a4acba9e52
Add missing EXTENSION section to tests 2022-10-27 14:39:43 +01:00
Christoph M. Becker
9bd9e9a867
Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Fix #79451: DOMDocument->replaceChild on doctype causes double free
2022-08-19 18:13:48 +02:00