There's two issues here:
- freeing of predefined entity declaration crashes (unique to 8.3 & master)
- using multiple entity references for a single entity declaration crashes
(since forever)
The fix for the last issue is fairly easy to do on 8.3, but may require a
slightly different approach on 8.2. Therefore, for now this is 8.3-only.
Closes GH-13004.
I forgot to also update the document reference of attributes, so when
there is no document reference anymore from a variable, but still an
attribute, this can crash. Fix it by also updating the document
references for attributes.
Closes GH-13002.
Autotools emits warning if 3rd argument is empty. Call is wrapped in the
AC_CACHE_CHECK with php_cv_* cache variable name according to the docs.
Closes GH-12966
While __php_mempcpy is only used by ext/standard/crypt_sha*, the
mempcpy "pattern" is used everywhere.
This commit removes __php_mempcpy, adds zend_mempcpy and transforms
open-coded parts into function calls.
Thanks to Maurício Fauth for finding and reporting this bug.
The bug was introduced in October 2022. It originally only affected
text encodings which do not have a fixed byte width per characters
and for which mbstring does not have an mblen_table. However, I recently
made another change to mbstring, such that mb_substr no longer relies
on the mblen_table even if one is available. Because of this change,
the bug earlier introduced in October 2022 now affected a greater
number of text encodings, including UTF-8.
Because these functions are copied and not properly registered (which we
can't), the observer code doesn't add the temporaries on startup.
Add them via a callback during startup.
Closes GH-12906.
For GB18030, it is not generally possible to identify character
boundaries without scanning through the entire string. Therefore,
implement mb_strcut using a similar strategy as the mblen_table based
implementation in mbstring.c. The difference is that for GB18030, we
need to look at two leading bytes to determine the byte length of a
multi-byte character.
The new implementation is 4-5x faster for short strings, and more than
10x faster for long strings. (Part of the reason why this new code has
such a great performance advantage is because it is replacing code
based on the older text conversion filters provided by libmbfl, which
were quite slow.)
The behavior is the same as before for valid GB18030 strings; for
some invalid strings, mb_strcut will choose different 'cut' points
as compared to before. (Clang's libFuzzer was used to compare the
old and new implementations, searching for test cases where they had
different behavior; no such cases were found.)