`ap_get_brigade()` may fail for different reasons, and we must not
pretend that a partially read POST payload is fine; instead we report
a content length of zero what matches all other `read_post()` callbacks
of bundled SAPIs.
Closes GH-10059.
In b5ff87ca71, I made a number of adjustments to our conversion code
for CP1252. One of the adjustments was to make the mappings match those
published by the Unicode Consortium in the file CP1252.TXT. These do
not include mappings for the CP1252 bytes 0x81, 0x8D, 0x8F, 0x90, and
0x9D.
Rostyslav Gulka reported that this caused a problem. His application
stores binary JPEG data in an MS-SQL database. When they SELECT the
binary data out of the database, it is treated as CP1252 text and
automatically converted to UTF-8. To recover the original binary
data, they then do a conversion from UTF-8 to CP1252.
Obviously, that does not work if certain CP1252 bytes do not map to
any Unicode codepoint at all.
While this is a very unusual application of text encoding conversion,
and we might choose not to support it if there was no other basis for
including those mappings, it seems that Microsoft does actually include
them in the Win32 API as "best fit" mappings. These are extra mappings
from Unicode to other text encodings, which the Win32 API function
WideCharToMultiByte uses by default unless the WC_NO_BEST_FIT_CHARS
flag was passed.
A list of these "best fit" mappings for CP1252 can be found here:
https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1252.txt
A previous fix[1] was not sufficient to catch all potential file URIs,
because the patch did not cater to URL encoding. Properly parsing and
decoding the URI may yield a different result than the handling of
SQLite3, so we play it safe, and reject any file URIs if open_basedir
is configured.
[1] <https://bugs.php.net/bug.php?id=77967>
Closes GH-10018.
There are two issues to resolve:
1. The FCC is not refetch when trying to unregister a trampoline
2. Comparing the function pointer of trampolines is meaningless as they are reallocated, thus we need to compare the name of the function
Found while working on GH-8294
Closes GH-10033
Dialect 1 databases store and transfer `NUMERIC(15,2)` values as
doubles, which we need to cater to in `firebird_stmt_get_col()` to
avoid `ZEND_ASSUME(0)` to ever be triggered, since that may result
in undefined behavior.
Since adding a regression test would require to create a dialect 1
database, we go without it.
Closes GH-10021.
There might be a moment when the child log event is executed after
freeing a child. That could possibly happen if the child output is
triggered at the same as the terminating of the child. Then the output
event could be potentially processed after the terminating event which
would cause this kind of issue.
The issue might got more visible after introducing the log_stream on
a child because it is more likely that this cannot be dereferenced
after free. However it is very hard to reproduce this issue so there
is no test for this.
The fix basically prevents passing a child pointer and instead passes
the child PID and then looks the child up by the PID when it is being
processed. This is obviously slower but it is a safe way to do it and
the slow down should not be hopefully visible in a way that it would
overload a master process.
We need to overwrite the __toString magic method for SplFileObject, similarly to how DirectoryIterator overwrites it
Moreover, the custom cast handler is useless as we define __toString methods, so use the standard one instead.
Closes GH-9912
`.nFileIndexHigh` is a unsigned 32bit number. Casting that to `__int64`
and shifting left by 32bits triggers undefined behavior if the most
significant bit of `.nFileIndexHigh` is set. We could avoid that by
casting to `(__uint64)`, but in that case the whole clause doesn't have
an effect anymore, so we drop it altogether.
Closes GH-9958.
The existing implementation of mb_strcut extracts part of a
multi-byte encoded string by pulling out raw bytes and then running
them through a conversion filter to ensure that the output is valid
in the requested encoding.
If the conversion filter emits error markers when doing the final
'flush' operation which ends the conversion of the extracted bytes,
these error markers may (in some cases) be included in the output.
The conversion operation does not respect the value of
mb_substitute_character; rather, it always uses '?' as an error marker.
So this issue manifests itself as unwanted '?' characters being
inserted into the output.
This issue has existed for a long time, but became noticeable in PHP
8.1 because for at least some of the supported text encodings, mbstring
is now more strict about emitting error markers when strings end in an
illegal state.
The simplest fix is to suppress error markers during the final flush
operation.
While working on a fix for this problem, another problem with mb_strcut
was discovered; since it decides when to stop consuming bytes from
the input by looking at the byte length of its OUTPUT, anything which
causes extra bytes to be emitted to the output may cause mb_strcut to
not consume all the bytes in the requested range.
The one case where we DO emit extra output bytes is for encodings
which have a selectable mode, like ISO-2022-JP; if a string in such
an encoding ends in a mode which is not the default, we emit an ending
escape sequence which changes back to the default mode. This is done
so that concatenating strings in such encodings is safe.
However, as mentioned, this can cause the output of mb_strcut to be
shorter than it logically should be. This bug has existed for a long
time, and fixing it now will be a BC break, so we may not fix it right
away.
Therefore, tests for THIS fix which don't pass because of that OTHER
bug have been split out into a separate test file (gh9535b.phpt), and
that file has been marked XFAIL.
Directly referring to a constant of an undefined throws an exception;
there is not much point in `constant()` raising a fatal error in this
case.
Closes GH-9907.