php-src/NEWS

146 lines
6.2 KiB
Plaintext
Raw Normal View History

2015-07-21 14:36:36 +00:00
PHP NEWS
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2022-08-30 15:17:15 +00:00
?? ??? ????, PHP 8.3.0alpha1
- CLI:
. Added pdeathsig to builtin server to terminate workers when the master
process is killed. (ilutov)
- Core:
. Fixed bug GH-9388 (Improve unset property and __get type incompatibility
error message). (ilutov)
. SA_ONSTACK is now set for signal handlers to be friendlier to other
in-process code such as Go's cgo. (Kévin Dunglas)
. SA_ONSTACK is now set when signals are disabled. (Kévin Dunglas)
. Fix GH-9649: Signal handlers now do a no-op instead of crashing when
executed on threads not managed by TSRM. (Kévin Dunglas)
. Fixed potential NULL pointer dereference Windows shm*() functions. (cmb)
. Added shadow stack support for fibers. (Chen Hu)
. Fix bug GH-9965 (Fix accidental caching of default arguments with side
effects). (ilutov)
. Implement GH-10217 (Use strlen() for determining the class_name length).
(Dennis Buteyn)
. Fix bug GH-8821 (Improve line numbers for errors in constant expressions).
(ilutov)
. Fix bug GH-10083 (Allow comments between & and parameter). (ilutov)
. Zend Max Execution Timers is now enabled by default for ZTS builds on
Linux. (Kévin Dunglas)
- Date:
. Implement More Appropriate Date/Time Exceptions RFC. (Derick)
- Exif:
. Removed unneeded codepaths in exif_process_TIFF_in_JPEG(). (nielsdos)
- Fileinfo:
. Upgrade bundled libmagic to 5.43. (Anatol)
- FPM:
. The status.listen shared pool now uses the same php_values (including
expose_php) and php_admin_value as the pool it is shared with. (dwxh)
- GD:
. Fixed bug #81739: OOB read due to insufficient input validation in
imageloadfont(). (CVE-2022-31630) (cmb)
- Hash:
. Fixed bug #81738: buffer overflow in hash_update() on long parameter.
(CVE-2022-37454) (nicky at mouha dot be)
- Intl:
. Added pattern format error infos for numfmt_set_pattern. (David Carlier)
. Added MIXED_NUMBERS and HIDDEN_OVERLAY constants for
the Spoofchecker's class. (David Carlier)
. Updated datefmt_set_timezone/IntlDateformatter::setTimezone returns type.
(David Carlier).
- JSON:
. Added json_validate(). (Juan Morales)
- MBString:
. mb_detect_encoding is better able to identify the correct encoding for
Turkish text. (Alex Dowad)
Implement mb_detect_encoding using fast text conversion filters Regarding the optional 3rd `strict` argument to mb_detect_encoding, the documentation states: Controls the behaviour when string is not valid in any of the listed encodings. If strict is set to false, the closest matching encoding will be returned; if strict is set to true, false will be returned. (Ref: https://www.php.net/manual/en/function.mb-detect-encoding.php) Because of bugs in the implementation, mb_detect_encoding did not always behave according to this description when `strict` was false. For example: <?php echo var_export(mb_detect_encoding("\xc0\x00", "UTF-8", false)); // Before this commit, prints: false // After this commit, prints: 'UTF-8' Because `strict` is false in the above example, mb_detect_encoding should return the 'closest matching encoding', which is UTF-8, since that is the only candidate encoding. (Incidentally, this example shows that using mb_detect_encoding with a single candidate encoding in non-strict mode is useless.) The new implementation fixes this bug. It also fixes another problem with the old implementation as regards non-strict detection mode: The old implementation would stop processing of the input string using a particular candidate encoding as soon as it saw an error in that encoding, even in non-strict mode. This means that it could not really detect the 'closest matching encoding'; rather, what it would return in non-strict mode was 'the encoding in which the first decoding error is furthest from the beginning of the input string'. In non-strict mode, the new implementation continues trying to process the input string to its end even after seeing an error. This makes it possible to determine in which candidate encoding the string has the smallest number of errors, i.e. the 'closest matching encoding'. Rejecting candidate encodings as soon as it saw an error gave the old implementation a marked performance advantage in non-strict mode; however, the new implementation still beats it in most cases. Here are a few sample microbenchmark results: UTF-8, ~100 codepoints, strict mode Old: 0.080s (100,000 calls) New: 0.026s (" " ) UTF-8, ~100 codepoints, non-strict mode Old: 0.079s (100,000 calls) New: 0.033s (" " ) UTF-8, ~10000 codepoints, strict mode Old: 6.708s (60,000 calls) New: 1.383s (" " ) UTF-8, ~10000 codepoints, non-strict mode Old: 6.705s (60,000 calls) New: 3.044s (" " ) Notice that the old implementation had almost identical performance between strict and non-strict mode, while the new suffers a significant performance penalty for non-strict detection. This is the cost of implementing the behavior specified in the documentation. A couple more sample results: SJIS, ~10000 codepoints, strict mode Old: 4.563s New: 1.084s SJIS, ~10000 codepoints, non-strict mode Old: 4.569s New: 2.863s This is the only case I found where the new implementation loses: UTF-16LE, ~10000 codepoints, non-strict mode Old: 1.514s New: 2.813s The reason is because the test strings happened to be invalid right from the first few bytes for all the candidate encodings except for UTF-16LE; so the old implementation would immediately reject all those encodings and only process the entire string in UTF-16LE. I believe mb_detect_encoding could be made much faster if we identified good criteria for when to reject candidate encodings before reaching the end of the input string.
2022-07-20 06:53:04 +00:00
. mb_detect_encoding's "non-strict" mode now behaves as described in the
documentation. Previously, it would return false if the very first byte
of the input string was invalid in all candidate encodings. (Alex Dowad)
Implement Unicode conditional casing rules for Greek letter sigma The capital Greek letter sigma (Σ) should be lowercased as σ except when it appears at the end of a word; in that case, it should be lowercased as the special form ς. This rule is included in the Unicode data file SpecialCasing.txt. The condition for applying the rule is called "Final_Sigma" and is defined in Unicode technical report 21. The rule is: • For the special casing form to apply, the capital letter sigma must be preceded by 0 or more "case-ignorable" characters, preceded by at least 1 "cased" character. • Further, capital sigma must NOT be followed by 0 or more case-ignorable characters and then at least 1 cased character. "Case-ignorable" characters include certain punctuation marks, like the apostrophe, as well as various accent marks. There are actually close to 500 different case-ignorable characters, including accent marks from Cyrillic, Hebrew, Armenian, Arabic, Syriac, Bengali, Gujarati, Telugu, Tibetan, and many other alphabets. This category also includes zero-width spaces, codepoints which indicate RTL/LTR text direction, certain musical symbols, etc. Since the rule involves scanning over "0 or more" of such case-ignorable characters, it may be necessary to scan arbitrarily far to the left and right of capital sigma to determine whether the special lowercase form should be used or not. However, since we are trying to be both memory-efficient and CPU-efficient, this implementation limits how far to the left we will scan. Generally, we scan up to 63 characters to the left looking for a "cased" character, but not more. When scanning to the right, we go up to the end of the string if necessary, even if it means scanning over thousands of characters. Anyways, it is almost impossible to imagine that natural text will include "words" with more than 63 successive apostrophes (for example) followed by a capital sigma. Closes GH-8096.
2023-01-07 18:27:59 +00:00
. mb_strtolower, mb_strtotitle, and mb_convert_case implement conditional
casing rules for the Greek letter sigma. For mb_convert_case, conditional
casing only applies to MB_CASE_LOWER and MB_CASE_TITLE modes, not to
MB_CASE_LOWER_SIMPLE and MB_CASE_TITLE_SIMPLE. (Alex Dowad)
. mb_detect_encoding is better able to identify UTF-8 and UTF-16 strings
with a byte-order mark. (Alex Dowad)
. mb_decode_mimeheader interprets underscores in QPrint-encoded MIME
encoded words as required by RFC 2047; they are converted to spaces.
Underscores must be encoded as "=5F" in such MIME encoded words.
(Alex Dowad)
- Opcache:
. Added start, restart and force restart time to opcache's
phpinfo section. (Mikhail Galanin)
. Fix GH-9139: Allow FFI in opcache.preload when opcache.preload_user=root.
(Arnaud, Kapitan Oczywisty)
. Made opcache.preload_user always optional in the cli and phpdbg SAPIs.
(Arnaud)
. Allows W/X bits on page creation on FreeBSD despite system settings.
(David Carlier)
. Added memfd api usage, on Linux, for zend_shared_alloc_create_lock()
to create an abstract anonymous file for the opcache's lock. (Max Kellermann)
2022-09-09 09:04:38 +00:00
- PCNTL:
. SA_ONSTACK is now set for pcntl_signal. (Kévin Dunglas)
. Added SIGINFO constant. (David Carlier)
2022-09-04 18:52:21 +00:00
- Posix:
. Added posix_sysconf. (David Carlier)
. Added posix_pathconf. (David Carlier)
. Added posix_fpathconf. (David Carlier)
2023-02-13 19:43:29 +00:00
. Fixed zend_parse_arg_long's bool pointer argument assignment. (Cristian Rodriguez)
2022-09-04 18:52:21 +00:00
- Random:
. Added Randomizer::getBytesFromString(). (Joshua Rüsweg)
. Added Randomizer::nextFloat(), ::getFloat(), and IntervalBoundary. (timwolla)
. Fix GH-10292 (Made the default value of the first param of srand() and
mt_srand() nullable). (kocsismate)
. Enable getrandom() for NetBSD (from 10.x). (David Carlier)
- Reflection:
. Fix GH-9470 (ReflectionMethod constructor should not find private parent
method). (ilutov)
. Fix GH-10259 (ReflectionClass::getStaticProperties doesn't need null return
type). (kocsismate)
. Fix Segfault when using ReflectionFiber suspended by an internal function.
(danog)
- Sockets:
. Added SO_ATTACH_REUSEPORT_CBPF socket option, to give tighter control
over socket binding for a cpu core. (David Carlier)
. Added SKF_AD_QUEUE for cbpf filters. (David Carlier)
. Added socket_atmark if send/recv needs using MSG_OOB. (David Carlier)
. Added TCP_QUICKACK constant, to give tigher control over
ACK delays. (David Carlier)
. Added DONTFRAGMENT support for path MTU discovery purpose. (David Carlier)
. Added AF_DIVERT for raw socket for divert ports. (David Carlier)
. Added SOL_UPDLITE, UDPLITE_RECV_CSCOV and UDPLITE_SEND_CSCOV for updlite
protocol support. (David Carlier)
. Added SO_RERROR, SO_ZEROIZE and SO_SPLICE netbsd and openbsd constants.
(David Carlier)
. Added TCP_REPAIR for quietly close a connection. (David Carlier)
- Standard:
. E_NOTICEs emitted by unserialize() have been promoted to E_WARNING. (timwolla)
2023-01-14 11:19:35 +00:00
. Make array_pad's $length warning less confusing. (nielsdos)
. E_WARNING emitted by strtok in the caase both arguments are not provided when
starting tokenisation. (David Carlier)
. password_hash() will now chain the original RandomException to the ValueError
on salt generation failure. (timwolla)
. Fix GH-10239 (proc_close after proc_get_status always returns -1). (nielsdos)
- Streams:
. Fixed bug #51056: blocking fread() will block even if data is available.
(Jakub Zelenka)
- XSLTProcessor:
. Fixed bug #69168 (DomNode::getNodePath() returns invalid path). (nielsdos)
<<< NOTE: Insert NEWS from last stable release here prior to actual release! >>>