php-src

mirror of https://github.com/php/php-src.git synced 2024-09-22 10:27:25 +00:00

Author	SHA1	Message	Date
Alex Dowad	9ac49c0dd3	New implementation of mb_convert_kana mb_convert_kana now uses the new text encoding conversion filters. Microbenchmarking shows speed gains of 50%-150% across various text encodings and input string lengths. The behavior is the same as the old mb_convert_kana except for one fix: if the 'zero codepoint' U+0000 appeared in the input, the old implementation would sometimes drop it, not passing it through to the output. This is now fixed.	2022-07-20 07:44:19 +02:00
Alex Dowad	321dbd0413	Implement fast text conversion interface for ISO-2022-JP-2004 There were bugs in the legacy implementation. Lots of them. It did not properly track whether it has switched to JISX 0213 plane 1 or plane 2. If it processes a character in plane 1 and then immediately one in plane 2, it failed to emit the escape code to switch to plane 2. Further, when converting codepoints from 0x80-0xFF to ISO-2022-JP-2004, the legacy implementation would totally disregard which mode it was operating in. Such codepoints would pass through directly to the output without any escape sequences being emitted. If that was not enough, all the legacy implementations of JISX 0213:2004 encodings had another common bug; their 'flush function' did not call the next flush function in the chain of conversion filters. So if any of these encodings were converted to an encoding where the flush function was needed to finish the output string, then the output would be truncated.	2022-05-28 21:53:36 +02:00
Alex Dowad	e5fdd5cef2	Implement fast text conversion interface for EUC-JP-2004 All the legacy implementations of JISX 0213:2004 encodings had a common bug; their 'flush function' did not call the next flush function in the chain of conversion filters. So if any of these encodings were converted to an encoding where the flush function was needed to finish the output string, then the output would be truncated.	2022-05-28 21:53:36 +02:00
Alex Dowad	e2459857af	Remove duplicate implementation of CP932 from mbstring Sigh. Double sigh. After fruitlessly searching the Internet for information on this mysterious text encoding called "SJIS-open", I wrote a script to try converting every Unicode codepoint from 0-0xFFFF and compare the results from different variants of Shift-JIS, to see which one "SJIS-open" would be most similar to. The result? It's just CP932. There is no difference at all. So why do we have two implementations of CP932 in mbstring? In case somebody, somewhere is using "SJIS-open" (or its aliases "SJIS-win" or "SJIS-ms"), add these as aliases to CP932 so existing code will continue to work.	2021-06-17 13:12:40 +02:00
Alex Dowad	e169ad3b61	Consolidate all single-byte encodings in one source file We can squeeze out a lot of duplicated code in this way.	2020-11-11 11:18:59 +02:00
Alex Dowad	3e7acf901d	Remove mbstring identify filters mbstring had an 'identify filter' for almost every supported text encoding which was used when auto-detecting the most likely encoding for a string. It would run over the string and set a 'flag' if it saw anything which did not appear likely to be the encoding in question. One problem with this scheme was that encodings which merely appeared less likely to be the correct one were completely rejected, even if there was no better candidate. Another problem was that the 'identify filters' had a huge amount of code duplication with the 'conversion filters'. Eliminate the identify filters. Instead, when auto-detecting text encoding, use conversion filters to see whether the input string is valid in candidate encodings or not. At the same type, watch the type of codepoints which the string decodes to and mark it as less likely if non-printable characters (ESC, form feed, bell, etc.) or 'private use area' codepoints are seen. Interestingly, one old test case in which JIS text was misidentified as UTF-8 (and this wrong behavior was enshrined in the test) was 'fixed' and the JIS string is now auto-detected as JIS.	2020-11-09 13:45:17 +02:00
Alex Dowad	cc03c54c36	Remove useless byte{2,4}{be,le} encodings from mbstring There is no meaningful difference between these and UCS-{2,4}. They are just a little bit more lax about passing errors silently. They also have no known use. Alias to UCS-{2,4} in case someone, somewhere is using them.	2020-11-09 13:45:16 +02:00
Alex Dowad	62317d592f	Remove redundant includes from mbstring (and make sure correct config.h is used) Very interesting... it turns out that when Valgrind support was enabled, `#include "config.h"` from within mbstring was actually including the file "config.h" from Valgrind, and not the one from mbstring!! This is because -I/usr/include/valgrind was added to the compiler invocation _before_ -Iext/mbstring/libmbfl. Make sure we actually include the file which was intended.	2020-08-31 23:17:58 +02:00
Alex Dowad	d4ef7ef11d	Inline unneeded indirection for mbstring memory management All memory allocation and deallocation for mbstring bounces through a table of function pointers before going to emalloc/efree/etc. But this is unnecessary. The allocators are never swapped out. Better to just call them directly.	2020-08-31 23:16:09 +02:00
Christoph M. Becker	737c1b492c	Put oniguruma include path to proper CFLAGS	2019-07-19 20:04:47 +02:00
Christoph M. Becker	504cd03fc3	Move Oniguruma related config stuff to where it belongs Oniguruma is exclusively used by ext/mbstring, and only if mbregex is enabled. Therefore it is unnecessary and confusing to have Oniguruma related config stuff scattered elsewhere. While we're at it, we also remove the referral to the bundled libonig which is removed as of PHP 7.4.0, and the duplicated call to `PHP_INSTALL_HEADERS()`.	2019-07-19 19:30:41 +02:00
Peter Kokot	359a78b16c	Remove unused defines Used in php-src the past and today removed and not used anymore: - HAVE_CURL_EASY_STRERROR - HAVE_CURL_MULTI_STRERROR - HAVE_NEW_MIME2TEXT - HAVE_MBSTR_CN - HAVE_MBSTR_JA - HAVE_MBSTR_KR - HAVE_MBSTR_RU - HAVE_MBSTR_TW Part of oniguruma which doesn't use these anymore - NOT_RUBY - HAVE_STDARG_PROTOTYPES Unused: - HAVE_MPIR Closes GH-4427	2019-07-18 02:21:39 +02:00
Anatol Belski	e10349152b	Sync with ZEND_ENABLE_STATIC_TSRMLS_CACHE enablement in ext/mbstring	2019-03-12 21:33:43 +01:00
Anatol Belski	2d7658959e	Unbundle oniguruma in config.w32	2019-02-11 14:53:19 +01:00
Nikita Popov	d1c1481081	Unbundle oniguruma And also switch detection over to pkg-config.	2019-02-11 14:53:19 +01:00
Peter Kokot	7dd62811ce	Remove HAVE_STDLIB_H The C89 and later standard defines the `<stdlib.h>` header as part of the standard headers [1] and on current systems it is always present and the `HAVE_STDLIB_H` symbol can be removed. Also Autoconf suggests doing this and relying on C89 or above [2] and [3]. [1] https://port70.net/~nsz/c/c89/c89-draft.html#4.1.2 [2] http://git.savannah.gnu.org/cgit/autoconf.git/tree/lib/autoconf/headers.m4 [3] https://www.gnu.org/software/autoconf/manual/autoconf-2.69/autoconf.html	2018-09-16 20:53:53 +02:00
Peter Kokot	8d3f8ca12a	Remove unused Git attributes ident The $Id$ keywords were used in Subversion where they can be substituted with filename, last revision number change, last changed date, and last user who changed it. In Git this functionality is different and can be done with Git attribute ident. These need to be defined manually for each file in the .gitattributes file and are afterwards replaced with 40-character hexadecimal blob object name which is based only on the particular file contents. This patch simplifies handling of $Id$ keywords by removing them since they are not used anymore.	2018-07-25 00:53:25 +02:00
Christoph M. Becker	d48b233991	Update to Oniguruma 6.7.1 We also apply the still relevant parts of `oniguruma.patch` and update the patch accordingly.	2018-03-10 01:07:00 +01:00
Peter Kokot	5c5bd30339	Remove --with-libmbfl configure option The bundled libmbfl library is no longer API or ABI compatible with the (currently unmaintained) upstream library. As such, building against an external libmbfl is no longer possible.	2017-10-28 16:11:30 +02:00
Anatol Belski	2a76d2282a	upgrade to Oniguruma 6.1.2	2016-11-25 22:00:53 +01:00
Anatol Belski	864cd82ace	Merge remote-tracking branch 'origin/master' into native-tls * origin/master: updated NEWS refactored the mbstring config.w32 Update NEWS Fixed compilation warnings Fixed bug #68504 --with-libmbfl configure option not present on Windows Changed "finally" handling. Removed EX(fast_ret) and EX(delayed_exception). Allocate and use additional IS_TMP_VAR slot on VM stack instead. the darwin specific test fails for me with the same output which is the expected for the original test I couldn't find anybody who managed to see this test passing, but I found a bunch of other reports on qa.php.net/reports and on google which do see this test failing on mac. if this change causes you to have this test failing on Mac, please drop me a mail so we can improve the current test so it passes for everybody. #68446 is fixed Reimplemented silence operator (@) handling on exceptions. Now each silence region is stored in op_array->brk_cont_array. On exception ZEND_HANDLE_EXCEPTION handler traverse this array and restore original EG(error_reporting) if exception occured inside a "silence" region. remove the NEWS entries for the reverted stuff typo fix go back with phpdbg to the state of 5.6.3, reverting the controversial commits(remote debugging/xml protocol) 5.5.21 now New label length test Fix ext/filter/tests/033.phpt Fix filter_list test FILTER_VALIDATE_DOMAIN and RFC conformance for FILTER_VALIDATE_URL Conflicts: ext/mbstring/config.w32	2014-11-27 15:59:43 +01:00
Anatol Belski	42af411620	refactored the mbstring config.w32	2014-11-27 13:37:00 +01:00
Anatol Belski	3ec8730e89	Fixed bug #68504 --with-libmbfl configure option not present on Windows	2014-11-27 09:14:47 +01:00
Anatol Belski	0490a32249	more exts converted for static tsrm ls pointer mbstring, pcre, reflection	2014-10-15 19:19:23 +02:00
Rui Hirokawa	4122ef275c	added iso2022jp-mobile and emoji unsuppoted in unicode 6.0.	2011-08-24 15:28:44 +00:00
Pierre Joye	60dd9e0bd9	- fix typo & build	2011-08-22 07:39:09 +00:00
Rui Hirokawa	c746cf5dc9	updated libmbfl to 1.3.2 (JISX-0213:2004 support).	2011-08-20 07:24:04 +00:00
Rui Hirokawa	484e6b8fb3	added gb18030 encoding to mbstring/libmbfl.~	2011-08-14 14:09:11 +00:00
Rui Hirokawa	1ec46d3fe3	fixed win32 build.	2011-08-13 12:53:40 +00:00
Rui Hirokawa	52948b534c	added new files of libmbfl 1.3.0.	2011-08-02 02:50:11 +00:00
Pierre Joye	a7ffa09e18	- add PHP_INSTALL_HEADERS to all parts (core&exts) exposing headers, generate the install-headers cmd	2010-12-11 22:18:10 +00:00
Moriyoshi Koizumi	872f07aa5e	- Fix win32 build. (notified by Rob. Thanks)	2010-03-15 14:19:51 +00:00
Moriyoshi Koizumi	d9dda48f8a	- Update the bundled libmbfl to the latest on upstream.	2010-03-12 04:55:37 +00:00
Kalle Sommer Nielsen	4b17fee3b9	Fixed static build of mbstring on Windows (makes static build of exif possible too)	2009-06-11 23:37:51 +00:00
Jani Taskinen	a0f3cf5cc4	MFB: Thanks to the "maintainers" who are too lazy to commit FIRST to HEAD!	2009-04-20 17:06:03 +00:00
Moriyoshi Koizumi	935fa7a97e	- Fix win32 build	2008-07-24 16:59:53 +00:00
Antony Dovgal	d7ab2da30b	there is no such file	2007-07-16 19:07:22 +00:00
Frank M. Kromann	16ccbf0c0c	MFB: Fix win32 build	2007-02-04 00:23:32 +00:00
Frank M. Kromann	af741730f4	Fix win32 build	2006-11-04 17:25:37 +00:00
Rui Hirokawa	bcf3a3311d	added turkish language support for libmbfl.	2005-12-23 13:53:30 +00:00
Moriyoshi Koizumi	542901d705	- Add Armenian encoding / NLS (patch by Hayk Chamyan)	2005-03-22 22:22:11 +00:00
Moriyoshi Koizumi	5b5e012bc2	- Update libmbfl (fixes bug #30549 and #31911 ). - Update oniguruma to 3.7.0	2005-02-20 22:18:09 +00:00
Wez Furlong	a8757b11e6	Enable mbregex in win32 build	2004-04-08 11:01:51 +00:00
Moriyoshi Koizumi	a91e44c830	- Add missing include path.	2004-03-03 10:27:19 +00:00
Moriyoshi Koizumi	9e9d7d1743	- proper DLL linkage specifier. # oniguruma.h:34- # # #ifndef ONIG_EXTERN # #if defined(_WIN32) && !defined(__CYGWIN__) # #if defined(EXPORT) \|\| defined(RUBY_EXPORT) # #define ONIG_EXTERN extern __declspec(dllexport) # #else # #define ONIG_EXTERN extern __declspec(dllimport) # #endif # #endif # #endif	2004-03-02 22:38:21 +00:00
Moriyoshi Koizumi	bc4d64477a	- Fix typo.	2004-03-02 20:18:14 +00:00
Moriyoshi Koizumi	1dfd0bd901	- Really fix the build. # Should be fixed now :\|	2004-03-02 15:59:30 +00:00
Edin Kadribasic	f067c0479f	Temporary fix for win32 build	2004-03-02 11:50:10 +00:00
Moriyoshi Koizumi	03bdd13560	- Fix win32 build. # Thanks Nuno Lopes & Derick for letting me know.	2004-03-01 20:25:33 +00:00
Wez Furlong	05b9b20ed8	Add new (optional!) win32 build infrastructure. Will follow up to internals@ shortly.	2003-12-02 23:17:04 +00:00

50 Commits