php-src

mirror of https://github.com/php/php-src.git synced 2024-09-22 02:17:32 +00:00

Author	SHA1	Message	Date
Peter Kokot	1ad08256f3	Sync leading and final newlines in source code files This patch adds missing newlines, trims multiple redundant final newlines into a single one, and trims redundant leading newlines. According to POSIX, a line is a sequence of zero or more non-' <newline>' characters plus a terminating '<newline>' character. [1] Files should normally have at least one final newline character. C89 [2] and later standards [3] mention a final newline: "A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character." Although it is not mandatory for all files to have a final newline fixed, a more consistent and homogeneous approach brings less of commit differences issues and a better development experience in certain text editors and IDEs. [1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206 [2] https://port70.net/~nsz/c/c89/c89-draft.html#2.1.1.2 [3] https://port70.net/~nsz/c/c99/n1256.html#5.1.1.2	2018-10-14 12:56:38 +02:00
Peter Kokot	37c329d715	Trim trailing whitespace in source code files	2018-10-13 14:17:28 +02:00
Peter Kokot	02294f0c84	Make PHP development tools files and scripts executable This patch makes several scripts and PHP development tools files executable and adds more proper shebangs to the PHP scripts. The `#!/usr/bin/env php` shebang provides running the script via `./script.php` and uses env to find PHP script location on the system. At the same time it still provides running the script with a user defined PHP location using `php script.php`.	2018-08-29 20:58:17 +02:00
Nikita Popov	f4a1d9c821	Fixed bug #65544 and #71298	2017-07-28 14:57:08 +02:00
Nikita Popov	582a65b06f	Implement full case mapping Implement full case mapping according to SpecialCasing.txt and also full case folding according to CaseFolding.txt (F). There are a number of caveats: * Only language-agnostic and unconditional full case mapping is implemented. The only language-agnostic conditional case mapping rule relates to Greek sigma in final position (Final_Sigma). Correctly handling this requires both arbitrary lookahead and lookbehind, which would require some larger changes to how the case mapping is implemented. This is a possible future extension. * The only language-specific handling that is implemented is for Turkish dotted/undotted Is, if the ISO-8859-9 encoding is used. This matches the previous behavior and makes sure that no codepoints not supported by the encoding are produced. A future extension would be to also handle the Turkish mappings specified by SpecialCasing.txt based on the mbfl internal language. * Full case folding is implemented, but case-insensitive mb_* operations continue to use simple case folding. The reason is that full case folding of the haystack string may change the position at which a match occurred. This would have to be mapped back into the position in the original string. * mb_convert_case() exposes both the full and the simple case mapping / folding, where full is the default. The constants are: * MB_CASE_LOWER (used by mb_strtolower) * MB_CASE_UPPER (used by mb_strtolower) * MB_CASE_TITLE * MB_CASE_FOLD * MB_CASE_LOWER_SIMPLE * MB_CASE_UPPER_SIMPLE * MB_CASE_TITLE_SIMPLE * MB_CASE_FOLD_SIMPLE (used by case-insensitive operations)	2017-07-28 12:32:50 +02:00
Nikita Popov	9ac7c1e71d	Use case-folding for case insensitive comparisons Instead of using lowercasing.	2017-07-28 12:32:50 +02:00
Nikita Popov	80a0601fe5	Use MPH for case maps Instead of performing a binary search, use a hashtable to store the case maps. In particular a minimal perfect hash construction is used, which does not require collision resolution (but does use an auxiliary table for the hash perturbation).	2017-07-28 12:32:50 +02:00
Nikita Popov	eacd70f762	Don't store titlecase if same as uppercase The totitle code already has a fallback for that case.	2017-07-28 12:32:50 +02:00
Nikita Popov	cedfc2f426	Drop implementation-specific character properties No point in keeping around non-standard character properties if we're not using them and most are not even being populated.	2017-07-28 12:32:50 +02:00
Nikita Popov	8ace7045e9	Handle character ranges in ucgendat generically In particular, the previous implementation did not account for Tangut Ideographs and CJK Ideograph extensions C through F.	2017-07-25 18:48:12 +02:00
Nikita Popov	0c0e35fedc	Port ucgendat to PHP Implemented such that the output is identical, including some quirks that should be fixed subsequently.	2017-07-25 18:48:12 +02:00
Nikita Popov	4bd61ec7ad	Fix handling of some special ranges in ucgendat * Han Ideagraphs go up to U+9FEA. * CJK Compatibility Ideographs are no longer specified as a special range in remotely recent versions of Unicode. * Surrogate properties should be assigned to U+D800-U+DFFF, not to U+10000-U+1FFFF.	2017-07-25 18:48:12 +02:00
Nikita Popov	3c6b2512cb	Change layout of case mapping table Previously the case mapping table was segregated by the type of the character (upper, lower, title) and always stored the other two variants (key, other1, other2). Now the table is segregated by the target type (key, other). As only very few characters have more than one target this only slightly increases the size of the table. The advantage of this layout is that we only need to perform a single table lookup in the case table. Previously, depending on the case that was hit, either one lookup in the property table, or two lookups in the property table and one lookup in the case table were required. This changes the layout from libunicode in the OpenLDAP project -- however, the last commit there was over 10 years ago, so I don't see value in keeping this in sync.	2017-07-23 18:33:15 +02:00
Nikita Popov	24cfbfd56f	Update ucgendat for more bidi properties Handle them the same way as others -- by classifying as Other Neutral.	2017-07-23 16:03:11 +02:00
Nikita Popov	077e61fad3	Fixed bug #69267 completely ucgendat.c was assuming that a title-case character is a character that has both lower and upper-case variants. However, there are title-case characters that only have a lower-case variant. Use the Lt general character proprety to determine where in the case map the character should be placed instead.	2017-07-23 15:30:17 +02:00
Nikita Popov	0e4af9192f	Partial fix for bug #69267 This pulls in 60a25c72ba389f53b0621ca250bc99f3b295d43f from the OpenLDAP project.	2017-07-23 14:47:21 +02:00
olshevskiy87	8bdec7a248	fix typos Signed-off-by: olshevskiy87 <olshevskiy87@bk.ru>	2015-05-13 22:28:35 +04:00
Stanislav Malyshev	b7a7b1a624	trailing whitespace removal	2015-01-10 15:07:38 -08:00
Gustavo André dos Santos Lopes	99807e9a72	- Moved ucgendat.c to a separate directory and included the OpenLDAP license there, as required by the license itself.	2010-10-05 02:34:35 +00:00

19 Commits