mirror of
https://github.com/php/php-src.git
synced 2024-10-01 06:46:08 +00:00
7e61cffb0d
Updated with UnicodeData-6.0.0d7.txt and included the source of the generator program with the distribution. #The replaced tables, generated circa 2002, seem to reflect #Unicode 3.2. I was unable to generate the same property #offsets with Unicode 3.2 data, but all the tests I made #indicate php_unicode_is_prop() is returning the correct #values. The replaced file merely says it used a "modified #version" of ucgendat, which is not very helpful. The results #I got were not significantly different, only slightly higher #offsets at two properties, which were carried over to the #subsequent properties. #I was, however, able to replicate precisely the casing table. #The extent of the "modifications" besides omitting most of #the tables, a slightly different layout and the casing table #offsets having been multiplied by 3 is unclear. #The test suite showed no regressions; however, it's very poor #in testing the modified portion of the extension.
24 lines
702 B
PHP
24 lines
702 B
PHP
--TEST--
|
|
Bug #52981 (Unicode properties are outdated (from Unicode 3.2))
|
|
--SKIPIF--
|
|
<?php extension_loaded('mbstring') or die('skip mbstring not available'); ?>
|
|
--FILE--
|
|
<?php
|
|
function test($str)
|
|
{
|
|
$upper = mb_strtoupper($str, 'UTF-8');
|
|
$len = strlen($upper);
|
|
for ($i = 0; $i < $len; ++$i) echo dechex(ord($upper[$i])) . ' ';
|
|
echo "\n";
|
|
}
|
|
|
|
// OK
|
|
test("\xF0\x90\x90\xB8");// U+10438 DESERET SMALL LETTER H (added in 3.1.0, March 2001)
|
|
// not OK
|
|
test("\xE2\xB0\xB0"); // U+2C30 GLAGOLITIC SMALL LETTER AZU (added in 4.1.0, March 2005)
|
|
test("\xD4\xA5"); // U+0525 CYRILLIC SMALL LETTER PE WITH DESCENDER (added in 5.2.0, October 2009)
|
|
--EXPECTF--
|
|
f0 90 90 90
|
|
e2 b0 80
|
|
d4 a4
|