Fix incorrect check in cs_8559_5 in map_from_unicode()

The condition `code == 0x0450 || code == 0x045D` is always false because
of an incorrect range check on code.
According to the BMP coverage in the encoding spec for ISO-8859-5
(https://encoding.spec.whatwg.org/iso-8859-5-bmp.html) the range of
valid characters is 0x0401 - 0x045F (except for 0x040D, 0x0450, 0x045D).
The current check has an upper bound of 0x044F instead of 0x045F.
Fix this by changing the upper bound.

Closes GH-10399

Signed-off-by: George Peter Banyard <girgias@php.net>
This commit is contained in:
Niels Dossche 2023-01-21 13:50:36 +01:00 committed by George Peter Banyard
parent b7a158a19b
commit a8c8fb2564
No known key found for this signature in database
GPG Key ID: 3306078E3194AEBD
3 changed files with 16 additions and 15 deletions

1
NEWS
View File

@ -15,6 +15,7 @@ PHP NEWS
- Standard:
. Fixed bug GH-10292 (Made the default value of the first param of srand() and
mt_srand() unknown). (kocsismate)
. Fix incorrect check in cs_8559_5 in map_from_unicode(). (nielsdos)
02 Feb 2023, PHP 8.1.15

View File

@ -477,7 +477,7 @@ static inline int map_from_unicode(unsigned code, enum entity_charset charset, u
*res = 0xF0; /* numero sign */
} else if (code == 0xA7) {
*res = 0xFD; /* section sign */
} else if (code >= 0x0401 && code <= 0x044F) {
} else if (code >= 0x0401 && code <= 0x045F) {
if (code == 0x040D || code == 0x0450 || code == 0x045D)
return FAILURE;
*res = code - 0x360;

View File

@ -358,47 +358,47 @@ CYRILLIC SMALL LETTER YA: &#x44F; => ef
NUMERO SIGN: &#x2116; => f0
&#xF0; => &#xF0;
CYRILLIC SMALL LETTER IO: &#x451; => 2623783435313b
CYRILLIC SMALL LETTER IO: &#x451; => f1
&#xF1; => &#xF1;
CYRILLIC SMALL LETTER DJE: &#x452; => 2623783435323b
CYRILLIC SMALL LETTER DJE: &#x452; => f2
&#xF2; => &#xF2;
CYRILLIC SMALL LETTER GJE: &#x453; => 2623783435333b
CYRILLIC SMALL LETTER GJE: &#x453; => f3
&#xF3; => &#xF3;
CYRILLIC SMALL LETTER UKRAINIAN IE: &#x454; => 2623783435343b
CYRILLIC SMALL LETTER UKRAINIAN IE: &#x454; => f4
&#xF4; => &#xF4;
CYRILLIC SMALL LETTER DZE: &#x455; => 2623783435353b
CYRILLIC SMALL LETTER DZE: &#x455; => f5
&#xF5; => &#xF5;
CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I: &#x456; => 2623783435363b
CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I: &#x456; => f6
&#xF6; => &#xF6;
CYRILLIC SMALL LETTER YI: &#x457; => 2623783435373b
CYRILLIC SMALL LETTER YI: &#x457; => f7
&#xF7; => &#xF7;
CYRILLIC SMALL LETTER JE: &#x458; => 2623783435383b
CYRILLIC SMALL LETTER JE: &#x458; => f8
&#xF8; => &#xF8;
CYRILLIC SMALL LETTER LJE: &#x459; => 2623783435393b
CYRILLIC SMALL LETTER LJE: &#x459; => f9
&#xF9; => &#xF9;
CYRILLIC SMALL LETTER NJE: &#x45A; => 2623783435413b
CYRILLIC SMALL LETTER NJE: &#x45A; => fa
&#xFA; => &#xFA;
CYRILLIC SMALL LETTER TSHE: &#x45B; => 2623783435423b
CYRILLIC SMALL LETTER TSHE: &#x45B; => fb
&#xFB; => &#xFB;
CYRILLIC SMALL LETTER KJE: &#x45C; => 2623783435433b
CYRILLIC SMALL LETTER KJE: &#x45C; => fc
&#xFC; => &#xFC;
SECTION SIGN: &#xA7; => fd
&#xFD; => &#xFD;
CYRILLIC SMALL LETTER SHORT U: &#x45E; => 2623783435453b
CYRILLIC SMALL LETTER SHORT U: &#x45E; => fe
&#xFE; => &#xFE;
CYRILLIC SMALL LETTER DZHE: &#x45F; => 2623783435463b
CYRILLIC SMALL LETTER DZHE: &#x45F; => ff
&#xFF; => &#xFF;