Fix incorrect check in cs_8559_5 in map_from_unicode()

The condition `code == 0x0450 || code == 0x045D` is always false because
of an incorrect range check on code.
According to the BMP coverage in the encoding spec for ISO-8859-5
(https://encoding.spec.whatwg.org/iso-8859-5-bmp.html) the range of
valid characters is 0x0401 - 0x045F (except for 0x040D, 0x0450, 0x045D).
The current check has an upper bound of 0x044F instead of 0x045F.
Fix this by changing the upper bound.

Closes GH-10399

Signed-off-by: George Peter Banyard <girgias@php.net>
This commit is contained in:
Niels Dossche 2023-01-21 13:50:36 +01:00 committed by George Peter Banyard
parent b7a158a19b
commit a8c8fb2564
No known key found for this signature in database
GPG Key ID: 3306078E3194AEBD
3 changed files with 16 additions and 15 deletions

1
NEWS
View File

@ -15,6 +15,7 @@ PHP NEWS
- Standard: - Standard:
. Fixed bug GH-10292 (Made the default value of the first param of srand() and . Fixed bug GH-10292 (Made the default value of the first param of srand() and
mt_srand() unknown). (kocsismate) mt_srand() unknown). (kocsismate)
. Fix incorrect check in cs_8559_5 in map_from_unicode(). (nielsdos)
02 Feb 2023, PHP 8.1.15 02 Feb 2023, PHP 8.1.15

View File

@ -477,7 +477,7 @@ static inline int map_from_unicode(unsigned code, enum entity_charset charset, u
*res = 0xF0; /* numero sign */ *res = 0xF0; /* numero sign */
} else if (code == 0xA7) { } else if (code == 0xA7) {
*res = 0xFD; /* section sign */ *res = 0xFD; /* section sign */
} else if (code >= 0x0401 && code <= 0x044F) { } else if (code >= 0x0401 && code <= 0x045F) {
if (code == 0x040D || code == 0x0450 || code == 0x045D) if (code == 0x040D || code == 0x0450 || code == 0x045D)
return FAILURE; return FAILURE;
*res = code - 0x360; *res = code - 0x360;

View File

@ -358,47 +358,47 @@ CYRILLIC SMALL LETTER YA: &#x44F; => ef
NUMERO SIGN: &#x2116; => f0 NUMERO SIGN: &#x2116; => f0
&#xF0; => &#xF0; &#xF0; => &#xF0;
CYRILLIC SMALL LETTER IO: &#x451; => 2623783435313b CYRILLIC SMALL LETTER IO: &#x451; => f1
&#xF1; => &#xF1; &#xF1; => &#xF1;
CYRILLIC SMALL LETTER DJE: &#x452; => 2623783435323b CYRILLIC SMALL LETTER DJE: &#x452; => f2
&#xF2; => &#xF2; &#xF2; => &#xF2;
CYRILLIC SMALL LETTER GJE: &#x453; => 2623783435333b CYRILLIC SMALL LETTER GJE: &#x453; => f3
&#xF3; => &#xF3; &#xF3; => &#xF3;
CYRILLIC SMALL LETTER UKRAINIAN IE: &#x454; => 2623783435343b CYRILLIC SMALL LETTER UKRAINIAN IE: &#x454; => f4
&#xF4; => &#xF4; &#xF4; => &#xF4;
CYRILLIC SMALL LETTER DZE: &#x455; => 2623783435353b CYRILLIC SMALL LETTER DZE: &#x455; => f5
&#xF5; => &#xF5; &#xF5; => &#xF5;
CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I: &#x456; => 2623783435363b CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I: &#x456; => f6
&#xF6; => &#xF6; &#xF6; => &#xF6;
CYRILLIC SMALL LETTER YI: &#x457; => 2623783435373b CYRILLIC SMALL LETTER YI: &#x457; => f7
&#xF7; => &#xF7; &#xF7; => &#xF7;
CYRILLIC SMALL LETTER JE: &#x458; => 2623783435383b CYRILLIC SMALL LETTER JE: &#x458; => f8
&#xF8; => &#xF8; &#xF8; => &#xF8;
CYRILLIC SMALL LETTER LJE: &#x459; => 2623783435393b CYRILLIC SMALL LETTER LJE: &#x459; => f9
&#xF9; => &#xF9; &#xF9; => &#xF9;
CYRILLIC SMALL LETTER NJE: &#x45A; => 2623783435413b CYRILLIC SMALL LETTER NJE: &#x45A; => fa
&#xFA; => &#xFA; &#xFA; => &#xFA;
CYRILLIC SMALL LETTER TSHE: &#x45B; => 2623783435423b CYRILLIC SMALL LETTER TSHE: &#x45B; => fb
&#xFB; => &#xFB; &#xFB; => &#xFB;
CYRILLIC SMALL LETTER KJE: &#x45C; => 2623783435433b CYRILLIC SMALL LETTER KJE: &#x45C; => fc
&#xFC; => &#xFC; &#xFC; => &#xFC;
SECTION SIGN: &#xA7; => fd SECTION SIGN: &#xA7; => fd
&#xFD; => &#xFD; &#xFD; => &#xFD;
CYRILLIC SMALL LETTER SHORT U: &#x45E; => 2623783435453b CYRILLIC SMALL LETTER SHORT U: &#x45E; => fe
&#xFE; => &#xFE; &#xFE; => &#xFE;
CYRILLIC SMALL LETTER DZHE: &#x45F; => 2623783435463b CYRILLIC SMALL LETTER DZHE: &#x45F; => ff
&#xFF; => &#xFF; &#xFF; => &#xFF;