Commit Graph

205 Commits

Author SHA1 Message Date
Anatol Belski
0413a6e7a3 fix datatype mismatches 2014-10-23 10:30:04 +02:00
Anatol Belski
2be8fdcfd6 updated the comment to charset_hint 2014-09-19 19:45:38 +02:00
Johannes Schlüter
d0cb715373 s/PHP 5/PHP 7/ 2014-09-19 18:33:14 +02:00
Anatol Belski
6fab907920 there can be only one ... of the identical expressions 2014-09-19 09:26:07 +02:00
Anatol Belski
3234480827 first show to make 's' work with size_t 2014-08-27 20:49:31 +02:00
Xinchen Hui
0a712e1913 Unused variable 2014-08-26 11:49:32 +08:00
Anatol Belski
b9514bb8fd master renames phase 6 2014-08-25 21:26:42 +02:00
Anatol Belski
4d997f63d9 master renames phase 3 2014-08-25 20:22:49 +02:00
Anatol Belski
c3e3c98ec6 master renames phase 1 2014-08-25 19:24:55 +02:00
Anatol Belski
745a71be33 yet more fixes to zpp 2014-08-20 14:46:14 +02:00
Anatol Belski
f2182ab845 some more pure naming replacements 2014-08-17 21:16:27 +02:00
Anatol Belski
5bb25776a0 further fixes on core 2014-08-16 15:34:04 +02:00
Anatol Belski
7534bf125a fix set_time_limit, substr and some more 2014-08-16 14:46:31 +02:00
Anatol Belski
1169de3e61 fix some cases with fast zpp 2014-08-16 14:00:02 +02:00
Anatol Belski
cb25136f4e fix macros in the 5 basic extensions 2014-08-16 11:37:14 +02:00
Dmitry Stogov
27f38798a1 Fast parameter parsing API
This API is experemental. It may be changed or removed.
It should be used only for really often used functions.
(Keep the original parsing code and wrap usage with #ifndef FAST_ZPP)
2014-07-11 16:32:20 +04:00
Dmitry Stogov
cd4b4dfc4d Merge branch 'master' into refactoring2
Conflicts:
	Zend/zend_hash.c
	ext/date/php_date.c
2014-05-05 13:02:43 +04:00
Anatol Belski
0d5121a3c7 fixed ZEND_DEBUG usage 2014-05-05 00:50:51 +02:00
Dmitry Stogov
f9927a6c97 Merge mainstream 'master' branch into refactoring
During merge I had to revert:
	Nikita's patch for php_splice() (it probably needs to be applyed again)
	Bob Weinand's patches related to constant expression handling (we need to review them carefully)
	I also reverted all our attempts to support sapi/phpdbg (we didn't test it anyway)

Conflicts:
	Zend/zend.h
	Zend/zend_API.c
	Zend/zend_ast.c
	Zend/zend_compile.c
	Zend/zend_compile.h
	Zend/zend_constants.c
	Zend/zend_exceptions.c
	Zend/zend_execute.c
	Zend/zend_execute.h
	Zend/zend_execute_API.c
	Zend/zend_hash.c
	Zend/zend_highlight.c
	Zend/zend_language_parser.y
	Zend/zend_language_scanner.c
	Zend/zend_language_scanner_defs.h
	Zend/zend_variables.c
	Zend/zend_vm_def.h
	Zend/zend_vm_execute.h
	ext/date/php_date.c
	ext/dom/documenttype.c
	ext/hash/hash.c
	ext/iconv/iconv.c
	ext/mbstring/tests/zend_multibyte-10.phpt
	ext/mbstring/tests/zend_multibyte-11.phpt
	ext/mbstring/tests/zend_multibyte-12.phpt
	ext/mysql/php_mysql.c
	ext/mysqli/mysqli.c
	ext/mysqlnd/mysqlnd_reverse_api.c
	ext/mysqlnd/php_mysqlnd.c
	ext/opcache/ZendAccelerator.c
	ext/opcache/zend_accelerator_util_funcs.c
	ext/opcache/zend_persist.c
	ext/opcache/zend_persist_calc.c
	ext/pcre/php_pcre.c
	ext/pdo/pdo_dbh.c
	ext/pdo/pdo_stmt.c
	ext/pdo_pgsql/pgsql_driver.c
	ext/pgsql/pgsql.c
	ext/reflection/php_reflection.c
	ext/session/session.c
	ext/spl/spl_array.c
	ext/spl/spl_observer.c
	ext/standard/array.c
	ext/standard/basic_functions.c
	ext/standard/html.c
	ext/standard/mail.c
	ext/standard/php_array.h
	ext/standard/proc_open.c
	ext/standard/streamsfuncs.c
	ext/standard/user_filters.c
	ext/standard/var_unserializer.c
	ext/standard/var_unserializer.re
	main/php_variables.c
	sapi/phpdbg/phpdbg.c
	sapi/phpdbg/phpdbg_bp.c
	sapi/phpdbg/phpdbg_frame.c
	sapi/phpdbg/phpdbg_help.c
	sapi/phpdbg/phpdbg_list.c
	sapi/phpdbg/phpdbg_print.c
	sapi/phpdbg/phpdbg_prompt.c
2014-04-26 00:32:51 +04:00
Dmitry Stogov
f9b26bc39a Cleanup (2-nd round) 2014-04-15 21:56:30 +04:00
Dmitry Stogov
050d7e38ad Cleanup (1-st round) 2014-04-15 15:40:40 +04:00
Yasuo Ohgaki
a84e5dc37d Remove unneeded string copy.
Allow to set ''(empty string values) internal/input/output_encoding for better compatibility. i.e. Runtime INI value changes.
More compliance to the RFC. Improve/add encoding handling tests. i.e. Rather than set encoding automagic way, detect it.
2014-03-27 17:20:57 +09:00
Yasuo Ohgaki
e1fe76f28a Add default_charset handling 2014-03-20 10:50:32 +09:00
Xinchen Hui
4fb57d7f4f Fixed wrong size of key length 2014-02-24 16:24:08 +08:00
Dmitry Stogov
2b9b9afa7a Use better data structures (incomplete) 2014-02-17 17:59:18 +04:00
Dmitry Stogov
40e053e7f3 Use better data structures (incomplete) 2014-02-13 17:54:23 +04:00
Yasuo Ohgaki
cbd108abf1 Implement RFC https://wiki.php.net/rfc/default_encoding 2014-02-13 11:54:52 +09:00
Xinchen Hui
c081ce628f Bump year 2014-01-03 11:08:10 +08:00
Christopher Jones
9ad97cd489 Reduce (some) compile noise of 'unused variable' and 'may be used uninitialized' warnings. 2013-08-14 20:36:50 -07:00
Gustavo Lopes
77ee200097 Fix bug #64011 (get_html_translation_table())
get_html_translation_table() with encoding ISO-8859-1 and HTMLENTITIES
was broken. Only entities for characters U+0000 to U+0040 were being
included in the result.
2013-01-18 12:10:27 +01:00
Xinchen Hui
0a7395e009 Happy New Year 2013-01-01 16:28:54 +08:00
Gustavo André dos Santos Lopes
cfdd6c5788 MFH: 7dcada1 for 5.4
- Fixed possible unsigned int wrap around in html.c. Note that 5.3 has the same
  (potential) problem; even though the code is substantially different, the
  variable name and the fashion it was incremented was kept.
2012-03-19 16:36:21 +00:00
Gustavo André dos Santos Lopes
ed98579924 - Fixed bug #61374: html_entity_decode tries to decode code points that don't
exist in ISO-8859-1.
2012-03-13 18:08:30 +00:00
Gustavo André dos Santos Lopes
d4cf399cc4 - Merge r323056 (see bug #60965). 2012-02-05 09:59:33 +00:00
Felipe Pena
4e19825281 - Year++ 2012-01-01 13:15:04 +00:00
Gustavo André dos Santos Lopes
79bb42548d - Less GCC warnings; code less readable, yay!
- Fixed html_tables.h generaration in 64-bit archs.
- Closes bug #55394 - Patch to suppress initialization warnings in html.c
#signed/unsigned mismatches for another day
#regenerated tables on another commit
2011-08-31 05:45:02 +00:00
Xinchen Hui
5540b64a3d Eliminated compiler's warnings 2011-08-10 11:59:11 +00:00
Gustavo André dos Santos Lopes
a61534eab8 - Elided unused argument in internal linkage function. 2011-08-09 00:40:45 +00:00
Gustavo André dos Santos Lopes
547a96090f - Fixed bug #54332 (trunk only, null pointer deref due to information loss on long to int conversion)
- Fixed some int* pointers being passed as size_t*.
2011-03-20 15:15:08 +00:00
Gustavo André dos Santos Lopes
4a946a91e5 - Fixed CHARSET_UNICODE_COMPAT (ISO-8859-1 is compatible in the relevant sense).
- Fixed usage of zend_multibyte_get_internal_encoding (its return cannot be
  cast to char*).
- Change tests to reflect that charset detection now relies on
  internal_encoding, not on current_internal_encoding.
  NOTE: This fixes the changes in rev 306077, but it remains that that change
  introduced a BC break. I assumed it was intentional
2011-01-25 10:57:07 +00:00
Felipe Pena
0203cc3d44 - Year++ 2011-01-01 02:17:06 +00:00
Dmitry Stogov
755c2cd0d8 Removed compile time dependency from ext/mbstring 2010-12-08 11:27:34 +00:00
Pierrick Charron
71dfe80e05 Remove unused variables 2010-11-17 17:55:18 +00:00
Gustavo André dos Santos Lopes
e69b1ff2c4 - Fixed bug #49687 (utf8_decode vulnerabilities and deficiencies in the number
of reported malformed sequences). (Gustavo)
#Made a public interface for get_next_char/utf-8 in trunk to use in utf8_decode.
#In PHP 5.3, trunk's get_next_char was copied to xml.c because 5.3's
#get_next_char is different and is not prepared to recover appropriately from
#errors.
2010-10-27 18:13:25 +00:00
Ilia Alshanetsky
18fa045e75 Code cleanup & CS 2010-10-25 16:46:55 +00:00
Gustavo André dos Santos Lopes
20e2c5fc33 - Fixed uninitialized and 1 character short local variable. 2010-10-24 21:19:04 +00:00
Gustavo André dos Santos Lopes
91727cb844 - Completed rewrite of html.c. Except for determine_charset, almost nothing
remains.
- Fixed bug on determine_charset that was preventing correct detection in
  combination with internal mbstring encoding "none", "pass" or "auto".
- Added profiles for entity encode/decode for HTMl 4.01, XHTML 1.0, XML 1.0
  and HTML 5. Added the constants ENT_HTML401, ENT_XML1, ENT_XHTML and
  ENT_HTML5.
- htmlentities()/htmlspecialchars(), when told not to double encode, verify
  the correctness of the existenting entities more thoroughly.
  It is checked whether the numerical entity represents a valid unicode code
  point (number is between 0 and 0x10FFFF). If using the flag ENT_DISALLOWED,
  it is also checked whether that numerical entity is valid in selected
  document. In HTML 4.01, all the numerical entities that represent a Unicode
  code point (< U+10FFFFFF) are valid, but that's not the case with other
  document types. If the entity is not valid, & is encoded to &amp;.
  For named entities, the check is also more thorough. While before the only
  check would be to determine if the entity was constituted by alphanumeric
  characters, now it is checked whether that entity is necessarily defined for
  the target document type. Otherwise, & is encoded to &amp;.
- For html_entity_decode(), only valid numerical and named entities (as defined
  above for htmlentities()/htmlspecialchars() + !double_encode) are decoded.
  But there is in this case one additional check. Entities that represent
  non-SGML or otherwise invalid characters are not decoded. Note that, in
  HTML5, U+000D is a valid literal character, but the entity &#x0D is not
  valid and is therefore not decoded.
- The hash tables lazily created for decoding in html_entity_decode() that were
  added recently were substituted by static hash tables. Instead of 1 hash
  table per encoding, there's only one hash table per document type defined in
  terms of unicode code points. This means that for charsets other than UTF-8
  and ISO-8859-1, a conversion to unicode code points is necessary before
  decoding.
- On the encoding side, the ad hoc ranges of entities of the translation
  tables, which mapped (in general) non-unicode code points to HTML entities
  were replaced by three-stage tables for HTML 4 and HTML 5. This mapping
  tables are defined only in terms of unicode code points, so a conversion
  is necessary for charsets other than UTF-8 and ISO-8859-1. Even so, the
  multi-stage table is much faster than the previous method, by a factor
  of 5; the conversion to unicode is a small penalty because it's just a
  simple table lookup.
  XML 1.0/htmlspecialchars() uses a simple table instead of a three-stage
  table.
- Added the flag ENT_SUBSTITUTE, which makes htmlentities()/htmlspecialchars()
  replace the invalid multibyte sequences with U+FFFD (UTF-8) or &#FFFD;
  (other encodings).
- Added the flag ENT_DISALLOWED. Implements FR #52860. Characters that cannot
  appear literally are replaced by U+FFFD (UTF-8) or &#FFFD; (otherwise).
  An alternative implementation would be to encode those characters into
  numerical entities, but that would only work in HTML 4.01 due to limitations
  on the values of numerical entities in other document types. See also the
  effects on htmlentities()/htmlspecialchars() with !double_encode above.
2010-10-24 15:01:02 +00:00
Gustavo André dos Santos Lopes
bfcb754eae - Fixed get_next_char(), used by htmlentities/htmlspecialchars, accepting
certain ill-formed UTF-8 sequences.
2010-10-14 19:14:06 +00:00
Gustavo André dos Santos Lopes
4de6c3a948 - Added a 3rd parameter to get_html_translation_table. It now takes a charset
hint, like htmlentities et al.
- Fixed bug #49407 (get_html_translation_table doesn't handle UTF-8).
- Fixed bug #25927 (get_html_translation_table calls the ' &#39; instead of
  &#039;).
- Fixed tests for get_html_translation_table and unified the Windows and
  non-Windows versions of the tests.
2010-10-12 02:51:11 +00:00
Gustavo André dos Santos Lopes
f4a896c209 - PHP uses a big endian representation when it converts the
code unit sequences to integers so as to store the entity
  maps. Code in traverse_for_entities assumed little
  endian. Fixed.
  (in practice, due to the absence of unicode and entity
  mappings for multi-byte encodings -- except UTF-8 --, this
  doesn't matter, so the relevant code was commented out for
  performance reasons).
2010-10-11 22:26:10 +00:00