Commit Graph

70 Commits

Author SHA1 Message Date
Ilija Tovilo
6335264c07
Fix header errors when parsed standalone (#14272)
This is annoying for multiple reasons:

1. LSPs can show many errors for these files.
2. LSP can stop parsing these files completely if there are too many errors,
   resulting in spotty LSP features.
2024-05-20 22:30:38 +02:00
Christoph M. Becker
bf1cfc0753
Revert GH-10300
Cf. <https://github.com/php/php-src/pull/10220#issuecomment-1383739816>.

This reverts commit 68ada76f9a.
his reverts commit 45384c6e20.
This reverts commit ef7fbfd710.
This reverts commit 9b9ea0d7c6.
This reverts commit f15747c26b.
This reverts commit e883ba93c4.
This reverts commit 7e87551c37.
This reverts commit 921274d2b8.
This reverts commit fc1f528e5e.
This reverts commit 0961715cda.
This reverts commit a93f264526.
This reverts commit 72dd94e1c6.
This reverts commit 29b2dc8964.
This reverts commit 05c7653bba.
This reverts commit 5190e5c260.
This reverts commit 6b55bf228c.
This reverts commit 184b4a12d3.
This reverts commit 4c31b7888a.
This reverts commit d44e9680f0.
This reverts commit 4069a5c43f.
2023-01-16 12:22:54 +01:00
Max Kellermann
6b55bf228c Zend/zend_language_scanner: include cleanup 2023-01-15 15:07:58 +00:00
Nikita Popov
cc3e03c512 Fix parsing of semi-reserved tokens at offset > 4 GB
To avoid increasing the size of parser stack elements by storing
size_t offset and length, this instead only stores the start
offset (or rather pointer now) and determines the length of the
identifier in zend_lex_tstring.
2021-01-25 14:37:36 +01:00
Nikita Popov
3370b5fd87 Accept zend_string in zend_prepare_string_for_scanning 2021-01-21 10:31:32 +01:00
Nikita Popov
3e01f5afb1 Replace zend_bool uses with bool
We're starting to see a mix between uses of zend_bool and bool.
Replace all usages with the standard bool type everywhere.

Of course, zend_bool is retained as an alias.
2021-01-15 12:33:06 +01:00
George Peter Banyard
fa8d9b1183 Improve type declarations for Zend APIs
Voidification of Zend API which always succeeded
Use bool argument types instead of int for boolean arguments
Use bool return type for functions which return true/false (1/0)
Use zend_result return type for functions which return SUCCESS/FAILURE as they don't follow normal boolean semantics

Closes GH-6002
2020-08-28 15:41:27 +02:00
Nikita Popov
5571765609 Forbid use of <?= as a semi-reserved identifier
One of the weirdest pieces of PHP code I've ever seen. In terms
of tokens, this gets internally translated to

    use x as y; echo as my_echo;

On master it crashes because this "echo" does not have attached
identifier metadata. Make sure it is added and then reject the
use of "<?=" as an identifier inside zend_lex_tstring.

Fixes oss-fuzz #23547.
2020-06-19 09:29:58 +02:00
Nikita Popov
b03cafd19c Fix bug #77966: Cannot alias a method named "namespace"
This is a bit tricky: In this cases we have "namespace as", which
means that we will only recognize "namespace" as an identifier when
the lookahead token is already at the "as". This means that
zend_lex_tstring picks up the wrong identifier.

We solve this by actually assigning the identifier as the semantic
value on the parser stack -- as in almost all cases we will not
actually need the identifier, this is just an (offset, size)
reference, not a copy of the string.

Additionally, we need to teach the lexer feedback mechanism used
by tokenizer TOKEN_PARSE mode to apply feedback to something
other than the very last token. To that purpose we pass through
the token text and check the tokens in reverse order to find the
right one.

Closes GH-5668.
2020-06-08 12:55:14 +02:00
Alex Dowad
80598f1250 Syntax errors caused by unclosed {, [, ( mention specific location
Aside from a few very specific syntax errors for which detailed exceptions are
thrown, generally PHP just emits the default error messages generated by bison on syntax
error. These messages are very uninformative; they just say "Unexpected ... at line ...".

This is most problematic with constructs which can span an arbitrary number of lines, such
as blocks of code delimited by { }, 'if' conditions delimited by ( ), and so on. If a closing
delimiter is missed, the block will run for the entire remainder of the source file (which
could be thousands of lines), and then at the end, a parse error will be thrown with the
dreaded words: "Unexpected end of file".

Therefore, track the positions of opening and closing delimiters and ensure that they match
up correctly. If any mismatch or missing delimiter is detected, immediately throw a parse
error which points the user to the offending line. This is best done in the *lexer* and not
in the parser.

Thanks to Nikita Popov and George Peter Banyard for suggesting improvements.

Fixes bug #79368.
Closes GH-5364.
2020-04-14 11:22:23 +02:00
twosee
61f78de486 Constify some char* arguments or return values of ZEND_API
Closes GH-4247.
2019-06-12 16:49:32 +02:00
Peter Kokot
92ac598aab Remove local variables
This patch removes the so called local variables defined per
file basis for certain editors to properly show tab width, and
similar settings. These are mainly used by Vim and Emacs editors
yet with recent changes the once working definitions don't work
anymore in Vim without custom plugins or additional configuration.
Neither are these settings synced across the PHP code base.

A simpler and better approach is EditorConfig and fixing code
using some code style fixing tools in the future instead.

This patch also removes the so called modelines for Vim. Modelines
allow Vim editor specifically to set some editor configuration such as
syntax highlighting, indentation style and tab width to be set in the
first line or the last 5 lines per file basis. Since the php test
files have syntax highlighting already set in most editors properly and
EditorConfig takes care of the indentation settings, this patch removes
these as well for the Vim 6.0 and newer versions.

With the removal of local variables for certain editors such as
Emacs and Vim, the footer is also probably not needed anymore when
creating extensions using ext_skel.php script.

Additionally, Vim modelines for setting php syntax and some editor
settings has been removed from some *.phpt files.  All these are
mostly not relevant for phpt files neither work properly in the
middle of the file.
2019-02-03 21:03:00 +01:00
Zeev Suraski
a81202ac49 Adios, yearly copyright ranges 2019-01-30 11:48:28 +01:00
Zeev Suraski
54dc07f3dc Update email addresses. We're still @Zend, but future proofing it... 2018-11-01 17:20:07 +02:00
Peter Kokot
8d3f8ca12a Remove unused Git attributes ident
The $Id$ keywords were used in Subversion where they can be substituted
with filename, last revision number change, last changed date, and last
user who changed it.

In Git this functionality is different and can be done with Git attribute
ident. These need to be defined manually for each file in the
.gitattributes file and are afterwards replaced with 40-character
hexadecimal blob object name which is based only on the particular file
contents.

This patch simplifies handling of $Id$ keywords by removing them since
they are not used anymore.
2018-07-25 00:53:25 +02:00
Thomas Punt
4887357269 Implement flexible heredoc/nowdoc syntax
RFC: https://wiki.php.net/rfc/flexible_heredoc_nowdoc_syntaxes

* The ending label no longer has to be followed by a semicolon or
  newline. Any non-label character is fine.
* The ending label may be indented. The indentation will be stripped
  from all lines in the heredoc/nowdoc string.

Lexing of heredoc strings performs a scan-ahead to determine the
indentation of the ending label, so that the correct amount of
indentation can be removed when calculting the semantic values for
use by the parser. This makes the implementation quite a bit more
complicated than we would like :/
2018-04-13 21:35:37 +02:00
Xinchen Hui
a6519d0514 year++ 2018-01-02 12:57:58 +08:00
Anatol Belski
bc5811f361 further sync for vim mode lines 2017-07-04 18:12:45 +02:00
Sammy Kaye Powers
9e29f841ce Update copyright headers to 2017 2017-01-02 09:30:12 -06:00
Anatol Belski
b204b3abd1 further normalizations, uint vs uint32_t
fix merge mistake

yet one more replacement run
2016-11-26 17:29:01 +01:00
Nikita Popov
07af6ba898 Make sure TOKEN_PARSE mode is thread safe
Introduce an on_event_context passed to the on_event hook. Use this
context to pass along the token array. Previously this was stored
in a non-tls global :/
2016-07-23 00:00:13 +02:00
Nikita Popov
d5a38280be Drop dup declare with inconsistent linkage
This is already declared in zend_stream.h as ZEND_API.
2016-05-02 11:56:28 +02:00
Xinchen Hui
97a9470d97 bump year which is missed in rev 49493a2 2016-01-02 17:56:11 +08:00
Márcio Almada
110759386e ext tokenizer port + cleanup unused lexer states
we basically added a mechanism to store the token stream during parsing
and exposed the entire parser stack on the tokenizer extension through
an opt in flag: token_get_all($src, TOKEN_PARSE).

this change allows easy future language enhancements regarding context
aware parsing & scanning without further maintance on the tokenizer
extension while solves known inconsistencies "parseless" tokenizer
extension has when it handles `__halt_compiler()` presence.
2015-04-30 03:03:29 -03:00
Dmitry Stogov
b494aa0ba0 Fixed compiler reenterability 2015-01-22 20:39:34 +03:00
Xinchen Hui
fc33f52d8c bump year 2015-01-15 23:27:30 +08:00
Stanislav Malyshev
b7a7b1a624 trailing whitespace removal 2015-01-10 15:07:38 -08:00
Anatol Belski
bdeb220f48 first shot remove TSRMLS_* things 2014-12-13 23:06:14 +01:00
Dmitry Stogov
f4cfaf36e2 Use better data structures (incomplete) 2014-02-10 10:04:30 +04:00
Xinchen Hui
c081ce628f Bump year 2014-01-03 11:08:10 +08:00
Stanislav Malyshev
c793a65690 Merge branch 'PHP-5.4' into PHP-5.5
* PHP-5.4:
  non living code related typo fixes

Conflicts:
	Zend/zend_compile.c
2013-08-04 16:06:24 -07:00
Veres Lajos
8d86597d73 non living code related typo fixes 2013-08-04 16:05:36 -07:00
Xinchen Hui
a666285bc2 Happy New Year 2013-01-01 16:37:09 +08:00
Xinchen Hui
0a7395e009 Happy New Year 2013-01-01 16:28:54 +08:00
Nikita Popov
4cf90e06c9 Fix lexing of nested heredoc strings in token_get_all()
This fixes bug #60097.

Before two global variables CG(heredoc) and CG(heredoc_len) were used to
track the current heredoc label. In order to support nested heredoc
strings the *previous* heredoc label was assigned as the token value of
T_START_HEREDOC and the language_parser.y assigned that to CG(heredoc).

This created a dependency of the lexer on the parser. Thus the
token_get_all() function, which accesses the lexer directly without
also running the parser, was not able to tokenize nested heredoc strings
(and leaked memory). Same applies for the source-code highlighting
functions.

The new approach is to maintain a heredoc_label_stack in the lexer, which
contains all active heredoc labels.

As it is no longer required, T_START_HEREDOC and T_END_HEREDOC now don't
carry a token value anymore.

In order to make the work with zend_ptr_stack in this context more
convenient I added a new function zend_ptr_stack_top(), which retrieves the
top element of the stack (similar to zend_stack_top()).
2012-03-31 21:53:30 +02:00
Felipe Pena
8775a37559 - Year++ 2012-01-01 13:15:04 +00:00
Felipe Pena
4e19825281 - Year++ 2012-01-01 13:15:04 +00:00
Dmitry Stogov
4a25a7740d Fixed ZE specific compile warnings (Bug #55629) 2011-09-13 13:29:35 +00:00
Dmitry Stogov
e43ff1359e Fixed ZE specific compile warnings (Bug #55629) 2011-09-13 13:29:35 +00:00
Felipe Pena
0203cc3d44 - Year++ 2011-01-01 02:17:06 +00:00
Moriyoshi Koizumi
a304a0fc04 - Avoid allocating extra buffers. This makes parsing with zend.multibyte enabled as fast as with it disabled. 2010-12-20 03:16:09 +00:00
Moriyoshi Koizumi
bbf3d43c1e * Refactor zend_multibyte facility.
Now mbstring.script_encoding is superseded by zend.script_encoding.
2010-12-19 16:36:37 +00:00
Dmitry Stogov
ab93d8c621 Added multibyte suppport by default. Previosly php had to be compiled with --enable-zend-multibyte. Now it can be enabled or disabled throug zend.multibyte directive in php.ini 2010-11-24 05:41:23 +00:00
Sebastian Bergmann
d2281d1dff sed -i "s#1998-2009#1998-2010#g" **/*.c **/*.h **/*.php 2010-01-05 20:46:53 +00:00
Sebastian Bergmann
08659c2dcd MFH: Bump copyright year, 3 of 3. 2008-12-31 11:15:49 +00:00
Moriyoshi Koizumi
4f42ed39c0 - Revived zend multibyte 2008-07-24 22:21:41 +00:00
Rui Hirokawa
c3286f32ef implemented again zend-multibyte for PHP 5.3 2008-06-29 08:21:35 +00:00
Marcus Boerger
af316021e8 - Rewrite scanner to be based on re2c instead of flex
The full patch is available as:
  http://php.net/~helly/php-re2c-5.3-20080316.diff.txt
  This is against php-re2c repository version 98
  An older patch against version 97 is available under:
  http://php.net/~helly/php-re2c-97-20080316.diff.txt
2008-03-16 21:06:55 +00:00
Sebastian Bergmann
d1dded8751 MFH: Bump copyright year, 2 of 2. 2007-12-31 07:17:19 +00:00
Sebastian Bergmann
4223aa4d5e MFH: Bump year. 2007-01-01 09:36:18 +00:00