Don't treat "readonly" as a keyword if followed by "(". This
allows using it as a global function name. In particular, this
function has been used by WordPress.
This does not allow other uses of "readonly", in particular it
cannot be used as a class name, unlike "enum". The reason is that
we'd still have to recognize it as a keyword when using in a type
position:
class Test {
public ReadOnly $foo;
}
This should now be interpreted as a readonly property, not as a
read-write property with type `ReadOnly`. As such, a class with
name `ReadOnly`, while unambiguous in most other circumstances,
would not be usable as property or parameter type. For that
reason, we do not support it at all.
Add support for readonly properties, for which only a single
initializing assignment from the declaring scope is allowed.
RFC: https://wiki.php.net/rfc/readonly_properties_v2
Closes GH-7089.
When a method is inherited, the static variables will now always
use the initial values, rather than the values at the time of
inheritance. As such, behavior no longer depends on whether
inheritance happens before or after a method has been called.
This is implemented by always keeping static_variables as the
original values, and static_variables_ptr as the modified copy.
Closes GH-6705.
To avoid increasing the size of parser stack elements by storing
size_t offset and length, this instead only stores the start
offset (or rather pointer now) and determines the length of the
identifier in zend_lex_tstring.
We're starting to see a mix between uses of zend_bool and bool.
Replace all usages with the standard bool type everywhere.
Of course, zend_bool is retained as an alias.
Don't truncate the file length to unsigned int...
I have no idea whether that fully fixes the problem because the
process gets OOM killed before finishing, but at least the
immediate parse error is gone now.
For php-ast interning the file name is an effective memory leak,
see php-ast#134.
I don't think there's any reason to do this. At some point this
was needed due to bugs in the interned string mechanism that
caused issues if the string was later interned, e.g. through a
__FILE__ reference. These issues have since been resolved.
In conjunction with the filenames_table removal in c4016ecd44
this means that filenames now need to be refcounted like normal
strings. In particular the filename reference in op_arrays and CEs
are refcounted.
Voidification of Zend API which always succeeded
Use bool argument types instead of int for boolean arguments
Use bool return type for functions which return true/false (1/0)
Use zend_result return type for functions which return SUCCESS/FAILURE as they don't follow normal boolean semantics
Closes GH-6002
The primary issue was already resolved in 7c3e487289,
but the particular example used in this bug report ran into an
additional issue on PHP 8, because I forgot to drop a number of
zend_bailout calls when switch require failure to throw.
Unconditionally strip shebang lines when using the CLI SAPI,
independently of whether they occur in the primary or non-primary
script. It's unlikely that someone intentionally wants to print
that shebang line when including a script, and this regularly
causes issues when scripts are used in multiple contexts, e.g.
for direct invocation and as a phar bootstrap.
Don't include a trailing newline in T_COMMENT tokens, instead leave
it for a following T_WHITESPACE token. The newline does not belong
to the comment logically, and this makes for an ugly special case,
as other tokens do not include trailing newlines.
Whitespace-sensitive tooling will want to either forward or backward
emulate this change.
Closes GH-5182.
One of the weirdest pieces of PHP code I've ever seen. In terms
of tokens, this gets internally translated to
use x as y; echo as my_echo;
On master it crashes because this "echo" does not have attached
identifier metadata. Make sure it is added and then reject the
use of "<?=" as an identifier inside zend_lex_tstring.
Fixes oss-fuzz #23547.
This is a bit tricky: In this cases we have "namespace as", which
means that we will only recognize "namespace" as an identifier when
the lookahead token is already at the "as". This means that
zend_lex_tstring picks up the wrong identifier.
We solve this by actually assigning the identifier as the semantic
value on the parser stack -- as in almost all cases we will not
actually need the identifier, this is just an (offset, size)
reference, not a copy of the string.
Additionally, we need to teach the lexer feedback mechanism used
by tokenizer TOKEN_PARSE mode to apply feedback to something
other than the very last token. To that purpose we pass through
the token text and check the tokens in reverse order to find the
right one.
Closes GH-5668.
Aside from a few very specific syntax errors for which detailed exceptions are
thrown, generally PHP just emits the default error messages generated by bison on syntax
error. These messages are very uninformative; they just say "Unexpected ... at line ...".
This is most problematic with constructs which can span an arbitrary number of lines, such
as blocks of code delimited by { }, 'if' conditions delimited by ( ), and so on. If a closing
delimiter is missed, the block will run for the entire remainder of the source file (which
could be thousands of lines), and then at the end, a parse error will be thrown with the
dreaded words: "Unexpected end of file".
Therefore, track the positions of opening and closing delimiters and ensure that they match
up correctly. If any mismatch or missing delimiter is detected, immediately throw a parse
error which points the user to the offending line. This is best done in the *lexer* and not
in the parser.
Thanks to Nikita Popov and George Peter Banyard for suggesting improvements.
Fixes bug #79368.
Closes GH-5364.
Closes GH-5353. From now on, PHP will have reflection information
about default values of parameters of internal functions.
Co-authored-by: Nikita Popov <nikita.ppv@gmail.com>