php-src/UPGRADING.INTERNALS

325 lines
15 KiB
Plaintext

PHP 7.0 INTERNALS UPGRADE NOTES
0. Wiki Examples
1. Internal API changes
e. New data types
f. zend_parse_parameters() specs
g. sprintf() formats
h. HashTable API
i. New portable macros for large file support
j. New portable macros for integers
k. get_class_entry object handler info
l. get_class_name object handler info
m. Other portable macros info
n. ZEND_ENGINE_2 removal
o. Updated final class modifier
p. TSRM changes
q. gc_collect_cycles() is now hookable
r. PDO uses size_t for lengths
s. Resource API changes
t. Optimized strings concatenation.
u. Streams changes
v. Sort algorithm changes
2. Build system changes
a. Unix build system changes
b. Windows build system changes
3. Module changes
================
0. Wiki Examples
================
The wiki contains multiple examples and further explanations of the internal
changes. See: https://wiki.php.net/phpng-upgrading
========================
1. Internal API changes
========================
e. New data types
String
Besides the old way of accepting the strings with 's', the new 'S' ZPP spec
was introduced. It expects an argument of the type zend_string *. String lengths
in it are bound to the size_t datatype.
Integer types
Integers do no more depend on the firm 'long' type. Instead a platform
dependent integer type is used, it is called zend_long. That datatype is
defined dynamically to guarantee the consistent 64 bit support. The zval
field representing user land integer it bound to zend_long.
Signed integer is defined as zend_long, unsigned integer as zend_ulong
inside Zend.
Other datatypes
zend_off_t - portable off_t analogue
zend_stat_t - portable 'struct stat' analogue
These datatypes are declared to be portable across platforms. Thus, direct
usage of the functions like fseek, stat, etc. as well as direct usage of
off_t and struct stat is strongly not recommended. Instead the portable
macros should be used.
zend_fseek - portable fseek equivalent
zend_ftell - portable ftell equivalent
zend_lseek - portable lseek equivalent
zend_fstat - portable fstat equivalent
zend_stat - portable stat equivalent
f. zend_parse_parameters() specs
The new spec 'S' introduced, which expects an argument of type zend_string *.
The 'l' spec expects a parameter of the type zend_long, not long anymore.
The 's' spec expects parameters of the type char * and size_t, no int anymore.
g. sprintf() formats
New printf modifier 'p' was implemented to platform independently output zend_long,
zend_ulong and php_size_t datatypes. That modifier can be used with 'd', 'u', 'x' and 'o'
printf format specs with spprintf, snprintf and the wrapping printf implementations.
%pu is sufficient for both zend_ulong and php_size_t. the code using %p spec to output
pointer address might need to be enclosed into #ifdef when it unlickily followed by 'd',
'u', 'x' or 'o'.
The only exceptions are the snprintf and zend_sprintf functions yet, because in some cases
they can use the implemenations available on the system, not the PHP one. With snprintf the
macros ZEND_INT_FMT and ZEND_UINT_FMT should be used.
h. HashTable API
Datatype for array indexes was changed to zend_ulong, for string keys to zend_string *.
i. New portable macros for large file support
Function(s) Alias Comment
stat, _stat64 zend_stat for use with zend_stat_t
fstat, _fstat64 zend_fstat for use with zend_stat_t
lseek, _lseeki64 zend_lseek for use with zend_off_t
ftell, _ftelli64 zend_ftell for use with zend_off_t
fseek, _fseeki64 zend_fseek for use with zend_off_t
j. New portable macros for integers
Function(s) Alias Comment
snprintf with "%ld" or "%lld", _ltoa_s, _i64toa_s ZEND_LTOA for use with zend_long
atol, atoll, _atoi64 ZEND_ATOL for use with zend_long
strtol, strtoll, _strtoi64 ZEND_STRTOL for use with zend_long
strtoul, strtoull, _strtoui64 ZEND_STRTOUL for use with zend_long
abs, llabs, _abs64 ZEND_ABS for use with zend_long
- ZEND_LONG_MAX replaces LONG_MAX where appropriate
- ZEND_LONG_MIN replaces LONG_MIN where appropriate
- ZEND_ULONG_MAX replaces ULONG_MAX where appropriate
- SIZEOF_ZEND_LONG reworked SIZEOF_ZEND_LONG representing the size of zend_long datatype
- ZEND_SIZE_MAX Max value of size_t
- Z_L casts an integral constant to zend_long
- Z_UL casts an integral constant to zend_ulong
The macro ZEND_ENABLE_ZVAL_LONG64 reveals whether zval operates on 64 or 32 bit integer.
k. The get_class_entry object handler is no longer available. Instead
zend_object.ce is always used.
l. The get_class_name object handler is now only used for displaying class
names in debugging functions like var_dump(). It is no longer used in
get_class(), get_parent_class() or similar.
The handler is now obligatory, no longer accepts a `parent` argument and
must return a non-NULL zend_string*, which will be released by the caller.
m. Other portable macros info
ZEND_SECURE_ZERO - zeroes chunk of memory
ZEND_VALID_SOCKET - validates a php_socket_t variable
ZEND_FASTCALL is defined to use __vectorcall convention on VS2013 and above
ZEND_NORETURN is defined as __declspec(noreturn) on VS
n. The ZEND_ENGINE_2 macro has been removed. A ZEND_ENGINE_3 macro has been added.
o. Removed ZEND_ACC_FINAL_CLASS in favour of ZEND_ACC_FINAL, turning final class
modifier now a different class entry flag. Update your extensions.
p. TSRM changes
The TSRM layer undergone significant changes. It is not necessary anymore to
pass the tsrm_ls pointer explicitly. The TSRMLS_* macros should not be used
in any new developments.
The recommended way accessing globals in TS builds is by implementing it
to use a local thread unique pointer to the corresponding data pool. These
are the new macros that serve to achieve this goal:
- ZEND_TSRMG - based on TSRMG and if static pointer for data pool access is
implemented, will use it
- ZEND_TSRMLS_CACHE_EXTERN - to be used in a header shared across multiple
C/C++ files
- ZEND_TSRMLS_CACHE_DEFINE - to be used in the main extension C/C++ file
- ZEND_TSRMLS_CACHE_UPDATE - to be integrated at the top of the globals
ctor or RINIT function
- ZEND_ENABLE_STATIC_TSRMLS_CACHE - to be used in the config.[m4|w32] for
enabling the globals access per thread unique pointer
- TSRMLS_CACHE - expands to the variable name which links to the thread
unique data pool
Note that usually it will work without implementing the globals access per
a thread unique pointer, however it might cause a slowdown. The reason is
that the access model to the extension/SAPI own globals as well as the
access to the core globals is determined in the compilation phase depending
on whether ZEND_ENABLE_STATIC_TSRMLS_CACHE macro is defined.
Due to the current compiler evolution state, the TSRMLS_CACHE pointer
(when implemented) has to be present and updated in every binary unit which
is not compiled statically into PHP. So any DLL, DSO, EXE, etc. has to take
care about defining and updating it's TSRMLS cache. An update should happen
before any globals was accessed.
Porting an extension or SAPI is usually as easy as removing all the TSRMLS_*
occurrences and integrating the macros mentioned above. However if tsrm_ls
is used explicitly, its usage can be considered obsolete in most cases.
Additionally, if an extension triggers its own threads, TSRMLS_CACHE shouldn't
be passed to that threads directly.
When working with CTOR/DTOR for the extension globals, only the data passed
as parameters should be touched. For example, don't use code like
free(MYEXT_G(vars)); inside globals destructor, as it possibly can destroy
the data from a wrong thread. Instead use the pointer to the data delivered
into the destructor. So the previous example could look like
free(my_struct->vars); given the income was casted to (struct my_struct *).
A new macro was introduced to simplify the declaration to the extension
globals. A simplified declaration looks as follows
#define MYEXT_G(v) ZEND_MODULE_GLOBALS_ACCESSOR(myext, v)
Another new macro ZEND_MODULE_GLOBALS_BULK(myext) delivers the corresponding
globals struct as a whole.
Two new storage specifiers was introduced.
ZEND_EXT_TLS - expand to an appropriate thread specific storage specifier in
the thread safe build, expands to empty in a non thread safe
build.
ZEND_TLS - expands to an appropriate thread specific storage specifier
with static visibility in the thread safe build, expands to
static specifier in a non thread safe build.
Variables declared with these storage specifiers can not be shared across
threads. While ZEND_TLS enforces the local scope visibility, ZEND_EXT_TLS
pertains to the visiblity across several compilation units. In both cases,
there's no portable way to achieve the visibility across shared objects.
Thus, these specifiers are not compatible with ZEND_API and PHPAPI specifiers.
q. gc_collect_cycles() is now a function pointer, and can be replaced in the
same manner as zend_execute_ex() if needed (for example, to include the
time spent in the garbage collector in a profiler). The default
implementation has been renamed to zend_gc_collect_cycles(), and is
exported with ZEND_API.
r. In accordance with general use of size_t as string length, all PDO API
functions now use size_t for string length.
s. Removed ZEND_REGISTER/FETCH_RESOURCE, use zend_fetch_resource and
and zend_register_resource instead, zend_fetch_resource accepts a zend_resource *
as first argument, if you still need to fetch a resource from zval, use
zend_fetch_resource_ex. More details can be found in Zend/zend_list.c.
t. Optimized strings concatenation.
ZEND_ADD_STRING/VAR/CHAR are replaced with ZEND_ROPE_INIT, ZEND_ROPE_ADD,
ZEND_ROPE_END.
Instead of reallocation and copying string on each ZEND_ADD_STRING/VAR/CAHR,
collect all the strings and then allocate and construct the resulting string once.
u. Streams changes
- It is possible to do blocking reads on Windows pipes. It can be done either
using context stream option PHP_STREAM_OPTION_PIPE_BLOCKING or
adding STREAM_USE_BLOCKING_PIPE when opening the stream.
v. Sort algorithm changes
- Improved zend_qsort(using hybrid sorting algo) for better performance,
and also renamed zend_qsort to zend_sort.
- Added stable sorting algo zend_insert_sort.
w. Tick functions internal API change
Tick functions have different declaration now. They expect a single
single parameter while before they were parameterless. When registering the
tick function a value for this parameter must be passed. This value will be
provided to the tick function on every execution.
========================
2. Build system changes
========================
a. Unix build system changes
b. Windows build system changes
- Besides Visual Studio, building with Clang or Intel Composer is now
possible. To enable an alternative toolset, add the option
--with-toolset=[vs,clang,icc] to the configure line. The default
toolset is vs. Still clang or icc need the correct environment
which involves many tools from the vs toolset.
The toolset option is supported by phpize as well.
AWARENESS The only recommended and supported toolset to produce production
ready binaries is Visual Studio. Still other compilers can be used now for
testing and analyzing purposes.
- configure.js now produces response files which are passed to the linker
and library manager. This solves the issues with the long command lines
which can exceed the OS limit.
- with clang toolset, an option --with-uncritical-warn-choke is available,
which suppresses the most frequent false positive warnings
- the --with-mp option will by default utilize all the available cores. It's
enabled by default for release builds and can be disabled with the special
"disable" keyword.
========================
3. Module changes
========================
Session:
- PS_MOD_UPDATE_TIMESTAMP() session save handler definition is added. New
save handler should use PS_MOD_UPDATE_TIMESTAMP() definition. Save handler
must not refer/change PS() variables directly from save handler. Use
parameters.
- PS_MOD()/PS_MOD_SID() macro exist only for transition. Beware these
save handlers are less secure and slower than PS_MOD_UPDATE_TIMESTAMP().
- PS(invalid_session_id) was removed. It was never worked as it supposed.
To report invalid session ID, use PS_VALIDATE_SID() handler.
- PS_VALIDATE_SID() defines session ID validation handler. This handler
must validate if requested session ID exists in session data storage or not.
Do not access PS(id) directly, but use this handler and it's parameter.
- PS_UPDATE_TIMESTAMP() defines timestamp updating handler. This handler
must update session data timestamp for GC if it is needed. e.g. Memcache
updates timestamp on read, so it does not need to update timestamp. Return
SUCCESS simply for this case.
- PS_CREATE_SID() should check session ID collision. Return NULL for failure.
- More documentation can be found in ext/session/mod_files.c as comments.
mod_files.c may be used as reference implementation.
Openssl:
- several ext/openssl functions require the inclusion of the applink shim
as documented in the OpenSSL FAQ https://www.openssl.org/support/faq.html#PROG2
to properly work on Windows. While this primarily affects the OpenSSL
functionality, the shim needs to be included into the executable file
which loads the extension DLL, not into the extension DLL itself. Alternatively
it can be compiled and linked with the main program as a separate object file.
More information and explanation in the linked OpenSSL FAQ.
Thus, this primarily affects any WAMP or other redistributions bundling the
official PHP release zipballs or building from sources. The applink shim is
already integrated with all the PHP executables from the official distribution
starting with 7.0.0beta1.