conversions of filename entries.
Normal path conversions will simply use this converter,
Certain other protocols (such as http) which specify a
required character set (utf8), may override the conversion
by defining a path_encode() and/or path_decode() wrapper ops method.
update userspace function file_get_contents().
Note: fgc()'s second parameter (use_include_path) has been changed
to be a bitmask "flags" parameter instead.
For the most commonly used values (TRUE, 1) this will continue functioning
as expected since the value of FILE_USE_INCLUDE_PATH is (coincidentally) 1.
The impact to other values should be noted in the migration6 guide.
This change makes it possible to allow fgc() to return binary file
contents (default) or unicode transcoded contents (using FILE_TEXT flag).
This updates userspace functions fpassthru() and readfile()
UG(output_encoding) is used by php_stream_passthru() to translate
unicode stream contents back to an outputable character set.
Note: readfile()'s second parameter (use_include_path) has been changed
to be a bitmask "flags" parameter instead.
For the most commonly used values (TRUE, 1) this will continue functioning
as expected since the value of FILE_USE_INCLUDE_PATH is (coincidentally) 1.
The impact to other values should be noted in the migration6 guide.
This change makes it possible to allow readfile() to output binary file
contents (default) or unicode transcoded contents (using FILE_TEXT flag).
This moves unicode conversion to the filter layer
(rather than at the lower streams layer)
unicode_filter.c has been moved from ext/unicode to main/streams
as it's an integral part of the streams unicode conversion process.
There are now three ways to set encoding on a stream:
(1) By context
$ctx = stream_context_create(NULL,array('encoding'=>'latin1'));
$fp = fopen('somefile', 'r+t', false, $ctx);
(2) By stream_encoding()
$fp = fopen('somefile', 'r+');
stream_encoding($fp, 'latin1');
(3) By filter
$fp = fopen('somefile', 'r+');
stream_filter_append($fp, 'unicode.from.latin1', STREAM_FILTER_READ);
stream_filter_append($fp, 'unicode.to.latin1', STREAM_FILTER_WRITE);
Note: Methods 1 and 2 are convenience wrappers around method 3.
Use the global conversion error handlers for output conversion (for now)
We may want to make this customizable on a per-stream basis
via context param later on...
fgets() will work now as will anything which calls one of the
_php_stream_get_line() family of functions.
The one exception here is when the legacy defines are used on a unicode
stream. At the moment they'll simply return NULL, I'll update these
to do sloppy conversion in a bit.
'make (u)test' still doesn't work, but it's a different doesn't work.
<?php
declare(encoding="latin1");
$a = "1234å67890";
file_put_contents( "/tmp/testuc.1", $a);
file_put_contents( "/tmp/testuc.2", (string) $a);
$context = stream_context_create();
stream_context_set_params($context, array( "output_encoding" => "latin1" ) );
file_put_contents( "/tmp/testuc.3", $a, FILE_TEXT, $context);
file_put_contents( "/tmp/testuc.4", (string) $a, FILE_TEXT, $context);
?>
But it still throws a warning on ".3". It's a small design issue that I
didn't want to touch right now.
Don't be frightened by the size of this commit.
A significant portion of it is restoring the read buffer semantics back
to what PHP4/5 use. (Or a close aproximation thereof).
See main/streams/streams.c and ext/standard/file.c for a set of
UTODO comments covering work yet to be done.
- use the same type (int) for zval.value.usr.len and zval.value.str.len
- use union "zstr" as char*/UChar* mixture instead of void*
- Z_UNISTR() and Z_UNILEN() no longer check for Z_TYPE()
- nuke int32_t from ZE (not finisned)