Commit Graph

86 Commits

Author SHA1 Message Date
Andrei Zmievski
72df9946d2 - Remove support for code units in TextIterator (people shouldn't be
examining individual code units anyway)
- Add offset() method.
- Add optional locale parameter to the constructor.
2006-06-24 18:18:38 +00:00
Andrei Zmievski
e875205714 Implement user conversion error handler support. Works as normal error
handler, in that it can return false to make the default one take over.
Handler signature is:
  user_handler($direction, $encoding, $char_byte, $offset, $message)

Also removed support for using exceptions in default error handler.
2006-06-21 20:17:21 +00:00
Andrei Zmievski
37972451f8 Implement unicode_set_error_handler() / unicode_restore_error_handler().
The error handler doesn't do anything yet. (vaporware)
2006-06-20 23:00:02 +00:00
Andrei Zmievski
927c5f2eb7 Rename to str_transliterate(). 2006-06-15 17:37:48 +00:00
Dmitry Stogov
f6bdedcb5b Fixed ZTS build 2006-06-15 10:03:52 +00:00
Andrei Zmievski
a093762a6f transliterate() 2006-06-13 23:46:04 +00:00
Andrei Zmievski
e2a1d7a3e1 Add char_enum_types(). 2006-05-09 18:21:27 +00:00
Andrei Zmievski
fad88da96e Fix and adjust. 2006-05-09 00:15:45 +00:00
Andrei Zmievski
f0dec5c4a4 Add char_enum_names(). 2006-05-09 00:06:08 +00:00
Andrei Zmievski
cbe16953e8 Been a long day.. 2006-05-08 23:01:20 +00:00
Andrei Zmievski
8a938324e3 And going, and going... 2006-05-08 22:23:57 +00:00
Andrei Zmievski
002b28e5cc A few more property functinos. 2006-05-08 21:54:44 +00:00
Edin Kadribasic
1483647359 Added constants.c to the windows build 2006-05-08 21:06:34 +00:00
Frank M. Kromann
e7a6d29e95 Adding property.c to windows build 2006-05-05 21:37:08 +00:00
Andrei Zmievski
80f849ac1a Register slightly under half a metric ton of constants. 2006-05-05 20:56:21 +00:00
Andrei Zmievski
49dbb7710a Implement char_from_name(). 2006-05-04 21:22:30 +00:00
Andrei Zmievski
c631205e0c Some more work on property/names stuff. 2006-05-04 18:37:12 +00:00
Andrei Zmievski
2c42e06895 Fix locale functions naming problem. 2006-05-04 16:49:33 +00:00
Andrei Zmievski
349d4a7de9 Change prefix to char_ and rename some functions. 2006-05-04 00:01:34 +00:00
Andrei Zmievski
bbde23e247 Some more property functions.
# I am pondering a different prefix..
2006-05-03 22:03:10 +00:00
Andrei Zmievski
70acfbe14e *** empty log message *** 2006-05-03 06:36:53 +00:00
Andrei Zmievski
84c9b4f290 Some additional binary property functions. 2006-05-02 22:43:52 +00:00
Andrei Zmievski
aaed3ca1b0 FALSE on empty string. 2006-05-02 21:49:16 +00:00
Andrei Zmievski
f0640426cb Implement C/POSIX migration functions. 2006-05-02 21:39:15 +00:00
Andrei Zmievski
675ecc637b Add skeleton for character property file. Also remove some HAVE_UNICODE
tests since it's non optional.
2006-05-02 20:58:30 +00:00
Andrei Zmievski
76d6cca78e Add collator_set_default(). 2006-04-21 21:10:01 +00:00
Andrei Zmievski
ec57be524b Hmm, ZEND_FENTRY() is the only one that allows flags to be added.. 2006-04-21 19:40:57 +00:00
Andrei Zmievski
2bbced4bce Rename i18_loc_* to locale_*. 2006-04-21 19:35:26 +00:00
Andrei Zmievski
24988a088c Implement collator_get_default() and simplify/fix the underlying code.
# Derick, objects aren't that difficult.. :)
2006-04-21 18:25:16 +00:00
Andrei Zmievski
16c55fb25a Move to refcounted implementation of collators. 2006-04-20 21:56:43 +00:00
Frank M. Kromann
9b93bc74cd fix build on Win32 2006-04-20 03:41:33 +00:00
Andrei Zmievski
85c36036e5 Update protos. 2006-04-18 21:36:38 +00:00
Sara Golemon
30a2bd1d11 Another (and hopefully last) major streams commit.
This moves unicode conversion to the filter layer
(rather than at the lower streams layer)
unicode_filter.c has been moved from ext/unicode to main/streams
as it's an integral part of the streams unicode conversion process.

There are now three ways to set encoding on a stream:

(1) By context
$ctx = stream_context_create(NULL,array('encoding'=>'latin1'));
$fp = fopen('somefile', 'r+t', false, $ctx);

(2) By stream_encoding()
$fp = fopen('somefile', 'r+');
stream_encoding($fp, 'latin1');

(3) By filter
$fp = fopen('somefile', 'r+');
stream_filter_append($fp, 'unicode.from.latin1', STREAM_FILTER_READ);
stream_filter_append($fp, 'unicode.to.latin1', STREAM_FILTER_WRITE);

Note: Methods 1 and 2 are convenience wrappers around method 3.
2006-03-29 01:20:43 +00:00
Andrei Zmievski
3eee3a5fd6 Fix collator instantiation. 2006-03-28 04:33:29 +00:00
Andrei Zmievski
cbbfebc428 Fix typos. 2006-03-28 03:28:08 +00:00
Andrei Zmievski
b36d2dfef6 Rewrite unicode_encode() and unicode_decode() functions. Apply the new
conversion error semantics.
2006-03-27 03:19:30 +00:00
Andrei Zmievski
db50082fe9 Add unicode_get_error_mode() and unicode_get_subst_char(). 2006-03-26 21:22:59 +00:00
Derick Rethans
ad6a972de3 - Implemented basic collation support. For some reason "new Collator" gives segfaults when the object's collation resource is used.
- The following example shows what is implemented:

<?php
$orig = $strings = array(
    'côte',
    'cote',
    'côté',
    'coté',
    'fluße',
    'flüße',
);

echo "German phonebook:\n";
$c = collator_create( "de@collation=phonebook" );
foreach($c->sort($strings) as $string) {
    echo $string, "\n";
}
echo $c->getAttribute(Collator::FRENCH_COLLATION) == Collator::ON
    ? "With" : "Without", " french accent sorting order\n";

echo "\nFrench with options:\n";
$c = collator_create( "fr" );
$c->setAttribute(Collator::CASE_FIRST, Collator::UPPER_FIRST);
$c->setAttribute(Collator::CASE_LEVEL, Collator::ON);
$c->setStrength(Collator::SECONDARY);
foreach($c->sort($strings) as $string) {
    echo $string, "\n";
}
echo $c->getAttribute(Collator::FRENCH_COLLATION) == Collator::ON
    ? "With" : "Without", " french accent sorting order\n";
?>
2006-03-26 11:06:24 +00:00
Andrei Zmievski
1709428494 Implement to-Unicode conversion error behavior. Note the adjusted APIs. 2006-03-26 06:19:24 +00:00
Andrei Zmievski
c254b21cca Add protos. 2006-03-26 03:33:10 +00:00
Andrei Zmievski
930bde5897 * Remove unicode.from_error_mode and unicode.from_subst_char from INI
settings.
* Add unicode_set_error_mode() and unicode_set_subst_char() functions to
  manipulate these global settings.
2006-03-26 01:48:33 +00:00
Andrei Zmievski
fe0cccc003 Use intern->type for break iterator. 2006-03-24 21:06:36 +00:00
Antony Dovgal
9cee8be28e first check for NULL, then use the pointer 2006-03-24 10:21:56 +00:00
Derick Rethans
3056defb26 - Moved strtotitle to ext/standard and implemented the fallback case to
non-unicode with ucwords. There is also an implementation for unicode ucwords
  but that returns different results then strtotitle as it uppercases the
  first character of every word, and doesn't *titlecase* a word. The test case
  shows that.
2006-03-22 10:20:20 +00:00
Derick Rethans
7f7300ae0b - Update windows file too (not tested, but should work). 2006-03-21 13:57:16 +00:00
Derick Rethans
c86cf4fbea - Make ext/unicode an extension that is always there and can not be disabled. 2006-03-21 13:56:50 +00:00
Sara Golemon
48798021b5 Refactor streams layer for PHP6.
Don't be frightened by the size of this commit.
A significant portion of it is restoring the read buffer semantics back
to what PHP4/5 use.  (Or a close aproximation thereof).

See main/streams/streams.c and ext/standard/file.c for a set of
UTODO comments covering work yet to be done.
2006-03-13 04:40:11 +00:00
Andrei Zmievski
20301a153f Should use word break iteration instead of title, as title one has been
deprecated since Unicode 3.2>
2006-03-02 20:40:45 +00:00
Dmitry Stogov
c366cc6d1a Nuke int32_t (everywhere except streams layer) and signed/unsigned warnings 2006-03-02 13:12:45 +00:00
Dmitry Stogov
e3b7f3fd0d Unicode support: MS Visual C compatibility 2006-02-26 11:57:14 +00:00