Update to Oniguruma 6.8.2

This commit is contained in:
Christoph M. Becker 2018-05-26 13:38:30 +02:00
parent e265a96d93
commit 2c4556ee43
50 changed files with 1284 additions and 322 deletions

2
NEWS
View File

@ -108,7 +108,7 @@ PHP NEWS
on Windows). (Nikita)
. Fixed bug #76319 (mb_strtolower with invalid UTF-8 causes segmentation
fault). (Nikita)
. Update to Oniguruma 6.8.1. (cmb)
. Updated to Oniguruma 6.8.2. (cmb)
- ODBC:
. Removed support for ODBCRouter. (Kalle)

View File

@ -1,28 +1,26 @@
Oniguruma LICENSE
-----------------
/*-
* Copyright (c) 2002-2015 K.Kosako <kkosako0@gmail.com>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
Copyright (c) 2002-2018 K.Kosako <kkosako0@gmail.com>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.

View File

@ -1,5 +1,16 @@
History
2018/04/17: Version 6.8.2
2018/04/13: add doc/CALLOUTS.API.ja
2018/04/10: add doc/CALLOUTS.API
2018/04/10: fix #87: Read unknown address in onig_error_code_to_str()
2018/04/06: fix #86: typedef StateCheckNumType is unused
2018/04/02: update automake 1.16.1
2018/03/30: fix #84: stack-buffer-overflow in mbc_enc_len
2018/03/28: PR #83: Improve CMake build
2018/03/21: switch uses of UChar to OnigUChar in oniguruma.h (#80)
2018/03/19: Version 6.8.1
2018/03/19: update LTVERSION from 4:0:0 to 5:0:0

View File

@ -1,10 +1,10 @@
README 2018/01/14
README 2018/04/05
Oniguruma ---- (C) K.Kosako
https://github.com/kkos/oniguruma
FIXED Security Issues:
FIXED Security Issues (in Oniguruma 6.3.0):
CVE-2017-9224, CVE-2017-9225, CVE-2017-9226
CVE-2017-9227, CVE-2017-9228, CVE-2017-9229

View File

@ -3,11 +3,6 @@ Oniguruma
https://github.com/kkos/oniguruma
FIXED Security Issues:
--------------------------
**CVE-2017-9224, CVE-2017-9225, CVE-2017-9226**
**CVE-2017-9227, CVE-2017-9228, CVE-2017-9229**
Oniguruma is a modern and flexible regular expressions library. It
encompasses features from different regular expression implementations
that traditionally exist in different languages. It comes close to
@ -39,6 +34,14 @@ Supported character encodings:
* CP1251: contributed by Byte
New feature of version 6.8.2
--------------------------
* Fix: #80 UChar in header causes issue
* NEW API: onig_set_callout_user_data_of_match_param() (* omission in 6.8.0)
* add doc/CALLOUTS.API and doc/CALLOUTS.API.ja
New feature of version 6.8.1
--------------------------
@ -51,9 +54,10 @@ New feature of version 6.8.0
* Retry-limit-in-match function enabled by default
* NEW: configure option --enable-posix-api=no (* enabled by default)
* NEW API: onig_search_with_param(), onig_match_with_param()
* NEW: Callouts of contents (?{...contents...}) (?{...}\[X<>]) (?{{....}})
* NEW: Callouts of contents (?{...contents...}) (?{...}\[tag]\[X<>]) (?{{...}})
* NEW: Callouts of name (*name) (*name\[tag]{args...})
* NEW: Builtin callouts (*FAIL) (*MISMATCH) (*ERROR{n}) (*COUNT) (*MAX{n}) etc..
* Examples of Callouts program: [callout.c](sample/callout.c), [count.c](sample/count.c), [echo.c](sample/echo.c)
(* Callout function API is experimental level and isn't fixed definitely yet. Undocumented now)
@ -107,6 +111,12 @@ New feature of version 6.3.0
--------------------------
* NEW: octal codepoint \o{.....}
* Fixed CVE-2017-9224
* Fixed CVE-2017-9225
* Fixed CVE-2017-9226
* Fixed CVE-2017-9227
* Fixed CVE-2017-9228
* Fixed CVE-2017-9229
New feature of version 6.1.2

View File

@ -0,0 +1,385 @@
Callouts API Version 6.8.2 2018/04/14
#include <oniguruma.h>
(1) Callout functions
(2) Set/Get functions for Callouts of contents
(3) Set functions for Callouts of name
(4) User data
(5) Get values from OnigCalloutArgs
(6) Tag
(7) Callout data (used in callout functions)
(8) Callout data (used in applications)
(9) Miscellaneous functions
(1) Callout functions
type: OnigCalloutFunc
typedef int (*OnigCalloutFunc)(OnigCalloutArgs* args, void* user_data);
If 0 (NULL) is set as a callout function value, never called.
* Callout function return value (int)
ONIG_CALLOUT_FAIL(1): fail
ONIG_CALLOUT_SUCCESS(0): success
less than -1: error code (terminate search/match)
ONIG_CALLOUT_FAIL/SUCCESS values are ignored in retractions,
because retraction is a part of recovery process after failure.
* Example of callout function
extern int always_success(OnigCalloutArgs* args, void* user_data)
{
return ONIG_CALLOUT_SUCCESS;
}
(2) Set/Get functions for Callouts of contents
# OnigCalloutFunc onig_get_progress_callout(void)
Get a function for callouts of contents in progress.
# int onig_set_progress_callout(OnigCalloutFunc f)
Set a function for callouts of contents in progress.
This value set in onig_initialize_match_param() as a default
callout function.
normal return: ONIG_NORMAL
# OnigCalloutFunc onig_get_retraction_callout(void)
Get a function for callouts of contents in retraction (backtrack).
# int onig_set_retraction_callout(OnigCalloutFunc f)
Set a function for callouts of contents in retraction (backtrack).
This value set in onig_initialize_match_param() as a default
callout function.
normal return: ONIG_NORMAL
# int onig_set_progress_callout_of_match_param(OnigMatchParam* mp, OnigCalloutFunc f)
Set a function for callouts of contents in progress.
arguments
1 mp: match-param pointer
2 f: function
normal return: ONIG_NORMAL
# int onig_set_retraction_callout_of_match_param(OnigMatchParam* mp, OnigCalloutFunc f)
Set a function for callouts of contents in retraction (backtrack).
arguments
1 mp: match-param pointer
2 f: function
normal return: ONIG_NORMAL
(3) Set functions for Callouts of name
# int onig_set_callout_of_name(OnigEncoding enc, OnigCalloutType type, OnigUChar* name, OnigUChar* name_end, int callout_in, OnigCalloutFunc callout, OnigCalloutFunc end_callout, int arg_num, unsigned int arg_types[], int opt_arg_num, OnigValue opt_defaults[])
Set a function for callouts of name.
Allowed name string characters: _ A-Z a-z 0-9 (* first character: _ A-Z a-z)
(enc, name) pair is used as key value to find callout function.
You have to call this function for every encoding used in your applications.
But if enc is ASCII compatible and (enc, name) entry is not found,
then (ASCII, name) entry is used.
Therefore, if you use ASCII compatible encodings only, it is enough to call
this function one time for (ASCII, name).
arguments
1 enc: character encoding
2 type: callout type (currently ONIG_CALLOUT_TYPE_SINGLE only supported)
3 name: name string address (the string is encoded by enc)
4 name_end: name string end address
5 callout_in: direction (ONIG_CALLOUT_IN_PROGRESS/RETRACTION/BOTH)
6 callout: callout function
7 end_callout: * not used currently (set 0)
8 arg_num: number of arguments (*limit by ONIG_CALLOUT_MAX_ARGS_NUM == 4)
9 arg_types: type array of arguments
10 opt_arg_num: number of optional arguments
11 opt_defaults: default values array of optional arguments
normal return: ONIG_NORMAL
error:
ONIGERR_INVALID_CALLOUT_NAME
ONIGERR_INVALID_ARGUMENT
ONIGERR_INVALID_CALLOUT_ARG
(4) User data
# int onig_set_callout_user_data_of_match_param(OnigMatchParam* param, void* user_data)
Set a user_data value which passed as second argument of callout.
normal return: ONIG_NORMAL
(5) Get values from OnigCalloutArgs
# int onig_get_callout_num_by_callout_args(OnigCalloutArgs* args)
Returns callout number of this callout.
"Callout number" is an identifier of callout in a regex pattern.
# OnigCalloutIn onig_get_callout_in_by_callout_args(OnigCalloutArgs* args)
Returns the direction of this callout.
(ONIG_CALLOUT_IN_PROGRESS or ONIG_CALLOUT_IN_RETRACTION)
# int onig_get_name_id_by_callout_args(OnigCalloutArgs* args)
Returns the name identifier of this callout.
If this callout is callout of contents, then returns ONIG_NON_NAME_ID.
# const OnigUChar* onig_get_contents_by_callout_args(OnigCalloutArgs* args)
Returns the contents string of this callout. (NULL terminated string)
If this callout is callout of name, then returns NULL.
# const OnigUChar* onig_get_contents_end_by_callout_args(OnigCalloutArgs* args)
Returns the end of contents string of this callout.
If this callout is callout of name, then returns NULL.
# int onig_get_args_num_by_callout_args(OnigCalloutArgs* args)
Returns the number of args of this callout.
It includes optional arguments that doesn't passed in regex pattern.
If this callout is callout of contents, then returns
ONIGERR_INVALID_ARGUMENT.
# int onig_get_passed_args_num_by_callout_args(OnigCalloutArgs* args)
Returns the number of args that passed really in regex pattern.
If this callout is callout of contents, then returns
ONIGERR_INVALID_ARGUMENT.
# int onig_get_arg_by_callout_args(OnigCalloutArgs* args, int index, OnigType* type, OnigValue* val)
Returns a value and a type of the callout argument.
If this callout is callout of contents, then returns
ONIGERR_INVALID_ARGUMENT.
normal return: ONIG_NORMAL
# const OnigUChar* onig_get_string_by_callout_args(OnigCalloutArgs* args)
Returns the subject string adress.
This is the second argument(str) of onig_search().
# const OnigUChar* onig_get_string_end_by_callout_args(OnigCalloutArgs* args)
Returns the end address of subject string.
This is the third argument(end) of onig_search().
# const OnigUChar* onig_get_start_by_callout_args(OnigCalloutArgs* args)
Returns the start address of subject string in current match process.
# const OnigUChar* onig_get_right_range_by_callout_args(OnigCalloutArgs* args)
Returns the right range address of subject string.
# const OnigUChar* onig_get_current_by_callout_args(OnigCalloutArgs* args)
Returns the current address of subject string in current match process.
# OnigRegex onig_get_regex_by_callout_args(OnigCalloutArgs* args)
Returns the regex object address of this callout.
# unsigned long onig_get_retry_counter_by_callout_args(OnigCalloutArgs* args)
Returns the current counter value for retry-limit-in-match.
(6) Tag
"Tag" is a name assigned to a callout in regexp pattern.
Allowed tag string characters: _ A-Z a-z 0-9 (* first character: _ A-Z a-z)
# int onig_callout_tag_is_exist_at_callout_num(OnigRegex reg, int callout_num)
Returns 1 if tag is assigned for the callout, else returns 0.
# int onig_get_callout_num_by_tag(OnigRegex reg, const OnigUChar* tag, const OnigUChar* tag_end)
Returns the callout number for the tag.
# const OnigUChar* onig_get_callout_tag_start(OnigRegex reg, int callout_num)
Returns the start address of tag string for the callout.
(NULL terminated string)
# const OnigUChar* onig_get_callout_tag_end(OnigRegex reg, int callout_num)
Returns the end address of tag string for the callout.
(7) Callout data (used in callout functions)
"Callout data" is ONIG_CALLOUT_DATA_SLOT_NUM(5) values area
for each callout in each search process.
Each value area in a callout is indicated by "slot" number (0 - 4).
Callout data are used for any purpose by callout function implementers.
# int onig_get_callout_data_by_callout_args(OnigCalloutArgs* args, int callout_num, int slot, OnigType* type, OnigValue* val)
Returns the callout data value/type for a callout slot indicated by
callout_num/slot.
normal return: ONIG_NORMAL
1: not yet set (type is ONIG_TYPE_VOID)
< 0: error code
# int onig_get_callout_data_by_callout_args_self(OnigCalloutArgs* args, int slot, OnigType* type, OnigValue* val)
Returns self callout data value/type.
normal return: ONIG_NORMAL
1: not yet set (type is ONIG_TYPE_VOID)
< 0: error code
# int onig_set_callout_data_by_callout_args(OnigCalloutArgs* args, int callout_num, int slot, OnigType type, OnigValue* val)
Set the callout data value/type for a callout slot indicated by callout_num/slot.
normal return: ONIG_NORMAL
< 0: error code
# int onig_set_callout_data_by_callout_args_self(OnigCalloutArgs* args, int slot, OnigType type, OnigValue* val)
Set self callout data value/type for a callout slot indicated by slot.
normal return: ONIG_NORMAL
< 0: error code
# int onig_get_callout_data_by_callout_args_self_dont_clear_old(OnigCalloutArgs* args, int slot, OnigType* type, OnigValue* val)
This function is almost same as onig_get_callout_data_by_callout_args_self().
But this function doesn't clear values which set in previous failed match process.
Other onig_get_callout_data_xxxx() functions clear all values which set
in previous failed match process.
For example, Builtin callout (*TOTAL_COUNT) is implemented by using this
function for accumulate count of all of match processes in a search process.
Builtin callout (*COUNT) returns count in last success match process only,
because it doesn't use this function.
(8) Callout data (used in apllications)
# int onig_get_callout_data(OnigRegex reg, OnigMatchParam* mp, int callout_num, int slot, OnigType* type, OnigValue* val)
Returns the callout data value/type for a callout slot indicated by
callout_num/slot.
normal return: ONIG_NORMAL
1: not yet set (type is ONIG_TYPE_VOID)
< 0: error code
# int onig_get_callout_data_by_tag(OnigRegex reg, OnigMatchParam* mp, const OnigUChar* tag, const OnigUChar* tag_end, int slot, OnigType* type, OnigValue* val)
Returns the callout data value/type for a callout slot indicated by tag/slot.
normal return: ONIG_NORMAL
1: not yet set (type is ONIG_TYPE_VOID)
< 0: error code
# int onig_set_callout_data(OnigRegex reg, OnigMatchParam* mp, int callout_num, int slot, OnigType type, OnigValue* val)
Set the callout data value/type for a callout slot indicated by callout_num/slot.
normal return: ONIG_NORMAL
< 0: error code
# int onig_set_callout_data_by_tag(OnigRegex reg, OnigMatchParam* mp, const OnigUChar* tag, const OnigUChar* tag_end, int slot, OnigType type, OnigValue* val)
Set the callout data value/type for a callout slot indicated by tag/slot.
normal return: ONIG_NORMAL
< 0: error code
# int onig_get_callout_data_dont_clear_old(OnigRegex reg, OnigMatchParam* mp, int callout_num, int slot, OnigType* type, OnigValue* val)
No needs to use this function.
It will be abolished.
(9) Miscellaneous functions
# OnigUChar* onig_get_callout_name_by_name_id(int name_id)
Returns callout name of the name id.
if invalid name id is passed, return 0.
# int onig_get_capture_range_in_callout(OnigCalloutArgs* args, int mem_num, int* begin, int* end)
Returns current capture range position.
Position is byte length offset from subject string.
For uncaptured mem_num, ONIG_REGION_NOTPOS is set.
# int onig_get_used_stack_size_in_callout(OnigCalloutArgs* args, int* used_num, int* used_bytes)
Returns current used match-stack size.
used_num: number of match-stack elements
used_bytes: used byte size of match-stack
//END

View File

@ -0,0 +1,382 @@
Callouts API Version 6.8.2 2018/04/13
#include <oniguruma.h>
(1) 呼び出し関数
(2) 内容の呼び出し関数の設定/取得
(3) 名前の呼び出し関数の設定
(4) ユーザデータ
(5) OnigCalloutArgsからの値の取得
(6) 名札
(7) 呼び出しデータ (呼び出し関数内から使用される)
(8) 呼び出しデータ (アプリケーションから使用される)
(9) その他の関数
(1) 呼び出し関数
型: OnigCalloutFunc
typedef int (*OnigCalloutFunc)(OnigCalloutArgs* args, void* user_data);
若し呼び出し関数として0(NULL)がセットされると、呼ばれることはない
* 呼び出し関数の戻り値 (int)
ONIG_CALLOUT_FAIL(1): 失敗
ONIG_CALLOUT_SUCCESS(0): 成功
-1未満: エラーコード (検索/照合の終了)
ONIG_CALLOUT_FAIL/SUCCESSは、後退中の呼び出しでは無視される。
後退は失敗の回復過程なので。
* 呼び出し関数の例
extern int always_success(OnigCalloutArgs* args, void* user_data)
{
return ONIG_CALLOUT_SUCCESS;
}
(2) 内容の呼び出し関数の設定/取得
# OnigCalloutFunc onig_get_progress_callout(void)
内容の呼び出し関数(前進)を返す
# int onig_set_progress_callout(OnigCalloutFunc f)
内容の呼び出し関数(前進)をセットする。
この値はonig_initialize_match_param()の中でデフォルトの呼び出し関数として
セットされる。
正常終了: ONIG_NORMAL
# OnigCalloutFunc onig_get_retraction_callout(void)
内容の呼び出し関数(後退)を返す
# int onig_set_retraction_callout(OnigCalloutFunc f)
内容の呼び出し関数(後退)をセットする。
この値はonig_initialize_match_param()の中でデフォルトの呼び出し関数として
セットされる。
正常終了: ONIG_NORMAL
# int onig_set_progress_callout_of_match_param(OnigMatchParam* mp, OnigCalloutFunc f)
内容の呼び出し関数(前進)をセットする。
引数
1 mp: match-paramアドレス
2 f: 関数
正常終了: ONIG_NORMAL
# int onig_set_retraction_callout_of_match_param(OnigMatchParam* mp, OnigCalloutFunc f)
内容の呼び出し関数(後退)をセットする。
引数
1 mp: match-paramアドレス
2 f: 関数
正常終了: ONIG_NORMAL
(3) 名前の呼び出し関数の設定
# int onig_set_callout_of_name(OnigEncoding enc, OnigCalloutType type, OnigUChar* name, OnigUChar* name_end, int callout_in, OnigCalloutFunc callout, OnigCalloutFunc end_callout, int arg_num, unsigned int arg_types[], int opt_arg_num, OnigValue opt_defaults[])
名前の呼び出し関数をセットする。
名前に許される文字: _ A-Z a-z 0-9 (* 最初の文字: _ A-Z a-z)
(enc, name)のペアが、呼び出し関数を見つけるためのキーとして使用される。
アプリケーションで使用される各エンコーディングに対してこの関数を呼ぶ必要がある。
しかし若しencエンコーディングがASCII互換であり、(enc, name)に対するエントリが
見つからない場合には、(ASCII, name)エントリが参照される。
従って、若しASCII互換エンコーディングのみ使用している場合には、この関数を(ASCII, name)
について一回呼べば十分である。
引数
1 enc: 文字エンコーディング
2 type: 呼び出し型 (現在は ONIG_CALLOUT_TYPE_SINGLE のみサポート)
3 name: 名前のアドレス (encでエンコーディングされている文字列)
4 name_end: 名前の終端アドレス
5 callout_in: 方向フラグ (ONIG_CALLOUT_IN_PROGRESS/RETRACTION/BOTH)
6 callout: 呼び出し関数
7 end_callout: *まだ使用していない (0をセット)
8 arg_num: 引数の数 (* 最大値 ONIG_CALLOUT_MAX_ARGS_NUM == 4)
9 arg_types: 引数の型の配列
10 opt_arg_num: オプション引数の数
11 opt_defaults: オプション引数のデフォルト値
正常終了: ONIG_NORMAL
error:
ONIGERR_INVALID_CALLOUT_NAME
ONIGERR_INVALID_ARGUMENT
ONIGERR_INVALID_CALLOUT_ARG
(4) ユーザデータ
# int onig_set_callout_user_data_of_match_param(OnigMatchParam* param, void* user_data)
呼び出し関数の引数として渡されるユーザデータをセットする。
正常終了: ONIG_NORMAL
(5) OnigCalloutArgsからの値の取得
# int onig_get_callout_num_by_callout_args(OnigCalloutArgs* args)
この呼び出しの呼び出し番号を返す。
"呼び出し番号"とは、正規表現パターンの中の呼び出しに対する識別子である。
# OnigCalloutIn onig_get_callout_in_by_callout_args(OnigCalloutArgs* args)
この呼び出しが起きた時の方向(前進中/後退中)を返す。
(ONIG_CALLOUT_IN_PROGRESS か ONIG_CALLOUT_IN_RETRACTION)
# int onig_get_name_id_by_callout_args(OnigCalloutArgs* args)
この呼び出しの名前(name)の識別子を返す。
若しこの呼び出しが内容の呼び出しのときには、ONIG_NON_NAME_IDが返される。
# const OnigUChar* onig_get_contents_by_callout_args(OnigCalloutArgs* args)
この呼び出しの内容文字列(NULL終端あり)を返す。
若しこの呼び出しが名前の呼び出しのときには、NULLを返す。
# const OnigUChar* onig_get_contents_end_by_callout_args(OnigCalloutArgs* args)
この呼び出しの内容(contents)の終端を返す。
若しこの呼び出しが名前の呼び出しのときには、NULLを返す。
# int onig_get_args_num_by_callout_args(OnigCalloutArgs* args)
この呼び出しの引数の数を返す。
正規表現パターンの中で渡されなかったオプション引数も含む。
若しこの呼び出しが内容の呼び出しのときには、ONIGERR_INVALID_ARGUMENTが返される。
# int onig_get_passed_args_num_by_callout_args(OnigCalloutArgs* args)
この呼び出しの本当に渡された引数の数を返す。
若しこの呼び出しが内容の呼び出しのときには、ONIGERR_INVALID_ARGUMENTが返される。
# int onig_get_arg_by_callout_args(OnigCalloutArgs* args, int index, OnigType* type, OnigValue* val)
この呼び出しの一個の引数の値と型を返す。
若しこの呼び出しが内容の呼び出しのときには、ONIGERR_INVALID_ARGUMENTが返される。
正常終了: ONIG_NORMAL
# const OnigUChar* onig_get_string_by_callout_args(OnigCalloutArgs* args)
対象文字列のアドレスを返す。
onig_search()の二番目の引数(str)である。
# const OnigUChar* onig_get_string_end_by_callout_args(OnigCalloutArgs* args)
対象文字列の終端アドレスを返す。
onig_search()の三番目の引数(end)である。
# const OnigUChar* onig_get_start_by_callout_args(OnigCalloutArgs* args)
対象文字列の現在の照合処理開始アドレスを返す。
# const OnigUChar* onig_get_right_range_by_callout_args(OnigCalloutArgs* args)
対象文字列の現在の照合範囲アドレスを返す。
# const OnigUChar* onig_get_current_by_callout_args(OnigCalloutArgs* args)
対象文字列の現在の照合位置アドレスを返す。
# OnigRegex onig_get_regex_by_callout_args(OnigCalloutArgs* args)
この呼び出しの正規表現オブジェクトのアドレスを返す。
# unsigned long onig_get_retry_counter_by_callout_args(OnigCalloutArgs* args)
retry-limit-in-matchのためのリトライカウンタの現在値を返す。
(6) 名札
"Tag" (名札)とは、正規表現パターンの中で呼び出しに割り当てられた名前である。
tag文字列に使用できる文字: _ A-Z a-z 0-9 (* 先頭の文字: _ A-Z a-z)
# int onig_callout_tag_is_exist_at_callout_num(OnigRegex reg, int callout_num)
その呼び出しにtagが割り当てられていれば1を返す、そうでなければ0を返す。
# const OnigUChar* onig_get_callout_tag_start(OnigRegex reg, int callout_num)
その呼び出しに対するtag文字列(NULL終端あり)の先頭アドレスを返す。
# const OnigUChar* onig_get_callout_tag_end(OnigRegex reg, int callout_num)
その呼び出しに対するtag文字列の終端アドレスを返す。
# int onig_get_callout_num_by_tag(OnigRegex reg, const OnigUChar* tag, const OnigUChar* tag_end)
そのtagに対する呼び出し番号を返す。
(7) 呼び出しデータ (呼び出し関数内から使用される)
"呼び出しデータ" (callout data)とは、
それぞれの呼び出しに対してそれぞれの検索処理の中で割り当てられた、
ONIG_CALLOUT_DATA_SLOT_NUM(== 5)個の値の領域である。
一個の呼び出しに対する各値の領域は、"スロット"(slot)番号(0 - 4)によって示される。
呼び出しデータは呼び出し関数の実装者によって任意の目的に使用される。
# int onig_get_callout_data_by_callout_args(OnigCalloutArgs* args, int callout_num, int slot, OnigType* type, OnigValue* val)
callout_num/slotによって示された呼び出しスロットに対するデータの値/型を返す。
正常終了: ONIG_NORMAL
1: 値が未セット (typeは ONIG_TYPE_VOID)
< 0: エラーコード
# int onig_get_callout_data_by_callout_args_self(OnigCalloutArgs* args, int slot, OnigType* type, OnigValue* val)
自分自身の呼び出しのslotによって示されたスロットに対するデータの値/型を返す。
正常終了: ONIG_NORMAL
1: 値が未セット (typeは ONIG_TYPE_VOID)
< 0: エラーコード
# int onig_set_callout_data_by_callout_args(OnigCalloutArgs* args, int callout_num, int slot, OnigType type, OnigValue* val)
callout_num/slotによって示された呼び出しスロットに対する値/型をセットする。。
正常終了: ONIG_NORMAL
< 0: エラーコード
# int onig_set_callout_data_by_callout_args_self(OnigCalloutArgs* args, int slot, OnigType type, OnigValue* val)
自分自身の呼び出しのslotによって示されたスロットに対する値/型をセットする。。
正常終了: ONIG_NORMAL
< 0: エラーコード
# int onig_get_callout_data_by_callout_args_self_dont_clear_old(OnigCalloutArgs* args, int slot, OnigType* type, OnigValue* val)
この関数は、onig_get_callout_data_by_callout_args_self()とほぼ同じである。
しかしこの関数は、現在の照合処理以前の失敗した照合処理の中でセットされた値を
クリアしない。
他のonig_get_callout_data_xxxx()関数は、以前の失敗した照合処理の中でセットされた値を
クリアする。
例えば、組み込み呼び出し(*TOTAL_COUNT)は、検索処理の中の全ての照合処理の積算カウントを
得るためにこの関数を使用して実装されている。
組み込む呼び出し(*COUNT)は、この関数を使用しないので、最後の成功した照合処理だけの
カウントを返す。
(8) 呼び出しデータ (アプリケーションから使用される)
# int onig_get_callout_data(OnigRegex reg, OnigMatchParam* mp, int callout_num, int slot, OnigType* type, OnigValue* val)
callout_num/slotによって示された呼び出しスロットに対するデータの値/型を返す。
正常終了: ONIG_NORMAL
1: 値が未セット (typeは ONIG_TYPE_VOID)
< 0: エラーコード
# int onig_get_callout_data_by_tag(OnigRegex reg, OnigMatchParam* mp, const OnigUChar* tag, const OnigUChar* tag_end, int slot, OnigType* type, OnigValue* val)
tag/slotによって示された呼び出しスロットに対するデータの値/型を返す。
正常終了: ONIG_NORMAL
1: 値が未セット (typeは ONIG_TYPE_VOID)
< 0: エラーコード
# int onig_set_callout_data(OnigRegex reg, OnigMatchParam* mp, int callout_num, int slot, OnigType type, OnigValue* val)
callout_num/slotによって示された呼び出しスロットに対する値/型をセットする。。
正常終了: ONIG_NORMAL
< 0: エラーコード
# int onig_set_callout_data_by_tag(OnigRegex reg, OnigMatchParam* mp, const OnigUChar* tag, const OnigUChar* tag_end, int slot, OnigType type, OnigValue* val)
tag/slotによって示された呼び出しスロットに対する値/型をセットする。。
正常終了: ONIG_NORMAL
< 0: エラーコード
# int onig_get_callout_data_dont_clear_old(OnigRegex reg, OnigMatchParam* mp, int callout_num, int slot, OnigType* type, OnigValue* val)
この関数を使用する必要はないと思われる。
廃止予定。
(9) その他の関数
# OnigUChar* onig_get_callout_name_by_name_id(int name_id)
名前の識別子に対する名前を返す。
不正な識別子が渡された場合には0を返す。
# int onig_get_capture_range_in_callout(OnigCalloutArgs* args, int mem_num, int* begin, int* end)
現在の捕獲範囲を返す。
位置は、対象文字列に対するバイト単位で表される。
未捕獲のmem_numに対しては、ONIG_REGION_NOTPOSがセットされる。
# int onig_get_used_stack_size_in_callout(OnigCalloutArgs* args, int* used_num, int* used_bytes)
現在使用されている照合処理用スタックサイズを返す。
used_num: 要素数
used_bytes: バイト数
//END

View File

@ -1,4 +1,4 @@
CALLOUTS.BUILTIN 2018/03/19
CALLOUTS.BUILTIN 2018/03/26
* FAIL (progress)
@ -12,15 +12,15 @@ CALLOUTS.BUILTIN 2018/03/19
(*MISMATCH)
Terminate Match process.
Continue Search process.
Terminates Match process.
Continues Search process.
* ERROR (progress)
(*ERROR{n::LONG})
Terminate Search/Match process.
Terminates Search/Match process.
Return value is the argument 'n'. (The value must be less than -1)
'n' is an optional argument. (default value is ONIG_ABORT)
@ -28,12 +28,20 @@ CALLOUTS.BUILTIN 2018/03/19
* MAX (progress/retraction)
(*MAX{n::LONG})
(*MAX{n::LONG/TAG, c::CHAR})
Restrict the maximum count of success.
Restricts the maximum count of success(default), progress or retraction.
If 'n' type is tag, slot 0 value of the tag are used.
Depends on 'c' argument, the slot 0 value changes.
'c' is an optional argument, default value is 'X'.
(* success count = progress count - retraction count)
ex. "(?:(*COUNT[T]{X})a)*(?:(*MAX{T})c)*"
[callout data]
slot 0: current success count.
slot 0: '>': progress count, '<': retraction count, 'X': success count (default)
* COUNT (progress/retraction)
@ -42,15 +50,13 @@ CALLOUTS.BUILTIN 2018/03/19
Counter.
Depends on 'c' argument, the slot 0 value changes.
'c' is an optional argument, deefault value is '>'.
'c' is an optional argument, default value is '>'.
[callout data]
slot 0: '>': progress count, '<': retraction count, 'X': success count
slot 0: '>': progress count (default), '<': retraction count, 'X': success count
slot 1: progress count
slot 2: retraction count
(* success count = progress count - retraction count)
** If option ONIG_OPTION_FIND_LONGEST or ONIG_OPTION_FIND_NOT_EMPTY is used,
counts are not accurate.
@ -61,10 +67,10 @@ CALLOUTS.BUILTIN 2018/03/19
It's the almost same as COUNT.
But the counts are integrated in a search process.
'c' is an optional argument, deefault value is '>'.
'c' is an optional argument, default value is '>'.
[callout data]
slot 0: '>': progress count, '<': retraction count, 'X': success count
slot 0: '>': progress count (default), '<': retraction count, 'X': success count
slot 1: progress count
slot 2: retraction count
@ -76,7 +82,8 @@ CALLOUTS.BUILTIN 2018/03/19
(*CMP{x::TAG/LONG, op::STRING, y::TAG/LONG})
Compare x value and y value with op operator.
Compares x value and y value with op operator.
If x and y types are tag, slot 0 value of the tag are used.
op: '==', '!=', '>', '<', '>=', '<='

View File

@ -1,4 +1,4 @@
CALLOUTS.BUILTIN.ja 2018/03/19
CALLOUTS.BUILTIN.ja 2018/03/26
* FAIL (前進)
@ -27,12 +27,17 @@ CALLOUTS.BUILTIN.ja 2018/03/19
* MAX (前進/後退)
(*MAX{n::LONG})
(*MAX{n::LONG/TAG, c::CHAR})
成功回数を制限する
成功(デフォルト)、前進または後退回数を制限する
'n'がTAGのときは、そのTAGのcalloutのslot 0の値が使用される
'c'引数の値によって、slot 0の値が変化する
'c'はオプション引数で、デフォルト値は'X'
例: "(?:(*COUNT[T]{X})a)*(?:(*MAX{T})c)*"
[callout data]
slot 0: 現在の成功回数
slot 0: '>': 前進回数, '<': 後退回数, 'X': 成功回数(デフォルト)
* COUNT (前進/後退)
@ -44,7 +49,7 @@ CALLOUTS.BUILTIN.ja 2018/03/19
'c'はオプション引数で、デフォルト値は'>'
[callout data]
slot 0: '>': 前進回数, '<': 後退回数, 'X': 成功回数
slot 0: '>': 前進回数(デフォルト), '<': 後退回数, 'X': 成功回数
slot 1: 前進回数
slot 2: 後退回数
@ -63,7 +68,7 @@ CALLOUTS.BUILTIN.ja 2018/03/19
'c'はオプション引数で、デフォルト値は'>'
[callout data]
slot 0: '>': 前進回数, '<': 後退回数, 'X': 成功回数
slot 0: '>': 前進回数(デフォルト), '<': 後退回数, 'X': 成功回数
slot 1: 前進回数
slot 2: 後退回数

View File

@ -1,4 +1,4 @@
Oniguruma Regular Expressions Version 6.8.0 2018/03/08
Oniguruma Regular Expressions Version 6.8.0 2018/04/13
syntax: ONIG_SYNTAX_ONIGURUMA (default)
@ -266,19 +266,32 @@ syntax: ONIG_SYNTAX_ONIGURUMA (default)
<Callouts>
* Callouts of contents
(?{...contents...}) callouts in progress
(?{...contents...}D) D is a direction flag char. ('X' or '<' or '>')
D = 'X': progress and retraction, '<': retraction only
'>': progress only (default)
(?{...contents...}) callout in progress
(?{...contents...}D) D is a direction flag char
D = 'X': in progress and retraction
'<': in retraction only
'>': in progress only
(?{...contents...}[tag]) tag assigned
(?{...contents...}[tag]D)
* Escape characters have no effects in contents.
* contents is not allowed to start with '{'.
(?{{{...contents...}}}) n times continuations '}' in contents is allowed in
(n+1) times continuations {{{...}}}.
Allowed tag string characters: _ A-Z a-z 0-9 (* first character: _ A-Z a-z)
* Callouts of name
(*name)
(*name{args...}) with args
(*name[tag]) tag assigned
(*name[tag]{args...})
Allowed name string characters: _ A-Z a-z 0-9 (* first character: _ A-Z a-z)
Allowed tag string characters: _ A-Z a-z 0-9 (* first character: _ A-Z a-z)
<Absent functions>

View File

@ -1,4 +1,4 @@
鬼車 正規表現 Version 6.8.0 2018/03/08
鬼車 正規表現 Version 6.8.0 2018/04/13
使用文法: ONIG_SYNTAX_ONIGURUMA (既定値)
@ -269,17 +269,31 @@
* 内容の呼び出し
(?{...contents...}) 前進中のみの呼び出し
(?{...contents...}D) Dは方向指定文字 ('X' or '<' or '>')
D = 'X': 前進および後退, '<' 後退のみ, '>': 前進のみ
(?{...contents...}D) Dは方向指定文字
D = 'X': 前進中および後退中
'<': 後退中のみ
'>': 前進中のみ
(?{...contents...}[tag]) 名札付き
(?{...contents...}[tag]D)
* エスケープ文字はcontentsの中で何の機能も持たない
* contentsは、'{'文字で始まってはならない
(?{{{...contents...}}}) contentsの中のn個連続の'}'は、(n+1)個連続の{{{...}}}
の中で許される
tagに許される文字: _ A-Z a-z 0-9 (* 最初の文字: _ A-Z a-z)
* 名前の呼び出し
(*name)
(*name{args...}) 引数付き
(*name[tag]) 名札付き
(*name[tag]{args...})
nameに許される文字: _ A-Z a-z 0-9 (* 最初の文字: _ A-Z a-z)
tag に許される文字: _ A-Z a-z 0-9 (* 最初の文字: _ A-Z a-z)
<不在機能群>
@ -296,7 +310,7 @@
例 (?~|345|\d*) "12345678" ==> "12", "1", ""
(?~|不在式) 不在停止 (* 原作)
この演算子を通過した後は、対象文字列の適合範囲の最後
この演算子を通過した後は、対象文字列の適合範囲が
<不在式>に適合する文字列を含まない範囲に制限される。
(?~|) 範囲消去

View File

@ -8,7 +8,7 @@
<h1>Oniguruma</h1> (<a href="index_ja.html">Japanese</a>)
<p>
(c) K.Kosako, updated at: 2018/03/19
(c) K.Kosako, updated at: 2018/04/14
</p>
<dl>
@ -16,6 +16,7 @@
<dt><b>What's new</b>
</font>
<ul>
<li>2018/04/17: Version 6.8.2 released.</li>
<li>2018/03/19: Version 6.8.1 released.</li>
<li>2018/03/16: Version 6.8.0 released.</li>
<li>2018/01/26: Version 6.7.1 released.</li>

View File

@ -8,7 +8,7 @@
<h1>鬼車</h1>
<p>
(c) K.Kosako, 最終更新: 2018/03/19
(c) K.Kosako, 最終更新: 2018/04/14
</p>
<dl>
@ -16,6 +16,7 @@
<dt><b>更新情報</b>
</font>
<ul>
<li>2018/04/17: Version 6.8.2 リリース</li>
<li>2018/03/19: Version 6.8.1 リリース</li>
<li>2018/03/16: Version 6.8.0 リリース</li>
<li>2018/01/26: Version 6.7.1 リリース</li>

View File

@ -37,16 +37,19 @@ init(void)
int id;
OnigEncoding enc;
char* name;
unsigned int t_long;
unsigned int args[4];
OnigValue opts[4];
enc = ONIG_ENCODING_ASCII;
t_long = ONIG_TYPE_LONG;
name = "FAIL"; BC0_P(name, fail);
name = "MISMATCH"; BC0_P(name, mismatch);
name = "MAX"; BC_B(name, max, 1, &t_long);
name = "MAX";
args[0] = ONIG_TYPE_TAG | ONIG_TYPE_LONG;
args[1] = ONIG_TYPE_CHAR;
opts[0].c = 'X';
BC_B_O(name, max, 2, args, 1, opts);
name = "ERROR";
args[0] = ONIG_TYPE_LONG; opts[0].l = ONIG_ABORT;
@ -110,5 +113,6 @@ OnigEncodingType OnigEncodingASCII = {
init,
0, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -187,5 +187,6 @@ OnigEncodingType OnigEncodingBIG5 = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -200,5 +200,6 @@ OnigEncodingType OnigEncodingCP1251 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -307,5 +307,6 @@ OnigEncodingType OnigEncodingEUC_JP = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -185,5 +185,6 @@ OnigEncodingType OnigEncodingEUC_CN = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -168,5 +168,6 @@ OnigEncodingType OnigEncodingEUC_TW = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -89,25 +89,25 @@ is_valid_mbc_string(const UChar* p, const UChar* end)
p++;
if (p >= end) return FALSE;
if (*p < 0x40) {
if (*p < 0x30 || *p > 0x39)
return FALSE;
if (*p < 0x30 || *p > 0x39)
return FALSE;
p++;
if (p >= end) return FALSE;
if (*p < 0x81 || *p == 0xff) return FALSE;
p++;
if (p >= end) return FALSE;
if (*p < 0x81 || *p == 0xff) return FALSE;
p++;
if (p >= end) return FALSE;
if (*p < 0x30 || *p > 0x39)
return FALSE;
p++;
if (p >= end) return FALSE;
if (*p < 0x30 || *p > 0x39)
return FALSE;
p++;
p++;
}
else if (*p == 0x7f || *p == 0xff) {
return FALSE;
return FALSE;
}
else {
p++;
p++;
}
}
}
@ -138,7 +138,7 @@ gb18030_mbc_case_fold(OnigCaseFoldType flag, const UChar** pp, const UChar* end,
#if 0
static int
gb18030_is_mbc_ambiguous(OnigCaseFoldType flag,
const UChar** pp, const UChar* end)
const UChar** pp, const UChar* end)
{
return onigenc_mbn_is_mbc_ambiguous(ONIG_ENCODING_GB18030, flag, pp, end);
}
@ -197,16 +197,16 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case S_START:
switch (GB18030_MAP[*p]) {
case C1:
return (UChar *)s;
return (UChar *)s;
case C2:
state = S_one_C2; /* C2 */
break;
state = S_one_C2; /* C2 */
break;
case C4:
state = S_one_C4; /* C4 */
break;
state = S_one_C4; /* C4 */
break;
case CM:
state = S_one_CM; /* CM */
break;
state = S_one_CM; /* CM */
break;
}
break;
case S_one_C2: /* C2 */
@ -214,10 +214,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)s;
return (UChar *)s;
case CM:
state = S_odd_CM_one_CX; /* CM C2 */
break;
state = S_odd_CM_one_CX; /* CM C2 */
break;
}
break;
case S_one_C4: /* C4 */
@ -225,23 +225,23 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)s;
return (UChar *)s;
case CM:
state = S_one_CMC4;
break;
state = S_one_CMC4;
break;
}
break;
case S_one_CM: /* CM */
switch (GB18030_MAP[*p]) {
case C1:
case C2:
return (UChar *)s;
return (UChar *)s;
case C4:
state = S_odd_C4CM;
break;
state = S_odd_C4CM;
break;
case CM:
state = S_odd_CM_one_CX; /* CM CM */
break;
state = S_odd_CM_one_CX; /* CM CM */
break;
}
break;
@ -250,10 +250,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 1);
return (UChar *)(s - 1);
case CM:
state = S_even_CM_one_CX;
break;
state = S_even_CM_one_CX;
break;
}
break;
case S_even_CM_one_CX: /* CM CM C2 */ /* CM CM CM */ /* CM CM C4 */
@ -261,10 +261,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)s;
return (UChar *)s;
case CM:
state = S_odd_CM_one_CX;
break;
state = S_odd_CM_one_CX;
break;
}
break;
@ -272,26 +272,26 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
switch (GB18030_MAP[*p]) {
case C1:
case C2:
return (UChar *)(s - 1);
return (UChar *)(s - 1);
case C4:
state = S_one_C4_odd_CMC4; /* C4 CM C4 */
break;
state = S_one_C4_odd_CMC4; /* C4 CM C4 */
break;
case CM:
state = S_even_CM_one_CX; /* CM CM C4 */
break;
state = S_even_CM_one_CX; /* CM CM C4 */
break;
}
break;
case S_odd_CMC4: /* CM C4 CM C4 CM C4 */
switch (GB18030_MAP[*p]) {
case C1:
case C2:
return (UChar *)(s - 1);
return (UChar *)(s - 1);
case C4:
state = S_one_C4_odd_CMC4;
break;
state = S_one_C4_odd_CMC4;
break;
case CM:
state = S_odd_CM_odd_CMC4;
break;
state = S_odd_CM_odd_CMC4;
break;
}
break;
case S_one_C4_odd_CMC4: /* C4 CM C4 */
@ -299,23 +299,23 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 1);
return (UChar *)(s - 1);
case CM:
state = S_even_CMC4; /* CM C4 CM C4 */
break;
state = S_even_CMC4; /* CM C4 CM C4 */
break;
}
break;
case S_even_CMC4: /* CM C4 CM C4 */
switch (GB18030_MAP[*p]) {
case C1:
case C2:
return (UChar *)(s - 3);
return (UChar *)(s - 3);
case C4:
state = S_one_C4_even_CMC4;
break;
state = S_one_C4_even_CMC4;
break;
case CM:
state = S_odd_CM_even_CMC4;
break;
state = S_odd_CM_even_CMC4;
break;
}
break;
case S_one_C4_even_CMC4: /* C4 CM C4 CM C4 */
@ -323,10 +323,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 3);
return (UChar *)(s - 3);
case CM:
state = S_odd_CMC4;
break;
state = S_odd_CMC4;
break;
}
break;
@ -335,10 +335,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 3);
return (UChar *)(s - 3);
case CM:
state = S_even_CM_odd_CMC4;
break;
state = S_even_CM_odd_CMC4;
break;
}
break;
case S_even_CM_odd_CMC4: /* CM CM CM C4 CM C4 CM C4 */
@ -346,10 +346,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 1);
return (UChar *)(s - 1);
case CM:
state = S_odd_CM_odd_CMC4;
break;
state = S_odd_CM_odd_CMC4;
break;
}
break;
@ -358,10 +358,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 1);
return (UChar *)(s - 1);
case CM:
state = S_even_CM_even_CMC4;
break;
state = S_even_CM_even_CMC4;
break;
}
break;
case S_even_CM_even_CMC4: /* CM CM CM C4 CM C4 */
@ -369,10 +369,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 3);
return (UChar *)(s - 3);
case CM:
state = S_odd_CM_even_CMC4;
break;
state = S_odd_CM_even_CMC4;
break;
}
break;
@ -381,23 +381,23 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)s;
return (UChar *)s;
case CM:
state = S_one_CM_odd_C4CM; /* CM C4 CM */
break;
state = S_one_CM_odd_C4CM; /* CM C4 CM */
break;
}
break;
case S_one_CM_odd_C4CM: /* CM C4 CM */ /* CM C4 CM C4 CM C4 CM */
switch (GB18030_MAP[*p]) {
case C1:
case C2:
return (UChar *)(s - 2); /* |CM C4 CM */
return (UChar *)(s - 2); /* |CM C4 CM */
case C4:
state = S_even_C4CM;
break;
state = S_even_C4CM;
break;
case CM:
state = S_even_CM_odd_C4CM;
break;
state = S_even_CM_odd_C4CM;
break;
}
break;
case S_even_C4CM: /* C4 CM C4 CM */
@ -405,23 +405,23 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 2); /* C4|CM C4 CM */
return (UChar *)(s - 2); /* C4|CM C4 CM */
case CM:
state = S_one_CM_even_C4CM;
break;
state = S_one_CM_even_C4CM;
break;
}
break;
case S_one_CM_even_C4CM: /* CM C4 CM C4 CM */
switch (GB18030_MAP[*p]) {
case C1:
case C2:
return (UChar *)(s - 0); /*|CM C4 CM C4|CM */
return (UChar *)(s - 0); /*|CM C4 CM C4|CM */
case C4:
state = S_odd_C4CM;
break;
state = S_odd_C4CM;
break;
case CM:
state = S_even_CM_even_C4CM;
break;
state = S_even_CM_even_C4CM;
break;
}
break;
@ -430,10 +430,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 0); /* |CM CM|C4|CM */
return (UChar *)(s - 0); /* |CM CM|C4|CM */
case CM:
state = S_odd_CM_odd_C4CM;
break;
state = S_odd_CM_odd_C4CM;
break;
}
break;
case S_odd_CM_odd_C4CM: /* CM CM CM C4 CM */
@ -441,10 +441,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 2); /* |CM CM|CM C4 CM */
return (UChar *)(s - 2); /* |CM CM|CM C4 CM */
case CM:
state = S_even_CM_odd_C4CM;
break;
state = S_even_CM_odd_C4CM;
break;
}
break;
@ -453,10 +453,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 2); /* |CM CM|C4|CM C4 CM */
return (UChar *)(s - 2); /* |CM CM|C4|CM C4 CM */
case CM:
state = S_odd_CM_even_C4CM;
break;
state = S_odd_CM_even_C4CM;
break;
}
break;
case S_odd_CM_even_C4CM: /* CM CM CM C4 CM C4 CM */
@ -464,10 +464,10 @@ gb18030_left_adjust_char_head(const UChar* start, const UChar* s)
case C1:
case C2:
case C4:
return (UChar *)(s - 0); /* |CM CM|CM C4 CM C4|CM */
return (UChar *)(s - 0); /* |CM CM|CM C4 CM C4|CM */
case CM:
state = S_even_CM_even_C4CM;
break;
state = S_even_CM_even_C4CM;
break;
}
break;
}
@ -535,5 +535,6 @@ OnigEncodingType OnigEncodingGB18030 = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -272,5 +272,6 @@ OnigEncodingType OnigEncodingISO_8859_1 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -239,5 +239,6 @@ OnigEncodingType OnigEncodingISO_8859_10 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -96,5 +96,6 @@ OnigEncodingType OnigEncodingISO_8859_11 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -228,5 +228,6 @@ OnigEncodingType OnigEncodingISO_8859_13 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -241,5 +241,6 @@ OnigEncodingType OnigEncodingISO_8859_14 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -235,5 +235,6 @@ OnigEncodingType OnigEncodingISO_8859_15 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -237,5 +237,6 @@ OnigEncodingType OnigEncodingISO_8859_16 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -235,5 +235,6 @@ OnigEncodingType OnigEncodingISO_8859_2 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -235,5 +235,6 @@ OnigEncodingType OnigEncodingISO_8859_3 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -237,5 +237,6 @@ OnigEncodingType OnigEncodingISO_8859_4 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -226,5 +226,6 @@ OnigEncodingType OnigEncodingISO_8859_5 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -96,5 +96,6 @@ OnigEncodingType OnigEncodingISO_8859_6 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -222,5 +222,6 @@ OnigEncodingType OnigEncodingISO_8859_7 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -96,5 +96,6 @@ OnigEncodingType OnigEncodingISO_8859_8 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -228,5 +228,6 @@ OnigEncodingType OnigEncodingISO_8859_9 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -250,5 +250,6 @@ OnigEncodingType OnigEncodingKOI8 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -212,5 +212,6 @@ OnigEncodingType OnigEncodingKOI8_R = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
0, 0, 0
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -36,7 +36,7 @@ extern "C" {
#define ONIGURUMA
#define ONIGURUMA_VERSION_MAJOR 6
#define ONIGURUMA_VERSION_MINOR 8
#define ONIGURUMA_VERSION_TEENY 1
#define ONIGURUMA_VERSION_TEENY 2
#define ONIGURUMA_VERSION_INT 60801
@ -115,7 +115,7 @@ typedef struct {
OnigCodePoint one_or_more_time;
OnigCodePoint anychar_anytime;
} OnigMetaCharTableType;
typedef int (*OnigApplyAllCaseFoldFunc)(OnigCodePoint from, OnigCodePoint* to, int to_len, void* arg);
typedef struct OnigEncodingTypeST {
@ -344,7 +344,7 @@ int onigenc_str_bytelen_null P_((OnigEncoding enc, const OnigUChar* p));
ONIG_EXTERN
int onigenc_is_valid_mbc_string P_((OnigEncoding enc, const OnigUChar* s, const OnigUChar* end));
ONIG_EXTERN
UChar* onigenc_strdup P_((OnigEncoding enc, const UChar* s, const UChar* end));
OnigUChar* onigenc_strdup P_((OnigEncoding enc, const OnigUChar* s, const OnigUChar* end));
/* PART: regular expression */
@ -549,7 +549,7 @@ ONIG_EXTERN OnigSyntaxType* OnigDefaultSyntax;
#define ONIGERR_SPECIFIED_ENCODING_CANT_CONVERT_TO_WIDE_CHAR -22
#define ONIGERR_FAIL_TO_INITIALIZE -23
/* general error */
#define ONIGERR_INVALID_ARGUMENT -30
#define ONIGERR_INVALID_ARGUMENT -30
/* syntax error */
#define ONIGERR_END_PATTERN_AT_LEFT_BRACE -100
#define ONIGERR_END_PATTERN_AT_LEFT_BRACKET -101
@ -894,6 +894,8 @@ ONIG_EXTERN
int onig_set_progress_callout_of_match_param P_((OnigMatchParam* param, OnigCalloutFunc f));
ONIG_EXTERN
int onig_set_retraction_callout_of_match_param P_((OnigMatchParam* param, OnigCalloutFunc f));
ONIG_EXTERN
int onig_set_callout_user_data_of_match_param P_((OnigMatchParam* param, void* user_data));
/* for callout functions */
ONIG_EXTERN
@ -905,15 +907,15 @@ OnigCalloutFunc onig_get_retraction_callout P_((void));
ONIG_EXTERN
int onig_set_retraction_callout P_((OnigCalloutFunc f));
ONIG_EXTERN
int onig_set_callout_of_name P_((OnigEncoding enc, OnigCalloutType type, OnigUChar* name, OnigUChar* name_end, int callout_in, OnigCalloutFunc callout, OnigCalloutFunc end_callout, int arg_num, unsigned int arg_types[], int optional_arg_num, OnigValue opt_defaults[])); /* name: single-byte string */
int onig_set_callout_of_name P_((OnigEncoding enc, OnigCalloutType type, OnigUChar* name, OnigUChar* name_end, int callout_in, OnigCalloutFunc callout, OnigCalloutFunc end_callout, int arg_num, unsigned int arg_types[], int optional_arg_num, OnigValue opt_defaults[]));
ONIG_EXTERN
OnigUChar* onig_get_callout_name_by_name_id P_((int id));
ONIG_EXTERN
int onig_get_callout_num_by_tag P_((OnigRegex reg, const UChar* tag, const UChar* tag_end));
int onig_get_callout_num_by_tag P_((OnigRegex reg, const OnigUChar* tag, const OnigUChar* tag_end));
ONIG_EXTERN
int onig_get_callout_data_by_tag P_((OnigRegex reg, OnigMatchParam* mp, const UChar* tag, const UChar* tag_end, int slot, OnigType* type, OnigValue* val));
int onig_get_callout_data_by_tag P_((OnigRegex reg, OnigMatchParam* mp, const OnigUChar* tag, const OnigUChar* tag_end, int slot, OnigType* type, OnigValue* val));
ONIG_EXTERN
int onig_set_callout_data_by_tag P_((OnigRegex reg, OnigMatchParam* mp, const UChar* tag, const UChar* tag_end, int slot, OnigType type, OnigValue* val));
int onig_set_callout_data_by_tag P_((OnigRegex reg, OnigMatchParam* mp, const OnigUChar* tag, const OnigUChar* tag_end, int slot, OnigType type, OnigValue* val));
/* used in callout functions */
ONIG_EXTERN

View File

@ -2450,7 +2450,6 @@ is_exclusive(Node* x, Node* y, regex_t* reg)
if (NODE_STRING_LEN(x) == 0)
break;
//c = *(xs->s);
switch (ytype) {
case NODE_CTYPE:
switch (CTYPE_(y)->ctype) {
@ -2758,7 +2757,7 @@ tree_min_len(Node* node, ScanEnv* env)
len = en->min_len;
else {
if (NODE_IS_MARK1(node))
len = 0; // recursive
len = 0; /* recursive */
else {
NODE_STATUS_ADD(node, NST_MARK1);
len = tree_min_len(NODE_BODY(node), env);
@ -3763,7 +3762,7 @@ expand_case_fold_string(Node* node, regex_t* reg)
return r;
}
#ifdef USE_INSISTENT_CHECK_CAPTURES_STATUS_IN_ENDLESS_REPEAT
#ifdef USE_INSISTENT_CHECK_CAPTURES_IN_EMPTY_REPEAT
static enum QuantBodyEmpty
quantifiers_memory_node_info(Node* node)
{
@ -3847,7 +3846,7 @@ quantifiers_memory_node_info(Node* node)
return r;
}
#endif /* USE_INSISTENT_CHECK_CAPTURES_STATUS_IN_ENDLESS_REPEAT */
#endif /* USE_INSISTENT_CHECK_CAPTURES_IN_EMPTY_REPEAT */
#define IN_ALT (1<<0)
@ -4375,7 +4374,7 @@ setup_quant(Node* node, regex_t* reg, int state, ScanEnv* env)
if (IS_REPEAT_INFINITE(qn->upper) || qn->upper >= 1) {
d = tree_min_len(body, env);
if (d == 0) {
#ifdef USE_INSISTENT_CHECK_CAPTURES_STATUS_IN_ENDLESS_REPEAT
#ifdef USE_INSISTENT_CHECK_CAPTURES_IN_EMPTY_REPEAT
qn->body_empty_info = quantifiers_memory_node_info(body);
if (qn->body_empty_info == QUANT_BODY_IS_EMPTY_REC) {
if (NODE_TYPE(body) == NODE_ENCLOSURE &&
@ -5979,7 +5978,10 @@ onig_compile(regex_t* reg, const UChar* pattern, const UChar* pattern_end,
#endif
root = 0;
if (IS_NOT_NULL(einfo)) einfo->par = (UChar* )NULL;
if (IS_NOT_NULL(einfo)) {
einfo->enc = reg->enc;
einfo->par = (UChar* )NULL;
}
#ifdef ONIG_DEBUG
print_enc_string(stderr, reg->enc, pattern, pattern_end);
@ -6124,7 +6126,6 @@ onig_compile(regex_t* reg, const UChar* pattern, const UChar* pattern_end,
err:
if (IS_NOT_NULL(scan_env.error)) {
if (IS_NOT_NULL(einfo)) {
einfo->enc = scan_env.enc;
einfo->par = scan_env.error;
einfo->par_end = scan_env.error_end;
}

View File

@ -120,6 +120,10 @@ struct PropertyNameCtype {
#define ONIG_ENCODING_INIT_DEFAULT ONIG_ENCODING_ASCII
#define ENC_FLAG_ASCII_COMPATIBLE (1<<0)
#define ENC_FLAG_UNICODE (1<<1)
/* for encoding system implementation (internal) */
extern int onigenc_end(void);
extern int onigenc_ascii_apply_all_case_fold P_((OnigCaseFoldType flag, OnigApplyAllCaseFoldFunc f, void* arg));
@ -156,7 +160,7 @@ extern int onigenc_mb4_code_to_mbc P_((OnigEncoding enc, OnigCodePoint code, UCh
extern int onigenc_mb4_is_code_ctype P_((OnigEncoding enc, OnigCodePoint code, unsigned int ctype));
extern struct PropertyNameCtype* euc_jp_lookup_property_name P_((register const char *str, register unsigned int len));
extern struct PropertyNameCtype* sjis_lookup_property_name P_((register const char *str, register unsigned int len));
//extern const struct PropertyNameCtype* unicode_lookup_property_name P_((register const char *str, register unsigned int len));
/* extern const struct PropertyNameCtype* unicode_lookup_property_name P_((register const char *str, register unsigned int len)); */
/* in enc/unicode.c */
extern int onigenc_unicode_is_code_ctype P_((OnigCodePoint code, unsigned int ctype));
@ -250,8 +254,9 @@ extern const unsigned short OnigEncAsciiCtypeTable[];
ONIGENC_IS_ASCII_CODE_CTYPE(code, ONIGENC_CTYPE_LOWER))
#define ONIGENC_IS_UNICODE_ENCODING(enc) \
((enc)->is_code_ctype == onigenc_unicode_is_code_ctype)
(((enc)->flag & ENC_FLAG_UNICODE) != 0)
#define ONIGENC_IS_ASCII_COMPATIBLE_ENCODING(enc) ((enc)->min_enc_len == 1)
#define ONIGENC_IS_ASCII_COMPATIBLE_ENCODING(enc) \
(((enc)->flag & ENC_FLAG_ASCII_COMPATIBLE) != 0)
#endif /* REGENC_H */

View File

@ -52,9 +52,9 @@ typedef struct {
struct OnigMatchParamStruct {
unsigned int match_stack_limit;
unsigned long retry_limit_in_match;
#ifdef USE_CALLOUT
OnigCalloutFunc progress_callout_of_contents;
OnigCalloutFunc retraction_callout_of_contents;
#ifdef USE_CALLOUT
int match_at_call_counter;
void* callout_user_data;
CalloutData* callout_data;
@ -81,15 +81,34 @@ onig_set_retry_limit_in_match_of_match_param(OnigMatchParam* param,
extern int
onig_set_progress_callout_of_match_param(OnigMatchParam* param, OnigCalloutFunc f)
{
#ifdef USE_CALLOUT
param->progress_callout_of_contents = f;
return ONIG_NORMAL;
#else
return ONIG_NO_SUPPORT_CONFIG;
#endif
}
extern int
onig_set_retraction_callout_of_match_param(OnigMatchParam* param, OnigCalloutFunc f)
{
#ifdef USE_CALLOUT
param->retraction_callout_of_contents = f;
return ONIG_NORMAL;
#else
return ONIG_NO_SUPPORT_CONFIG;
#endif
}
extern int
onig_set_callout_user_data_of_match_param(OnigMatchParam* param, void* user_data)
{
#ifdef USE_CALLOUT
param->callout_user_data = user_data;
return ONIG_NORMAL;
#else
return ONIG_NO_SUPPORT_CONFIG;
#endif
}
@ -114,19 +133,21 @@ typedef struct {
#ifdef ONIG_DEBUG
/* arguments type */
#define ARG_SPECIAL -1
#define ARG_NON 0
#define ARG_RELADDR 1
#define ARG_ABSADDR 2
#define ARG_LENGTH 3
#define ARG_MEMNUM 4
#define ARG_OPTION 5
#define ARG_MODE 6
typedef enum {
ARG_SPECIAL = -1,
ARG_NON = 0,
ARG_RELADDR = 1,
ARG_ABSADDR = 2,
ARG_LENGTH = 3,
ARG_MEMNUM = 4,
ARG_OPTION = 5,
ARG_MODE = 6
} OpArgType;
typedef struct {
short int opcode;
char* name;
short int arg_type;
OpArgType arg_type;
} OpInfoType;
static OpInfoType OpInfo[] = {
@ -295,11 +316,12 @@ extern void
onig_print_compiled_byte_code(FILE* f, UChar* bp, UChar** nextp, UChar* start,
OnigEncoding enc)
{
int i, n, arg_type;
int i, n;
OpArgType arg_type;
RelAddrType addr;
LengthType len;
MemNumType mem;
OnigCodePoint code;
LengthType len;
MemNumType mem;
OnigCodePoint code;
OnigOptionType option;
ModeType mode;
UChar *q;
@ -336,12 +358,13 @@ onig_print_compiled_byte_code(FILE* f, UChar* bp, UChar** nextp, UChar* start,
fprintf(f, ":%d", option);
}
break;
case ARG_MODE:
mode = *((ModeType* )bp);
bp += SIZE_MODE;
fprintf(f, ":%d", mode);
break;
default:
break;
}
}
else {
@ -546,7 +569,7 @@ onig_print_compiled_byte_code(FILE* f, UChar* bp, UChar** nextp, UChar* start,
#ifdef USE_CALLOUT
case OP_CALLOUT_CONTENTS:
{
GET_MEMNUM_INC(mem, bp); // number
GET_MEMNUM_INC(mem, bp); /* number */
fprintf(f, ":%d", mem);
}
break;
@ -555,8 +578,8 @@ onig_print_compiled_byte_code(FILE* f, UChar* bp, UChar** nextp, UChar* start,
{
int id;
GET_MEMNUM_INC(id, bp); // id
GET_MEMNUM_INC(mem, bp); // number
GET_MEMNUM_INC(id, bp); /* id */
GET_MEMNUM_INC(mem, bp); /* number */
fprintf(f, ":%d:%d", id, mem);
}
@ -959,8 +982,8 @@ typedef struct _StackType {
struct {
UChar *pstr; /* start/end position */
/* Following information is set, if this stack type is MEM-START */
StackIndex start; /* prev. info (for backtrack "(...)*" ) */
StackIndex end; /* prev. info (for backtrack "(...)*" ) */
StackIndex prev_start; /* prev. info (for backtrack "(...)*" ) */
StackIndex prev_end; /* prev. info (for backtrack "(...)*" ) */
} mem;
struct {
UChar *pstr; /* start position */
@ -996,7 +1019,7 @@ struct OnigCalloutArgsStruct {
const OnigUChar* string_end;
const OnigUChar* start;
const OnigUChar* right_range;
const OnigUChar* current; // current matching position
const OnigUChar* current; /* current matching position */
unsigned long retry_in_match_counter;
/* invisible to users */
@ -1127,7 +1150,7 @@ onig_get_retry_limit_in_match(void)
#ifdef USE_RETRY_LIMIT_IN_MATCH
return RetryLimitInMatch;
#else
//return ONIG_NO_SUPPORT_CONFIG;
/* return ONIG_NO_SUPPORT_CONFIG; */
return 0;
#endif
}
@ -1520,11 +1543,11 @@ stack_double(int is_alloca, char** arg_alloc_base,
STACK_ENSURE(1);\
stk->type = STK_MEM_START;\
stk->zid = (mnum);\
stk->u.mem.pstr = (s);\
stk->u.mem.start = mem_start_stk[mnum];\
stk->u.mem.end = mem_end_stk[mnum];\
mem_start_stk[mnum] = GET_STACK_INDEX(stk);\
mem_end_stk[mnum] = INVALID_STACK_INDEX;\
stk->u.mem.pstr = (s);\
stk->u.mem.prev_start = mem_start_stk[mnum];\
stk->u.mem.prev_end = mem_end_stk[mnum];\
mem_start_stk[mnum] = GET_STACK_INDEX(stk);\
mem_end_stk[mnum] = INVALID_STACK_INDEX;\
STACK_INC;\
} while(0)
@ -1532,9 +1555,9 @@ stack_double(int is_alloca, char** arg_alloc_base,
STACK_ENSURE(1);\
stk->type = STK_MEM_END;\
stk->zid = (mnum);\
stk->u.mem.pstr = (s);\
stk->u.mem.start = mem_start_stk[mnum];\
stk->u.mem.end = mem_end_stk[mnum];\
stk->u.mem.pstr = (s);\
stk->u.mem.prev_start = mem_start_stk[mnum];\
stk->u.mem.prev_end = mem_end_stk[mnum];\
mem_end_stk[mnum] = GET_STACK_INDEX(stk);\
STACK_INC;\
} while(0)
@ -1758,8 +1781,8 @@ stack_double(int is_alloca, char** arg_alloc_base,
STACK_BASE_CHECK(stk, "STACK_POP 2"); \
if ((stk->type & STK_MASK_POP_USED) != 0) break;\
else if (stk->type == STK_MEM_START) {\
mem_start_stk[stk->zid] = stk->u.mem.start;\
mem_end_stk[stk->zid] = stk->u.mem.end;\
mem_start_stk[stk->zid] = stk->u.mem.prev_start;\
mem_end_stk[stk->zid] = stk->u.mem.prev_end;\
}\
}\
break;\
@ -1770,15 +1793,15 @@ stack_double(int is_alloca, char** arg_alloc_base,
if ((stk->type & STK_MASK_POP_USED) != 0) break;\
else if ((stk->type & STK_MASK_POP_HANDLED) != 0) {\
if (stk->type == STK_MEM_START) {\
mem_start_stk[stk->zid] = stk->u.mem.start;\
mem_end_stk[stk->zid] = stk->u.mem.end;\
mem_start_stk[stk->zid] = stk->u.mem.prev_start;\
mem_end_stk[stk->zid] = stk->u.mem.prev_end;\
}\
else if (stk->type == STK_REPEAT_INC) {\
STACK_AT(stk->u.repeat_inc.si)->u.repeat.count--;\
}\
else if (stk->type == STK_MEM_END) {\
mem_start_stk[stk->zid] = stk->u.mem.start;\
mem_end_stk[stk->zid] = stk->u.mem.end;\
mem_start_stk[stk->zid] = stk->u.mem.prev_start;\
mem_end_stk[stk->zid] = stk->u.mem.prev_end;\
}\
POP_CALLOUT_CASE\
}\
@ -1795,15 +1818,15 @@ stack_double(int is_alloca, char** arg_alloc_base,
if (stk->type == (til_type)) break;\
else {\
if (stk->type == STK_MEM_START) {\
mem_start_stk[stk->zid] = stk->u.mem.start;\
mem_end_stk[stk->zid] = stk->u.mem.end;\
mem_start_stk[stk->zid] = stk->u.mem.prev_start;\
mem_end_stk[stk->zid] = stk->u.mem.prev_end;\
}\
else if (stk->type == STK_REPEAT_INC) {\
STACK_AT(stk->u.repeat_inc.si)->u.repeat.count--;\
}\
else if (stk->type == STK_MEM_END) {\
mem_start_stk[stk->zid] = stk->u.mem.start;\
mem_end_stk[stk->zid] = stk->u.mem.end;\
mem_start_stk[stk->zid] = stk->u.mem.prev_start;\
mem_end_stk[stk->zid] = stk->u.mem.prev_end;\
}\
/* Don't call callout here because negation of total success by (?!..) (?<!..) */\
}\
@ -1849,12 +1872,24 @@ stack_double(int is_alloca, char** arg_alloc_base,
}\
} while(0)
#ifdef USE_INSISTENT_CHECK_CAPTURES_STATUS_IN_ENDLESS_REPEAT
#define STACK_EMPTY_CHECK_MEMST(isnull,sid,s,reg) do {\
#define STACK_MEM_START_GET_PREV_END_ADDR(k /* STK_MEM_START*/, reg, addr) do {\
if (k->u.mem.prev_end == INVALID_STACK_INDEX) {\
(addr) = 0;\
}\
else {\
if (MEM_STATUS_AT((reg)->bt_mem_end, k->zid))\
(addr) = STACK_AT(k->u.mem.prev_end)->u.mem.pstr;\
else\
(addr) = (UChar* )k->u.mem.prev_end;\
}\
} while (0)
#ifdef USE_INSISTENT_CHECK_CAPTURES_IN_EMPTY_REPEAT
#define STACK_EMPTY_CHECK_MEM(isnull,sid,s,reg) do {\
StackType* k = stk;\
while (1) {\
k--;\
STACK_BASE_CHECK(k, "STACK_EMPTY_CHECK_MEMST"); \
STACK_BASE_CHECK(k, "STACK_EMPTY_CHECK_MEM"); \
if (k->type == STK_EMPTY_CHECK_START) {\
if (k->zid == (sid)) {\
if (k->u.empty_check.pstr != (s)) {\
@ -1866,15 +1901,11 @@ stack_double(int is_alloca, char** arg_alloc_base,
(isnull) = 1;\
while (k < stk) {\
if (k->type == STK_MEM_START) {\
if (k->u.mem.end == INVALID_STACK_INDEX) {\
STACK_MEM_START_GET_PREV_END_ADDR(k, reg, endp);\
if (endp == 0) {\
(isnull) = 0; break;\
}\
if (MEM_STATUS_AT(reg->bt_mem_end, k->zid))\
endp = STACK_AT(k->u.mem.end)->u.mem.pstr;\
else\
endp = (UChar* )k->u.mem.end;\
/*fprintf(stderr, "num: %d, pstr: %p, endp: %p\n", k->u.mem.num, STACK_AT(k->u.mem.start)->u.mem.pstr, endp);*/ \
if (STACK_AT(k->u.mem.start)->u.mem.pstr != endp) {\
else if (STACK_AT(k->u.mem.prev_start)->u.mem.pstr != endp) {\
(isnull) = 0; break;\
}\
else if (endp != s) {\
@ -1890,12 +1921,12 @@ stack_double(int is_alloca, char** arg_alloc_base,
}\
} while(0)
#define STACK_EMPTY_CHECK_MEMST_REC(isnull,sid,s,reg) do {\
#define STACK_EMPTY_CHECK_MEM_REC(isnull,sid,s,reg) do {\
int level = 0;\
StackType* k = stk;\
while (1) {\
k--;\
STACK_BASE_CHECK(k, "STACK_EMPTY_CHECK_MEMST_REC"); \
STACK_BASE_CHECK(k, "STACK_EMPTY_CHECK_MEM_REC");\
if (k->type == STK_EMPTY_CHECK_START) {\
if (k->zid == (sid)) {\
if (level == 0) {\
@ -1908,20 +1939,25 @@ stack_double(int is_alloca, char** arg_alloc_base,
(isnull) = 1;\
while (k < stk) {\
if (k->type == STK_MEM_START) {\
if (k->u.mem.end == INVALID_STACK_INDEX) {\
(isnull) = 0; break;\
}\
if (MEM_STATUS_AT(reg->bt_mem_end, k->zid))\
endp = STACK_AT(k->u.mem.end)->u.mem.pstr;\
else\
endp = (UChar* )k->u.mem.end;\
if (STACK_AT(k->u.mem.start)->u.mem.pstr != endp) {\
(isnull) = 0; break;\
}\
else if (endp != s) {\
(isnull) = -1; /* empty, but position changed */ \
if (level == 0) {\
STACK_MEM_START_GET_PREV_END_ADDR(k, reg, endp);\
if (endp == 0) {\
(isnull) = 0; break;\
}\
else if (STACK_AT(k->u.mem.prev_start)->u.mem.pstr != endp) { \
(isnull) = 0; break;\
}\
else if (endp != s) {\
(isnull) = -1; /* empty, but position changed */\
}\
}\
}\
else if (k->type == STK_EMPTY_CHECK_START) {\
if (k->zid == (sid)) level++;\
}\
else if (k->type == STK_EMPTY_CHECK_END) {\
if (k->zid == (sid)) level--;\
}\
k++;\
}\
break;\
@ -1958,7 +1994,7 @@ stack_double(int is_alloca, char** arg_alloc_base,
}\
}\
} while(0)
#endif /* USE_INSISTENT_CHECK_CAPTURES_STATUS_IN_ENDLESS_REPEAT */
#endif /* USE_INSISTENT_CHECK_CAPTURES_IN_EMPTY_REPEAT */
#define STACK_GET_REPEAT(sid, k) do {\
int level = 0;\
@ -2348,7 +2384,6 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
retry_limit_in_match = msa->retry_limit_in_match;
#endif
//n = reg->num_repeat + reg->num_mem * 2;
pop_level = reg->stack_pop_level;
num_mem = reg->num_mem;
STACK_INIT(INIT_MATCH_STACK_SIZE);
@ -2996,7 +3031,7 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
case OP_WORD_BOUNDARY: SOP_IN(OP_WORD_BOUNDARY);
{
ModeType mode;
GET_MODE_INC(mode, p); // ascii_mode
GET_MODE_INC(mode, p); /* ascii_mode */
if (ON_STR_BEGIN(s)) {
DATA_ENSURE(1);
@ -3020,7 +3055,7 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
case OP_NO_WORD_BOUNDARY: SOP_IN(OP_NO_WORD_BOUNDARY);
{
ModeType mode;
GET_MODE_INC(mode, p); // ascii_mode
GET_MODE_INC(mode, p); /* ascii_mode */
if (ON_STR_BEGIN(s)) {
if (DATA_ENSURE_CHECK1 && IS_MBC_WORD_ASCII_MODE(encode, s, end, mode))
@ -3044,7 +3079,7 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
case OP_WORD_BEGIN: SOP_IN(OP_WORD_BEGIN);
{
ModeType mode;
GET_MODE_INC(mode, p); // ascii_mode
GET_MODE_INC(mode, p); /* ascii_mode */
if (DATA_ENSURE_CHECK1 && IS_MBC_WORD_ASCII_MODE(encode, s, end, mode)) {
if (ON_STR_BEGIN(s) || !IS_MBC_WORD_ASCII_MODE(encode, sprev, end, mode)) {
@ -3059,7 +3094,7 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
case OP_WORD_END: SOP_IN(OP_WORD_END);
{
ModeType mode;
GET_MODE_INC(mode, p); // ascii_mode
GET_MODE_INC(mode, p); /* ascii_mode */
if (!ON_STR_BEGIN(s) && IS_MBC_WORD_ASCII_MODE(encode, sprev, end, mode)) {
if (ON_STR_END(s) || ! IS_MBC_WORD_ASCII_MODE(encode, s, end, mode)) {
@ -3395,9 +3430,10 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
sprev = s;
if (backref_match_at_nested_level(reg, stk, stk_base, ic
, case_fold_flag, (int )level, (int )tlen, p, &s, end)) {
while (sprev + (len = enclen(encode, sprev)) < s)
sprev += len;
if (sprev < end) {
while (sprev + (len = enclen(encode, sprev)) < s)
sprev += len;
}
p += (SIZE_MEMNUM * tlen);
}
else
@ -3504,16 +3540,16 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
continue;
break;
#ifdef USE_INSISTENT_CHECK_CAPTURES_STATUS_IN_ENDLESS_REPEAT
#ifdef USE_INSISTENT_CHECK_CAPTURES_IN_EMPTY_REPEAT
case OP_EMPTY_CHECK_END_MEMST: SOP_IN(OP_EMPTY_CHECK_END_MEMST);
{
int is_empty;
GET_MEMNUM_INC(mem, p); /* mem: null check id */
STACK_EMPTY_CHECK_MEMST(is_empty, mem, s, reg);
STACK_EMPTY_CHECK_MEM(is_empty, mem, s, reg);
if (is_empty) {
#ifdef ONIG_DEBUG_MATCH
fprintf(stderr, "EMPTY_CHECK_END_MEMST: skip id:%d, s:%p\n", (int)mem, s);
fprintf(stderr, "EMPTY_CHECK_END_MEM: skip id:%d, s:%p\n", (int)mem, s);
#endif
if (is_empty == -1) goto fail;
goto empty_check_found;
@ -3531,14 +3567,14 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
int is_empty;
GET_MEMNUM_INC(mem, p); /* mem: null check id */
#ifdef USE_INSISTENT_CHECK_CAPTURES_STATUS_IN_ENDLESS_REPEAT
STACK_EMPTY_CHECK_MEMST_REC(is_empty, mem, s, reg);
#ifdef USE_INSISTENT_CHECK_CAPTURES_IN_EMPTY_REPEAT
STACK_EMPTY_CHECK_MEM_REC(is_empty, mem, s, reg);
#else
STACK_EMPTY_CHECK_REC(is_empty, mem, s);
#endif
if (is_empty) {
#ifdef ONIG_DEBUG_MATCH
fprintf(stderr, "EMPTY_CHECK_END_MEMST_PUSH: skip id:%d, s:%p\n",
fprintf(stderr, "EMPTY_CHECK_END_MEM_PUSH: skip id:%d, s:%p\n",
(int )mem, s);
#endif
if (is_empty == -1) goto fail;
@ -3577,8 +3613,8 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
case OP_POP_OUT: SOP_IN(OP_POP_OUT);
STACK_POP_ONE;
// for stop backtrack
//CHECK_RETRY_LIMIT_IN_MATCH;
/* for stop backtrack */
/* CHECK_RETRY_LIMIT_IN_MATCH; */
SOP_OUT;
continue;
break;
@ -5137,7 +5173,7 @@ onig_get_args_num_by_callout_args(OnigCalloutArgs* args)
num = args->num;
e = onig_reg_callout_list_at(args->regex, num);
if (IS_NULL(e)) return 0;
if (IS_NULL(e)) return ONIGERR_INVALID_ARGUMENT;
if (e->of == ONIG_CALLOUT_OF_NAME) {
return e->u.arg.num;
}
@ -5153,7 +5189,7 @@ onig_get_passed_args_num_by_callout_args(OnigCalloutArgs* args)
num = args->num;
e = onig_reg_callout_list_at(args->regex, num);
if (IS_NULL(e)) return 0;
if (IS_NULL(e)) return ONIGERR_INVALID_ARGUMENT;
if (e->of == ONIG_CALLOUT_OF_NAME) {
return e->u.arg.passed_num;
}
@ -5170,7 +5206,7 @@ onig_get_arg_by_callout_args(OnigCalloutArgs* args, int index,
num = args->num;
e = onig_reg_callout_list_at(args->regex, num);
if (IS_NULL(e)) return 0;
if (IS_NULL(e)) return ONIGERR_INVALID_ARGUMENT;
if (e->of == ONIG_CALLOUT_OF_NAME) {
if (IS_NOT_NULL(type)) *type = e->u.arg.types[index];
if (IS_NOT_NULL(val)) *val = e->u.arg.vals[index];
@ -5393,6 +5429,8 @@ onig_builtin_max(OnigCalloutArgs* args, void* user_data ARG_UNUSED)
{
int r;
int slot;
long max_val;
OnigCodePoint count_type;
OnigType type;
OnigValue val;
OnigValue aval;
@ -5411,13 +5449,38 @@ onig_builtin_max(OnigCalloutArgs* args, void* user_data ARG_UNUSED)
r = onig_get_arg_by_callout_args(args, 0, &type, &aval);
if (r != ONIG_NORMAL) return r;
if (type == ONIG_TYPE_TAG) {
r = onig_get_callout_data_by_callout_args(args, aval.tag, 0, &type, &aval);
if (r < ONIG_NORMAL) return r;
else if (r > ONIG_NORMAL)
max_val = 0L;
else
max_val = aval.l;
}
else { /* LONG */
max_val = aval.l;
}
r = onig_get_arg_by_callout_args(args, 1, &type, &aval);
if (r != ONIG_NORMAL) return r;
count_type = aval.c;
if (count_type != '>' && count_type != 'X' && count_type != '<')
return ONIGERR_INVALID_CALLOUT_ARG;
if (args->in == ONIG_CALLOUT_IN_RETRACTION) {
val.l--;
if (count_type == '<') {
if (val.l >= max_val) return ONIG_CALLOUT_FAIL;
val.l++;
}
else if (count_type == 'X')
val.l--;
}
else {
if (val.l >= aval.l) return ONIG_CALLOUT_FAIL;
val.l++;
if (count_type != '<') {
if (val.l >= max_val) return ONIG_CALLOUT_FAIL;
val.l++;
}
}
r = onig_set_callout_data_by_callout_args_self(args, slot, ONIG_TYPE_LONG, &val);

View File

@ -59,7 +59,7 @@
#define USE_CALL
#define USE_CALLOUT
#define USE_BACKREF_WITH_LEVEL /* \k<name+n>, \k<name-n> */
#define USE_INSISTENT_CHECK_CAPTURES_STATUS_IN_ENDLESS_REPEAT /* /(?:()|())*\2/ */
#define USE_INSISTENT_CHECK_CAPTURES_IN_EMPTY_REPEAT /* /(?:()|())*\2/ */
#define USE_NEWLINE_AT_END_OF_STRING_HAS_EMPTY_LINE /* /\n$/ =~ "\n" */
#define USE_WARNING_REDUNDANT_NESTED_REPEAT_OPERATOR
@ -710,7 +710,6 @@ typedef int AbsAddrType;
typedef int LengthType;
typedef int RepeatNumType;
typedef int MemNumType;
typedef short int StateCheckNumType;
typedef void* PointerType;
typedef int SaveType;
typedef int UpdateVarType;

View File

@ -525,7 +525,7 @@ onig_st_insert_strend(hash_table_type* table, const UChar* str_key,
typedef struct {
OnigEncoding enc;
int type; // callout type: single or not
int type; /* callout type: single or not */
UChar* s;
UChar* end;
} st_callout_name_key;
@ -1583,7 +1583,7 @@ onig_set_callout_of_name(OnigEncoding enc, OnigCalloutType callout_type,
}
}
r = id; // return id
r = id;
return r;
}
@ -1637,24 +1637,36 @@ onig_get_callout_tag_end(regex_t* reg, int callout_num)
extern OnigCalloutType
onig_get_callout_type_by_name_id(int name_id)
{
if (name_id < 0 || name_id >= GlobalCalloutNameList->n)
return 0;
return GlobalCalloutNameList->v[name_id].type;
}
extern OnigCalloutFunc
onig_get_callout_start_func_by_name_id(int name_id)
{
if (name_id < 0 || name_id >= GlobalCalloutNameList->n)
return 0;
return GlobalCalloutNameList->v[name_id].start_func;
}
extern OnigCalloutFunc
onig_get_callout_end_func_by_name_id(int name_id)
{
if (name_id < 0 || name_id >= GlobalCalloutNameList->n)
return 0;
return GlobalCalloutNameList->v[name_id].end_func;
}
extern int
onig_get_callout_in_by_name_id(int name_id)
{
if (name_id < 0 || name_id >= GlobalCalloutNameList->n)
return 0;
return GlobalCalloutNameList->v[name_id].in;
}
@ -1685,6 +1697,9 @@ get_callout_opt_default_by_name_id(int name_id, int index)
extern UChar*
onig_get_callout_name_by_name_id(int name_id)
{
if (name_id < 0 || name_id >= GlobalCalloutNameList->n)
return 0;
return GlobalCalloutNameList->v[name_id].name;
}
@ -2689,7 +2704,7 @@ make_absent_engine(Node** node, int pre_save_right_id, Node* absent,
for (i = 0; i < 4; i++) ns[i] = NULL_NODE;
ns[1] = absent;
ns[3] = step_one; // for err
ns[3] = step_one; /* for err */
r = node_new_save_gimmick(&ns[0], SAVE_S, env);
if (r != 0) goto err;
@ -5341,8 +5356,11 @@ fetch_token(OnigToken* tok, UChar** src, UChar* end, ScanEnv* env)
if (num_type != IS_NOT_NUM) {
if (num_type == IS_REL_NUM) {
gnum = backref_rel_to_abs(gnum, env);
if (gnum < 0)
if (gnum < 0) {
onig_scan_env_set_error_string(env, ONIGERR_UNDEFINED_NAME_REFERENCE,
prev, name_end);
return ONIGERR_UNDEFINED_GROUP_REFERENCE;
}
}
tok->u.call.by_number = 1;
tok->u.call.gnum = gnum;
@ -5563,8 +5581,11 @@ fetch_token(OnigToken* tok, UChar** src, UChar* end, ScanEnv* env)
else {
if (num_type == IS_REL_NUM) {
gnum = backref_rel_to_abs(gnum, env);
if (gnum < 0)
if (gnum < 0) {
onig_scan_env_set_error_string(env,
ONIGERR_UNDEFINED_NAME_REFERENCE, name, name_end);
return ONIGERR_UNDEFINED_GROUP_REFERENCE;
}
}
tok->u.call.by_number = 1;
tok->u.call.gnum = gnum;
@ -6583,7 +6604,6 @@ parse_callout_of_contents(Node** np, int cterm, UChar** src, UChar* end, ScanEnv
PFETCH_S(c);
}
else if (c == '>') { /* no needs (default) */
//in = ONIG_CALLOUT_IN_PROGRESS;
if (PEND) return ONIGERR_END_PATTERN_IN_GROUP;
PFETCH_S(c);
}
@ -6823,7 +6843,7 @@ parse_callout_of_name(Node** np, int cterm, UChar** src, UChar* end, ScanEnv* en
OnigEncoding enc = env->enc;
UChar* p = *src;
//PFETCH_READY;
/* PFETCH_READY; */
if (PEND) return ONIGERR_INVALID_CALLOUT_PATTERN;
node = 0;
@ -7053,12 +7073,12 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
if (PEND) return ONIGERR_END_PATTERN_IN_GROUP;
if (PPEEK_IS('|')) { // (?~|generator|absent)
if (PPEEK_IS('|')) { /* (?~|generator|absent) */
PINC;
if (PEND) return ONIGERR_END_PATTERN_IN_GROUP;
head_bar = 1;
if (PPEEK_IS(')')) { // (?~|) : range clear
if (PPEEK_IS(')')) { /* (?~|) : range clear */
PINC;
r = make_range_clear(np, env);
if (r != 0) return r;
@ -7083,7 +7103,7 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
if (NODE_TYPE(top) != NODE_ALT || IS_NULL(NODE_CDR(top))) {
expr = NULL_NODE;
is_range_cutter = 1;
//return ONIGERR_INVALID_ABSENT_GROUP_GENERATOR_PATTERN;
/* return ONIGERR_INVALID_ABSENT_GROUP_GENERATOR_PATTERN; */
}
else {
absent = NODE_CAR(top);
@ -7778,7 +7798,7 @@ parse_exp(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
len = 1;
while (1) {
if (len >= ONIGENC_MBC_MINLEN(env->enc)) {
if (len == enclen(env->enc, STR_(*np)->s)) {//should not enclen_end()
if (len == enclen(env->enc, STR_(*np)->s)) {/* should not enclen_end() */
r = fetch_token(tok, src, end, env);
NODE_STRING_CLEAR_RAW(*np);
goto string_end;

View File

@ -337,5 +337,7 @@ OnigEncodingType OnigEncodingSJIS = {
is_allowed_reverse_match,
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string
is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
0, 0
};

View File

@ -335,7 +335,7 @@ onigenc_unicode_get_case_fold_codes_by_str(OnigEncoding enc,
n++;
}
}
code = items[0].code[0]; // for multi-code to unfold search.
code = items[0].code[0]; /* for multi-code to unfold search. */
}
else if ((flag & INTERNAL_ONIGENC_CASE_FOLD_MULTI_CHAR) != 0) {
OnigCodePoint cs[3][4];

View File

@ -38,16 +38,19 @@ init(void)
int id;
OnigEncoding enc;
char* name;
unsigned int t_long;
unsigned int args[4];
OnigValue opts[4];
enc = ONIG_ENCODING_UTF16_BE;
t_long = ONIG_TYPE_LONG;
name = "\000F\000A\000I\000L\000\000"; BC0_P(name, fail);
name = "\000M\000I\000S\000M\000A\000T\000C\000H\000\000"; BC0_P(name, mismatch);
name = "\000M\000A\000X\000\000"; BC_B(name, max, 1, &t_long);
name = "\000M\000A\000X\000\000";
args[0] = ONIG_TYPE_TAG | ONIG_TYPE_LONG;
args[1] = ONIG_TYPE_CHAR;
opts[0].c = 'X';
BC_B_O(name, max, 2, args, 1, opts);
name = "\000E\000R\000R\000O\000R\000\000";
args[0] = ONIG_TYPE_LONG; opts[0].l = ONIG_ABORT;
@ -274,5 +277,7 @@ OnigEncodingType OnigEncodingUTF16_BE = {
onigenc_always_false_is_allowed_reverse_match,
init,
0, /* is_initialized */
is_valid_mbc_string
is_valid_mbc_string,
ENC_FLAG_UNICODE,
0, 0
};

View File

@ -36,16 +36,19 @@ init(void)
int id;
OnigEncoding enc;
char* name;
unsigned int t_long;
unsigned int args[4];
OnigValue opts[4];
enc = ONIG_ENCODING_UTF16_LE;
t_long = ONIG_TYPE_LONG;
name = "F\000A\000I\000L\000\000\000"; BC0_P(name, fail);
name = "M\000I\000S\000M\000A\000T\000C\000H\000\000\000"; BC0_P(name, mismatch);
name = "M\000A\000X\000\000\000"; BC_B(name, max, 1, &t_long);
name = "M\000A\000X\000\000\000";
args[0] = ONIG_TYPE_TAG | ONIG_TYPE_LONG;
args[1] = ONIG_TYPE_CHAR;
opts[0].c = 'X';
BC_B_O(name, max, 2, args, 1, opts);
name = "E\000R\000R\000O\000R\000\000\000";
args[0] = ONIG_TYPE_LONG; opts[0].l = ONIG_ABORT;
@ -282,5 +285,7 @@ OnigEncodingType OnigEncodingUTF16_LE = {
onigenc_always_false_is_allowed_reverse_match,
init,
0, /* is_initialized */
is_valid_mbc_string
is_valid_mbc_string,
ENC_FLAG_UNICODE,
0, 0
};

View File

@ -190,5 +190,7 @@ OnigEncodingType OnigEncodingUTF32_BE = {
onigenc_always_false_is_allowed_reverse_match,
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string
is_valid_mbc_string,
ENC_FLAG_UNICODE,
0, 0
};

View File

@ -190,5 +190,7 @@ OnigEncodingType OnigEncodingUTF32_LE = {
onigenc_always_false_is_allowed_reverse_match,
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string
is_valid_mbc_string,
ENC_FLAG_UNICODE,
0, 0
};

View File

@ -29,7 +29,7 @@
#include "regenc.h"
//#define USE_INVALID_CODE_SCHEME
/* #define USE_INVALID_CODE_SCHEME */
#ifdef USE_INVALID_CODE_SCHEME
/* virtual codepoint values for invalid encoding byte 0xfe and 0xff */
@ -296,5 +296,7 @@ OnigEncodingType OnigEncodingUTF8 = {
onigenc_always_true_is_allowed_reverse_match,
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string
is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_UNICODE,
0, 0
};