Update Oniguruma to 6.9.1

This commit is contained in:
Stanislav Malyshev 2019-08-24 23:53:35 -07:00
parent 5704eca6f7
commit d3f2cfe20a
56 changed files with 1725 additions and 1531 deletions

View File

@ -1,5 +1,25 @@
History
2018/12/11: Version 6.9.1
2018/10/08: use ENC_FLAG_SKIP_OFFSET_XXX values
2018/10/06: UTF-8 supports code range from 0x0000 to 0x10FFFF
(https://tools.ietf.org/html/rfc3629)
2018/10/05: speed improvement
2018/10/03: use OPTIMIZE_STR_CASE_FOLD_FAST
2018/10/01: convert CRLF line endings to LF
2018/09/27: set SIZEOF_SIZE_T for windows platforms
2018/09/22: use Sunday quick search algorithm instead of Boyer-Moor-Horspool
2018/09/20: introduce threaded code into match_at()
2018/09/17: remove HAVE_STRINGS_H
2018/09/16: remove HAVE_PROTOTYPES and HAVE_STDARG_PROTOTYPES
2018/09/14: add a command line option '-gc' for make_unicode_property_data.py.
2018/09/08: remove AC_HEADER_STDC
2018/09/06: remove AC_OUTPUT macro call
2018/09/06: remove AC_FUNC_MEMCMP, AC_HEADER_TIME, AC_C_CONST, HAVE__SETJMP and
HAVE_STRING_H
2018/09/05: remove HAVE_LIMITS_H, HAVE_FLOAT_H and HAVE_STDLIB_H
2018/09/03: Version 6.9.0
2018/08/24: add Unicode Emoji properties
@ -394,12 +414,12 @@ History
2006/11/07: [dist] remove test.rb, testconv.rb and testconvu.rb.
2006/11/07: [bug] get_case_fold_codes_by_str() should handle 'Ss' and 'sS'
combination for ess-tsett.
2006/11/07: [impl] apply_all_case_fold() doesn't need to return all
2006/11/07: [impl] apply_all_case_fold() doesn't need to return all
case character combination for multi-character folding.
(ONIGENC_CASE_FOLD_MULTI_CHAR)
2006/11/07: [bug] (thanks Byte)
add { 0xa3, 0xb3 } to CaseFoldMap[] for KOI8-R.
2006/11/06: [spec] change ONIG_OPTION_FIND_LONGEST to search all of
2006/11/06: [spec] change ONIG_OPTION_FIND_LONGEST to search all of
the string range.
add USE_FIND_LONGEST_SEARCH_ALL_OF_RANGE.
2006/11/02: [impl] re-implement expand_case_fold_string() for
@ -667,7 +687,7 @@ History
2006/05/11: [test] success in ruby 1.9.0 (2006-03-01) [i686-linux].
2006/05/11: [bug] (thanks Yuji Kaneda)
dead-lock in onig_end().
dead-lock in onig_end().
2006/05/11: [dist] update index.html.
2006/05/08: Version 4.0.3
@ -719,7 +739,7 @@ History
use GNU libtool/automake.
change configure.in and add Makefile.am, sample/Makefile.am.
add AUTHORS file.
2006/01/24: [dist] test programs return exit code -1 when test fails.
2006/01/24: [dist] test programs return exit code -1 when test fails.
2006/01/24: [bug] (thanks KIMURA Koichi)
invalid syntax definition in ONIG_SYNTAX_GREP.
ONIG_SYN_OP_BRACE_INTERVAL
@ -737,7 +757,7 @@ History
2005/11/24: [test] success in ruby 1.9.0 (2005-08-09) [i686-linux].
2005/11/21: [test] success in ruby 1.9.0 (2005-11-20) [i386-cygwin].
2005/11/21: [bug] (thanks Allan Odgaard)
utf-8 character comments in extended mode leads
utf-8 character comments in extended mode leads
invalid result.
ex. /(?x)(?<= # <any-utf-8 multibyte char>o\n~) /
fix onigenc_unicode_is_code_ctype() and
@ -819,7 +839,7 @@ History
add new character encoding ONIG_ENCODING_GB18030.
2005/06/30: [bug] invalid ctype check for multibyte encodings.
("graph", "print")
fix onigenc_mb2/4_is_code_ctype(),
fix onigenc_mb2/4_is_code_ctype(),
eucjp_is_code_ctype() and sjis_is_code_ctype().
2005/06/30: [bug] invalid conversion from code point to mbc in
onigenc_mb4_code_to_mbc().
@ -894,7 +914,7 @@ History
remove oniggnu.h from make 19.
2005/03/01: [bug] (thanks matz) [ruby-dev:25778]
uninitialized member (OptEnv.backrefed_status)
was used.
was used.
2005/02/19: Version 3.7.0
@ -945,7 +965,7 @@ History
2005/01/19: [bug] (thanks Isao Sonobe)
callback function argument name_end of onig_foreach_name()
was wrong.
name key of name table should be null terminated for
name key of name table should be null terminated for
character encoding length.
add strdup_with_null(), rename onig_strdup() to k_strdup().
use e->name_len in i_names().
@ -1217,7 +1237,7 @@ History
RelAddrType, AbsAddrType and LengthType change
from short int to int type for the very long string match.
2004/06/14: [bug] (thanks Greg A. Woods)
fix nmatch argument of regexec() is smaller than
fix nmatch argument of regexec() is smaller than
reg->num_mem + 1 case. (POSIX API)
2004/06/14: [spec] (thanks Greg A. Woods)
set pmatch to NULL if nmatch is 0 in regexec(). (POSIX API)
@ -1397,7 +1417,7 @@ History
2004/02/23: [new] support ISO-8859-10. (ONIG_ENCODING_ISO_8859_10)
2004/02/20: [bug] fix iso_8859_4_mbc_is_case_ambig().
2004/02/20: [new] support ISO-8859-9. (ONIG_ENCODING_ISO_8859_9)
2004/02/19: [bug] correct ctype tables for ISO-8859-3, ISO-8859-4,
2004/02/19: [bug] correct ctype tables for ISO-8859-3, ISO-8859-4,
ISO-8859-6, ISO-8859-7, ISO-8859-8, KOI8_R.
2004/02/18: [bug] wrong replaced name OnigSyntaxGnuOnigex.
2004/02/17: [spec] check capture status for empty infinite loop.
@ -1570,7 +1590,7 @@ History
2003/11/11: [spec] add syntax op. REG_SYN_OP_VARIABLE_META_CHARS.
2003/11/11: [spec] rename REG_SYN_OP_ESC_CAPITAL_Q_QUOTE to
REG_SYN_OP2_ESC_CAPITAL_Q_QUOTE,
REG_SYN_OP_QMARK_GROUP_EFFECT to
REG_SYN_OP_QMARK_GROUP_EFFECT to
REG_SYN_OP2_QMARK_GROUP_EFFECT.
2003/11/06: [impl] define THREAD_PASS as rb_thread_schedule() in Ruby mode.
2003/11/05: [spec] add syntax behavior REG_SYN_WARN_REDUNDANT_NESTED_REPEAT.
@ -1587,7 +1607,7 @@ History
2003/10/03: [bug] (thanks nobu) [ruby-dev:21472]
sub-anchor of optimization map info was wrong
in concat_left_node_opt_info().
ex. /^(x?y)/ = "xy" fail.
ex. /^(x?y)/ = "xy" fail.
2003/09/17: Version 1.9.4
@ -1650,7 +1670,7 @@ History
2003/09/01: [dist] update doc/RE and doc/RE.ja.
2003/08/26: [bug] (thanks Guy Decoux)
should not double free node at the case TK_CC_CC_OPEN
in parse_char_class().
in parse_char_class().
2003/08/19: Version 1.9.3
@ -1662,8 +1682,8 @@ History
REG_SYN_OP2_ATMARK_CAPTURE_HISTORY.
2003/08/18: [spec] (thanks nobu)
don't use IMPORT in oniguruma.h and onigposix.h.
2003/08/18: [impl] (thanks nobu) change error output to stdout in testconv.rb.
2003/08/18: [inst] (thanks nobu) lacked $(srcdir) in Makefile.in.
2003/08/18: [impl] (thanks nobu) change error output to stdout in testconv.rb.
2003/08/18: [inst] (thanks nobu) lacked $(srcdir) in Makefile.in.
2003/08/18: [bug] REG_MBLEN_TABLE[SJIS][0xFD-0xFF] should be 1.
2003/08/18: [bug] (thanks nobu) mbctab_sjis[0x80] should be 0.
2003/08/18: [bug] (thanks nobu)
@ -1692,7 +1712,7 @@ History
2003/07/29: [new] add regex_get_encoding(), regex_get_options() and
regex_get_syntax().
2003/07/25: [spec] (thanks akr)
change group(...) to shy-group(?:...) if named group is
change group(...) to shy-group(?:...) if named group is
used in the pattern.
add REG_SYN_CAPTURE_ONLY_NAMED_GROUP.
2003/07/24: [spec] rename REG_OPTION_CAPTURE_ONLY_NAMED_GROUP to
@ -1720,7 +1740,7 @@ History
set option status to effect memory in optimize_node_left().
2003/07/07: [impl] add opcode OP_ANYCHAR_ML, OP_ANYCHAR_ML_STAR and
OP_ANYCHAR_ML_START_PEEK_NEXT.
2003/07/07: [bug] (thanks nobu) REG_MBLEN_TABLE[SJIS][0x80] should be 1.
2003/07/07: [bug] (thanks nobu) REG_MBLEN_TABLE[SJIS][0x80] should be 1.
2003/07/07: [spec] rename REG_SYN_OP_QUOTE to REG_SYN_OP_ESC_Q_QUOTE.
2003/07/04: Version 1.9.1
@ -1783,7 +1803,7 @@ History
2003/06/12: [spec] add syntax behavior REG_SYN_WARN_FOR_CC_OP_NOT_ESCAPEED.
2003/06/12: [spec] invalid POSIX bracket should be error. ex. [[:upper :]]
2003/06/11: [new] char-class in char-class (as Java(TM)).
2003/06/11: [spec] change AND operator in char-class from &&[..] to &&.
2003/06/11: [spec] change AND operator in char-class from &&[..] to &&.
2003/06/04: [spec] {n,m}+ should not be possessive operator.
ex. a{3}+ should be (?:a{3})+
2003/06/03: [bug] should compare strings with min-length in is_not_included().
@ -1947,7 +1967,7 @@ History
2003/02/26: [impl] add -win option to testconv.rb.
2003/02/25: [spec] allow to assign same name to different group.
add OP_BACKREF_MULTI.
2003/02/24: [impl] reduce redundant repeat of empty target.
2003/02/24: [impl] reduce redundant repeat of empty target.
ex. /()*/ ==> /()?/, /()+/ ==> /()/, /(?:)+/ ==> //
2003/02/24: [impl] change condition in regex_is_allow_reverse_match().
2003/02/24: [impl] convert i(/../, ...) functions in testconv.rb.
@ -2016,7 +2036,7 @@ History
2003/02/04: [bug] typo miss in regex_region_copy().
2003/02/04: [impl] change THREAD_PASS macro. (regint.h)
2003/02/04: [dist] add API document file doc/API.
2003/02/04: [tune] if sub_anchor has ANCHOR_BEGIN_LINE then
2003/02/04: [tune] if sub_anchor has ANCHOR_BEGIN_LINE then
set REG_OPTIMIZE_EXACT_BM in set_optimize_exact_info().
2003/02/04: [spec] reimplement regex_clone() and it is obsoleted.
2003/02/04: [bug] add REGERR_OVER_THREAD_PASS_LIMIT_COUNT
@ -2136,7 +2156,7 @@ History
2002/04/01: [dist] add COPYING.
2002/03/30: [spec] warn redundant nested repeat operator
in Ruby verbose mode. ex. (?:a*)?
2002/03/30: [spec] nested repeat operator error check should be
2002/03/30: [spec] nested repeat operator error check should be
same with GNU regex. (thanks Guy Decoux)
2002/03/30: [new] add \x{hexadecimal-wide-char}. (thanks matz)
2002/03/27: [bug] MBCTYPE_XXX symbol values should be same with GNU regex.
@ -2199,7 +2219,7 @@ History
ex. /(?:abc){10}/
2002/03/06: [new] add a symbol REG_TRANSTABLE_USE_DEFAULT in regex.h.
2002/03/06: [impl] rename RegDefaultCharCode to RegDefaultCharEncoding.
2002/03/06: [bug] if pattern has NULL(\000) char, infinite loop happens
2002/03/06: [bug] if pattern has NULL(\000) char, infinite loop happens
in ScanMakeNode(). (beware of strchr(). thanks Nobu)
2002/03/06: [bug] range argument of ForwardSearchRange() is wrong.
ex. /\A.a/, /\G.a/ mismatched with "aa". (thanks Nobu)

View File

@ -94,7 +94,7 @@ Usage
See doc/API for Oniguruma API.
If you want to disable UChar type (== unsigned char) definition
in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then
in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then
include oniguruma.h.
If you want to disable regex_t type definition in oniguruma.h,

View File

@ -1,4 +1,6 @@
[![Build Status](https://travis-ci.org/kkos/oniguruma.svg?branch=master)](https://travis-ci.org/kkos/oniguruma)
[![Code Quality: Cpp](https://img.shields.io/lgtm/grade/cpp/g/kkos/oniguruma.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/kkos/oniguruma/context:cpp)
[![Total Alerts](https://img.shields.io/lgtm/alerts/g/kkos/oniguruma.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/kkos/oniguruma/alerts)
Oniguruma
=========
@ -24,6 +26,12 @@ Supported character encodings:
* CP1251: contributed by Byte
New feature of version 6.9.1
--------------------------
* Speed improvement (* especially UTF-8)
New feature of version 6.9.0
--------------------------
@ -193,7 +201,7 @@ Usage
See doc/API for Oniguruma API.
If you want to disable UChar type (== unsigned char) definition
in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then
in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then
include oniguruma.h.
If you want to disable regex_t type definition in oniguruma.h,
@ -294,4 +302,4 @@ Source Files
|utf32_le.c |UTF-32LE encoding |
|unicode.c |common codes of Unicode encoding |
|unicode_fold_data.c|Unicode folding data |
|windows/testc.c |Test program for Windowns (VC++) |
|windows/testc.c |Test program for Windows (VC++) |

View File

@ -182,7 +182,7 @@ Oniguruma API Version 6.8.0 2018/03/13
ci->target_enc: target string character encoding.
ci->syntax: address of pattern syntax definition.
ci->option: compile time option.
ci->case_fold_flag: character matching case fold bit flag for
ci->case_fold_flag: character matching case fold bit flag for
ONIG_OPTION_IGNORECASE mode.
ONIGENC_CASE_FOLD_MIN: minimum

View File

@ -54,7 +54,7 @@
\t, \n, \v, \f, \r, \x20
Unicodeの場合:
U+0009, U+000A, U+000B, U+000C, U+000D, U+0085(NEL),
U+0009, U+000A, U+000B, U+000C, U+000D, U+0085(NEL),
General_Category -- Line_Separator
-- Paragraph_Separator
-- Space_Separator

View File

@ -8,7 +8,7 @@
<h1>Oniguruma</h1> (<a href="index_ja.html">Japanese</a>)
<p>
(c) K.Kosako, updated at: 2018/08/31
(c) K.Kosako, updated at: 2018/12/06
</p>
<dl>
@ -16,6 +16,7 @@
<dt><b>What's new</b>
</font>
<ul>
<li>2018/12/11: Version 6.9.1 released.</li>
<li>2018/09/03: Version 6.9.0 released.</li>
<li>2018/04/17: Version 6.8.2 released.</li>
<li>2018/03/19: Version 6.8.1 released.</li>

View File

@ -8,7 +8,7 @@
<h1>鬼車</h1>
<p>
(c) K.Kosako, 最終更新: 2018/09/03
(c) K.Kosako, 最終更新: 2018/12/06
</p>
<dl>
@ -16,6 +16,7 @@
<dt><b>更新情報</b>
</font>
<ul>
<li>2018/12/11: Version 6.9.1 リリース</li>
<li>2018/09/03: Version 6.9.0 リリース</li>
<li>2018/04/17: Version 6.8.2 リリース</li>
<li>2018/03/19: Version 6.8.1 リリース</li>

View File

@ -113,6 +113,6 @@ OnigEncodingType OnigEncodingASCII = {
init,
0, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -151,7 +151,7 @@ big5_left_adjust_char_head(const UChar* start, const UChar* s)
p++;
break;
}
}
}
}
len = enclen(ONIG_ENCODING_BIG5, p);
if (p + len > s) return (UChar* )p;
@ -187,6 +187,6 @@ OnigEncodingType OnigEncodingBIG5 = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -200,6 +200,6 @@ OnigEncodingType OnigEncodingCP1251 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -151,7 +151,7 @@ code_to_mbc(OnigCodePoint code, UChar *buf)
#if 1
if (enclen(ONIG_ENCODING_EUC_JP, buf) != (p - buf))
return ONIGERR_INVALID_CODE_POINT_VALUE;
#endif
#endif
return (int )(p - buf);
}
@ -307,6 +307,6 @@ OnigEncodingType OnigEncodingEUC_JP = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1_OR_0,
0, 0
};

View File

@ -161,7 +161,9 @@ OnigEncodingType OnigEncodingEUC_KR = {
euckr_is_allowed_reverse_match,
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string
is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1_OR_0,
0, 0
};
/* Same with OnigEncodingEUC_KR except the name */
@ -185,6 +187,6 @@ OnigEncodingType OnigEncodingEUC_CN = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1_OR_0,
0, 0
};

View File

@ -168,6 +168,6 @@ OnigEncodingType OnigEncodingEUC_TW = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -535,6 +535,6 @@ OnigEncodingType OnigEncodingGB18030 = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -272,6 +272,6 @@ OnigEncodingType OnigEncodingISO_8859_1 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -239,6 +239,6 @@ OnigEncodingType OnigEncodingISO_8859_10 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -96,6 +96,6 @@ OnigEncodingType OnigEncodingISO_8859_11 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -228,6 +228,6 @@ OnigEncodingType OnigEncodingISO_8859_13 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -241,6 +241,6 @@ OnigEncodingType OnigEncodingISO_8859_14 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -235,6 +235,6 @@ OnigEncodingType OnigEncodingISO_8859_15 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -237,6 +237,6 @@ OnigEncodingType OnigEncodingISO_8859_16 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -235,6 +235,6 @@ OnigEncodingType OnigEncodingISO_8859_2 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -235,6 +235,6 @@ OnigEncodingType OnigEncodingISO_8859_3 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -237,6 +237,6 @@ OnigEncodingType OnigEncodingISO_8859_4 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -226,6 +226,6 @@ OnigEncodingType OnigEncodingISO_8859_5 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -96,6 +96,6 @@ OnigEncodingType OnigEncodingISO_8859_6 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -222,6 +222,6 @@ OnigEncodingType OnigEncodingISO_8859_7 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -96,6 +96,6 @@ OnigEncodingType OnigEncodingISO_8859_8 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -228,6 +228,6 @@ OnigEncodingType OnigEncodingISO_8859_9 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -250,6 +250,6 @@ OnigEncodingType OnigEncodingKOI8 = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -212,6 +212,6 @@ OnigEncodingType OnigEncodingKOI8_R = {
NULL, /* init */
NULL, /* is_initialized */
onigenc_always_true_is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -36,9 +36,9 @@ extern "C" {
#define ONIGURUMA
#define ONIGURUMA_VERSION_MAJOR 6
#define ONIGURUMA_VERSION_MINOR 9
#define ONIGURUMA_VERSION_TEENY 0
#define ONIGURUMA_VERSION_TEENY 1
#define ONIGURUMA_VERSION_INT 60900
#define ONIGURUMA_VERSION_INT 60901
#ifndef P_
#if defined(__STDC__) || defined(_WIN32)

File diff suppressed because it is too large Load Diff

View File

@ -231,7 +231,7 @@ onigenc_strlen(OnigEncoding enc, const UChar* p, const UChar* end)
{
int n = 0;
UChar* q = (UChar* )p;
while (q < end) {
q += ONIGENC_MBC_ENC_LEN(enc, q);
n++;
@ -244,7 +244,7 @@ onigenc_strlen_null(OnigEncoding enc, const UChar* s)
{
int n = 0;
UChar* p = (UChar* )s;
while (1) {
if (*p == '\0') {
UChar* q;

View File

@ -121,8 +121,20 @@ struct PropertyNameCtype {
#define ONIG_ENCODING_INIT_DEFAULT ONIG_ENCODING_ASCII
#define ENC_SKIP_OFFSET_1_OR_0 7
#define ENC_FLAG_ASCII_COMPATIBLE (1<<0)
#define ENC_FLAG_UNICODE (1<<1)
#define ENC_FLAG_SKIP_OFFSET_MASK (7<<2)
#define ENC_FLAG_SKIP_OFFSET_0 0
#define ENC_FLAG_SKIP_OFFSET_1 (1<<2)
#define ENC_FLAG_SKIP_OFFSET_2 (2<<2)
#define ENC_FLAG_SKIP_OFFSET_3 (3<<2)
#define ENC_FLAG_SKIP_OFFSET_4 (4<<2)
#define ENC_FLAG_SKIP_OFFSET_1_OR_0 (ENC_SKIP_OFFSET_1_OR_0<<2)
#define ENC_GET_SKIP_OFFSET(enc) \
(((enc)->flag & ENC_FLAG_SKIP_OFFSET_MASK)>>2)
/* for encoding system implementation (internal) */
@ -197,7 +209,7 @@ extern int onigenc_egcb_is_break_position P_((OnigEncoding enc, UChar* p, UChar*
else if ((buk)->fold_len == 3)\
addr = OnigUnicodeFolds3 + (buk)->index;\
else\
addr = 0;\
return ONIGERR_INVALID_CODE_POINT_VALUE;\
} while (0)
extern OnigCodePoint OnigUnicodeFolds1[];
@ -252,7 +264,7 @@ extern const unsigned short OnigEncAsciiCtypeTable[];
#define ONIGENC_IS_ASCII_CODE_CASE_AMBIG(code) \
(ONIGENC_IS_ASCII_CODE_CTYPE(code, ONIGENC_CTYPE_UPPER) ||\
ONIGENC_IS_ASCII_CODE_CTYPE(code, ONIGENC_CTYPE_LOWER))
#define ONIGENC_IS_UNICODE_ENCODING(enc) \
(((enc)->flag & ENC_FLAG_UNICODE) != 0)

View File

@ -30,13 +30,7 @@
#include "regint.h"
#include <stdio.h> /* for vsnprintf() */
#ifdef HAVE_STDARG_PROTOTYPES
#include <stdarg.h>
#define va_init_list(a,b) va_start(a,b)
#else
#include <varargs.h>
#define va_init_list(a,b) va_start(a)
#endif
extern UChar*
onig_error_code_to_format(int code)
@ -247,7 +241,7 @@ static int to_ascii(OnigEncoding enc, UChar *s, UChar *end,
if (len >= buf_size) break;
}
*is_over = ((p < end) ? 1 : 0);
*is_over = p < end;
}
else {
len = MIN((int )(end - s), buf_size);
@ -262,15 +256,7 @@ static int to_ascii(OnigEncoding enc, UChar *s, UChar *end,
/* for ONIG_MAX_ERROR_MESSAGE_LEN */
#define MAX_ERROR_PAR_LEN 30
extern int
#ifdef HAVE_STDARG_PROTOTYPES
onig_error_code_to_str(UChar* s, int code, ...)
#else
onig_error_code_to_str(s, code, va_alist)
UChar* s;
int code;
va_dcl
#endif
extern int onig_error_code_to_str(UChar* s, int code, ...)
{
UChar *p, *q;
OnigErrorInfo* einfo;
@ -278,7 +264,7 @@ onig_error_code_to_str(s, code, va_alist)
UChar parbuf[MAX_ERROR_PAR_LEN];
va_list vargs;
va_init_list(vargs, code);
va_start(vargs, code);
switch (code) {
case ONIGERR_UNDEFINED_NAME_REFERENCE:
@ -330,27 +316,15 @@ onig_error_code_to_str(s, code, va_alist)
}
void
#ifdef HAVE_STDARG_PROTOTYPES
onig_snprintf_with_pattern(UChar buf[], int bufsize, OnigEncoding enc,
UChar* pat, UChar* pat_end, const UChar *fmt, ...)
#else
onig_snprintf_with_pattern(buf, bufsize, enc, pat, pat_end, fmt, va_alist)
UChar buf[];
int bufsize;
OnigEncoding enc;
UChar* pat;
UChar* pat_end;
const UChar *fmt;
va_dcl
#endif
void onig_snprintf_with_pattern(UChar buf[], int bufsize, OnigEncoding enc,
UChar* pat, UChar* pat_end, const UChar *fmt, ...)
{
int n, need, len;
UChar *p, *s, *bp;
UChar bs[6];
va_list args;
va_init_list(args, fmt);
va_start(args, fmt);
n = xvsnprintf((char* )buf, bufsize, (const char* )fmt, args);
va_end(args);

File diff suppressed because it is too large Load Diff

View File

@ -171,9 +171,7 @@ onig_new_deluxe(regex_t** reg, const UChar* pattern, const UChar* pattern_end,
if (IS_NOT_NULL(einfo)) einfo->par = (UChar* )NULL;
if (ci->pattern_enc != ci->target_enc) {
r = conv_encoding(ci->pattern_enc, ci->target_enc, pattern, pattern_end,
&cpat, &cpat_end);
if (r != 0) return r;
return ONIGERR_NOT_SUPPORTED_ENCODING_COMBINATION;
}
else {
cpat = (UChar* )pattern;

View File

@ -62,7 +62,6 @@
#define USE_INSISTENT_CHECK_CAPTURES_IN_EMPTY_REPEAT /* /(?:()|())*\2/ */
#define USE_NEWLINE_AT_END_OF_STRING_HAS_EMPTY_LINE /* /\n$/ =~ "\n" */
#define USE_WARNING_REDUNDANT_NESTED_REPEAT_OPERATOR
#define USE_RETRY_LIMIT_IN_MATCH
/* internal config */
@ -70,28 +69,14 @@
#define USE_QUANT_PEEK_NEXT
#define USE_ST_LIBRARY
#define USE_WORD_BEGIN_END /* "\<", "\>" */
#define USE_CAPTURE_HISTORY
#define USE_VARIABLE_META_CHARS
#define USE_POSIX_API_REGION_OPTION
#define USE_FIND_LONGEST_SEARCH_ALL_OF_RANGE
#include "regenc.h"
#ifdef __cplusplus
# ifndef HAVE_STDARG_PROTOTYPES
# define HAVE_STDARG_PROTOTYPES 1
# endif
#endif
/* escape Mac OS X/Xcode 2.4/gcc 4.0.1 problem */
#if defined(__APPLE__) && defined(__GNUC__) && __GNUC__ >= 4
# ifndef HAVE_STDARG_PROTOTYPES
# define HAVE_STDARG_PROTOTYPES 1
# endif
#endif
#ifdef HAVE_STDARG_H
# ifndef HAVE_STDARG_PROTOTYPES
# define HAVE_STDARG_PROTOTYPES 1
# endif
#endif
#define INIT_MATCH_STACK_SIZE 160
#define DEFAULT_MATCH_STACK_LIMIT_SIZE 0 /* unlimited */
#define DEFAULT_RETRY_LIMIT_IN_MATCH 10000000
@ -103,12 +88,6 @@
#undef ONIG_ESCAPE_UCHAR_COLLISION
#endif
#define USE_WORD_BEGIN_END /* "\<", "\>" */
#define USE_CAPTURE_HISTORY
#define USE_VARIABLE_META_CHARS
#define USE_POSIX_API_REGION_OPTION
#define USE_FIND_LONGEST_SEARCH_ALL_OF_RANGE
#define xmalloc malloc
#define xrealloc realloc
#define xcalloc calloc
@ -152,14 +131,8 @@
#include <stddef.h>
#ifdef HAVE_LIMITS_H
#include <limits.h>
#endif
#ifdef HAVE_STDLIB_H
#include <stdlib.h>
#endif
#ifdef HAVE_STDINT_H
#include <stdint.h>
@ -169,11 +142,7 @@
#include <alloca.h>
#endif
#ifdef HAVE_STRING_H
# include <string.h>
#else
# include <strings.h>
#endif
#include <string.h>
#include <ctype.h>
#ifdef HAVE_SYS_TYPES_H
@ -217,6 +186,7 @@ typedef unsigned int uintptr_t;
#define CHECK_NULL_RETURN_MEMERR(p) if (IS_NULL(p)) return ONIGERR_MEMORY
#define NULL_UCHARP ((UChar* )0)
#define CHAR_MAP_SIZE 256
#define INFINITE_LEN ONIG_INFINITE_DISTANCE
#ifdef PLATFORM_UNALIGNED_WORD_ACCESS
@ -292,9 +262,6 @@ typedef struct {
#endif
} RegexExt;
#define REG_EXTP(reg) ((RegexExt* )((reg)->chain))
#define REG_EXTPL(reg) ((reg)->chain)
struct re_pattern_buffer {
/* common members of BBuf(bytes-buffer) */
unsigned char* p; /* compiled pattern */
@ -304,7 +271,6 @@ struct re_pattern_buffer {
int num_mem; /* used memory(...) num counted from 1 */
int num_repeat; /* OP_REPEAT/OP_REPEAT_NG id-counter */
int num_null_check; /* OP_EMPTY_CHECK_START/END id counter */
int num_comb_exp_check; /* no longer used (combination explosion check) */
int num_call; /* number of subexp call */
unsigned int capture_history; /* (?@...) flag (1-31) */
unsigned int bt_mem_start; /* need backtrack flag */
@ -323,19 +289,16 @@ struct re_pattern_buffer {
int optimize; /* optimize flag */
int threshold_len; /* search str-length for apply optimize */
int anchor; /* BEGIN_BUF, BEGIN_POS, (SEMI_)END_BUF */
OnigLen anchor_dmin; /* (SEMI_)END_BUF anchor distance */
OnigLen anchor_dmax; /* (SEMI_)END_BUF anchor distance */
OnigLen anchor_dmin; /* (SEMI_)END_BUF anchor distance */
OnigLen anchor_dmax; /* (SEMI_)END_BUF anchor distance */
int sub_anchor; /* start-anchor for exact or map */
unsigned char *exact;
unsigned char *exact_end;
unsigned char map[ONIG_CHAR_TABLE_SIZE]; /* used as BM skip or char-map */
int *int_map; /* BM skip for exact_len > 255 */
int *int_map_backward; /* BM skip for backward search */
OnigLen dmin; /* min-distance of exact or map */
OnigLen dmax; /* max-distance of exact or map */
/* regex_t link chain */
struct re_pattern_buffer* chain; /* escape compile-conflict */
unsigned char map[CHAR_MAP_SIZE]; /* used as BMH skip or char-map */
int map_offset;
OnigLen dmin; /* min-distance of exact or map */
OnigLen dmax; /* max-distance of exact or map */
RegexExt* extp;
};
@ -348,12 +311,13 @@ enum StackPopLevel {
/* optimize flags */
enum OptimizeType {
OPTIMIZE_NONE = 0,
OPTIMIZE_EXACT = 1, /* Slow Search */
OPTIMIZE_EXACT_BM = 2, /* Boyer Moore Search */
OPTIMIZE_EXACT_BM_NO_REV = 3, /* BM (but not simple match) */
OPTIMIZE_EXACT_IC = 4, /* Slow Search (ignore case) */
OPTIMIZE_MAP = 5 /* char map */
OPTIMIZE_NONE = 0,
OPTIMIZE_STR, /* Slow Search */
OPTIMIZE_STR_FAST, /* Sunday quick search / BMH */
OPTIMIZE_STR_FAST_STEP_FORWARD, /* Sunday quick search / BMH */
OPTIMIZE_STR_CASE_FOLD_FAST, /* Sunday quick search / BMH (ignore case) */
OPTIMIZE_STR_CASE_FOLD, /* Slow Search (ignore case) */
OPTIMIZE_MAP /* char map */
};
/* bit status */
@ -541,32 +505,32 @@ typedef struct _BBuf {
/* has body */
#define ANCHOR_PREC_READ (1<<0)
#define ANCHOR_PREC_READ_NOT (1<<1)
#define ANCHOR_LOOK_BEHIND (1<<2)
#define ANCHOR_LOOK_BEHIND_NOT (1<<3)
#define ANCR_PREC_READ (1<<0)
#define ANCR_PREC_READ_NOT (1<<1)
#define ANCR_LOOK_BEHIND (1<<2)
#define ANCR_LOOK_BEHIND_NOT (1<<3)
/* no body */
#define ANCHOR_BEGIN_BUF (1<<4)
#define ANCHOR_BEGIN_LINE (1<<5)
#define ANCHOR_BEGIN_POSITION (1<<6)
#define ANCHOR_END_BUF (1<<7)
#define ANCHOR_SEMI_END_BUF (1<<8)
#define ANCHOR_END_LINE (1<<9)
#define ANCHOR_WORD_BOUNDARY (1<<10)
#define ANCHOR_NO_WORD_BOUNDARY (1<<11)
#define ANCHOR_WORD_BEGIN (1<<12)
#define ANCHOR_WORD_END (1<<13)
#define ANCHOR_ANYCHAR_INF (1<<14)
#define ANCHOR_ANYCHAR_INF_ML (1<<15)
#define ANCHOR_EXTENDED_GRAPHEME_CLUSTER_BOUNDARY (1<<16)
#define ANCHOR_NO_EXTENDED_GRAPHEME_CLUSTER_BOUNDARY (1<<17)
#define ANCR_BEGIN_BUF (1<<4)
#define ANCR_BEGIN_LINE (1<<5)
#define ANCR_BEGIN_POSITION (1<<6)
#define ANCR_END_BUF (1<<7)
#define ANCR_SEMI_END_BUF (1<<8)
#define ANCR_END_LINE (1<<9)
#define ANCR_WORD_BOUNDARY (1<<10)
#define ANCR_NO_WORD_BOUNDARY (1<<11)
#define ANCR_WORD_BEGIN (1<<12)
#define ANCR_WORD_END (1<<13)
#define ANCR_ANYCHAR_INF (1<<14)
#define ANCR_ANYCHAR_INF_ML (1<<15)
#define ANCR_EXTENDED_GRAPHEME_CLUSTER_BOUNDARY (1<<16)
#define ANCR_NO_EXTENDED_GRAPHEME_CLUSTER_BOUNDARY (1<<17)
#define ANCHOR_HAS_BODY(a) ((a)->type < ANCHOR_BEGIN_BUF)
#define ANCHOR_HAS_BODY(a) ((a)->type < ANCR_BEGIN_BUF)
#define IS_WORD_ANCHOR_TYPE(type) \
((type) == ANCHOR_WORD_BOUNDARY || (type) == ANCHOR_NO_WORD_BOUNDARY || \
(type) == ANCHOR_WORD_BEGIN || (type) == ANCHOR_WORD_END)
((type) == ANCR_WORD_BOUNDARY || (type) == ANCR_NO_WORD_BOUNDARY || \
(type) == ANCR_WORD_BEGIN || (type) == ANCR_WORD_END)
/* operation code */
enum OpCode {
@ -851,6 +815,7 @@ extern void onig_transfer P_((regex_t* to, regex_t* from));
extern int onig_is_code_in_cc_len P_((int enclen, OnigCodePoint code, void* /* CClassNode* */ cc));
extern RegexExt* onig_get_regex_ext(regex_t* reg);
extern int onig_ext_set_pattern(regex_t* reg, const UChar* pattern, const UChar* pattern_end);
extern int onig_positive_int_multiply(int x, int y);
#ifdef USE_CALLOUT

View File

@ -71,7 +71,7 @@ OnigSyntaxType OnigSyntaxOniguruma = {
ONIG_SYN_OP2_CCLASS_SET_OP | ONIG_SYN_OP2_ESC_CAPITAL_C_BAR_CONTROL |
ONIG_SYN_OP2_ESC_CAPITAL_M_BAR_META | ONIG_SYN_OP2_ESC_V_VTAB |
ONIG_SYN_OP2_ESC_H_XDIGIT | ONIG_SYN_OP2_ESC_U_HEX4 )
, ( SYN_GNU_REGEX_BV |
, ( SYN_GNU_REGEX_BV |
ONIG_SYN_ALLOW_INTERVAL_LOW_ABBREV |
ONIG_SYN_DIFFERENT_LEN_ALT_LOOK_BEHIND |
ONIG_SYN_CAPTURE_ONLY_NAMED_GROUP |
@ -113,7 +113,7 @@ OnigSyntaxType OnigSyntaxRuby = {
ONIG_SYN_OP2_CCLASS_SET_OP | ONIG_SYN_OP2_ESC_CAPITAL_C_BAR_CONTROL |
ONIG_SYN_OP2_ESC_CAPITAL_M_BAR_META | ONIG_SYN_OP2_ESC_V_VTAB |
ONIG_SYN_OP2_ESC_H_XDIGIT | ONIG_SYN_OP2_ESC_U_HEX4 )
, ( SYN_GNU_REGEX_BV |
, ( SYN_GNU_REGEX_BV |
ONIG_SYN_ALLOW_INTERVAL_LOW_ABBREV |
ONIG_SYN_DIFFERENT_LEN_ALT_LOOK_BEHIND |
ONIG_SYN_CAPTURE_ONLY_NAMED_GROUP |
@ -198,17 +198,6 @@ onig_set_parse_depth_limit(unsigned int depth)
return 0;
}
static int
positive_int_multiply(int x, int y)
{
if (x == 0 || y == 0) return 0;
if (x < INT_MAX / y)
return x * y;
else
return -1;
}
static void
bbuf_free(BBuf* bbuf)
{
@ -969,6 +958,7 @@ name_add(regex_t* reg, UChar* name, UChar* name_end, int backref, ScanEnv* env)
#ifdef USE_ST_LIBRARY
if (IS_NULL(t)) {
t = onig_st_init_strend_table_with_size(INIT_NAMES_ALLOC_NUM);
CHECK_NULL_RETURN_MEMERR(t);
reg->name_table = (void* )t;
}
e = (NameEntry* )xmalloc(sizeof(NameEntry));
@ -1375,6 +1365,7 @@ callout_name_entry(CalloutNameEntry** rentry, OnigEncoding enc,
#ifdef USE_ST_LIBRARY
if (IS_NULL(t)) {
t = onig_st_init_callout_name_table_with_size(INIT_NAMES_ALLOC_NUM);
CHECK_NULL_RETURN_MEMERR(t);
GlobalCalloutNameTable = t;
}
e = (CalloutNameEntry* )xmalloc(sizeof(CalloutNameEntry));
@ -1574,6 +1565,7 @@ onig_set_callout_of_name(OnigEncoding enc, OnigCalloutType callout_type,
}
for (i = arg_num - opt_arg_num, j = 0; i < arg_num; i++, j++) {
if (fe->arg_types[i] == ONIG_TYPE_STRING) {
if (IS_NULL(opt_defaults)) return ONIGERR_INVALID_ARGUMENT;
OnigValue* val = opt_defaults + j;
UChar* ds = onigenc_strdup(enc, val->s.start, val->s.end);
CHECK_NULL_RETURN_MEMERR(ds);
@ -1619,6 +1611,7 @@ onig_get_callout_start_func(regex_t* reg, int callout_num)
CalloutListEntry* e;
e = onig_reg_callout_list_at(reg, callout_num);
CHECK_NULL_RETURN(e);
return e->start_func;
}
@ -1626,6 +1619,7 @@ extern const UChar*
onig_get_callout_tag_start(regex_t* reg, int callout_num)
{
CalloutListEntry* e = onig_reg_callout_list_at(reg, callout_num);
CHECK_NULL_RETURN(e);
return e->tag_start;
}
@ -1633,6 +1627,7 @@ extern const UChar*
onig_get_callout_tag_end(regex_t* reg, int callout_num)
{
CalloutListEntry* e = onig_reg_callout_list_at(reg, callout_num);
CHECK_NULL_RETURN(e);
return e->tag_end;
}
@ -1739,7 +1734,7 @@ setup_ext_callout_list_values(regex_t* reg)
int i, j;
RegexExt* ext;
ext = REG_EXTP(reg);
ext = reg->extp;
if (IS_NOT_NULL(ext->tag_table)) {
onig_st_foreach((CalloutTagTable *)ext->tag_table, i_callout_callout_list_set,
(st_data_t )ext);
@ -1769,13 +1764,13 @@ setup_ext_callout_list_values(regex_t* reg)
extern int
onig_callout_tag_is_exist_at_callout_num(regex_t* reg, int callout_num)
{
RegexExt* ext = REG_EXTP(reg);
RegexExt* ext = reg->extp;
if (IS_NULL(ext) || IS_NULL(ext->callout_list)) return 0;
if (callout_num > ext->callout_num) return 0;
return (ext->callout_list[callout_num].flag &
CALLOUT_TAG_LIST_FLAG_TAG_EXIST) != 0 ? 1 : 0;
CALLOUT_TAG_LIST_FLAG_TAG_EXIST) != 0;
}
static int
@ -1817,7 +1812,7 @@ onig_get_callout_num_by_tag(regex_t* reg,
RegexExt* ext;
CalloutTagVal e;
ext = REG_EXTP(reg);
ext = reg->extp;
if (IS_NULL(ext) || IS_NULL(ext->tag_table))
return ONIGERR_INVALID_CALLOUT_TAG_NAME;
@ -1904,9 +1899,11 @@ callout_tag_entry(regex_t* reg, UChar* name, UChar* name_end,
if (r != ONIG_NORMAL) return r;
ext = onig_get_regex_ext(reg);
CHECK_NULL_RETURN_MEMERR(ext);
r = callout_tag_entry_raw(ext->tag_table, name, name_end, entry_val);
e = onig_reg_callout_list_at(reg, (int )entry_val);
CHECK_NULL_RETURN_MEMERR(e);
e->tag_start = name;
e->tag_end = name_end;
@ -2011,7 +2008,7 @@ onig_node_free(Node* node)
switch (NODE_TYPE(node)) {
case NODE_STRING:
if (STR_(node)->capa != 0 &&
if (STR_(node)->capacity != 0 &&
IS_NOT_NULL(STR_(node)->s) && STR_(node)->s != STR_(node)->buf) {
xfree(STR_(node)->s);
}
@ -2043,13 +2040,13 @@ onig_node_free(Node* node)
xfree(BACKREF_(node)->back_dynamic);
break;
case NODE_ENCLOSURE:
case NODE_BAG:
if (NODE_BODY(node))
onig_node_free(NODE_BODY(node));
{
EnclosureNode* en = ENCLOSURE_(node);
if (en->type == ENCLOSURE_IF_ELSE) {
BagNode* en = BAG_(node);
if (en->type == BAG_IF_ELSE) {
onig_node_free(en->te.Then);
onig_node_free(en->te.Else);
}
@ -2085,6 +2082,7 @@ node_new(void)
Node* node;
node = (Node* )xmalloc(sizeof(Node));
CHECK_NULL_RETURN(node);
xmemset(node, 0, sizeof(*node));
#ifdef DEBUG_NODE_FREE
@ -2141,6 +2139,8 @@ node_new_anychar_with_fixed_option(OnigOptionType option)
Node* node;
node = node_new_anychar();
CHECK_NULL_RETURN(node);
ct = CTYPE_(node);
ct->options = option;
NODE_STATUS_ADD(node, FIXED_OPTION);
@ -2384,62 +2384,62 @@ node_new_quantifier(int lower, int upper, int by_number)
}
static Node*
node_new_enclosure(enum EnclosureType type)
node_new_bag(enum BagType type)
{
Node* node = node_new();
CHECK_NULL_RETURN(node);
NODE_SET_TYPE(node, NODE_ENCLOSURE);
ENCLOSURE_(node)->type = type;
NODE_SET_TYPE(node, NODE_BAG);
BAG_(node)->type = type;
switch (type) {
case ENCLOSURE_MEMORY:
ENCLOSURE_(node)->m.regnum = 0;
ENCLOSURE_(node)->m.called_addr = -1;
ENCLOSURE_(node)->m.entry_count = 1;
ENCLOSURE_(node)->m.called_state = 0;
case BAG_MEMORY:
BAG_(node)->m.regnum = 0;
BAG_(node)->m.called_addr = -1;
BAG_(node)->m.entry_count = 1;
BAG_(node)->m.called_state = 0;
break;
case ENCLOSURE_OPTION:
ENCLOSURE_(node)->o.options = 0;
case BAG_OPTION:
BAG_(node)->o.options = 0;
break;
case ENCLOSURE_STOP_BACKTRACK:
case BAG_STOP_BACKTRACK:
break;
case ENCLOSURE_IF_ELSE:
ENCLOSURE_(node)->te.Then = 0;
ENCLOSURE_(node)->te.Else = 0;
case BAG_IF_ELSE:
BAG_(node)->te.Then = 0;
BAG_(node)->te.Else = 0;
break;
}
ENCLOSURE_(node)->opt_count = 0;
BAG_(node)->opt_count = 0;
return node;
}
extern Node*
onig_node_new_enclosure(int type)
onig_node_new_bag(enum BagType type)
{
return node_new_enclosure(type);
return node_new_bag(type);
}
static Node*
node_new_enclosure_if_else(Node* cond, Node* Then, Node* Else)
node_new_bag_if_else(Node* cond, Node* Then, Node* Else)
{
Node* n;
n = node_new_enclosure(ENCLOSURE_IF_ELSE);
n = node_new_bag(BAG_IF_ELSE);
CHECK_NULL_RETURN(n);
NODE_BODY(n) = cond;
ENCLOSURE_(n)->te.Then = Then;
ENCLOSURE_(n)->te.Else = Else;
BAG_(n)->te.Then = Then;
BAG_(n)->te.Else = Else;
return n;
}
static Node*
node_new_memory(int is_named)
{
Node* node = node_new_enclosure(ENCLOSURE_MEMORY);
Node* node = node_new_bag(BAG_MEMORY);
CHECK_NULL_RETURN(node);
if (is_named != 0)
NODE_STATUS_ADD(node, NAMED_GROUP);
@ -2450,12 +2450,37 @@ node_new_memory(int is_named)
static Node*
node_new_option(OnigOptionType option)
{
Node* node = node_new_enclosure(ENCLOSURE_OPTION);
Node* node = node_new_bag(BAG_OPTION);
CHECK_NULL_RETURN(node);
ENCLOSURE_(node)->o.options = option;
BAG_(node)->o.options = option;
return node;
}
static Node*
node_new_group(Node* content)
{
Node* node;
node = node_new();
CHECK_NULL_RETURN(node);
NODE_SET_TYPE(node, NODE_LIST);
NODE_CAR(node) = content;
NODE_CDR(node) = NULL_NODE;
return node;
}
static Node*
node_drop_group(Node* group)
{
Node* content;
content = NODE_CAR(group);
NODE_CAR(group) = NULL_NODE;
onig_node_free(group);
return content;
}
static int
node_new_fail(Node** node, ScanEnv* env)
{
@ -2546,7 +2571,7 @@ onig_free_reg_callout_list(int n, CalloutListEntry* list)
extern CalloutListEntry*
onig_reg_callout_list_at(regex_t* reg, int num)
{
RegexExt* ext = REG_EXTP(reg);
RegexExt* ext = reg->extp;
CHECK_NULL_RETURN(ext);
if (num <= 0 || num > ext->callout_num)
@ -2637,7 +2662,7 @@ make_extended_grapheme_cluster(Node** node, ScanEnv* env)
ns[1] = NULL_NODE;
r = ONIGERR_MEMORY;
ns[0] = onig_node_new_anchor(ANCHOR_NO_EXTENDED_GRAPHEME_CLUSTER_BOUNDARY, 0);
ns[0] = onig_node_new_anchor(ANCR_NO_EXTENDED_GRAPHEME_CLUSTER_BOUNDARY, 0);
if (IS_NULL(ns[0])) goto err;
r = node_new_true_anychar(&ns[1], env);
@ -2664,7 +2689,7 @@ make_extended_grapheme_cluster(Node** node, ScanEnv* env)
ns[0] = x;
ns[1] = NULL_NODE;
x = node_new_enclosure(ENCLOSURE_STOP_BACKTRACK);
x = node_new_bag(BAG_STOP_BACKTRACK);
if (IS_NULL(x)) goto err;
NODE_BODY(x) = ns[0];
@ -2724,7 +2749,7 @@ make_absent_engine(Node** node, int pre_save_right_id, Node* absent,
ns[0] = x;
if (possessive != 0) {
x = node_new_enclosure(ENCLOSURE_STOP_BACKTRACK);
x = node_new_bag(BAG_STOP_BACKTRACK);
if (IS_NULL(x)) goto err0;
NODE_BODY(x) = ns[0];
@ -2876,11 +2901,11 @@ is_simple_one_char_repeat(Node* node, Node** rquant, Node** rbody,
quant = node;
}
else {
if (NODE_TYPE(node) == NODE_ENCLOSURE) {
EnclosureNode* en = ENCLOSURE_(node);
if (en->type == ENCLOSURE_STOP_BACKTRACK) {
if (NODE_TYPE(node) == NODE_BAG) {
BagNode* en = BAG_(node);
if (en->type == BAG_STOP_BACKTRACK) {
*is_possessive = 1;
quant = NODE_ENCLOSURE_BODY(en);
quant = NODE_BAG_BODY(en);
if (NODE_TYPE(quant) != NODE_QUANT)
return 0;
}
@ -3057,7 +3082,7 @@ make_absent_tree(Node** node, Node* absent, Node* expr, int is_range_cutter,
else {
r = make_absent_tail(&ns[5], &ns[6], id1, env);
if (r != 0) goto err;
x = make_list(7, ns);
if (IS_NULL(x)) goto err0;
}
@ -3069,7 +3094,7 @@ make_absent_tree(Node** node, Node* absent, Node* expr, int is_range_cutter,
r = ONIGERR_MEMORY;
err:
for (i = 0; i < 7; i++) onig_node_free(ns[i]);
return r;
return r;
}
extern int
@ -3080,11 +3105,11 @@ onig_node_str_cat(Node* node, const UChar* s, const UChar* end)
if (addlen > 0) {
int len = (int )(STR_(node)->end - STR_(node)->s);
if (STR_(node)->capa > 0 || (len + addlen > NODE_STRING_BUF_SIZE - 1)) {
if (STR_(node)->capacity > 0 || (len + addlen > NODE_STRING_BUF_SIZE - 1)) {
UChar* p;
int capa = len + addlen + NODE_STRING_MARGIN;
if (capa <= STR_(node)->capa) {
if (capa <= STR_(node)->capacity) {
onig_strcpy(STR_(node)->s + len, s, end);
}
else {
@ -3095,8 +3120,8 @@ onig_node_str_cat(Node* node, const UChar* s, const UChar* end)
p = strcat_capa(STR_(node)->s, STR_(node)->end, s, end, capa);
CHECK_NULL_RETURN_MEMERR(p);
STR_(node)->s = p;
STR_(node)->capa = capa;
STR_(node)->s = p;
STR_(node)->capacity = capa;
}
}
else {
@ -3128,24 +3153,24 @@ extern void
onig_node_conv_to_str_node(Node* node, int flag)
{
NODE_SET_TYPE(node, NODE_STRING);
STR_(node)->flag = flag;
STR_(node)->capa = 0;
STR_(node)->s = STR_(node)->buf;
STR_(node)->end = STR_(node)->buf;
STR_(node)->flag = flag;
STR_(node)->capacity = 0;
STR_(node)->s = STR_(node)->buf;
STR_(node)->end = STR_(node)->buf;
}
extern void
onig_node_str_clear(Node* node)
{
if (STR_(node)->capa != 0 &&
if (STR_(node)->capacity != 0 &&
IS_NOT_NULL(STR_(node)->s) && STR_(node)->s != STR_(node)->buf) {
xfree(STR_(node)->s);
}
STR_(node)->capa = 0;
STR_(node)->flag = 0;
STR_(node)->s = STR_(node)->buf;
STR_(node)->end = STR_(node)->buf;
STR_(node)->capacity = 0;
STR_(node)->flag = 0;
STR_(node)->s = STR_(node)->buf;
STR_(node)->end = STR_(node)->buf;
}
static Node*
@ -3155,10 +3180,10 @@ node_new_str(const UChar* s, const UChar* end)
CHECK_NULL_RETURN(node);
NODE_SET_TYPE(node, NODE_STRING);
STR_(node)->capa = 0;
STR_(node)->flag = 0;
STR_(node)->s = STR_(node)->buf;
STR_(node)->end = STR_(node)->buf;
STR_(node)->capacity = 0;
STR_(node)->flag = 0;
STR_(node)->s = STR_(node)->buf;
STR_(node)->end = STR_(node)->buf;
if (onig_node_str_cat(node, s, end)) {
onig_node_free(node);
return NULL;
@ -3176,6 +3201,7 @@ static Node*
node_new_str_raw(UChar* s, UChar* end)
{
Node* node = node_new_str(s, end);
CHECK_NULL_RETURN(node);
NODE_STRING_SET_RAW(node);
return node;
}
@ -3208,6 +3234,7 @@ str_node_split_last_char(Node* node, OnigEncoding enc)
p = onigenc_get_prev_char_head(enc, sn->s, sn->end);
if (p && p > sn->s) { /* can be split. */
rn = node_new_str(p, sn->end);
CHECK_NULL_RETURN(rn);
if (NODE_STRING_IS_RAW(node))
NODE_STRING_SET_RAW(rn);
@ -3798,7 +3825,7 @@ is_invalid_quantifier_target(Node* node)
return 1;
break;
case NODE_ENCLOSURE:
case NODE_BAG:
/* allow enclosed elements */
/* return is_invalid_quantifier_target(NODE_BODY(node)); */
break;
@ -3880,7 +3907,7 @@ onig_reduce_nested_quantifier(Node* pnode, Node* cnode)
if (pnum < 0 || cnum < 0) {
if ((p->lower == p->upper) && ! IS_REPEAT_INFINITE(p->upper)) {
if ((c->lower == c->upper) && ! IS_REPEAT_INFINITE(c->upper)) {
int n = positive_int_multiply(p->lower, c->lower);
int n = onig_positive_int_multiply(p->lower, c->lower);
if (n >= 0) {
p->lower = p->upper = n;
NODE_BODY(pnode) = NODE_BODY(cnode);
@ -3975,7 +4002,7 @@ node_new_general_newline(Node** node, ScanEnv* env)
if (r != 0) goto err1;
}
x = node_new_enclosure_if_else(crnl, 0, ncc);
x = node_new_bag_if_else(crnl, 0, ncc);
if (IS_NULL(x)) goto err1;
*node = x;
@ -4555,7 +4582,7 @@ find_str_position(OnigCodePoint s[], int n, UChar* from, UChar* to,
OnigCodePoint x;
UChar *q;
UChar *p = from;
while (p < to) {
x = ONIGENC_MBC_TO_CODE(enc, p, to);
q = p + enclen(enc, p);
@ -4704,12 +4731,12 @@ fetch_token_in_cc(OnigToken* tok, UChar** src, UChar* end, ScanEnv* env)
IS_SYNTAX_OP2(syn, ONIG_SYN_OP2_ESC_P_BRACE_CHAR_PROPERTY)) {
PINC;
tok->type = TK_CHAR_PROPERTY;
tok->u.prop.not = (c == 'P' ? 1 : 0);
tok->u.prop.not = c == 'P';
if (!PEND && IS_SYNTAX_OP2(syn, ONIG_SYN_OP2_ESC_P_BRACE_CIRCUMFLEX_NOT)) {
PFETCH(c2);
if (c2 == '^') {
tok->u.prop.not = (tok->u.prop.not == 0 ? 1 : 0);
tok->u.prop.not = tok->u.prop.not == 0;
}
else
PUNFETCH;
@ -4989,38 +5016,38 @@ fetch_token(OnigToken* tok, UChar** src, UChar* end, ScanEnv* env)
case 'b':
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP_ESC_B_WORD_BOUND)) break;
tok->type = TK_ANCHOR;
tok->u.anchor = ANCHOR_WORD_BOUNDARY;
tok->u.anchor = ANCR_WORD_BOUNDARY;
break;
case 'B':
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP_ESC_B_WORD_BOUND)) break;
tok->type = TK_ANCHOR;
tok->u.anchor = ANCHOR_NO_WORD_BOUNDARY;
tok->u.anchor = ANCR_NO_WORD_BOUNDARY;
break;
case 'y':
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP2_ESC_X_Y_GRAPHEME_CLUSTER)) break;
tok->type = TK_ANCHOR;
tok->u.anchor = ANCHOR_EXTENDED_GRAPHEME_CLUSTER_BOUNDARY;
tok->u.anchor = ANCR_EXTENDED_GRAPHEME_CLUSTER_BOUNDARY;
break;
case 'Y':
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP2_ESC_X_Y_GRAPHEME_CLUSTER)) break;
tok->type = TK_ANCHOR;
tok->u.anchor = ANCHOR_NO_EXTENDED_GRAPHEME_CLUSTER_BOUNDARY;
tok->u.anchor = ANCR_NO_EXTENDED_GRAPHEME_CLUSTER_BOUNDARY;
break;
#ifdef USE_WORD_BEGIN_END
case '<':
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP_ESC_LTGT_WORD_BEGIN_END)) break;
tok->type = TK_ANCHOR;
tok->u.anchor = ANCHOR_WORD_BEGIN;
tok->u.anchor = ANCR_WORD_BEGIN;
break;
case '>':
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP_ESC_LTGT_WORD_BEGIN_END)) break;
tok->type = TK_ANCHOR;
tok->u.anchor = ANCHOR_WORD_END;
tok->u.anchor = ANCR_WORD_END;
break;
#endif
@ -5095,26 +5122,26 @@ fetch_token(OnigToken* tok, UChar** src, UChar* end, ScanEnv* env)
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP_ESC_AZ_BUF_ANCHOR)) break;
begin_buf:
tok->type = TK_ANCHOR;
tok->u.subtype = ANCHOR_BEGIN_BUF;
tok->u.subtype = ANCR_BEGIN_BUF;
break;
case 'Z':
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP_ESC_AZ_BUF_ANCHOR)) break;
tok->type = TK_ANCHOR;
tok->u.subtype = ANCHOR_SEMI_END_BUF;
tok->u.subtype = ANCR_SEMI_END_BUF;
break;
case 'z':
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP_ESC_AZ_BUF_ANCHOR)) break;
end_buf:
tok->type = TK_ANCHOR;
tok->u.subtype = ANCHOR_END_BUF;
tok->u.subtype = ANCR_END_BUF;
break;
case 'G':
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP_ESC_CAPITAL_G_BEGIN_ANCHOR)) break;
tok->type = TK_ANCHOR;
tok->u.subtype = ANCHOR_BEGIN_POSITION;
tok->u.subtype = ANCR_BEGIN_POSITION;
break;
case '`':
@ -5217,7 +5244,7 @@ fetch_token(OnigToken* tok, UChar** src, UChar* end, ScanEnv* env)
goto skip_backref;
}
if (IS_SYNTAX_OP(syn, ONIG_SYN_OP_DECIMAL_BACKREF) &&
if (IS_SYNTAX_OP(syn, ONIG_SYN_OP_DECIMAL_BACKREF) &&
(num <= env->num_mem || num <= 9)) { /* This spec. from GNU regex */
if (IS_SYNTAX_BV(syn, ONIG_SYN_STRICT_CHECK_BACKREF)) {
if (num > env->num_mem || IS_NULL(SCANENV_MEMENV(env)[num].node))
@ -5385,13 +5412,13 @@ fetch_token(OnigToken* tok, UChar** src, UChar* end, ScanEnv* env)
IS_SYNTAX_OP2(syn, ONIG_SYN_OP2_ESC_P_BRACE_CHAR_PROPERTY)) {
PINC;
tok->type = TK_CHAR_PROPERTY;
tok->u.prop.not = (c == 'P' ? 1 : 0);
tok->u.prop.not = c == 'P';
if (!PEND &&
IS_SYNTAX_OP2(syn, ONIG_SYN_OP2_ESC_P_BRACE_CIRCUMFLEX_NOT)) {
PFETCH(c);
if (c == '^') {
tok->u.prop.not = (tok->u.prop.not == 0 ? 1 : 0);
tok->u.prop.not = tok->u.prop.not == 0;
}
else
PUNFETCH;
@ -5611,14 +5638,14 @@ fetch_token(OnigToken* tok, UChar** src, UChar* end, ScanEnv* env)
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP_LINE_ANCHOR)) break;
tok->type = TK_ANCHOR;
tok->u.subtype = (IS_SINGLELINE(env->options)
? ANCHOR_BEGIN_BUF : ANCHOR_BEGIN_LINE);
? ANCR_BEGIN_BUF : ANCR_BEGIN_LINE);
break;
case '$':
if (! IS_SYNTAX_OP(syn, ONIG_SYN_OP_LINE_ANCHOR)) break;
tok->type = TK_ANCHOR;
tok->u.subtype = (IS_SINGLELINE(env->options)
? ANCHOR_SEMI_END_BUF : ANCHOR_END_LINE);
? ANCR_SEMI_END_BUF : ANCR_END_LINE);
break;
case '[':
@ -6514,7 +6541,7 @@ parse_char_class(Node** np, OnigToken* tok, UChar** src, UChar* end, ScanEnv* en
}
static int parse_subexp(Node** top, OnigToken* tok, int term,
UChar** src, UChar* end, ScanEnv* env);
UChar** src, UChar* end, ScanEnv* env, int group_head);
#ifdef USE_CALLOUT
@ -6610,6 +6637,7 @@ parse_callout_of_contents(Node** np, int cterm, UChar** src, UChar* end, ScanEnv
if (r != 0) return r;
ext = onig_get_regex_ext(env->reg);
CHECK_NULL_RETURN_MEMERR(ext);
if (IS_NULL(ext->pattern)) {
r = onig_ext_set_pattern(env->reg, env->pattern, env->pattern_end);
if (r != ONIG_NORMAL) return r;
@ -6630,6 +6658,11 @@ parse_callout_of_contents(Node** np, int cterm, UChar** src, UChar* end, ScanEnv
}
e = onig_reg_callout_list_at(env->reg, num);
if (IS_NULL(e)) {
xfree(contents);
return ONIGERR_MEMORY;
}
e->of = ONIG_CALLOUT_OF_CONTENTS;
e->in = in;
e->name_id = ONIG_NON_NAME_ID;
@ -6925,6 +6958,7 @@ parse_callout_of_name(Node** np, int cterm, UChar** src, UChar* end, ScanEnv* en
if (r != 0) return r;
ext = onig_get_regex_ext(env->reg);
CHECK_NULL_RETURN_MEMERR(ext);
if (IS_NULL(ext->pattern)) {
r = onig_ext_set_pattern(env->reg, env->pattern, env->pattern_end);
if (r != ONIG_NORMAL) return r;
@ -6939,6 +6973,8 @@ parse_callout_of_name(Node** np, int cterm, UChar** src, UChar* end, ScanEnv* en
if (r != ONIG_NORMAL) return r;
e = onig_reg_callout_list_at(env->reg, num);
CHECK_NULL_RETURN_MEMERR(e);
e->of = ONIG_CALLOUT_OF_NAME;
e->in = in;
e->name_id = name_id;
@ -6962,8 +6998,8 @@ parse_callout_of_name(Node** np, int cterm, UChar** src, UChar* end, ScanEnv* en
#endif
static int
parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
ScanEnv* env)
parse_bag(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
ScanEnv* env)
{
int r, num;
Node *target;
@ -6990,20 +7026,20 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
group:
r = fetch_token(tok, &p, end, env);
if (r < 0) return r;
r = parse_subexp(np, tok, term, &p, end, env);
r = parse_subexp(np, tok, term, &p, end, env, 0);
if (r < 0) return r;
*src = p;
return 1; /* group */
break;
case '=':
*np = onig_node_new_anchor(ANCHOR_PREC_READ, 0);
*np = onig_node_new_anchor(ANCR_PREC_READ, 0);
break;
case '!': /* preceding read */
*np = onig_node_new_anchor(ANCHOR_PREC_READ_NOT, 0);
*np = onig_node_new_anchor(ANCR_PREC_READ_NOT, 0);
break;
case '>': /* (?>...) stop backtrack */
*np = node_new_enclosure(ENCLOSURE_STOP_BACKTRACK);
*np = node_new_bag(BAG_STOP_BACKTRACK);
break;
case '\'':
@ -7018,9 +7054,9 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
if (PEND) return ONIGERR_END_PATTERN_WITH_UNMATCHED_PARENTHESIS;
PFETCH(c);
if (c == '=')
*np = onig_node_new_anchor(ANCHOR_LOOK_BEHIND, 0);
*np = onig_node_new_anchor(ANCR_LOOK_BEHIND, 0);
else if (c == '!')
*np = onig_node_new_anchor(ANCHOR_LOOK_BEHIND_NOT, 0);
*np = onig_node_new_anchor(ANCR_LOOK_BEHIND_NOT, 0);
else {
if (IS_SYNTAX_OP2(env->syntax, ONIG_SYN_OP2_QMARK_LT_NAMED_GROUP)) {
UChar *name;
@ -7048,7 +7084,7 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
if (r != 0) return r;
*np = node_new_memory(1);
CHECK_NULL_RETURN_MEMERR(*np);
ENCLOSURE_(*np)->m.regnum = num;
BAG_(*np)->m.regnum = num;
if (list_capture != 0)
MEM_STATUS_ON_SIMPLE(env->capture_history, num);
env->num_named++;
@ -7085,7 +7121,7 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
r = fetch_token(tok, &p, end, env);
if (r < 0) return r;
r = parse_subexp(&absent, tok, term, &p, end, env);
r = parse_subexp(&absent, tok, term, &p, end, env, 1);
if (r < 0) {
onig_node_free(absent);
return r;
@ -7263,7 +7299,7 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
condition_is_checker = 0;
r = fetch_token(tok, &p, end, env);
if (r < 0) return r;
r = parse_subexp(&condition, tok, term, &p, end, env);
r = parse_subexp(&condition, tok, term, &p, end, env, 0);
if (r < 0) {
onig_node_free(condition);
return r;
@ -7304,7 +7340,7 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
onig_node_free(condition);
return r;
}
r = parse_subexp(&target, tok, term, &p, end, env);
r = parse_subexp(&target, tok, term, &p, end, env, 1);
if (r < 0) {
onig_node_free(condition);
onig_node_free(target);
@ -7332,7 +7368,7 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
}
}
*np = node_new_enclosure_if_else(condition, Then, Else);
*np = node_new_bag_if_else(condition, Then, Else);
if (IS_NULL(*np)) {
onig_node_free(condition);
onig_node_free(Then);
@ -7367,7 +7403,7 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
else if (num >= (int )MEM_STATUS_BITS_NUM) {
return ONIGERR_GROUP_NUMBER_OVER_FOR_CAPTURE_HISTORY;
}
ENCLOSURE_(*np)->m.regnum = num;
BAG_(*np)->m.regnum = num;
MEM_STATUS_ON_SIMPLE(env->capture_history, num);
}
else {
@ -7436,7 +7472,7 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
env->options = option;
r = fetch_token(tok, &p, end, env);
if (r < 0) return r;
r = parse_subexp(&target, tok, term, &p, end, env);
r = parse_subexp(&target, tok, term, &p, end, env, 0);
env->options = prev;
if (r < 0) {
onig_node_free(target);
@ -7477,13 +7513,13 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
CHECK_NULL_RETURN_MEMERR(*np);
num = scan_env_add_mem_entry(env);
if (num < 0) return num;
ENCLOSURE_(*np)->m.regnum = num;
BAG_(*np)->m.regnum = num;
}
CHECK_NULL_RETURN_MEMERR(*np);
r = fetch_token(tok, &p, end, env);
if (r < 0) return r;
r = parse_subexp(&target, tok, term, &p, end, env);
r = parse_subexp(&target, tok, term, &p, end, env, 0);
if (r < 0) {
onig_node_free(target);
return r;
@ -7491,10 +7527,10 @@ parse_enclosure(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
NODE_BODY(*np) = target;
if (NODE_TYPE(*np) == NODE_ENCLOSURE) {
if (ENCLOSURE_(*np)->type == ENCLOSURE_MEMORY) {
if (NODE_TYPE(*np) == NODE_BAG) {
if (BAG_(*np)->type == BAG_MEMORY) {
/* Don't move this to previous of parse_subexp() */
r = scan_env_set_mem_node(env, ENCLOSURE_(*np)->m.regnum, *np);
r = scan_env_set_mem_node(env, BAG_(*np)->m.regnum, *np);
if (r != 0) return r;
}
}
@ -7523,7 +7559,7 @@ set_quantifier(Node* qnode, Node* target, int group, ScanEnv* env)
switch (NODE_TYPE(target)) {
case NODE_STRING:
if (! group) {
if (group == 0) {
if (str_node_can_be_split(target, env->enc)) {
Node* n = str_node_split_last_char(target, env->enc);
if (IS_NOT_NULL(n)) {
@ -7715,7 +7751,7 @@ i_apply_case_fold(OnigCodePoint from, OnigCodePoint to[], int to_len, void* arg)
static int
parse_exp(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
ScanEnv* env)
ScanEnv* env, int group_head)
{
int r, len, group = 0;
Node* qn;
@ -7729,22 +7765,35 @@ parse_exp(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
case TK_ALT:
case TK_EOT:
end_of_token:
*np = node_new_empty();
return tok->type;
*np = node_new_empty();
CHECK_NULL_RETURN_MEMERR(*np);
return tok->type;
break;
case TK_SUBEXP_OPEN:
r = parse_enclosure(np, tok, TK_SUBEXP_CLOSE, src, end, env);
r = parse_bag(np, tok, TK_SUBEXP_CLOSE, src, end, env);
if (r < 0) return r;
if (r == 1) group = 1;
if (r == 1) { /* group */
if (group_head == 0)
group = 1;
else {
Node* target = *np;
*np = node_new_group(target);
if (IS_NULL(*np)) {
onig_node_free(target);
return ONIGERR_MEMORY;
}
group = 2;
}
}
else if (r == 2) { /* option only */
Node* target;
OnigOptionType prev = env->options;
env->options = ENCLOSURE_(*np)->o.options;
env->options = BAG_(*np)->o.options;
r = fetch_token(tok, src, end, env);
if (r < 0) return r;
r = parse_subexp(&target, tok, term, src, end, env);
r = parse_subexp(&target, tok, term, src, end, env, 0);
env->options = prev;
if (r < 0) {
onig_node_free(target);
@ -7973,6 +8022,7 @@ parse_exp(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
int ascii_mode =
IS_WORD_ASCII(env->options) && IS_WORD_ANCHOR_TYPE(tok->u.anchor) ? 1 : 0;
*np = onig_node_new_anchor(tok->u.anchor, ascii_mode);
CHECK_NULL_RETURN_MEMERR(*np);
}
break;
@ -7981,8 +8031,10 @@ parse_exp(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
if (IS_SYNTAX_BV(env->syntax, ONIG_SYN_CONTEXT_INDEP_REPEAT_OPS)) {
if (IS_SYNTAX_BV(env->syntax, ONIG_SYN_CONTEXT_INVALID_REPEAT_OPS))
return ONIGERR_TARGET_OF_REPEAT_OPERATOR_NOT_SPECIFIED;
else
else {
*np = node_new_empty();
CHECK_NULL_RETURN_MEMERR(*np);
}
}
else {
goto tk_byte;
@ -8028,14 +8080,23 @@ parse_exp(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
repeat:
if (r == TK_OP_REPEAT || r == TK_INTERVAL) {
Node* target;
if (is_invalid_quantifier_target(*targetp))
return ONIGERR_TARGET_OF_REPEAT_OPERATOR_INVALID;
qn = node_new_quantifier(tok->u.repeat.lower, tok->u.repeat.upper,
(r == TK_INTERVAL ? 1 : 0));
r == TK_INTERVAL);
CHECK_NULL_RETURN_MEMERR(qn);
QUANT_(qn)->greedy = tok->u.repeat.greedy;
r = set_quantifier(qn, *targetp, group, env);
if (group == 2) {
target = node_drop_group(*np);
*np = NULL_NODE;
}
else {
target = *targetp;
}
r = set_quantifier(qn, target, group, env);
if (r < 0) {
onig_node_free(qn);
return r;
@ -8043,7 +8104,7 @@ parse_exp(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
if (tok->u.repeat.possessive != 0) {
Node* en;
en = node_new_enclosure(ENCLOSURE_STOP_BACKTRACK);
en = node_new_bag(BAG_STOP_BACKTRACK);
if (IS_NULL(en)) {
onig_node_free(qn);
return ONIGERR_MEMORY;
@ -8082,13 +8143,13 @@ parse_exp(Node** np, OnigToken* tok, int term, UChar** src, UChar* end,
static int
parse_branch(Node** top, OnigToken* tok, int term, UChar** src, UChar* end,
ScanEnv* env)
ScanEnv* env, int group_head)
{
int r;
Node *node, **headp;
*top = NULL;
r = parse_exp(&node, tok, term, src, end, env);
r = parse_exp(&node, tok, term, src, end, env, group_head);
if (r < 0) {
onig_node_free(node);
return r;
@ -8099,9 +8160,14 @@ parse_branch(Node** top, OnigToken* tok, int term, UChar** src, UChar* end,
}
else {
*top = node_new_list(node, NULL);
if (IS_NULL(*top)) {
onig_node_free(node);
return ONIGERR_MEMORY;
}
headp = &(NODE_CDR(*top));
while (r != TK_EOT && r != term && r != TK_ALT) {
r = parse_exp(&node, tok, term, src, end, env);
r = parse_exp(&node, tok, term, src, end, env, 0);
if (r < 0) {
onig_node_free(node);
return r;
@ -8125,7 +8191,7 @@ parse_branch(Node** top, OnigToken* tok, int term, UChar** src, UChar* end,
/* term_tok: TK_EOT or TK_SUBEXP_CLOSE */
static int
parse_subexp(Node** top, OnigToken* tok, int term, UChar** src, UChar* end,
ScanEnv* env)
ScanEnv* env, int group_head)
{
int r;
Node *node, **headp;
@ -8134,7 +8200,8 @@ parse_subexp(Node** top, OnigToken* tok, int term, UChar** src, UChar* end,
env->parse_depth++;
if (env->parse_depth > ParseDepthLimit)
return ONIGERR_PARSE_DEPTH_LIMIT_OVER;
r = parse_branch(&node, tok, term, src, end, env);
r = parse_branch(&node, tok, term, src, end, env, group_head);
if (r < 0) {
onig_node_free(node);
return r;
@ -8145,16 +8212,27 @@ parse_subexp(Node** top, OnigToken* tok, int term, UChar** src, UChar* end,
}
else if (r == TK_ALT) {
*top = onig_node_new_alt(node, NULL);
if (IS_NULL(*top)) {
onig_node_free(node);
return ONIGERR_MEMORY;
}
headp = &(NODE_CDR(*top));
while (r == TK_ALT) {
r = fetch_token(tok, src, end, env);
if (r < 0) return r;
r = parse_branch(&node, tok, term, src, end, env);
r = parse_branch(&node, tok, term, src, end, env, 0);
if (r < 0) {
onig_node_free(node);
return r;
}
*headp = onig_node_new_alt(node, NULL);
if (IS_NULL(*headp)) {
onig_node_free(node);
onig_node_free(*top);
return ONIGERR_MEMORY;
}
headp = &(NODE_CDR(*headp));
}
@ -8182,7 +8260,7 @@ parse_regexp(Node** top, UChar** src, UChar* end, ScanEnv* env)
r = fetch_token(&tok, src, end, env);
if (r < 0) return r;
r = parse_subexp(top, &tok, TK_EOT, src, end, env);
r = parse_subexp(top, &tok, TK_EOT, src, end, env, 0);
if (r < 0) return r;
return 0;
@ -8198,7 +8276,7 @@ make_call_zero_body(Node* node, ScanEnv* env, Node** rnode)
CHECK_NULL_RETURN_MEMERR(x);
NODE_BODY(x) = node;
ENCLOSURE_(x)->m.regnum = 0;
BAG_(x)->m.regnum = 0;
r = scan_env_set_mem_node(env, 0, x);
if (r != 0) {
onig_node_free(x);
@ -8254,7 +8332,7 @@ onig_parse_tree(Node** root, const UChar* pattern, const UChar* end,
reg->num_mem = env->num_mem;
#ifdef USE_CALLOUT
ext = REG_EXTP(reg);
ext = reg->extp;
if (IS_NOT_NULL(ext) && ext->callout_num > 0) {
r = setup_ext_callout_list_values(reg);
}

View File

@ -31,6 +31,10 @@
#include "regint.h"
#define NODE_STRING_MARGIN 16
#define NODE_STRING_BUF_SIZE 24 /* sizeof(CClassNode) - sizeof(int)*4 */
#define NODE_BACKREFS_SIZE 6
/* node type */
typedef enum {
NODE_STRING = 0,
@ -38,7 +42,7 @@ typedef enum {
NODE_CTYPE = 2,
NODE_BACKREF = 3,
NODE_QUANT = 4,
NODE_ENCLOSURE = 5,
NODE_BAG = 5,
NODE_ANCHOR = 6,
NODE_LIST = 7,
NODE_ALT = 8,
@ -46,95 +50,23 @@ typedef enum {
NODE_GIMMICK = 10
} NodeType;
enum BagType {
BAG_MEMORY = 0,
BAG_OPTION = 1,
BAG_STOP_BACKTRACK = 2,
BAG_IF_ELSE = 3,
};
enum GimmickType {
GIMMICK_FAIL = 0,
GIMMICK_KEEP = 1,
GIMMICK_SAVE = 2,
GIMMICK_FAIL = 0,
GIMMICK_KEEP = 1,
GIMMICK_SAVE = 2,
GIMMICK_UPDATE_VAR = 3,
#ifdef USE_CALLOUT
GIMMICK_CALLOUT = 4,
GIMMICK_CALLOUT = 4,
#endif
};
/* node type bit */
#define NODE_TYPE2BIT(type) (1<<(type))
#define NODE_BIT_STRING NODE_TYPE2BIT(NODE_STRING)
#define NODE_BIT_CCLASS NODE_TYPE2BIT(NODE_CCLASS)
#define NODE_BIT_CTYPE NODE_TYPE2BIT(NODE_CTYPE)
#define NODE_BIT_BACKREF NODE_TYPE2BIT(NODE_BACKREF)
#define NODE_BIT_QUANT NODE_TYPE2BIT(NODE_QUANT)
#define NODE_BIT_ENCLOSURE NODE_TYPE2BIT(NODE_ENCLOSURE)
#define NODE_BIT_ANCHOR NODE_TYPE2BIT(NODE_ANCHOR)
#define NODE_BIT_LIST NODE_TYPE2BIT(NODE_LIST)
#define NODE_BIT_ALT NODE_TYPE2BIT(NODE_ALT)
#define NODE_BIT_CALL NODE_TYPE2BIT(NODE_CALL)
#define NODE_BIT_GIMMICK NODE_TYPE2BIT(NODE_GIMMICK)
#define NODE_IS_SIMPLE_TYPE(node) \
((NODE_TYPE2BIT(NODE_TYPE(node)) & \
(NODE_BIT_STRING | NODE_BIT_CCLASS | NODE_BIT_CTYPE | NODE_BIT_BACKREF)) != 0)
#define NODE_TYPE(node) ((node)->u.base.node_type)
#define NODE_SET_TYPE(node, ntype) (node)->u.base.node_type = (ntype)
#define STR_(node) (&((node)->u.str))
#define CCLASS_(node) (&((node)->u.cclass))
#define CTYPE_(node) (&((node)->u.ctype))
#define BACKREF_(node) (&((node)->u.backref))
#define QUANT_(node) (&((node)->u.quant))
#define ENCLOSURE_(node) (&((node)->u.enclosure))
#define ANCHOR_(node) (&((node)->u.anchor))
#define CONS_(node) (&((node)->u.cons))
#define CALL_(node) (&((node)->u.call))
#define GIMMICK_(node) (&((node)->u.gimmick))
#define NODE_CAR(node) (CONS_(node)->car)
#define NODE_CDR(node) (CONS_(node)->cdr)
#define CTYPE_ANYCHAR -1
#define NODE_IS_ANYCHAR(node) \
(NODE_TYPE(node) == NODE_CTYPE && CTYPE_(node)->ctype == CTYPE_ANYCHAR)
#define CTYPE_OPTION(node, reg) \
(NODE_IS_FIXED_OPTION(node) ? CTYPE_(node)->options : reg->options)
#define ANCHOR_ANYCHAR_INF_MASK (ANCHOR_ANYCHAR_INF | ANCHOR_ANYCHAR_INF_ML)
#define ANCHOR_END_BUF_MASK (ANCHOR_END_BUF | ANCHOR_SEMI_END_BUF)
enum EnclosureType {
ENCLOSURE_MEMORY = 0,
ENCLOSURE_OPTION = 1,
ENCLOSURE_STOP_BACKTRACK = 2,
ENCLOSURE_IF_ELSE = 3,
};
#define NODE_STRING_MARGIN 16
#define NODE_STRING_BUF_SIZE 24 /* sizeof(CClassNode) - sizeof(int)*4 */
#define NODE_BACKREFS_SIZE 6
#define NODE_STRING_RAW (1<<0) /* by backslashed number */
#define NODE_STRING_AMBIG (1<<1)
#define NODE_STRING_DONT_GET_OPT_INFO (1<<2)
#define NODE_STRING_LEN(node) (int )((node)->u.str.end - (node)->u.str.s)
#define NODE_STRING_SET_RAW(node) (node)->u.str.flag |= NODE_STRING_RAW
#define NODE_STRING_CLEAR_RAW(node) (node)->u.str.flag &= ~NODE_STRING_RAW
#define NODE_STRING_SET_AMBIG(node) (node)->u.str.flag |= NODE_STRING_AMBIG
#define NODE_STRING_SET_DONT_GET_OPT_INFO(node) \
(node)->u.str.flag |= NODE_STRING_DONT_GET_OPT_INFO
#define NODE_STRING_IS_RAW(node) \
(((node)->u.str.flag & NODE_STRING_RAW) != 0)
#define NODE_STRING_IS_AMBIG(node) \
(((node)->u.str.flag & NODE_STRING_AMBIG) != 0)
#define NODE_STRING_IS_DONT_GET_OPT_INFO(node) \
(((node)->u.str.flag & NODE_STRING_DONT_GET_OPT_INFO) != 0)
#define BACKREFS_P(br) \
(IS_NOT_NULL((br)->back_dynamic) ? (br)->back_dynamic : (br)->back_static)
enum QuantBodyEmpty {
QUANT_BODY_IS_NOT_EMPTY = 0,
QUANT_BODY_IS_EMPTY = 1,
@ -142,65 +74,6 @@ enum QuantBodyEmpty {
QUANT_BODY_IS_EMPTY_REC = 3
};
/* node status bits */
#define NODE_ST_MIN_FIXED (1<<0)
#define NODE_ST_MAX_FIXED (1<<1)
#define NODE_ST_CLEN_FIXED (1<<2)
#define NODE_ST_MARK1 (1<<3)
#define NODE_ST_MARK2 (1<<4)
#define NODE_ST_STOP_BT_SIMPLE_REPEAT (1<<5)
#define NODE_ST_RECURSION (1<<6)
#define NODE_ST_CALLED (1<<7)
#define NODE_ST_ADDR_FIXED (1<<8)
#define NODE_ST_NAMED_GROUP (1<<9)
#define NODE_ST_IN_REAL_REPEAT (1<<10) /* STK_REPEAT is nested in stack. */
#define NODE_ST_IN_ZERO_REPEAT (1<<11) /* (....){0} */
#define NODE_ST_IN_MULTI_ENTRY (1<<12)
#define NODE_ST_NEST_LEVEL (1<<13)
#define NODE_ST_BY_NUMBER (1<<14) /* {n,m} */
#define NODE_ST_BY_NAME (1<<15) /* backref by name */
#define NODE_ST_BACKREF (1<<16)
#define NODE_ST_CHECKER (1<<17)
#define NODE_ST_FIXED_OPTION (1<<18)
#define NODE_ST_PROHIBIT_RECURSION (1<<19)
#define NODE_ST_SUPER (1<<20)
#define NODE_STATUS(node) (((Node* )node)->u.base.status)
#define NODE_STATUS_ADD(node,f) (NODE_STATUS(node) |= (NODE_ST_ ## f))
#define NODE_STATUS_REMOVE(node,f) (NODE_STATUS(node) &= ~(NODE_ST_ ## f))
#define NODE_IS_BY_NUMBER(node) ((NODE_STATUS(node) & NODE_ST_BY_NUMBER) != 0)
#define NODE_IS_IN_REAL_REPEAT(node) ((NODE_STATUS(node) & NODE_ST_IN_REAL_REPEAT) != 0)
#define NODE_IS_CALLED(node) ((NODE_STATUS(node) & NODE_ST_CALLED) != 0)
#define NODE_IS_IN_MULTI_ENTRY(node) ((NODE_STATUS(node) & NODE_ST_IN_MULTI_ENTRY) != 0)
#define NODE_IS_RECURSION(node) ((NODE_STATUS(node) & NODE_ST_RECURSION) != 0)
#define NODE_IS_IN_ZERO_REPEAT(node) ((NODE_STATUS(node) & NODE_ST_IN_ZERO_REPEAT) != 0)
#define NODE_IS_NAMED_GROUP(node) ((NODE_STATUS(node) & NODE_ST_NAMED_GROUP) != 0)
#define NODE_IS_ADDR_FIXED(node) ((NODE_STATUS(node) & NODE_ST_ADDR_FIXED) != 0)
#define NODE_IS_CLEN_FIXED(node) ((NODE_STATUS(node) & NODE_ST_CLEN_FIXED) != 0)
#define NODE_IS_MIN_FIXED(node) ((NODE_STATUS(node) & NODE_ST_MIN_FIXED) != 0)
#define NODE_IS_MAX_FIXED(node) ((NODE_STATUS(node) & NODE_ST_MAX_FIXED) != 0)
#define NODE_IS_MARK1(node) ((NODE_STATUS(node) & NODE_ST_MARK1) != 0)
#define NODE_IS_MARK2(node) ((NODE_STATUS(node) & NODE_ST_MARK2) != 0)
#define NODE_IS_NEST_LEVEL(node) ((NODE_STATUS(node) & NODE_ST_NEST_LEVEL) != 0)
#define NODE_IS_BY_NAME(node) ((NODE_STATUS(node) & NODE_ST_BY_NAME) != 0)
#define NODE_IS_BACKREF(node) ((NODE_STATUS(node) & NODE_ST_BACKREF) != 0)
#define NODE_IS_CHECKER(node) ((NODE_STATUS(node) & NODE_ST_CHECKER) != 0)
#define NODE_IS_FIXED_OPTION(node) ((NODE_STATUS(node) & NODE_ST_FIXED_OPTION) != 0)
#define NODE_IS_SUPER(node) ((NODE_STATUS(node) & NODE_ST_SUPER) != 0)
#define NODE_IS_PROHIBIT_RECURSION(node) \
((NODE_STATUS(node) & NODE_ST_PROHIBIT_RECURSION) != 0)
#define NODE_IS_STOP_BT_SIMPLE_REPEAT(node) \
((NODE_STATUS(node) & NODE_ST_STOP_BT_SIMPLE_REPEAT) != 0)
#define NODE_BODY(node) ((node)->u.base.body)
#define NODE_QUANT_BODY(node) ((node)->body)
#define NODE_ENCLOSURE_BODY(node) ((node)->body)
#define NODE_CALL_BODY(node) ((node)->body)
#define NODE_ANCHOR_BODY(node) ((node)->body)
typedef struct {
NodeType node_type;
int status;
@ -208,7 +81,7 @@ typedef struct {
UChar* s;
UChar* end;
unsigned int flag;
int capa; /* (allocated size - 1) or 0: use buf[] */
int capacity; /* (allocated size - 1) or 0: use buf[] */
UChar buf[NODE_STRING_BUF_SIZE];
} StrNode;
@ -240,7 +113,7 @@ typedef struct {
int status;
struct _Node* body;
enum EnclosureType type;
enum BagType type;
union {
struct {
int regnum;
@ -262,7 +135,7 @@ typedef struct {
OnigLen max_len; /* max length (byte) */
int char_len; /* character length */
int opt_count; /* referenced count in optimize_nodes() */
} EnclosureNode;
} BagNode;
#ifdef USE_CALL
@ -280,7 +153,7 @@ typedef struct {
typedef struct {
NodeType node_type;
int status;
struct _Node* body; /* to EnclosureNode : ENCLOSURE_MEMORY */
struct _Node* body; /* to BagNode : BAG_MEMORY */
int by_number;
int group_num;
@ -350,7 +223,7 @@ typedef struct _Node {
StrNode str;
CClassNode cclass;
QuantNode quant;
EnclosureNode enclosure;
BagNode bag;
BackRefNode backref;
AnchorNode anchor;
ConsAltNode cons;
@ -362,9 +235,138 @@ typedef struct _Node {
} u;
} Node;
#define NULL_NODE ((Node* )0)
/* node type bit */
#define NODE_TYPE2BIT(type) (1<<(type))
#define NODE_BIT_STRING NODE_TYPE2BIT(NODE_STRING)
#define NODE_BIT_CCLASS NODE_TYPE2BIT(NODE_CCLASS)
#define NODE_BIT_CTYPE NODE_TYPE2BIT(NODE_CTYPE)
#define NODE_BIT_BACKREF NODE_TYPE2BIT(NODE_BACKREF)
#define NODE_BIT_QUANT NODE_TYPE2BIT(NODE_QUANT)
#define NODE_BIT_BAG NODE_TYPE2BIT(NODE_BAG)
#define NODE_BIT_ANCHOR NODE_TYPE2BIT(NODE_ANCHOR)
#define NODE_BIT_LIST NODE_TYPE2BIT(NODE_LIST)
#define NODE_BIT_ALT NODE_TYPE2BIT(NODE_ALT)
#define NODE_BIT_CALL NODE_TYPE2BIT(NODE_CALL)
#define NODE_BIT_GIMMICK NODE_TYPE2BIT(NODE_GIMMICK)
#define NODE_IS_SIMPLE_TYPE(node) \
((NODE_TYPE2BIT(NODE_TYPE(node)) & \
(NODE_BIT_STRING | NODE_BIT_CCLASS | NODE_BIT_CTYPE | NODE_BIT_BACKREF)) != 0)
#define NODE_TYPE(node) ((node)->u.base.node_type)
#define NODE_SET_TYPE(node, ntype) (node)->u.base.node_type = (ntype)
#define STR_(node) (&((node)->u.str))
#define CCLASS_(node) (&((node)->u.cclass))
#define CTYPE_(node) (&((node)->u.ctype))
#define BACKREF_(node) (&((node)->u.backref))
#define QUANT_(node) (&((node)->u.quant))
#define BAG_(node) (&((node)->u.bag))
#define ANCHOR_(node) (&((node)->u.anchor))
#define CONS_(node) (&((node)->u.cons))
#define CALL_(node) (&((node)->u.call))
#define GIMMICK_(node) (&((node)->u.gimmick))
#define NODE_CAR(node) (CONS_(node)->car)
#define NODE_CDR(node) (CONS_(node)->cdr)
#define CTYPE_ANYCHAR -1
#define NODE_IS_ANYCHAR(node) \
(NODE_TYPE(node) == NODE_CTYPE && CTYPE_(node)->ctype == CTYPE_ANYCHAR)
#define CTYPE_OPTION(node, reg) \
(NODE_IS_FIXED_OPTION(node) ? CTYPE_(node)->options : reg->options)
#define ANCR_ANYCHAR_INF_MASK (ANCR_ANYCHAR_INF | ANCR_ANYCHAR_INF_ML)
#define ANCR_END_BUF_MASK (ANCR_END_BUF | ANCR_SEMI_END_BUF)
#define NODE_STRING_RAW (1<<0) /* by backslashed number */
#define NODE_STRING_AMBIG (1<<1)
#define NODE_STRING_GOOD_AMBIG (1<<2)
#define NODE_STRING_DONT_GET_OPT_INFO (1<<3)
#define NODE_STRING_LEN(node) (int )((node)->u.str.end - (node)->u.str.s)
#define NODE_STRING_SET_RAW(node) (node)->u.str.flag |= NODE_STRING_RAW
#define NODE_STRING_CLEAR_RAW(node) (node)->u.str.flag &= ~NODE_STRING_RAW
#define NODE_STRING_SET_AMBIG(node) (node)->u.str.flag |= NODE_STRING_AMBIG
#define NODE_STRING_SET_GOOD_AMBIG(node) (node)->u.str.flag |= NODE_STRING_GOOD_AMBIG
#define NODE_STRING_SET_DONT_GET_OPT_INFO(node) \
(node)->u.str.flag |= NODE_STRING_DONT_GET_OPT_INFO
#define NODE_STRING_IS_RAW(node) \
(((node)->u.str.flag & NODE_STRING_RAW) != 0)
#define NODE_STRING_IS_AMBIG(node) \
(((node)->u.str.flag & NODE_STRING_AMBIG) != 0)
#define NODE_STRING_IS_GOOD_AMBIG(node) \
(((node)->u.str.flag & NODE_STRING_GOOD_AMBIG) != 0)
#define NODE_STRING_IS_DONT_GET_OPT_INFO(node) \
(((node)->u.str.flag & NODE_STRING_DONT_GET_OPT_INFO) != 0)
#define BACKREFS_P(br) \
(IS_NOT_NULL((br)->back_dynamic) ? (br)->back_dynamic : (br)->back_static)
/* node status bits */
#define NODE_ST_MIN_FIXED (1<<0)
#define NODE_ST_MAX_FIXED (1<<1)
#define NODE_ST_CLEN_FIXED (1<<2)
#define NODE_ST_MARK1 (1<<3)
#define NODE_ST_MARK2 (1<<4)
#define NODE_ST_STOP_BT_SIMPLE_REPEAT (1<<5)
#define NODE_ST_RECURSION (1<<6)
#define NODE_ST_CALLED (1<<7)
#define NODE_ST_ADDR_FIXED (1<<8)
#define NODE_ST_NAMED_GROUP (1<<9)
#define NODE_ST_IN_REAL_REPEAT (1<<10) /* STK_REPEAT is nested in stack. */
#define NODE_ST_IN_ZERO_REPEAT (1<<11) /* (....){0} */
#define NODE_ST_IN_MULTI_ENTRY (1<<12)
#define NODE_ST_NEST_LEVEL (1<<13)
#define NODE_ST_BY_NUMBER (1<<14) /* {n,m} */
#define NODE_ST_BY_NAME (1<<15) /* backref by name */
#define NODE_ST_BACKREF (1<<16)
#define NODE_ST_CHECKER (1<<17)
#define NODE_ST_FIXED_OPTION (1<<18)
#define NODE_ST_PROHIBIT_RECURSION (1<<19)
#define NODE_ST_SUPER (1<<20)
#define NODE_STATUS(node) (((Node* )node)->u.base.status)
#define NODE_STATUS_ADD(node,f) (NODE_STATUS(node) |= (NODE_ST_ ## f))
#define NODE_STATUS_REMOVE(node,f) (NODE_STATUS(node) &= ~(NODE_ST_ ## f))
#define NODE_IS_BY_NUMBER(node) ((NODE_STATUS(node) & NODE_ST_BY_NUMBER) != 0)
#define NODE_IS_IN_REAL_REPEAT(node) ((NODE_STATUS(node) & NODE_ST_IN_REAL_REPEAT) != 0)
#define NODE_IS_CALLED(node) ((NODE_STATUS(node) & NODE_ST_CALLED) != 0)
#define NODE_IS_IN_MULTI_ENTRY(node) ((NODE_STATUS(node) & NODE_ST_IN_MULTI_ENTRY) != 0)
#define NODE_IS_RECURSION(node) ((NODE_STATUS(node) & NODE_ST_RECURSION) != 0)
#define NODE_IS_IN_ZERO_REPEAT(node) ((NODE_STATUS(node) & NODE_ST_IN_ZERO_REPEAT) != 0)
#define NODE_IS_NAMED_GROUP(node) ((NODE_STATUS(node) & NODE_ST_NAMED_GROUP) != 0)
#define NODE_IS_ADDR_FIXED(node) ((NODE_STATUS(node) & NODE_ST_ADDR_FIXED) != 0)
#define NODE_IS_CLEN_FIXED(node) ((NODE_STATUS(node) & NODE_ST_CLEN_FIXED) != 0)
#define NODE_IS_MIN_FIXED(node) ((NODE_STATUS(node) & NODE_ST_MIN_FIXED) != 0)
#define NODE_IS_MAX_FIXED(node) ((NODE_STATUS(node) & NODE_ST_MAX_FIXED) != 0)
#define NODE_IS_MARK1(node) ((NODE_STATUS(node) & NODE_ST_MARK1) != 0)
#define NODE_IS_MARK2(node) ((NODE_STATUS(node) & NODE_ST_MARK2) != 0)
#define NODE_IS_NEST_LEVEL(node) ((NODE_STATUS(node) & NODE_ST_NEST_LEVEL) != 0)
#define NODE_IS_BY_NAME(node) ((NODE_STATUS(node) & NODE_ST_BY_NAME) != 0)
#define NODE_IS_BACKREF(node) ((NODE_STATUS(node) & NODE_ST_BACKREF) != 0)
#define NODE_IS_CHECKER(node) ((NODE_STATUS(node) & NODE_ST_CHECKER) != 0)
#define NODE_IS_FIXED_OPTION(node) ((NODE_STATUS(node) & NODE_ST_FIXED_OPTION) != 0)
#define NODE_IS_SUPER(node) ((NODE_STATUS(node) & NODE_ST_SUPER) != 0)
#define NODE_IS_PROHIBIT_RECURSION(node) \
((NODE_STATUS(node) & NODE_ST_PROHIBIT_RECURSION) != 0)
#define NODE_IS_STOP_BT_SIMPLE_REPEAT(node) \
((NODE_STATUS(node) & NODE_ST_STOP_BT_SIMPLE_REPEAT) != 0)
#define NODE_BODY(node) ((node)->u.base.body)
#define NODE_QUANT_BODY(node) ((node)->body)
#define NODE_BAG_BODY(node) ((node)->body)
#define NODE_CALL_BODY(node) ((node)->body)
#define NODE_ANCHOR_BODY(node) ((node)->body)
#define SCANENV_MEMENV_SIZE 8
#define SCANENV_MEMENV(senv) \
(IS_NOT_NULL((senv)->mem_env_dynamic) ? \
@ -434,7 +436,7 @@ extern void onig_node_conv_to_str_node P_((Node* node, int raw));
extern int onig_node_str_cat P_((Node* node, const UChar* s, const UChar* end));
extern int onig_node_str_set P_((Node* node, const UChar* s, const UChar* end));
extern void onig_node_free P_((Node* node));
extern Node* onig_node_new_enclosure P_((int type));
extern Node* onig_node_new_bag P_((enum BagType type));
extern Node* onig_node_new_anchor P_((int type, int ascii_mode));
extern Node* onig_node_new_str P_((const UChar* s, const UChar* end));
extern Node* onig_node_new_list P_((Node* left, Node* right));

View File

@ -37,11 +37,7 @@
#include "config.h"
#include "onigposix.h"
#ifdef HAVE_STRING_H
# include <string.h>
#else
# include <strings.h>
#endif
#include <string.h>
#if defined(__GNUC__)
# define ARG_UNUSED __attribute__ ((unused))

View File

@ -67,8 +67,8 @@ OnigSyntaxType OnigSyntaxPosixExtended = {
ONIG_SYN_OP_BRACE_INTERVAL |
ONIG_SYN_OP_PLUS_ONE_INF | ONIG_SYN_OP_QMARK_ZERO_ONE | ONIG_SYN_OP_VBAR_ALT )
, 0
, ( ONIG_SYN_CONTEXT_INDEP_ANCHORS |
ONIG_SYN_CONTEXT_INDEP_REPEAT_OPS | ONIG_SYN_CONTEXT_INVALID_REPEAT_OPS |
, ( ONIG_SYN_CONTEXT_INDEP_ANCHORS |
ONIG_SYN_CONTEXT_INDEP_REPEAT_OPS | ONIG_SYN_CONTEXT_INVALID_REPEAT_OPS |
ONIG_SYN_ALLOW_UNMATCHED_CLOSE_SUBEXP |
ONIG_SYN_ALLOW_DOUBLE_RANGE_OP_IN_CC )
, ( ONIG_OPTION_SINGLELINE | ONIG_OPTION_MULTILINE )

View File

@ -113,10 +113,7 @@ static int
code_to_mbclen(OnigCodePoint code)
{
if (code < 256) {
if (EncLen_SJIS[(int )code] == 1)
return 1;
else
return 0;
return EncLen_SJIS[(int )code] == 1;
}
else if (code <= 0xffff) {
return 2;
@ -188,7 +185,7 @@ is_mbc_ambiguous(OnigCaseFoldType flag,
const UChar** pp, const UChar* end)
{
return onigenc_mbn_is_mbc_ambiguous(ONIG_ENCODING_SJIS, flag, pp, end);
}
#endif
@ -223,7 +220,7 @@ left_adjust_char_head(const UChar* start, const UChar* s)
p++;
break;
}
}
}
}
len = enclen(ONIG_ENCODING_SJIS, p);
if (p + len > s) return (UChar* )p;
@ -338,6 +335,6 @@ OnigEncodingType OnigEncodingSJIS = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_SKIP_OFFSET_1_OR_0,
0, 0
};

View File

@ -657,8 +657,7 @@ onigenc_egcb_is_break_position(OnigEncoding enc, UChar* p, UChar* prev,
#ifdef USE_UNICODE_EXTENDED_GRAPHEME_CLUSTER
if (! ONIGENC_IS_UNICODE_ENCODING(enc)) {
if (from == 0x000d && to == 0x000a) return 0;
else return 1;
return from != 0x000d || to != 0x000a;
}
btype = unicode_egcb_is_break_2code(from, to);
@ -701,8 +700,7 @@ onigenc_egcb_is_break_position(OnigEncoding enc, UChar* p, UChar* prev,
return 1;
#else
if (from == 0x000d && to == 0x000a) return 0;
else return 1;
return from != 0x000d || to != 0x000a;
#endif /* USE_UNICODE_EXTENDED_GRAPHEME_CLUSTER */
}
@ -729,6 +727,7 @@ onig_unicode_define_user_property(const char* name, OnigCodePoint* ranges)
int len;
int c;
char* s;
UChar* uname;
if (UserDefinedPropertyNum >= USER_DEFINED_PROPERTY_MAX_NUM)
return ONIGERR_TOO_MANY_USER_DEFINED_OBJECTS;
@ -741,10 +740,11 @@ onig_unicode_define_user_property(const char* name, OnigCodePoint* ranges)
if (s == 0)
return ONIGERR_MEMORY;
uname = (UChar* )name;
n = 0;
for (i = 0; i < len; i++) {
c = name[i];
if (c <= 0 || c >= 0x80) {
c = uname[i];
if (c < 0x20 || c >= 0x80) {
xfree(s);
return ONIGERR_INVALID_CHAR_PROPERTY_NAME;
}
@ -758,6 +758,10 @@ onig_unicode_define_user_property(const char* name, OnigCodePoint* ranges)
if (UserDefinedPropertyTable == 0) {
UserDefinedPropertyTable = onig_st_init_strend_table_with_size(10);
if (IS_NULL(UserDefinedPropertyTable)) {
xfree(s);
return ONIGERR_MEMORY;
}
}
e = UserDefinedPropertyRanges + UserDefinedPropertyNum;

View File

@ -2988,5 +2988,3 @@ onigenc_unicode_fold1_key(OnigCodePoint codes[])
}
return -1;
}

View File

@ -225,5 +225,3 @@ onigenc_unicode_fold2_key(OnigCodePoint codes[])
}
return -1;
}

View File

@ -135,5 +135,3 @@ onigenc_unicode_fold3_key(OnigCodePoint codes[])
}
return -1;
}

View File

@ -1513,4 +1513,3 @@ OnigCodePoint OnigUnicodeFolds3[] = {
/* ----- LOCALE ----- */
#define FOLDS3_END_INDEX 72
};

View File

@ -3283,5 +3283,3 @@ onigenc_unicode_unfold_key(OnigCodePoint code)
}
return 0;
}

View File

@ -280,6 +280,6 @@ OnigEncodingType OnigEncodingUTF16_BE = {
init,
0, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_UNICODE,
ENC_FLAG_UNICODE|ENC_FLAG_SKIP_OFFSET_2,
0, 0
};

View File

@ -287,6 +287,6 @@ OnigEncodingType OnigEncodingUTF16_LE = {
init,
0, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_UNICODE,
ENC_FLAG_UNICODE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -192,6 +192,6 @@ OnigEncodingType OnigEncodingUTF32_BE = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_UNICODE,
ENC_FLAG_UNICODE|ENC_FLAG_SKIP_OFFSET_4,
0, 0
};

View File

@ -192,6 +192,6 @@ OnigEncodingType OnigEncodingUTF32_LE = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_UNICODE,
ENC_FLAG_UNICODE|ENC_FLAG_SKIP_OFFSET_1,
0, 0
};

View File

@ -57,7 +57,7 @@ static const int EncLen_UTF8[] = {
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 1, 1
4, 4, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
};
static int
@ -280,7 +280,7 @@ get_case_fold_codes_by_str(OnigCaseFoldType flag,
OnigEncodingType OnigEncodingUTF8 = {
mbc_enc_len,
"UTF-8", /* name */
6, /* max enc length */
4, /* max enc length */
1, /* min enc length */
onigenc_is_mbc_newline_0x0a,
mbc_to_code,
@ -297,6 +297,6 @@ OnigEncodingType OnigEncodingUTF8 = {
NULL, /* init */
NULL, /* is_initialized */
is_valid_mbc_string,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_UNICODE,
ENC_FLAG_ASCII_COMPATIBLE|ENC_FLAG_UNICODE|ENC_FLAG_SKIP_OFFSET_1_OR_0,
0, 0
};

View File

@ -2,7 +2,7 @@
/* edit configure.ac to change version number */
#define PHP_MAJOR_VERSION 7
#define PHP_MINOR_VERSION 3
#define PHP_RELEASE_VERSION 10
#define PHP_RELEASE_VERSION 8
#define PHP_EXTRA_VERSION "-dev"
#define PHP_VERSION "7.3.10-dev"
#define PHP_VERSION_ID 70310
#define PHP_VERSION "7.3.8-dev"
#define PHP_VERSION_ID 70308