xref: /PHP-7.3/ext/mbstring/oniguruma/doc/SYNTAX.md (revision 1979c5d1)
1
2# Oniguruma syntax (operator) configuration
3
4_Documented for Oniguruma 6.9.3 (2019/08/08)_
5
6
7----------
8
9
10## Overview
11
12This document details how to configure Oniguruma's syntax, by describing the desired
13syntax operators and behaviors in an instance of the OnigSyntaxType struct, just like
14the built-in Oniguruma syntaxes do.
15
16Configuration operators are bit flags, and are broken into multiple groups, somewhat arbitrarily,
17because Oniguruma takes its configuration as a trio of 32-bit `unsigned int` values, assigned as
18the first three fields in an `OnigSyntaxType` struct:
19
20```C
21typedef struct {
22  unsigned int   op;
23  unsigned int   op2;
24  unsigned int   behavior;
25  OnigOptionType options;   /* default option */
26  OnigMetaCharTableType meta_char_table;
27} OnigSyntaxType;
28```
29
30The first group of configuration flags (`op`) roughly corresponds to the
31configuration for "basic regex."  The second group (`op2`) roughly corresponds
32to the configuration for "advanced regex."  And the third group (`behavior`)
33describes more-or-less what to do for broken input, bad input, or other corner-case
34regular expressions whose meaning is not well-defined.  These three groups of
35flags are described in full below, and tables of their usages for various syntaxes
36follow.
37
38The `options` field describes the default compile options to use if the caller does
39not specify any options when invoking `onig_new()`.
40
41The `meta_char_table` field is used exclusively by the ONIG_SYN_OP_VARIABLE_META_CHARACTERS
42option, which allows the various regex metacharacters, like `*` and `?`, to be replaced
43with alternates (for example, SQL typically uses `%` instead of `.*` and `_` instead of `?`).
44
45
46----------
47
48
49## Group One Flags (op)
50
51
52This group contains "basic regex" constructs, features common to most regex systems.
53
54
55### 0. ONIG_SYN_OP_VARIABLE_META_CHARACTERS
56
57_Set in: none_
58
59Enables support for `onig_set_meta_char()`, which allows you to provide alternate
60characters that will be used instead of the six special characters that are normally
61these characters below:
62
63   - `ONIG_META_CHAR_ESCAPE`: `\`
64   - `ONIG_META_CHAR_ANYCHAR`: `.`
65   - `ONIG_META_CHAR_ANYTIME`: `*`
66   - `ONIG_META_CHAR_ZERO_OR_ONE_TIME`: `?`
67   - `ONIG_META_CHAR_ONE_OR_MORE_TIME`: `+`
68   - `ONIG_META_CHAR_ANYCHAR_ANYTIME`: Equivalent in normal regex to `.*`, but supported
69      explicitly so that Oniguruma can support matching SQL `%` wildcards or shell `*` wildcards.
70
71If this flag is set, then the values defined using `onig_set_meta_char()` will be used;
72if this flag is clear, then the default regex characters will be used instead, and
73data set by `onig_set_meta_char()` will be ignored.
74
75
76### 1. ONIG_SYN_OP_DOT_ANYCHAR (enable `.`)
77
78_Set in: PosixBasic, PosixExtended, Emacs, Grep, GnuRegex, Java, Perl, Perl_NG, Ruby, Oniguruma_
79
80Enables support for the standard `.` metacharacter, meaning "any one character."  You
81usually want this flag on unless you have turned on `ONIG_SYN_OP_VARIABLE_META_CHARACTERS`
82so that you can use a metacharacter other than `.` instead.
83
84
85### 2. ONIG_SYN_OP_ASTERISK_ZERO_INF (enable `r*`)
86
87_Set in: PosixBasic, PosixExtended, Emacs, Grep, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
88
89Enables support for the standard `r*` metacharacter, meaning "zero or more r's."
90You usually want this flag set unless you have turned on `ONIG_SYN_OP_VARIABLE_META_CHARACTERS`
91so that you can use a metacharacter other than `*` instead.
92
93
94### 3. ONIG_SYN_OP_ESC_ASTERISK_ZERO_INF (enable `r\*`)
95
96_Set in: none_
97
98Enables support for an escaped `r\*` metacharacter, meaning "zero or more r's."  This is
99useful if you have disabled support for the normal `r*` metacharacter because you want `*`
100to simply match a literal `*` character, but you still want some way of activating "zero or more"
101behavior.
102
103
104### 4. ONIG_SYN_OP_PLUS_ONE_INF (enable `r+`)
105
106_Set in: PosixExtended, Emacs, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
107
108Enables support for the standard `r+` metacharacter, meaning "one or more r's."
109You usually want this flag set unless you have turned on `ONIG_SYN_OP_VARIABLE_META_CHARACTERS`
110so that you can use a metacharacter other than `+` instead.
111
112
113### 5. ONIG_SYN_OP_ESC_PLUS_ONE_INF (enable `r\+`)
114
115_Set in: Grep_
116
117Enables support for an escaped `r\+` metacharacter, meaning "one or more r's."  This is
118useful if you have disabled support for the normal `r+` metacharacter because you want `+`
119to simply match a literal `+` character, but you still want some way of activating "one or more"
120behavior.
121
122
123### 6. ONIG_SYN_OP_QMARK_ZERO_ONE (enable `r?`)
124
125_Set in: PosixExtended, Emacs, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
126
127Enables support for the standard `r?` metacharacter, meaning "zero or one r" or "an optional r."
128You usually want this flag set unless you have turned on `ONIG_SYN_OP_VARIABLE_META_CHARACTERS`
129so that you can use a metacharacter other than `?` instead.
130
131
132### 7. ONIG_SYN_OP_ESC_QMARK_ZERO_ONE (enable `r\?`)
133
134_Set in: Grep_
135
136Enables support for an escaped `r\?` metacharacter, meaning "zero or one r" or "an optional
137r."  This is useful if you have disabled support for the normal `r?` metacharacter because
138you want `?` to simply match a literal `?` character, but you still want some way of activating
139"optional" behavior.
140
141
142### 8. ONIG_SYN_OP_BRACE_INTERVAL (enable `r{l,u}`)
143
144_Set in: PosixExtended, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
145
146Enables support for the `r{lower,upper}` range form, common to more advanced
147regex engines, which lets you specify precisely a minimum and maximum range on how many r's
148must match (and not simply "zero or more").
149
150This form also allows `r{count}` to specify a precise count of r's that must match.
151
152This form also allows `r{lower,}` to be equivalent to `r{lower,infinity}`.
153
154If and only if the `ONIG_SYN_ALLOW_INTERVAL_LOW_ABBREV` behavior flag is set,
155this form also allows `r{,upper}` to be equivalent to `r{0,upper}`; otherwise,
156`r{,upper}` will be treated as an error.
157
158
159### 9. ONIG_SYN_OP_ESC_BRACE_INTERVAL (enable `\{` and `\}`)
160
161_Set in: PosixBasic, Emacs, Grep_
162
163Enables support for an escaped `r\{lower,upper\}` range form.  This is useful if you
164have disabled support for the normal `r{...}` range form and want curly braces to simply
165match literal curly brace characters, but you still want some way of activating
166"range" behavior.
167
168
169### 10. ONIG_SYN_OP_VBAR_ALT (enable `r|s`)
170
171_Set in: PosixExtended, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
172
173Enables support for the common `r|s` alternation operator.  You usually want this
174flag set.
175
176
177### 11. ONIG_SYN_OP_ESC_VBAR_ALT (enable `\|`)
178
179_Set in: Emacs, Grep_
180
181Enables support for an escaped `r\|s` alternation form.  This is useful if you
182have disabled support for the normal `r|s` alternation form and want `|` to simply
183match a literal `|` character, but you still want some way of activating "alternate" behavior.
184
185
186### 12. ONIG_SYN_OP_LPAREN_SUBEXP (enable `(r)`)
187
188_Set in: PosixExtended, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
189
190Enables support for the common `(...)` grouping-and-capturing operators.  You usually
191want this flag set.
192
193
194### 13. ONIG_SYN_OP_ESC_LPAREN_SUBEXP (enable `\(` and `\)`)
195
196_Set in: PosixBasic, Emacs, Grep_
197
198Enables support for escaped `\(...\)` grouping-and-capturing operators.  This is useful if you
199have disabled support for the normal `(...)` grouping-and-capturing operators and want
200parentheses to simply match literal parenthesis characters, but you still want some way of
201activating "grouping" or "capturing" behavior.
202
203
204### 14. ONIG_SYN_OP_ESC_AZ_BUF_ANCHOR (enable `\A` and `\Z` and `\z`)
205
206_Set in: GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
207
208Enables support for the anchors `\A` (start-of-string), `\Z` (end-of-string or
209newline-at-end-of-string), and `\z` (end-of-string) escapes.
210
211(If the escape metacharacter has been changed from the default of `\`, this
212option will recognize that metacharacter instead.)
213
214
215### 15. ONIG_SYN_OP_ESC_CAPITAL_G_BEGIN_ANCHOR (enable `\G`)
216
217_Set in: GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
218
219Enables support for the special anchor `\G` (start-of-previous-match).
220
221(If the escape metacharacter has been changed from the default of `\`, this
222option will recognize that metacharacter instead.)
223
224Note that `OnigRegex`/`regex_t` are not stateful objects, and do _not_ record
225the location of the previous match.  The `\G` flag uses the `start` parameter
226explicitly passed to `onig_search()` (or `onig_search_with_param()` to determine
227the "start of the previous match," so if the caller always passes the start of
228the entire buffer as the function's `start` parameter, then `\G` will behave
229exactly the same as `\A`.
230
231
232### 16. ONIG_SYN_OP_DECIMAL_BACKREF (enable `\num`)
233
234_Set in: PosixBasic, PosixExtended, Emacs, Grep, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
235
236Enables support for subsequent matches to back references to prior capture groups `(...)` using
237the common `\num` syntax (like `\3`).
238
239If this flag is clear, then a numeric escape like `\3` will either be treated as a literal `3`,
240or, if `ONIG_SYN_OP_ESC_OCTAL3` is set, will be treated as an octal character code `\3`.
241
242You usually want this enabled, and it is enabled by default in every built-in syntax.
243
244
245### 17. ONIG_SYN_OP_BRACKET_CC (enable `[...]`)
246
247_Set in: PosixBasic, PosixExtended, Emacs, Grep, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
248
249Enables support for recognizing character classes, like `[a-z]`.  If this flag is not set, `[`
250and `]` will be treated as ordinary literal characters instead of as metacharacters.
251
252You usually want this enabled, and it is enabled by default in every built-in syntax.
253
254
255### 18. ONIG_SYN_OP_ESC_W_WORD (enable `\w` and `\W`)
256
257_Set in: Grep, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
258
259Enables support for the common `\w` and `\W` shorthand forms.  These match "word characters,"
260whose meaning varies depending on the encoding being used.
261
262In ASCII encoding, `\w` is equivalent to `[A-Za-z0-9_]`.
263
264In most other encodings, `\w` matches many more characters, including accented letters, Greek letters,
265Cyrillic letters, Braille letters and numbers, Runic letters, Hebrew letters, Arabic letters and numerals,
266Chinese Han ideographs, Japanese Katakana and Hiragana, Korean Hangul, and generally any symbol that
267could qualify as a phonetic "letter" or counting "number" in any language.  (Note that emoji are _not_
268considered "word characters.")
269
270`\W` always matches the opposite of whatever `\w` matches.
271
272
273### 19. ONIG_SYN_OP_ESC_LTGT_WORD_BEGIN_END (enable `\<` and `\>`)
274
275_Set in: Grep, GnuRegex_
276
277Enables support for the GNU-specific `\<` and `\>` word-boundary metacharacters.  These work like
278the `\b` word-boundary metacharacter, but only match at one end of the word or the other:  `\<`
279only matches at a transition from a non-word character to a word character (i.e., at the start
280of a word), and `\>` only matches at a transition from a word character to a non-word character
281(i.e., at the end of a word).
282
283Most regex syntaxes do _not_ support these metacharacters.
284
285
286### 20. ONIG_SYN_OP_ESC_B_WORD_BOUND (enable `\b` and `\B`)
287
288_Set in: Grep, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
289
290Enables support for the common `\b` and `\B` word-boundary metacharacters.  The `\b` metacharacter
291matches a zero-width position at a transition from word-characters to non-word-characters, or vice
292versa.  The `\B` metacharacter matches at all positions _not_ matched by `\b`.
293
294See details in `ONIG_SYN_OP_ESC_W_WORD` above for an explanation as to which characters
295are considered "word characters."
296
297
298### 21. ONIG_SYN_OP_ESC_S_WHITE_SPACE (enable `\s` and `\S`)
299
300_Set in: GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
301
302Enables support for the common `\s` and `\S` whitespace-matching metacharacters.
303
304The `\s` metacharacter in ASCII encoding is exactly equivalent to the character class
305`[\t\n\v\f\r ]`, or characters codes 9 through 13 (inclusive), and 32.
306
307The `\s` metacharacter in Unicode is exactly equivalent to the character class
308`[\t\n\v\f\r \x85\xA0\x1680\x2000-\x200A\x2028-\x2029\x202F\x205F\x3000]` — that is, it matches
309the same as ASCII, plus U+0085 (next line), U+00A0 (nonbreaking space), U+1680 (Ogham space mark),
310U+2000 (en quad) through U+200A (hair space) (this range includes several widths of Unicode spaces),
311U+2028 (line separator) through U+2029 (paragraph separator),
312U+202F (narrow no-break space), U+205F (medium mathematical space), and U+3000 (CJK ideographic space).
313
314All non-Unicode encodings are handled by converting their code points to the appropriate
315Unicode-equivalent code points, and then matching according to Unicode rules.
316
317`\S` always matches any one character that is _not_ in the set matched by `\s`.
318
319
320### 22. ONIG_SYN_OP_ESC_D_DIGIT (enable `\d` and `\D`)
321
322_Set in: GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
323
324Enables support for the common `\d` and `\D` digit-matching metacharacters.
325
326The `\d` metacharacter in ASCII encoding is exactly equivalent to the character class
327`[0-9]`, or characters codes 48 through 57 (inclusive).
328
329The `\d` metacharacter in Unicode matches `[0-9]`, as well as digits in Arabic, Devanagari,
330Bengali, Laotian, Mongolian, CJK fullwidth numerals, and many more.
331
332All non-Unicode encodings are handled by converting their code points to the appropriate
333Unicode-equivalent code points, and then matching according to Unicode rules.
334
335`\D` always matches any one character that is _not_ in the set matched by `\d`.
336
337
338### 23. ONIG_SYN_OP_LINE_ANCHOR (enable `^r` and `r$`)
339
340_Set in: Emacs, Grep, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
341
342Enables support for the common `^` and `$` line-anchor metacharacters.
343
344In single-line mode, `^` matches the start of the input buffer, and `$` matches
345the end of the input buffer.  In multi-line mode, `^` matches if the preceding
346character is `\n`; and `$` matches if the following character is `\n`.
347
348(Note that Oniguruma does not recognize other newline types:  It only matches
349`^` and `$` against `\n`:  not `\r`, not `\r\n`, not the U+2028 line separator,
350and not any other form.)
351
352
353### 24. ONIG_SYN_OP_POSIX_BRACKET (enable POSIX `[:xxxx:]`)
354
355_Set in: PosixBasic, PosixExtended, Grep, GnuRegex, Perl, Java, Perl_NG, Ruby, Oniguruma_
356
357Enables support for the POSIX `[:xxxx:]` character classes, like `[:alpha:]` and `[:digit:]`.
358The supported POSIX character classes are `alnum`, `alpha`, `blank`, `cntrl`, `digit`,
359`graph`, `lower`, `print`, `punct`, `space`, `upper`, `xdigit`, `ascii`, `word`.
360
361
362### 25. ONIG_SYN_OP_QMARK_NON_GREEDY (enable `r??`, `r*?`, `r+?`, and `r{n,m}?`)
363
364_Set in: Perl, Java, Perl_NG, Ruby, Oniguruma_
365
366Enables support for lazy (non-greedy) quantifiers: That is, if you append a `?` after
367another quantifier such as `?`, `*`, `+`, or `{n,m}`, Oniguruma will try to match
368as _little_ as possible instead of as _much_ as possible.
369
370
371### 26. ONIG_SYN_OP_ESC_CONTROL_CHARS (enable `\n`, `\r`, `\t`, etc.)
372
373_Set in: PosixBasic, PosixExtended, Java, Perl, Perl_NG, Ruby, Oniguruma_
374
375Enables support for C-style control-code escapes, like `\n` and `\r`.  Specifically,
376this recognizes `\a` (7), `\b` (8), `\t` (9), `\n` (10), `\f` (12), `\r` (13), and
377`\e` (27).  If ONIG_SYN_OP2_ESC_V_VTAB is enabled (see below), this also enables
378support for recognizing `\v` as code point 11.
379
380
381### 27. ONIG_SYN_OP_ESC_C_CONTROL (enable `\cx` control codes)
382
383_Set in: Java, Perl, Perl_NG, Ruby, Oniguruma_
384
385Enables support for named control-code escapes, like `\cm` or `\cM` for code-point
38613.  In this shorthand form, control codes may be specified by `\c` (for "Control")
387followed by an alphabetic letter, a-z or A-Z, indicating which code point to represent
388(1 through 26).  So `\cA` is code point 1, and `\cZ` is code point 26.
389
390
391### 28. ONIG_SYN_OP_ESC_OCTAL3 (enable `\OOO` octal codes)
392
393_Set in: Java, Perl, Perl_NG, Ruby, Oniguruma_
394
395Enables support for octal-style escapes of up to three digits, like `\1` for code
396point 1, and `\177` for code point 127.  Octal values greater than 255 will result
397in an error message.
398
399
400### 29. ONIG_SYN_OP_ESC_X_HEX2 (enable `\xHH` hex codes)
401
402_Set in: Java, Perl, Perl_NG, Ruby, Oniguruma_
403
404Enables support for hexadecimal-style escapes of up to two digits, like `\x1` for code
405point 1, and `\x7F` for code point 127.
406
407
408### 30. ONIG_SYN_OP_ESC_X_BRACE_HEX8 (enable `\x{7HHHHHHH}` hex codes)
409
410_Set in: Perl, Perl_NG, Ruby, Oniguruma_
411
412Enables support for brace-wrapped hexadecimal-style escapes of up to eight digits,
413like `\x{1}` for code point 1, and `\x{FFFE}` for code point 65534.
414
415
416### 31. ONIG_SYN_OP_ESC_O_BRACE_OCTAL (enable `\o{1OOOOOOOOOO}` octal codes)
417
418_Set in: Perl, Perl_NG, Ruby, Oniguruma_
419
420Enables support for brace-wrapped octal-style escapes of up to eleven digits,
421like `\o{1}` for code point 1, and `\o{177776}` for code point 65534.
422
423(New feature as of Oniguruma 6.3.)
424
425
426----------
427
428
429## Group Two Flags (op2)
430
431
432This group contains support for lesser-known regex syntax constructs.
433
434
435### 0. ONIG_SYN_OP2_ESC_CAPITAL_Q_QUOTE (enable `\Q...\E`)
436
437_Set in: Java, Perl, Perl_NG_
438
439Enables support for "quoted" parts of a pattern:  Between `\Q` and `\E`, all
440syntax parsing is turned off, so that metacharacters like `*` and `+` will no
441longer be treated as metacharacters, and instead will be matched as literal
442`*` and `+`, as if they had been escaped with `\*` and `\+`.
443
444
445### 1. ONIG_SYN_OP2_QMARK_GROUP_EFFECT (enable `(?...)`)
446
447_Set in: Java, Perl, Perl_NG, Ruby, Oniguruma_
448
449Enables support for the fairly-common `(?...)` grouping operator, which
450controls precedence but which does _not_ capture its contents.
451
452
453### 2. ONIG_SYN_OP2_OPTION_PERL (enable options `(?imsx)` and `(?-imsx)`)
454
455_Set in: Java, Perl, Perl_NG_
456
457Enables support of regex options. (i,m,s,x)
458The supported toggle-able options for this flag are:
459
460  - `i` - Case-insensitivity
461  - `m` - Multi-line mode (`^` and `$` match at `\n` as well as start/end of buffer)
462  - `s` - Single-line mode (`.` can match `\n`)
463  - `x` - Extended pattern (free-formatting: whitespace will ignored)
464
465
466### 3. ONIG_SYN_OP2_OPTION_RUBY (enable options `(?imx)` and `(?-imx)`)
467
468_Set in: Ruby, Oniguruma_
469
470Enables support of regex options. (i,m,x)
471The supported toggle-able options for this flag are:
472
473  - `i` - Case-insensitivity
474  - `m` - Multi-line mode (`.` can match `\n`)
475  - `x` - Extended pattern (free-formatting: whitespace will ignored)
476
477
478### 4. ONIG_SYN_OP2_PLUS_POSSESSIVE_REPEAT (enable `r?+`, `r*+`, and `r++`)
479
480_Set in: Ruby, Oniguruma_
481
482Enables support for the _possessive_ quantifiers `?+`, `*+`, and `++`, which
483work similarly to `?` and `*` and `+`, respectively, but which do not backtrack
484after matching:  Like the normal greedy quantifiers, they match as much as
485possible, but they do not attempt to match _less_ than their maximum possible
486extent if subsequent parts of the pattern fail to match.
487
488
489### 5. ONIG_SYN_OP2_PLUS_POSSESSIVE_INTERVAL (enable `r{n,m}+`)
490
491_Set in: Java_
492
493Enables support for the _possessive_ quantifier `{n,m}+`, which
494works similarly to `{n,m}`, but which does not backtrack
495after matching:  Like the normal greedy quantifier, it matches as much as
496possible, but it do not attempt to match _less_ than its maximum possible
497extent if subsequent parts of the pattern fail to match.
498
499
500### 6. ONIG_SYN_OP2_CCLASS_SET_OP (enable `&&` within `[...]`)
501
502_Set in: Java, Ruby, Oniguruma_
503
504Enables support for character-class _intersection_.  For example, with this
505feature enabled, you can write `[a-z&&[^aeiou]]` to produce a character class
506of only consonants, or `[\0-\37&&[^\n\r]]` to produce a character class of
507all control codes _except_ newlines.
508
509
510### 7. ONIG_SYN_OP2_QMARK_LT_NAMED_GROUP (enable named captures `(?<name>...)`)
511
512_Set in: Perl_NG, Ruby, Oniguruma_
513
514Enables support for _naming_ capture groups, so that instead of having to
515refer to captures by position (like `\3` or `$3`), you can refer to them by names
516(like `server` and `path`).  This supports the Perl/Ruby naming syntaxes `(?<name>...)`
517and `(?'name'...)`, but not the Python `(?P<name>...)` syntax.
518
519
520### 8. ONIG_SYN_OP2_ESC_K_NAMED_BACKREF (enable named backreferences `\k<name>`)
521
522_Set in: Perl_NG, Ruby, Oniguruma_
523
524Enables support for substituted backreferences by name, not just by position.
525This supports using `\k'name'` in addition to supporting `\k<name>`.  This also
526supports an Oniguruma-specific extension that lets you specify the _distance_ of
527the match, if the capture matched multiple times, by writing `\k<name+n>` or
528`\k<name-n>`.
529
530
531### 9. ONIG_SYN_OP2_ESC_G_SUBEXP_CALL (enable backreferences `\g<name>` and `\g<n>`)
532
533_Set in: Perl_NG, Ruby, Oniguruma_
534
535Enables support for substituted backreferences by both name and position using
536the same syntax.  This supports using `\g'name'` and `\g'1'` in addition to
537supporting `\g<name>` and `\g<1>`.
538
539
540### 10. ONIG_SYN_OP2_ATMARK_CAPTURE_HISTORY (enable `(?@...)` and `(?@<name>...)`)
541
542_Set in: none_
543
544Enables support for _capture history_, which can answer via the `onig_*capture*()`
545functions exactly which captures were matched, how many times, and where in the
546input they were matched, by placing `?@` in front of the capture.  Per Oniguruma's
547regex syntax documentation (appendix A-5):
548
549`/(?@a)*/.match("aaa")` ==> `[<0-1>, <1-2>, <2-3>]`
550
551This can require substantial memory, is primarily useful for debugging, and is not
552enabled by default in any syntax.
553
554
555### 11. ONIG_SYN_OP2_ESC_CAPITAL_C_BAR_CONTROL (enable `\C-x`)
556
557_Set in: Ruby, Oniguruma_
558
559Enables support for Ruby legacy control-code escapes, like `\C-m` or `\C-M` for code-point
56013.  In this shorthand form, control codes may be specified by `\C-` (for "Control")
561followed by a single character (or equivalent), indicating which code point to represent,
562based on that character's lowest five bits.  So, like `\c`, you can represent code-point
56310 with `\C-j`, but you can also represent it with `\C-*` as well.
564
565See also ONIG_SYN_OP_ESC_C_CONTROL, which enables the more-common `\cx` syntax.
566
567
568### 12. ONIG_SYN_OP2_ESC_CAPITAL_M_BAR_META (enable `\M-x`)
569
570_Set in: Ruby, Oniguruma_
571
572Enables support for Ruby legacy meta-code escapes.  When you write `\M-x`, Oniguruma
573will match an `x` whose 8th bit is set (i.e., the character code of `x` will be or'ed
574with `0x80`).  So, for example, you can match `\x81` using `\x81`, or you can write
575`\M-\1`.  This is mostly useful when working with legacy 8-bit character encodings.
576
577
578### 13. ONIG_SYN_OP2_ESC_V_VTAB (enable `\v` as vertical tab)
579
580_Set in: Java, Ruby, Oniguruma_
581
582Enables support for a C-style `\v` escape code, meaning "vertical tab."  If enabled,
583`\v` will be equivalent to ASCII code point 11.
584
585
586### 14. ONIG_SYN_OP2_ESC_U_HEX4 (enable `\uHHHH` for Unicode)
587
588_Set in: Java, Ruby, Oniguruma_
589
590Enables support for a Java-style `\uHHHH` escape code for representing Unicode
591code-points by number, using up to four hexadecimal digits (up to `\uFFFF`).  So,
592for example, `\u221E` will match an infinity symbol, `∞`.
593
594For code points larger than four digits, like the emoji `��` (aerial tramway, or code
595point U+1F6A1), you must either represent the character directly using an encoding like
596UTF-8, or you must enable support for ONIG_SYN_OP_ESC_X_BRACE_HEX8 or
597ONIG_SYN_OP_ESC_O_BRACE_OCTAL, which support more than four digits.
598
599(New feature as of Oniguruma 6.7.)
600
601
602### 15. ONIG_SYN_OP2_ESC_GNU_BUF_ANCHOR (enable ``\` `` and `\'` anchors)
603
604_Set in: Emacs_
605
606This flag makes the ``\` `` and `\'` escapes function identically to
607`\A` and `\z`, respectively (when ONIG_SYN_OP_ESC_AZ_BUF_ANCHOR is enabled).
608
609These anchor forms are very obscure, and rarely supported by other regex libraries.
610
611
612### 16. ONIG_SYN_OP2_ESC_P_BRACE_CHAR_PROPERTY (enable `\p{...}` and `\P{...}`)
613
614_Set in: Java, Perl, Perl_NG, Ruby, Oniguruma_
615
616Enables support for an alternate syntax for POSIX character classes; instead of
617writing `[:alpha:]` when this is enabled, you can instead write `\p{alpha}`.
618
619See also ONIG_SYN_OP_POSIX_BRACKET for the classic POSIX form.
620
621
622### 17. ONIG_SYN_OP2_ESC_P_BRACE_CIRCUMFLEX_NOT (enable `\p{^...}` and `\P{^...}`)
623
624_Set in: Perl, Perl_NG, Ruby, Oniguruma_
625
626Enables support for an alternate syntax for POSIX character classes; instead of
627writing `[:^alpha:]` when this is enabled, you can instead write `\p{^alpha}`.
628
629See also ONIG_SYN_OP_POSIX_BRACKET for the classic POSIX form.
630
631
632### 18. ONIG_SYN_OP2_CHAR_PROPERTY_PREFIX_IS
633
634_(not presently used)_
635
636
637### 19. ONIG_SYN_OP2_ESC_H_XDIGIT (enable `\h` and `\H`)
638
639_Set in: Ruby, Oniguruma_
640
641Enables support for the Ruby-specific shorthand `\h` and `\H` metacharacters.
642Somewhat like `\d` matches decimal digits, `\h` matches hexadecimal digits — that is,
643characters in `[0-9a-fA-F]`.
644
645`\H` matches the opposite of whatever `\h` matches.
646
647
648### 20. ONIG_SYN_OP2_INEFFECTIVE_ESCAPE (disable `\`)
649
650_Set in: As-is_
651
652If set, this disables all escape codes, shorthands, and metacharacters that start
653with `\` (or whatever the configured escape character is), allowing `\` to be treated
654as a literal `\`.
655
656You usually do not want this flag to be enabled.
657
658
659### 21. ONIG_SYN_OP2_QMARK_LPAREN_IF_ELSE (enable `(?(...)then|else)`)
660
661_Set in: Perl, Perl_NG, Ruby, Oniguruma_
662
663Enables support for conditional inclusion of subsequent regex patterns based on whether
664a prior named or numbered capture matched, or based on whether a pattern will
665match.  This supports many different forms, including:
666
667  - `(?(<foo>)then|else)` - condition based on a capture by name.
668  - `(?('foo')then|else)` - condition based on a capture by name.
669  - `(?(3)then|else)` - condition based on a capture by number.
670  - `(?(+3)then|else)` - forward conditional to a future match, by relative position.
671  - `(?(-3)then|else)` - backward conditional to a prior match, by relative position.
672  - `(?(foo)then|else)` - this matches a pattern `foo`. (foo is any sub-expression)
673
674(New feature as of Oniguruma 6.5.)
675
676
677### 22. ONIG_SYN_OP2_ESC_CAPITAL_K_KEEP (enable `\K`)
678
679_Set in: Perl, Perl_NG, Ruby, Oniguruma_
680
681Enables support for `\K`, which excludes all content before it from the overall
682regex match (i.e., capture #0).  So, for example, pattern `foo\Kbar` would match
683`foobar`, but capture #0 would only include `bar`.
684
685(New feature as of Oniguruma 6.5.)
686
687
688### 23. ONIG_SYN_OP2_ESC_CAPITAL_R_GENERAL_NEWLINE (enable `\R`)
689
690_Set in: Perl, Perl_NG, Ruby, Oniguruma_
691
692Enables support for `\R`, the "general newline" shorthand, which matches
693`(\r\n|[\n\v\f\r\u0085\u2028\u2029])` (obviously, the Unicode values are cannot be
694matched in ASCII encodings).
695
696(New feature as of Oniguruma 6.5.)
697
698
699### 24. ONIG_SYN_OP2_ESC_CAPITAL_N_O_SUPER_DOT (enable `\N` and `\O`)
700
701_Set in: Perl, Perl_NG, Oniguruma_
702
703Enables support for `\N` and `\O`.  `\N` is "not a line break," which is much
704like the standard `.` metacharacter, except that while `.` can be affected by
705the single-line setting, `\N` always matches exactly one character that is not
706one of the various line-break characters (like `\n` and `\r`).
707
708`\O` matches exactly one character, regardless of whether single-line or
709multi-line mode are enabled or disabled.
710
711(New feature as of Oniguruma 6.5.)
712
713
714### 25. ONIG_SYN_OP2_QMARK_TILDE_ABSENT_GROUP (enable `(?~...)`)
715
716_Set in: Ruby, Oniguruma_
717
718Enables support for the `(?~r)` "absent operator" syntax, which matches
719as much as possible as long as the result _doesn't_ match pattern `r`.  This is
720_not_ the same as negative lookahead or negative lookbehind.
721
722Among the most useful examples of this is `\/\*(?~\*\/)\*\/`, which matches
723C-style comments by simply saying "starts with /*, ends with */, and _doesn't_
724contain a */ in between."
725
726A full explanation of this feature is complicated, but it is useful, and an
727excellent article about it is [available on Medium](https://medium.com/rubyinside/the-new-absent-operator-in-ruby-s-regular-expressions-7c3ef6cd0b99).
728
729(New feature as of Oniguruma 6.5.)
730
731
732### 26. ONIG_SYN_OP2_ESC_X_Y_TEXT_SEGMENT (enable `\X` and `\Y` and `\y`)
733
734_Set in: Perl, Perl_NG, Ruby, Oniguruma_
735
736`\X` is another variation on `.`, designed to support Unicode, in that it matches
737a full _grapheme cluster_.  In Unicode, `à` can be encoded as one code point,
738`U+00E0`, or as two, `U+0061 U+0300`.  If those are further escaped using UTF-8,
739the former becomes two bytes, and the latter becomes three.  Unfortunately, `.`
740would naively match only one or two bytes, depending on the encoding, and would
741likely incorrectly match anything from just `a` to a broken half of a code point.
742`\X` is designed to fix this:  It matches the full `à`, no matter how `à` is
743encoded or decomposed.
744
745`\y` matches a cluster boundary, i.e., a zero-width position between
746graphemes, somewhat like `\b` matches boundaries between words.  `\Y` matches
747the _opposite_ of `\y`, that is, a zero-width position between code points in
748the _middle_ of a grapheme.
749
750(New feature as of Oniguruma 6.6.)
751
752
753### 27. ONIG_SYN_OP2_QMARK_PERL_SUBEXP_CALL (enable `(?R)` and `(?&name)`)
754
755_Set in: Perl_NG_
756
757Enables support for substituted backreferences by both name and position using
758Perl-5-specific syntax.  This supports using `(?R3)` and `(?&name)` to reference
759previous (and future) matches, similar to the more-common `\g<3>` and `\g<name>`
760backreferences.
761
762(New feature as of Oniguruma 6.7.)
763
764
765### 28. ONIG_SYN_OP2_QMARK_BRACE_CALLOUT_CONTENTS (enable `(?{...})`)
766
767_Set in: Perl, Perl_NG, Oniguruma_
768
769Enables support for Perl-style "callouts" — pattern substitutions that result from
770invoking a callback method.  When `(?{foo})` is reached in a pattern, the callback
771function set in `onig_set_progress_callout()` will be invoked, and be able to perform
772custom computation during the pattern match (and during backtracking).
773
774Full documentation for this advanced feature can be found in the Oniguruma
775`docs/CALLOUT.md` file, with an example in `samples/callout.c`.
776
777(New feature as of Oniguruma 6.8.)
778
779
780### 29. ONIG_SYN_OP2_ASTERISK_CALLOUT_NAME (enable `(*name)`)
781
782_Set in: Perl, Perl_NG, Oniguruma_
783
784Enables support for Perl-style "callouts" — pattern substitutions that result from
785invoking a callback method.  When `(*foo)` is reached in a pattern, the callback
786function set in `onig_set_callout_of_name()` will be invoked, passing the given name
787`foo` to it, and it can perform custom computation during the pattern match (and
788during backtracking).
789
790Full documentation for this advanced feature can be found in the Oniguruma
791`docs/CALLOUT.md` file, with an example in `samples/callout.c`.
792
793(New feature as of Oniguruma 6.8.)
794
795
796### 30. ONIG_SYN_OP2_OPTION_ONIGURUMA (enable options `(?imxWSDPy)` and `(?-imxWDSP)`)
797
798_Set in: Oniguruma_
799
800Enables support of regex options. (i,m,x,W,S,D,P,y)
801
802(New feature as of Oniguruma 6.9.2)
803
804  - `i` - Case-insensitivity
805  - `m` - Multi-line mode (`.` can match `\n`)
806  - `x` - Extended pattern (free-formatting: whitespace will ignored)
807  - `W` - ASCII only word.
808  - `D` - ASCII only digit.
809  - `S` - ASCII only space.
810  - `P` - ASCII only POSIX properties. (includes W,D,S)
811
812----------
813
814
815## Syntax Flags (syn)
816
817
818This group contains rules to handle corner cases and constructs that are errors in
819some syntaxes but not in others.
820
821### 0. ONIG_SYN_CONTEXT_INDEP_REPEAT_OPS (independent `?`, `*`, `+`, `{n,m}`)
822
823_Set in: PosixExtended, GnuRegex, Java, Perl, Perl_NG, Ruby, Oniguruma_
824
825This flag specifies how to handle operators like `?` and `*` when they aren't
826directly attached to an operand, as in `^*` or `(*)`:  Are they an error, are
827they discarded, or are they taken as literals?  If this flag is clear, they
828are taken as literals; otherwise, the ONIG_SYN_CONTEXT_INVALID_REPEAT_OPS flag
829determines if they are errors or if they are discarded.
830
831### 1. ONIG_SYN_CONTEXT_INVALID_REPEAT_OPS (error or ignore independent operators)
832
833_Set in: PosixExtended, GnuRegex, Java, Perl, Perl_NG, Ruby, Oniguruma_
834
835If ONIG_SYN_CONTEXT_INDEP_REPEAT_OPS is set, this flag controls what happens when
836independent operators appear in a pattern:  If this flag is set, then independent
837operators produce an error message; if this flag is clear, then independent
838operators are silently discarded.
839
840### 2. ONIG_SYN_ALLOW_UNMATCHED_CLOSE_SUBEXP (allow `...)...`)
841
842_Set in: PosixExtended_
843
844This flag, if set, causes a `)` character without a preceding `(` to be treated as
845a literal `)`, equivalent to `\)`.  If this flag is clear, then an unmatched `)`
846character will produce an error message.
847
848### 3. ONIG_SYN_ALLOW_INVALID_INTERVAL (allow `{???`)
849
850_Set in: GnuRegex, Java, Perl, Perl_NG, Ruby, Oniguruma_
851
852This flag, if set, causes an invalid range, like `foo{bar}` or `foo{}`, to be
853silently discarded, as if `foo` had been written instead.  If clear, an invalid
854range will produce an error message.
855
856### 4. ONIG_SYN_ALLOW_INTERVAL_LOW_ABBREV (allow `{,n}` to mean `{0,n}`)
857
858_Set in: Ruby, Oniguruma_
859
860If this flag is set, then `r{,n}` will be treated as equivalent to writing
861`{0,n}`.  If this flag is clear, then `r{,n}` will produce an error message.
862
863Note that regardless of whether this flag is set or clear, if
864ONIG_SYN_OP_BRACE_INTERVAL is enabled, then `r{n,}` will always be legal:  This
865flag *only* controls the behavior of the opposite form, `r{,n}`.
866
867### 5. ONIG_SYN_STRICT_CHECK_BACKREF (error on invalid backrefs)
868
869_Set in: none_
870
871If this flag is set, an invalid backref, like `\1` in a pattern with no captures,
872will produce an error.  If this flag is clear, then an invalid backref will be
873equivalent to the empty string.
874
875No built-in syntax has this flag enabled.
876
877### 6. ONIG_SYN_DIFFERENT_LEN_ALT_LOOK_BEHIND (allow `(?<=a|bc)`)
878
879_Set in: Java, Ruby, Oniguruma_
880
881If this flag is set, lookbehind patterns with alternate options may have differing
882lengths among those options.  If this flag is clear, lookbehind patterns with options
883must have each option have identical length to the other options.
884
885Oniguruma can handle either form, but not all regex engines can, so for compatibility,
886Oniguruma allows you to cause regexes for other regex engines to fail if they might
887depend on this rule.
888
889### 7. ONIG_SYN_CAPTURE_ONLY_NAMED_GROUP (prefer `\k<name>` over `\3`)
890
891_Set in: Perl_NG, Ruby, Oniguruma_
892
893If this flag is set on the syntax *and* ONIG_OPTION_CAPTURE_GROUP is set when calling
894Oniguruma, then if a name is used on any capture, all captures must also use names:  A
895single use of a named capture prohibits the use of numbered captures.
896
897### 8. ONIG_SYN_ALLOW_MULTIPLEX_DEFINITION_NAME (allow `(?<x>)...(?<x>)`)
898
899_Set in: Perl_NG, Ruby, Oniguruma_
900
901If this flag is set, multiple capture groups may use the same name.  If this flag is
902clear, then reuse of a name will produce an error message.
903
904### 9. ONIG_SYN_FIXED_INTERVAL_IS_GREEDY_ONLY (`a{n}?` is equivalent to `(?:a{n})?`)
905
906_Set in: Ruby, Oniguruma_
907
908If this flag is set, then intervals of a fixed size will ignore a lazy (non-greedy)
909`?` quantifier and treat it as an optional match (an ordinary `r?`), since "match as
910little as possible" is meaningless for a fixed-size interval.  If this flag is clear,
911then `r{n}?` will mean the same as `r{n}`, and the useless `?` will be discarded.
912
913### 20. ONIG_SYN_NOT_NEWLINE_IN_NEGATIVE_CC (add `\n` to `[^...]`)
914
915_Set in: Grep_
916
917If this flag is set, all newline characters (like `\n`) will be excluded from a negative
918character class automatically, as if the pattern had been written as `[^...\n]`.  If this
919flag is clear, negative character classes do not automatically exclude newlines, and
920only exclude those characters and ranges written in them.
921
922### 21. ONIG_SYN_BACKSLASH_ESCAPE_IN_CC (allow `[...\w...]`)
923
924_Set in: GnuRegex, Java, Perl, Perl_NG, Ruby, Oniguruma_
925
926If this flag is set, shorthands like `\w` are allowed to describe characters in character
927classes.  If this flag is clear, shorthands like `\w` are treated as a redundantly-escaped
928literal `w`.
929
930### 22. ONIG_SYN_ALLOW_EMPTY_RANGE_IN_CC (silently discard `[z-a]`)
931
932_Set in: Emacs, Grep_
933
934If this flag is set, then character ranges like `[z-a]` that are broken or contain no
935characters will be silently ignored.  If this flag is clear, then broken or empty
936character ranges will produce an error message.
937
938### 23. ONIG_SYN_ALLOW_DOUBLE_RANGE_OP_IN_CC (treat `[0-9-a]` as `[0-9\-a]`)
939
940_Set in: PosixExtended, GnuRegex, Java, Perl, Perl_NG, Ruby, Oniguruma_
941
942If this flag is set, then a trailing `-` after a character range will be taken as a
943literal `-`, as if it had been escaped as `\-`.  If this flag is clear, then a trailing
944`-` after a character range will produce an error message.
945
946### 24. ONIG_SYN_WARN_CC_OP_NOT_ESCAPED (warn on `[[...]` and `[-x]`)
947
948_Set in: Ruby, Oniguruma_
949
950If this flag is set, Oniguruma will be stricter about warning for bad forms in
951character classes:  `[[...]` will produce a warning, but `[\[...]` will not;
952`[-x]` will produce a warning, but `[\-x]` will not; `[x&&-y]` will produce a warning,
953while `[x&&\-y]` will not; and so on.  If this flag is clear, all of these warnings
954will be silently discarded.
955
956### 25. ONIG_SYN_WARN_REDUNDANT_NESTED_REPEAT (warn on `(?:a*)+`)
957
958_Set in: Ruby, Oniguruma_
959
960If this flag is set, Oniguruma will warn about nested repeat operators those have no meaning, like `(?:a*)+`.
961If this flag is clear, Oniguruma will allow the nested repeat operators without warning about them.
962
963### 26. ONIG_SYN_ALLOW_INVALID_CODE_END_OF_RANGE_IN_CC (allow [a-\x{7fffffff}])
964
965_Set in: Oniguruma_
966
967If this flag is set, then invalid code points at the end of range in character class are allowed.
968
969### 31. ONIG_SYN_CONTEXT_INDEP_ANCHORS
970
971_Set in: PosixExtended, GnuRegex, Java, Perl, Perl_NG, Ruby, Oniguruma_
972
973Not currently used, and does nothing.  (But still set in several syntaxes for some
974reason.)
975
976----------
977
978## Usage tables
979
980These tables show which of the built-in syntaxes use which flags and options, for easy comparison between them.
981
982### Group One Flags (op)
983
984| ID    | Option                                        | PosB  | PosEx | Emacs | Grep  | Gnu   | Java  | Perl  | PeNG  | Ruby  | Onig  |
985| ----- | --------------------------------------------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
986| 0     | `ONIG_SYN_OP_VARIABLE_META_CHARACTERS`        | -     | -     | -     | -     | -     | -     | -     | -     | -     | -     |
987| 1     | `ONIG_SYN_OP_DOT_ANYCHAR`                     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
988| 2     | `ONIG_SYN_OP_ASTERISK_ZERO_INF`               | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
989| 3     | `ONIG_SYN_OP_ESC_ASTERISK_ZERO_INF`           | -     | -     | -     | -     | -     | -     | -     | -     | -     | -     |
990| 4     | `ONIG_SYN_OP_PLUS_ONE_INF`                    | -     | Yes   | Yes   | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
991| 5     | `ONIG_SYN_OP_ESC_PLUS_ONE_INF`                | -     | -     | -     | Yes   | -     | -     | -     | -     | -     | -     |
992| 6     | `ONIG_SYN_OP_QMARK_ZERO_ONE`                  | -     | Yes   | Yes   | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
993| 7     | `ONIG_SYN_OP_ESC_QMARK_ZERO_ONE`              | -     | -     | -     | Yes   | -     | -     | -     | -     | -     | -     |
994| 8     | `ONIG_SYN_OP_BRACE_INTERVAL`                  | -     | Yes   | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
995| 9     | `ONIG_SYN_OP_ESC_BRACE_INTERVAL`              | Yes   | -     | Yes   | Yes   | -     | -     | -     | -     | -     | -     |
996| 10    | `ONIG_SYN_OP_VBAR_ALT`                        | -     | Yes   | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
997| 11    | `ONIG_SYN_OP_ESC_VBAR_ALT`                    | -     | -     | Yes   | Yes   | -     | -     | -     | -     | -     | -     |
998| 12    | `ONIG_SYN_OP_LPAREN_SUBEXP`                   | -     | Yes   | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
999| 13    | `ONIG_SYN_OP_ESC_LPAREN_SUBEXP`               | Yes   | -     | Yes   | Yes   | -     | -     | -     | -     | -     | -     |
1000| 14    | `ONIG_SYN_OP_ESC_AZ_BUF_ANCHOR`               | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1001| 15    | `ONIG_SYN_OP_ESC_CAPITAL_G_BEGIN_ANCHOR`      | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1002| 16    | `ONIG_SYN_OP_DECIMAL_BACKREF`                 | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1003| 17    | `ONIG_SYN_OP_BRACKET_CC`                      | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1004| 18    | `ONIG_SYN_OP_ESC_W_WORD`                      | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1005| 19    | `ONIG_SYN_OP_ESC_LTGT_WORD_BEGIN_END`         | -     | -     | -     | Yes   | Yes   | -     | -     | -     | -     | -     |
1006| 20    | `ONIG_SYN_OP_ESC_B_WORD_BOUND`                | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1007| 21    | `ONIG_SYN_OP_ESC_S_WHITE_SPACE`               | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1008| 22    | `ONIG_SYN_OP_ESC_D_DIGIT`                     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1009| 23    | `ONIG_SYN_OP_LINE_ANCHOR`                     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1010| 24    | `ONIG_SYN_OP_POSIX_BRACKET`                   | Yes   | Yes   | Yes   | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1011| 25    | `ONIG_SYN_OP_QMARK_NON_GREEDY`                | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   |
1012| 26    | `ONIG_SYN_OP_ESC_CONTROL_CHARS`               | Yes   | Yes   | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   |
1013| 27    | `ONIG_SYN_OP_ESC_C_CONTROL`                   | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   |
1014| 28    | `ONIG_SYN_OP_ESC_OCTAL3`                      | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   |
1015| 29    | `ONIG_SYN_OP_ESC_X_HEX2`                      | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   |
1016| 30    | `ONIG_SYN_OP_ESC_X_BRACE_HEX8`                | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   |
1017| 31    | `ONIG_SYN_OP_ESC_O_BRACE_OCTAL`               | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   |
1018
1019### Group Two Flags (op2)
1020
1021| ID    | Option                                        | PosB  | PosEx | Emacs | Grep  | Gnu   | Java  | Perl  | PeNG  | Ruby  | Onig  |
1022| ----- | --------------------------------------------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
1023| 0     | `ONIG_SYN_OP2_ESC_CAPITAL_Q_QUOTE`            | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | -     | -     |
1024| 1     | `ONIG_SYN_OP2_QMARK_GROUP_EFFECT`             | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   |
1025| 2     | `ONIG_SYN_OP2_OPTION_PERL`                    | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | -     | -     |
1026| 3     | `ONIG_SYN_OP2_OPTION_RUBY`                    | -     | -     | -     | -     | -     | -     | -     | -     | Yes   | -     |
1027| 4     | `ONIG_SYN_OP2_PLUS_POSSESSIVE_REPEAT`         | -     | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   |
1028| 5     | `ONIG_SYN_OP2_PLUS_POSSESSIVE_INTERVAL`       | -     | -     | -     | -     | -     | Yes   | -     | -     | -     | -     |
1029| 6     | `ONIG_SYN_OP2_CCLASS_SET_OP`                  | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   |
1030| 7     | `ONIG_SYN_OP2_QMARK_LT_NAMED_GROUP`           | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   |
1031| 8     | `ONIG_SYN_OP2_ESC_K_NAMED_BACKREF`            | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   |
1032| 9     | `ONIG_SYN_OP2_ESC_G_SUBEXP_CALL`              | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   |
1033| 10    | `ONIG_SYN_OP2_ATMARK_CAPTURE_HISTORY`         | -     | -     | -     | -     | -     | -     | -     | -     | -     | -     |
1034| 11    | `ONIG_SYN_OP2_ESC_CAPITAL_C_BAR_CONTROL`      | -     | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   |
1035| 12    | `ONIG_SYN_OP2_ESC_CAPITAL_M_BAR_META`         | -     | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   |
1036| 13    | `ONIG_SYN_OP2_ESC_V_VTAB`                     | -     | -     | -     | -     | -     | Yes   | -     | -     | Yes   | Yes   |
1037| 14    | `ONIG_SYN_OP2_ESC_U_HEX4`                     | -     | -     | -     | -     | -     | Yes   | -     | -     | Yes   | Yes   |
1038| 15    | `ONIG_SYN_OP2_ESC_GNU_BUF_ANCHOR`             | -     | -     | Yes   | -     | -     | -     | -     | -     | -     | -     |
1039| 16    | `ONIG_SYN_OP2_ESC_P_BRACE_CHAR_PROPERTY`      | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   |
1040| 17    | `ONIG_SYN_OP2_ESC_P_BRACE_CIRCUMFLEX_NOT`     | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   |
1041| 18    | `ONIG_SYN_OP2_CHAR_PROPERTY_PREFIX_IS`        | -     | -     | -     | -     | -     | -     | -     | -     | -     | -     |
1042| 19    | `ONIG_SYN_OP2_ESC_H_XDIGIT`                   | -     | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   |
1043| 20    | `ONIG_SYN_OP2_INEFFECTIVE_ESCAPE`             | -     | -     | -     | -     | -     | -     | -     | -     | -     | -     |
1044| 21    | `ONIG_SYN_OP2_QMARK_LPAREN_IF_ELSE`           | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   |
1045| 22    | `ONIG_SYN_OP2_ESC_CAPITAL_K_KEEP`             | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   |
1046| 23    | `ONIG_SYN_OP2_ESC_CAPITAL_R_GENERAL_NEWLINE`  | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   |
1047| 24    | `ONIG_SYN_OP2_ESC_CAPITAL_N_O_SUPER_DOT`      | -     | -     | -     | -     | -     | -     | Yes   | Yes   | -     | Yes   |
1048| 25    | `ONIG_SYN_OP2_QMARK_TILDE_ABSENT_GROUP`       | -     | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   |
1049| 26    | `ONIG_SYN_OP2_ESC_X_Y_TEXT_SEGMENT`           | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   |
1050| 27    | `ONIG_SYN_OP2_QMARK_PERL_SUBEXP_CALL`         | -     | -     | -     | -     | -     | -     | -     | Yes   | -     | -     |
1051| 28    | `ONIG_SYN_OP2_QMARK_BRACE_CALLOUT_CONTENTS`   | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | -     |
1052| 29    | `ONIG_SYN_OP2_ASTERISK_CALLOUT_NAME`          | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   | -     |
1053| 30    | `ONIG_SYN_OP2_OPTION_ONIGURUMA`               | -     | -     | -     | -     | -     | -     | -     | -     | -     | Yes   |
1054
1055### Syntax Flags (syn)
1056
1057| ID    | Option                                        | PosB  | PosEx | Emacs | Grep  | Gnu   | Java  | Perl  | PeNG  | Ruby  | Onig  |
1058| ----- | --------------------------------------------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
1059| 0     | `ONIG_SYN_CONTEXT_INDEP_REPEAT_OPS`           | -     | Yes   | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1060| 1     | `ONIG_SYN_CONTEXT_INVALID_REPEAT_OPS`         | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1061| 2     | `ONIG_SYN_ALLOW_UNMATCHED_CLOSE_SUBEXP`       | -     | Yes   | -     | -     | -     | -     | -     | -     | -     | -     |
1062| 3     | `ONIG_SYN_ALLOW_INVALID_INTERVAL`             | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1063| 4     | `ONIG_SYN_ALLOW_INTERVAL_LOW_ABBREV`          | -     | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   |
1064| 5     | `ONIG_SYN_STRICT_CHECK_BACKREF`               | -     | -     | -     | -     | -     | -     | -     | -     | -     | -     |
1065| 6     | `ONIG_SYN_DIFFERENT_LEN_ALT_LOOK_BEHIND`      | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   |
1066| 7     | `ONIG_SYN_CAPTURE_ONLY_NAMED_GROUP`           | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   |
1067| 8     | `ONIG_SYN_ALLOW_MULTIPLEX_DEFINITION_NAME`    | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   | Yes   |
1068| 9     | `ONIG_SYN_FIXED_INTERVAL_IS_GREEDY_ONLY`      | -     | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   |
1069| 20    | `ONIG_SYN_NOT_NEWLINE_IN_NEGATIVE_CC`         | -     | -     | -     | Yes   | -     | -     | -     | -     | -     | -     |
1070| 21    | `ONIG_SYN_BACKSLASH_ESCAPE_IN_CC`             | -     | -     | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1071| 22    | `ONIG_SYN_ALLOW_EMPTY_RANGE_IN_CC`            | -     | -     | Yes   | Yes   | -     | -     | -     | -     | -     | -     |
1072| 23    | `ONIG_SYN_ALLOW_DOUBLE_RANGE_OP_IN_CC`        | -     | Yes   | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1073| 24    | `ONIG_SYN_WARN_CC_OP_NOT_ESCAPED`             | -     | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   |
1074| 25    | `ONIG_SYN_WARN_REDUNDANT_NESTED_REPEAT`       | -     | -     | -     | -     | -     | -     | -     | -     | Yes   | Yes   |
1075| 26    | `ONIG_SYN_ALLOW_INVALID_CODE_END_OF_RANGE_IN_CC` | -     | -     | -     | -     | -     | -     | -     | -     | -     | Yes   |
1076| 31    | `ONIG_SYN_CONTEXT_INDEP_ANCHORS`              | -     | Yes   | -     | -     | Yes   | Yes   | Yes   | Yes   | Yes   | Yes   |
1077