xref: /PHP-7.3/ext/mbstring/oniguruma/README (revision d3f2cfe2)
1README  2018/04/05
2
3Oniguruma  ----   (C) K.Kosako
4
5https://github.com/kkos/oniguruma
6
7FIXED Security Issues (in Oniguruma 6.3.0):
8  CVE-2017-9224, CVE-2017-9225, CVE-2017-9226
9  CVE-2017-9227, CVE-2017-9228, CVE-2017-9229
10
11---
12Oniguruma is a modern and flexible regular expressions library. It
13encompasses features from different regular expression implementations
14that traditionally exist in different languages. It comes close to
15being a complete superset of all regular expression features found
16in other regular expression implementations.
17
18Its features include:
19* Character encoding can be specified per regular expression object.
20* Several regular expression types are supported:
21  * POSIX
22  * Grep
23  * GNU Regex
24  * Perl
25  * Java
26  * Ruby
27  * Emacs
28
29Supported character encodings:
30
31  ASCII, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE,
32  EUC-JP, EUC-TW, EUC-KR, EUC-CN,
33  Shift_JIS, Big5, GB18030, KOI8-R, CP1251,
34  ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5,
35  ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10,
36  ISO-8859-11, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16
37
38* GB18030: contributed by KUBO Takehiro
39* CP1251:  contributed by Byte
40------------------------------------------------------------
41
42License
43
44   BSD license.
45
46
47Install
48
49 Case 1: Unix and Cygwin platform
50
51   1. autoreconf -vfi   (* case: configure script is not found.)
52
53   2. ./configure
54   3. make
55   4. make install
56
57   * uninstall
58
59     make uninstall
60
61   * configuration check
62
63     onig-config --cflags
64     onig-config --libs
65     onig-config --prefix
66     onig-config --exec-prefix
67
68
69
70 Case 2: Windows 64/32bit platform (Visual Studio)
71
72   execute make_win64 or make_win32
73
74      src/onig_s.lib:  static link library
75      src/onig.dll:    dynamic link library
76
77  * test (ASCII/Shift_JIS)
78      1. cd src
79      2. copy ..\windows\testc.c .
80      3. nmake -f Makefile.windows ctest
81
82  (I have checked by Visual Studio Community 2015)
83
84
85
86Regular Expressions
87
88  See doc/RE (or doc/RE.ja for Japanese).
89
90
91Usage
92
93  Include oniguruma.h in your program. (Oniguruma API)
94  See doc/API for Oniguruma API.
95
96  If you want to disable UChar type (== unsigned char) definition
97  in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then
98  include oniguruma.h.
99
100  If you want to disable regex_t type definition in oniguruma.h,
101  define ONIG_ESCAPE_REGEX_T_COLLISION and then include oniguruma.h.
102
103  Example of the compiling/linking command line in Unix or Cygwin,
104  (prefix == /usr/local case)
105
106    cc sample.c -L/usr/local/lib -lonig
107
108
109  If you want to use static link library(onig_s.lib) in Win32,
110  add option -DONIG_EXTERN=extern to C compiler.
111
112
113
114Sample Programs
115
116  sample/simple.c    example of the minimum (Oniguruma API)
117  sample/names.c     example of the named group callback.
118  sample/encode.c    example of some encodings.
119  sample/listcap.c   example of the capture history.
120  sample/posix.c     POSIX API sample.
121  sample/sql.c       example of the variable meta characters.
122                     (SQL-like pattern matching)
123  sample/user_property.c  example of user defined Unicode property.
124
125Test Programs
126  sample/syntax.c    Perl, Java and ASIS syntax test.
127  sample/crnl.c      --enable-crnl-as-line-terminator test
128
129
130Source Files
131
132  oniguruma.h        Oniguruma API header file. (public)
133  onig-config.in     configuration check program template.
134
135  regenc.h           character encodings framework header file.
136  regint.h           internal definitions
137  regparse.h         internal definitions for regparse.c and regcomp.c
138  regcomp.c          compiling and optimization functions
139  regenc.c           character encodings framework.
140  regerror.c         error message function
141  regext.c           extended API functions. (deluxe version API)
142  regexec.c          search and match functions
143  regparse.c         parsing functions.
144  regsyntax.c        pattern syntax functions and built-in syntax definitions.
145  regtrav.c          capture history tree data traverse functions.
146  regversion.c       version info function.
147  st.h               hash table functions header file
148  st.c               hash table functions
149
150  oniggnu.h          GNU regex API header file. (public)
151  reggnu.c           GNU regex API functions
152
153  onigposix.h        POSIX API header file. (public)
154  regposerr.c        POSIX error message function.
155  regposix.c         POSIX API functions.
156
157  mktable.c          character type table generator.
158  ascii.c            ASCII encoding.
159  euc_jp.c           EUC-JP encoding.
160  euc_tw.c           EUC-TW encoding.
161  euc_kr.c           EUC-KR, EUC-CN encoding.
162  sjis.c             Shift_JIS encoding.
163  big5.c             Big5      encoding.
164  gb18030.c          GB18030   encoding.
165  koi8.c             KOI8      encoding.
166  koi8_r.c           KOI8-R    encoding.
167  cp1251.c           CP1251    encoding.
168  iso8859_1.c        ISO-8859-1  encoding. (Latin-1)
169  iso8859_2.c        ISO-8859-2  encoding. (Latin-2)
170  iso8859_3.c        ISO-8859-3  encoding. (Latin-3)
171  iso8859_4.c        ISO-8859-4  encoding. (Latin-4)
172  iso8859_5.c        ISO-8859-5  encoding. (Cyrillic)
173  iso8859_6.c        ISO-8859-6  encoding. (Arabic)
174  iso8859_7.c        ISO-8859-7  encoding. (Greek)
175  iso8859_8.c        ISO-8859-8  encoding. (Hebrew)
176  iso8859_9.c        ISO-8859-9  encoding. (Latin-5 or Turkish)
177  iso8859_10.c       ISO-8859-10 encoding. (Latin-6 or Nordic)
178  iso8859_11.c       ISO-8859-11 encoding. (Thai)
179  iso8859_13.c       ISO-8859-13 encoding. (Latin-7 or Baltic Rim)
180  iso8859_14.c       ISO-8859-14 encoding. (Latin-8 or Celtic)
181  iso8859_15.c       ISO-8859-15 encoding. (Latin-9 or West European with Euro)
182  iso8859_16.c       ISO-8859-16 encoding.
183                     (Latin-10 or South-Eastern European with Euro)
184  utf8.c             UTF-8    encoding.
185  utf16_be.c         UTF-16BE encoding.
186  utf16_le.c         UTF-16LE encoding.
187  utf32_be.c         UTF-32BE encoding.
188  utf32_le.c         UTF-32LE encoding.
189  unicode.c          common codes of Unicode encoding.
190
191  win32/Makefile     Makefile for Win32 (VC++)
192  win32/config.h     config.h for Win32
193
194
195and I'm thankful to Akinori MUSHA.
196