xref: /PHP-5.5/ext/mbstring/oniguruma/README (revision fe92d64a)
1README  2007/05/31
2
3Oniguruma  ----   (C) K.Kosako <sndgk393 AT ybb DOT ne DOT jp>
4
5http://www.geocities.jp/kosako3/oniguruma/
6
7Oniguruma is a regular expressions library.
8The characteristics of this library is that different character encoding
9for every regular expression object can be specified.
10
11Supported character encodings:
12
13  ASCII, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE,
14  EUC-JP, EUC-TW, EUC-KR, EUC-CN,
15  Shift_JIS, Big5, GB18030, KOI8-R, CP1251,
16  ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5,
17  ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10,
18  ISO-8859-11, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16
19
20* GB18030: contributed by KUBO Takehiro
21* CP1251:  contributed by Byte
22------------------------------------------------------------
23
24License
25
26   BSD license.
27
28
29Install
30
31 Case 1: Unix and Cygwin platform
32
33   1. ./configure
34   2. make
35   3. make install
36
37   * uninstall
38
39     make uninstall
40
41   * test (ASCII/EUC-JP)
42
43     make atest
44
45   * configuration check
46
47     onig-config --cflags
48     onig-config --libs
49     onig-config --prefix
50     onig-config --exec-prefix
51
52
53
54 Case 2: Win32 platform (VC++)
55
56   1. copy win32\Makefile Makefile
57   2. copy win32\config.h config.h
58   3. nmake
59
60      onig_s.lib:  static link library
61      onig.dll:    dynamic link library
62
63  * test (ASCII/Shift_JIS)
64   4. copy win32\testc.c testc.c
65   5. nmake ctest
66
67
68
69Regular Expressions
70
71  See doc/RE (or doc/RE.ja for Japanese).
72
73
74Usage
75
76  Include oniguruma.h in your program. (Oniguruma API)
77  See doc/API for Oniguruma API.
78
79  If you want to disable UChar type (== unsigned char) definition
80  in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then
81  include oniguruma.h.
82
83  If you want to disable regex_t type definition in oniguruma.h,
84  define ONIG_ESCAPE_REGEX_T_COLLISION and then include oniguruma.h.
85
86  Example of the compiling/linking command line in Unix or Cygwin,
87  (prefix == /usr/local case)
88
89    cc sample.c -L/usr/local/lib -lonig
90
91
92  If you want to use static link library(onig_s.lib) in Win32,
93  add option -DONIG_EXTERN=extern to C compiler.
94
95
96
97Sample Programs
98
99  sample/simple.c    example of the minimum (Oniguruma API)
100  sample/names.c     example of the named group callback.
101  sample/encode.c    example of some encodings.
102  sample/listcap.c   example of the capture history.
103  sample/posix.c     POSIX API sample.
104  sample/sql.c       example of the variable meta characters.
105                     (SQL-like pattern matching)
106
107Test Programs
108  sample/syntax.c    Perl, Java and ASIS syntax test.
109  sample/crnl.c      --enable-crnl-as-line-terminator test
110
111
112Source Files
113
114  oniguruma.h        Oniguruma API header file. (public)
115  onig-config.in     configuration check program template.
116
117  regenc.h           character encodings framework header file.
118  regint.h           internal definitions
119  regparse.h         internal definitions for regparse.c and regcomp.c
120  regcomp.c          compiling and optimization functions
121  regenc.c           character encodings framework.
122  regerror.c         error message function
123  regext.c           extended API functions. (deluxe version API)
124  regexec.c          search and match functions
125  regparse.c         parsing functions.
126  regsyntax.c        pattern syntax functions and built-in syntax definitions.
127  regtrav.c          capture history tree data traverse functions.
128  regversion.c       version info function.
129  st.h               hash table functions header file
130  st.c               hash table functions
131
132  oniggnu.h          GNU regex API header file. (public)
133  reggnu.c           GNU regex API functions
134
135  onigposix.h        POSIX API header file. (public)
136  regposerr.c        POSIX error message function.
137  regposix.c         POSIX API functions.
138
139  enc/mktable.c      character type table generator.
140  enc/ascii.c        ASCII encoding.
141  enc/euc_jp.c       EUC-JP encoding.
142  enc/euc_tw.c       EUC-TW encoding.
143  enc/euc_kr.c       EUC-KR, EUC-CN encoding.
144  enc/sjis.c         Shift_JIS encoding.
145  enc/big5.c         Big5      encoding.
146  enc/gb18030.c      GB18030   encoding.
147  enc/koi8.c         KOI8      encoding.
148  enc/koi8_r.c       KOI8-R    encoding.
149  enc/cp1251.c       CP1251    encoding.
150  enc/iso8859_1.c    ISO-8859-1  encoding. (Latin-1)
151  enc/iso8859_2.c    ISO-8859-2  encoding. (Latin-2)
152  enc/iso8859_3.c    ISO-8859-3  encoding. (Latin-3)
153  enc/iso8859_4.c    ISO-8859-4  encoding. (Latin-4)
154  enc/iso8859_5.c    ISO-8859-5  encoding. (Cyrillic)
155  enc/iso8859_6.c    ISO-8859-6  encoding. (Arabic)
156  enc/iso8859_7.c    ISO-8859-7  encoding. (Greek)
157  enc/iso8859_8.c    ISO-8859-8  encoding. (Hebrew)
158  enc/iso8859_9.c    ISO-8859-9  encoding. (Latin-5 or Turkish)
159  enc/iso8859_10.c   ISO-8859-10 encoding. (Latin-6 or Nordic)
160  enc/iso8859_11.c   ISO-8859-11 encoding. (Thai)
161  enc/iso8859_13.c   ISO-8859-13 encoding. (Latin-7 or Baltic Rim)
162  enc/iso8859_14.c   ISO-8859-14 encoding. (Latin-8 or Celtic)
163  enc/iso8859_15.c   ISO-8859-15 encoding. (Latin-9 or West European with Euro)
164  enc/iso8859_16.c   ISO-8859-16 encoding.
165                     (Latin-10 or South-Eastern European with Euro)
166  enc/utf8.c         UTF-8    encoding.
167  enc/utf16_be.c     UTF-16BE encoding.
168  enc/utf16_le.c     UTF-16LE encoding.
169  enc/utf32_be.c     UTF-32BE encoding.
170  enc/utf32_le.c     UTF-32LE encoding.
171  enc/unicode.c      Unicode information data.
172
173  win32/Makefile     Makefile for Win32 (VC++)
174  win32/config.h     config.h for Win32
175
176
177
178ToDo
179
180  ? case fold flag: Katakana <-> Hiragana.
181  ? add ONIG_OPTION_NOTBOS/NOTEOS. (\A, \z, \Z)
182 ?? \X (== \PM\pM*)
183 ?? implement syntax behavior ONIG_SYN_CONTEXT_INDEP_ANCHORS.
184 ?? transmission stopper. (return ONIG_STOP from match_at())
185
186and I'm thankful to Akinori MUSHA.
187
188
189Mail Address: K.Kosako <sndgk393 AT ybb DOT ne DOT jp>
190