1README 2007/06/18 2 3Oniguruma ---- (C) K.Kosako <sndgk393 AT ybb DOT ne DOT jp> 4 5http://www.geocities.jp/kosako3/oniguruma/ 6http://www.freebsd.org/cgi/cvsweb.cgi/ports/devel/oniguruma/ 7 8Oniguruma is a regular expressions library. 9The characteristics of this library is that different character encoding 10for every regular expression object can be specified. 11 12Supported character encodings: 13 14 ASCII, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE, 15 EUC-JP, EUC-TW, EUC-KR, EUC-CN, 16 Shift_JIS, Big5, GB 18030, KOI8-R, KOI8, 17 ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5, 18 ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10, 19 ISO-8859-11, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16 20 21* GB 18030: contributed by KUBO Takehiro 22* KOI8 is not included in library archive by default setup. 23 (need to edit Makefile if you want to use it.) 24------------------------------------------------------------ 25 26Install 27 28 Case 1: Unix and Cygwin platform 29 30 1. ./configure 31 2. make 32 3. make install 33 34 * uninstall 35 36 make uninstall 37 38 * test (ASCII/EUC-JP) 39 40 make atest 41 42 * configuration check 43 44 onig-config --cflags 45 onig-config --libs 46 onig-config --prefix 47 onig-config --exec-prefix 48 49 50 51 Case 2: Win32 platform (VC++) 52 53 1. copy win32\Makefile Makefile 54 2. copy win32\config.h config.h 55 3. nmake 56 57 onig_s.lib: static link library 58 onig.dll: dynamic link library 59 60 * test (ASCII/Shift_JIS) 61 4. copy win32\testc.c testc.c 62 5. nmake ctest 63 64 65 66License 67 68 When this software is partly used or it is distributed with Ruby, 69 this of Ruby follows the license of Ruby. 70 It follows the BSD license in the case of the one except for it. 71 72 73 74Regular Expressions 75 76 See doc/RE (or doc/RE.ja for Japanese). 77 78 79Usage 80 81 Include oniguruma.h in your program. (Oniguruma API) 82 See doc/API for Oniguruma API. 83 84 If you want to disable UChar type (== unsigned char) definition 85 in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then 86 include oniguruma.h. 87 88 If you want to disable regex_t type definition in oniguruma.h, 89 define ONIG_ESCAPE_REGEX_T_COLLISION and then include oniguruma.h. 90 91 Example of the compiling/linking command line in Unix or Cygwin, 92 (prefix == /usr/local case) 93 94 cc sample.c -L/usr/local/lib -lonig 95 96 97 If you want to use static link library(onig_s.lib) in Win32, 98 add option -DONIG_EXTERN=extern to C compiler. 99 100 101 102Sample Programs 103 104 sample/simple.c example of the minimum (Oniguruma API) 105 sample/names.c example of the named group callback. 106 sample/encode.c example of some encodings. 107 sample/listcap.c example of the capture history. 108 sample/posix.c POSIX API sample. 109 sample/sql.c example of the variable meta characters. 110 (SQL-like pattern matching) 111 sample/syntax.c Perl, Java and ASIS syntax test. 112 113 114Source Files 115 116 oniguruma.h Oniguruma API header file. (public) 117 onig-config.in configuration check program template. 118 119 regenc.h character encodings framework header file. 120 regint.h internal definitions 121 regparse.h internal definitions for regparse.c and regcomp.c 122 regcomp.c compiling and optimization functions 123 regenc.c character encodings framework. 124 regerror.c error message function 125 regext.c extended API functions. (deluxe version API) 126 regexec.c search and match functions 127 regparse.c parsing functions. 128 regsyntax.c pattern syntax functions and built-in syntax definitions. 129 regtrav.c capture history tree data traverse functions. 130 regversion.c version info function. 131 st.h hash table functions header file 132 st.c hash table functions 133 134 oniggnu.h GNU regex API header file. (public) 135 reggnu.c GNU regex API functions 136 137 onigposix.h POSIX API header file. (public) 138 regposerr.c POSIX error message function. 139 regposix.c POSIX API functions. 140 141 enc/mktable.c character type table generator. 142 enc/ascii.c ASCII encoding. 143 enc/euc_jp.c EUC-JP encoding. 144 enc/euc_tw.c EUC-TW encoding. 145 enc/euc_kr.c EUC-KR, EUC-CN encoding. 146 enc/sjis.c Shift_JIS encoding. 147 enc/big5.c Big5 encoding. 148 enc/gb18030.c GB 18030 encoding (contributed by KUBO Takehiro) 149 enc/koi8.c KOI8 encoding. 150 enc/koi8_r.c KOI8-R encoding. 151 enc/iso8859_1.c ISO-8859-1 encoding. (Latin-1) 152 enc/iso8859_2.c ISO-8859-2 encoding. (Latin-2) 153 enc/iso8859_3.c ISO-8859-3 encoding. (Latin-3) 154 enc/iso8859_4.c ISO-8859-4 encoding. (Latin-4) 155 enc/iso8859_5.c ISO-8859-5 encoding. (Cyrillic) 156 enc/iso8859_6.c ISO-8859-6 encoding. (Arabic) 157 enc/iso8859_7.c ISO-8859-7 encoding. (Greek) 158 enc/iso8859_8.c ISO-8859-8 encoding. (Hebrew) 159 enc/iso8859_9.c ISO-8859-9 encoding. (Latin-5 or Turkish) 160 enc/iso8859_10.c ISO-8859-10 encoding. (Latin-6 or Nordic) 161 enc/iso8859_11.c ISO-8859-11 encoding. (Thai) 162 enc/iso8859_13.c ISO-8859-13 encoding. (Latin-7 or Baltic Rim) 163 enc/iso8859_14.c ISO-8859-14 encoding. (Latin-8 or Celtic) 164 enc/iso8859_15.c ISO-8859-15 encoding. (Latin-9 or West European with Euro) 165 enc/iso8859_16.c ISO-8859-16 encoding. 166 (Latin-10 or South-Eastern European with Euro) 167 enc/utf8.c UTF-8 encoding. 168 enc/utf16_be.c UTF-16BE encoding. 169 enc/utf16_le.c UTF-16LE encoding. 170 enc/utf32_be.c UTF-32BE encoding. 171 enc/utf32_le.c UTF-32LE encoding. 172 enc/unicode.c Unicode information data. 173 174 win32/Makefile Makefile for Win32 (VC++) 175 win32/config.h config.h for Win32 176 177 178 179API differences with Japanized GNU regex(version 0.12) of Ruby 1.8/1.6 180 181 + re_compile_fastmap() is removed. 182 + re_alloc_pattern() is added. 183 184 185 186I'm thankful to Akinori MUSHA. 187 188 189Mail Address: K.Kosako <sndgk393 AT ybb DOT ne DOT jp> 190