Commit | Line | Data |
---|---|---|
01c223d0 BOFG |
1 | <h2><a name="query">Query</a></h2> |
2 | ||
20dc6258 IK |
3 | <h3><a name="query-ignored">Long messages and words are ignored</a></h3> |
4 | <p> | |
5 | Messages longer than 100,000 letters or 500,000 bytes are ignored. Words longer than 40 characters are ignored. Attachments are ignored. | |
6 | </p> | |
7 | ||
01c223d0 BOFG |
8 | <h3><a name="query-term">Single term query</a></h3> |
9 | <p> | |
10 | The query specifies only one term for retrieving all | |
11 | documents which contain the term. e.g., | |
12 | </p> | |
13 | ||
14 | <p class="example"> | |
15 | namazu | |
16 | </p> | |
17 | ||
18 | <h3><a name="query-and">AND query</a></h3> | |
19 | ||
20 | <p> | |
21 | The query specifies two or more terms for retrieving all | |
22 | documents which contain both terms. You can insert the | |
23 | <code class="operator">and</code> operator between the terms. e.g., | |
24 | </p> | |
25 | ||
26 | <p class="example"> | |
27 | Linux and Netscape | |
28 | </p> | |
29 | ||
30 | <p> | |
31 | You can ommit the <code class="operator">and</code> operator. Terms which is | |
32 | separated by one ore more spaces is assumed to be AND query. | |
33 | </p> | |
34 | ||
35 | <h3><a name="query-or">OR query</a></h3> | |
36 | <p> | |
37 | The query specifies two or more terms for retrieving all | |
38 | documents which contain either term. You can insert the | |
39 | <code class="operator">or</code> operator between the terms. | |
40 | e.g., | |
41 | </p> | |
42 | ||
43 | <p class="example"> | |
44 | Linux or FreeBSD | |
45 | </p> | |
46 | ||
47 | <h3><a name="query-not">NOT query</a></h3> | |
48 | <p> | |
49 | The query specifies two or more terms for retrieving all | |
50 | documents which contain a first term but does't contain the | |
51 | following terms. You can insert the <code class="operator">not</code> | |
52 | operator between the terms to do NOT query. e.g., | |
53 | </p> | |
54 | ||
55 | <p class="example"> | |
56 | Linux not UNIX | |
57 | </p> | |
58 | ||
59 | ||
60 | <h3><a name="query-grouping">Grouping</a></h3> | |
61 | <p> | |
62 | You can group queries by surrounding them by | |
63 | parentheses. The parentheses should be separated by one or | |
64 | more spaces. e.g., | |
65 | </p> | |
66 | ||
67 | <p class="example"> | |
68 | ( Linux or FreeBSD ) and Netscape not Windows | |
69 | </p> | |
70 | ||
71 | <h3><a name="query-phrase">Phrase searching</a></h3> | |
72 | <p> | |
73 | You can search for a phrase which consists of two or more terms | |
74 | by surrounding them with double quotes like | |
75 | <code class="operator">"..."</code> or with braces like <code class="operator">{...}</code>. | |
76 | In Namazu, precision of phrase searching is not 100 %, | |
77 | so it causes wrong results occasionally. e.g., | |
78 | </p> | |
79 | ||
80 | <p class="example"> | |
81 | {GNU Emacs} | |
82 | </p> | |
83 | ||
84 | <!-- foo | |
85 | <p> | |
86 | You must choose the latter with Tkanamzu or namazu.el. | |
87 | </p> | |
88 | --> | |
89 | ||
90 | <h3><a name="query-substring">Substring matching</a></h3> | |
91 | <p> | |
92 | The are three types of substring matching searching. | |
93 | </p> | |
94 | ||
95 | <dl> | |
96 | <dt>Prefix matching | |
97 | <dd><code class="example">inter*</code> (terms which begin with <code>inter</code>) | |
98 | <dt>Inside matching | |
99 | <dd><code class="example">*text*</code> (terms which contain <code>text</code>) | |
100 | <dt>Suffix matching | |
101 | <dd><code class="example">*net</code> (terms which terminated | |
102 | with <code>net</code>) | |
103 | </dl> | |
104 | ||
105 | ||
106 | <h3><a name="query-regex">Regular expressions</a></h3> | |
107 | ||
108 | <p> | |
109 | You can use regular expressions for pattern matching. The | |
110 | regular expressions must be surrounded by slashes like <code | |
111 | class="operator">/.../</code>. Namazu uses <a | |
112 | href="http://www.ruby-lang.org/">Ruby</a>'s regular | |
113 | regular expressions engine. It offers generally <a | |
114 | href="http://www.perl.com/">Perl</a> compatible flavor. | |
115 | e.g., | |
116 | </p> | |
117 | ||
118 | <p class="example"> | |
119 | /pro(gram|blem)s?/ | |
120 | </p> | |
121 | ||
122 | ||
123 | <h3><a name="query-field">Field-specified searching</a></h3> | |
124 | <p> | |
125 | You can limit your search to specific fields such as | |
126 | <code>Subject:</code>, <code>From:</code>, | |
127 | <code>Message-Id:</code>. It's especially convenient for | |
128 | Mail/News documents. e.g., | |
129 | </p> | |
130 | ||
131 | <ul> | |
132 | <li><code class="example">+subject:Linux</code><br> | |
133 | (Retrieving all documents which contain <code>Linux</code> | |
134 | in a <code>Subject:</code> field) | |
135 | ||
136 | <li><code class="example">+subject:"GNU Emacs"</code><br> | |
137 | (Retrieving all documents which contain <code>GNU Emacs</code> | |
138 | in a <code>Subject:</code> field) | |
139 | ||
140 | <li><code class="example">+from:foo@bar.jp</code><br> | |
141 | (Retrieving all documents which contain <code>foo@bar.jp</code> | |
142 | in a <code>From:</code> field) | |
143 | ||
144 | ||
145 | <li><code class="example">+message-id:<199801240555.OAA18737@foo.bar.jp></code><br> | |
146 | (Retrieving a certain document which contains specified | |
147 | <code>Message-Id:</code>) | |
148 | </ul> | |
149 | ||
150 | <h3><a name="query-notes">Notes</a></h3> | |
151 | ||
152 | <ul> | |
153 | <li>In any queries, Namazu ignores case distinctions of | |
154 | alphabet characters. In other words, Namazu does | |
155 | case-insensitive pattern matching in any time. | |
156 | ||
157 | ||
158 | <li>Japanese phrases are forced to be segmented into | |
159 | morphemes automatically and are handled them as <a | |
160 | href="#query-phrase">phrase searching</a>. This processing | |
161 | causes invalid segmentation occasionally. | |
162 | ||
163 | ||
164 | <li>Alphabet, numbers or a part of symbols (duplicated in | |
165 | ASCII) characters which defined in JIS X 0208 (Japanese | |
166 | Industrial Standards) are handled as ASCII characters. | |
167 | ||
168 | <li>Namazu can handle a term which contains symbols like | |
169 | <code>TCP/IP</code>. Since this handling isn't complete, | |
170 | you can describe <code>TCP and IP</code> instead of | |
171 | <code>TCP/IP</code>, but it may cause noisy results. | |
172 | ||
173 | ||
174 | <li>Substring matching and field-specified searching takes | |
175 | more time than other methods. | |
176 | ||
177 | <li>If you want to use <code class="operator">and</code>, | |
178 | <code class="operator">or</code> or <code | |
179 | class="operator">not</code> simply as terms, you can | |
180 | surround them respectively with double quotes like <code | |
181 | class="operator">"..."</code> or braces like <code | |
182 | class="operator">{...}</code>. | |
183 | ||
184 | <!-- foo | |
185 | You must choose the latter with Tkanamzu or namazu.el. | |
186 | --> | |
187 | ||
188 | </ul> | |
189 |