Commit | Line | Data |
---|---|---|
bfab7d84 IK |
1 | <h2><a name="query">Query</a></h2> |
2 | ||
3 | <h3><a name="query-ignored">Long messages and words are ignored</a></h3> | |
4 | <p> | |
5 | Messages longer than 100,000 letters or 500,000 bytes are ignored. Words | |
6 | longer than 40 characters are ignored. Attachments are ignored. | |
7 | </p> | |
8 | ||
9 | <h3><a name="query-term">Single term query</a></h3> | |
10 | <p> | |
11 | The query specifies only one term for retrieving all | |
12 | documents which contain the term. e.g., | |
13 | </p> | |
14 | ||
15 | <p class="example"> | |
16 | namazu | |
17 | </p> | |
18 | ||
19 | <h3><a name="query-and">AND query</a></h3> | |
20 | ||
21 | <p> | |
22 | The query specifies two or more terms for retrieving all | |
23 | documents which contain both terms. You can insert the | |
24 | <code class="operator">and</code> operator between the terms. e.g., | |
25 | </p> | |
26 | ||
27 | <p class="example"> | |
28 | Linux and Netscape | |
29 | </p> | |
30 | ||
31 | <p> | |
32 | You can ommit the <code class="operator">and</code> operator. Terms which is | |
33 | separated by one ore more spaces is assumed to be AND query. | |
34 | </p> | |
35 | ||
36 | <h3><a name="query-or">OR query</a></h3> | |
37 | <p> | |
38 | The query specifies two or more terms for retrieving all | |
39 | documents which contain either term. You can insert the | |
40 | <code class="operator">or</code> operator between the terms. | |
41 | e.g., | |
42 | </p> | |
43 | ||
44 | <p class="example"> | |
45 | Linux or FreeBSD | |
46 | </p> | |
47 | ||
48 | <h3><a name="query-not">NOT query</a></h3> | |
49 | <p> | |
50 | The query specifies two or more terms for retrieving all | |
51 | documents which contain a first term but does't contain the | |
52 | following terms. You can insert the <code class="operator">not</code> | |
53 | operator between the terms to do NOT query. e.g., | |
54 | </p> | |
55 | ||
56 | <p class="example"> | |
57 | Linux not UNIX | |
58 | </p> | |
59 | ||
60 | ||
61 | <h3><a name="query-grouping">Grouping</a></h3> | |
62 | <p> | |
63 | You can group queries by surrounding them by | |
64 | parentheses. The parentheses should be separated by one or | |
65 | more spaces. e.g., | |
66 | </p> | |
67 | ||
68 | <p class="example"> | |
69 | ( Linux or FreeBSD ) and Netscape not Windows | |
70 | </p> | |
71 | ||
72 | <h3><a name="query-phrase">Phrase searching</a></h3> | |
73 | <p> | |
74 | You can search for a phrase which consists of two or more terms | |
75 | by surrounding them with double quotes like | |
76 | <code class="operator">"..."</code> or with braces like <code class="operator">{...}</code>. | |
77 | In Namazu, precision of phrase searching is not 100 %, | |
78 | so it causes wrong results occasionally. e.g., | |
79 | </p> | |
80 | ||
81 | <p class="example"> | |
82 | {GNU Emacs} | |
83 | </p> | |
84 | ||
85 | <!-- foo | |
86 | <p> | |
87 | You must choose the latter with Tkanamzu or namazu.el. | |
88 | </p> | |
89 | --> | |
90 | ||
91 | <h3><a name="query-substring">Substring matching</a></h3> | |
92 | <p> | |
93 | The are three types of substring matching searching. | |
94 | </p> | |
95 | ||
96 | <dl> | |
97 | <dt>Prefix matching | |
98 | <dd><code class="example">inter*</code> (terms which begin with <code>inter</code>) | |
99 | <dt>Inside matching | |
100 | <dd><code class="example">*text*</code> (terms which contain <code>text</code>) | |
101 | <dt>Suffix matching | |
102 | <dd><code class="example">*net</code> (terms which terminated | |
103 | with <code>net</code>) | |
104 | </dl> | |
105 | ||
106 | ||
107 | <h3><a name="query-regex">Regular expressions</a></h3> | |
108 | ||
109 | <p> | |
110 | You can use regular expressions for pattern matching. The | |
111 | regular expressions must be surrounded by slashes like <code | |
112 | class="operator">/.../</code>. Namazu uses <a | |
113 | href="http://www.ruby-lang.org/">Ruby</a>'s regular | |
114 | regular expressions engine. It offers generally <a | |
115 | href="http://www.perl.com/">Perl</a> compatible flavor. | |
116 | e.g., | |
117 | </p> | |
118 | ||
119 | <p class="example"> | |
120 | /pro(gram|blem)s?/ | |
121 | </p> | |
122 | ||
123 | ||
124 | <h3><a name="query-field">Field-specified searching</a></h3> | |
125 | <p> | |
126 | You can limit your search to specific fields such as | |
127 | <code>Subject:</code>, <code>From:</code>, | |
128 | <code>Message-Id:</code>. It's especially convenient for | |
129 | Mail/News documents. e.g., | |
130 | </p> | |
131 | ||
132 | <ul> | |
133 | <li><code class="example">+subject:Linux</code><br> | |
134 | (Retrieving all documents which contain <code>Linux</code> | |
135 | in a <code>Subject:</code> field) | |
136 | ||
137 | <li><code class="example">+subject:"GNU Emacs"</code><br> | |
138 | (Retrieving all documents which contain <code>GNU Emacs</code> | |
139 | in a <code>Subject:</code> field) | |
140 | ||
141 | <li><code class="example">+from:foo@bar.jp</code><br> | |
142 | (Retrieving all documents which contain <code>foo@bar.jp</code> | |
143 | in a <code>From:</code> field) | |
144 | ||
145 | ||
146 | <li><code class="example">+message-id:<199801240555.OAA18737@foo.bar.jp></code><br> | |
147 | (Retrieving a certain document which contains specified | |
148 | <code>Message-Id:</code>) | |
149 | </ul> | |
150 | ||
151 | <h3><a name="query-notes">Notes</a></h3> | |
152 | ||
153 | <ul> | |
154 | <li>In any queries, Namazu ignores case distinctions of | |
155 | alphabet characters. In other words, Namazu does | |
156 | case-insensitive pattern matching in any time. | |
157 | ||
158 | ||
159 | <li>Japanese phrases are forced to be segmented into | |
160 | morphemes automatically and are handled them as <a | |
161 | href="#query-phrase">phrase searching</a>. This processing | |
162 | causes invalid segmentation occasionally. | |
163 | ||
164 | ||
165 | <li>Alphabet, numbers or a part of symbols (duplicated in | |
166 | ASCII) characters which defined in JIS X 0208 (Japanese | |
167 | Industrial Standards) are handled as ASCII characters. | |
168 | ||
169 | <li>Namazu can handle a term which contains symbols like | |
170 | <code>TCP/IP</code>. Since this handling isn't complete, | |
171 | you can describe <code>TCP and IP</code> instead of | |
172 | <code>TCP/IP</code>, but it may cause noisy results. | |
173 | ||
174 | ||
175 | <li>Substring matching and field-specified searching takes | |
176 | more time than other methods. | |
177 | ||
178 | <li>If you want to use <code class="operator">and</code>, | |
179 | <code class="operator">or</code> or <code | |
180 | class="operator">not</code> simply as terms, you can | |
181 | surround them respectively with double quotes like <code | |
182 | class="operator">"..."</code> or braces like <code | |
183 | class="operator">{...}</code>. | |
184 | ||
185 | <!-- foo | |
186 | You must choose the latter with Tkanamzu or namazu.el. | |
187 | --> | |
188 | ||
189 | </ul> | |
190 |