3 frak transforms collections of strings into regular expressions for
4 matching those strings. The primary goal of this library is to
5 generate regular expressions from a known set of inputs which avoid
6 backtracking as much as possible. It is available as a [command line
7 utility](#command-line-usage) and for the [browser](#browser-usage)
8 as a JavaScript library.
12 Add frak as a dependency to your `project.clj` file.
18 ## Clojure(Script) usage
23 user> (frak/pattern ["foo" "bar" "baz" "quux"])
24 #"(?:ba[rz]|foo|quux)"
25 user> (frak/pattern ["Clojure" "Clojars" "ClojureScript"])
26 #"Cloj(?:ure(?:Script)?|ars)"
27 user> (frak/pattern ["skill" "skills" "skull" "skulls"])
33 frak can be used from the command line with either Leiningen or NodeJS.
37 Use the `lein run` command:
40 $ lein run -e foo bar baz quux
46 Compile the NodeJS version
49 $ lein do cljx once, cljsbuild once node
51 $ bin/frak -e foo bar baz quux
57 To use frak as a standalone library in the browser with JavaScript
58 compile the browser version:
61 $ lein do cljx once, cljsbuild once browser
62 $ mv ./target/js/frak.min.js <destination>
65 Try it using this HTML:
73 <pre>Input: <span id="input"></span></pre>
74 <pre>Output: <span id="output"></span></pre>
75 <script src="http://code.jquery.com/jquery-2.0.3.min.js"></script>
76 <script src="frak.min.js"></script>
78 var strings = ["foo", "bar", "baz", "quux"];
79 // It's a good idea to use the `"exact?"` option.
80 var pattern = frak.pattern(strings, {"exact?": true})
81 jQuery("#input").text(strings.join(" "));
82 jQuery("#output").text(pattern);
88 For even more fun try it with [AngularJS](http://angularjs.org/)!
92 A frak pattern is constructed from a trie of characters and a
93 renderer which processes it. As characters are added to the trie, data
94 such as such as which characters are terminal are stored in it's
97 During the rendering process frak analyzes each branch and attempts to
98 emit the most concise regular expression possible. Additional post
99 operations are applied after rendering to improve the expression where
104 [Here's](https://github.com/guns/vim-clojure-static/blob/249328ee659190babe2b14cd119f972b21b80538/syntax/clojure.vim#L91-L92)
107 ## And now for something completely different
109 Let's build a regular expression for matching any word in
110 `/usr/share/dict/words`.
113 user> (require '[clojure.java.io :as io])
116 (-> (io/file "/usr/share/dict/words")
120 user> (def word-re (frak/pattern words))
122 user> (every? #(re-matches word-re %) words)
126 The last two operations will take a moment since there are over
127 235,000 words to consider.
129 You can view the full expression
130 [here](https://gist.github.com/noprompt/6106573/raw/fcb683834bb2e171618ca91bf0b234014b5b957d/word-re.clj)
131 (it's approximately `1.5M`!).
136 (use 'criterium.core)
139 (-> (io/file "/usr/share/dict/words")
144 "Create a naive regular expression pattern for matching every string
148 (clojure.string/join "|")
152 ;; Shuffle 10000 words and build a naive and frak pattern from them.
153 (def ws (shuffle (take 10000 words)))
155 (def n-pat (naive-pattern ws))
156 (def f-pat (frak/pattern ws))
158 ;; Verify the naive pattern matches everything it was constructed from.
159 (every? #(re-matches n-pat %) ws)
162 ;; Shuffle the words again since the naive pattern is built in the
163 ;; same order as it's inputs.
164 (def ws' (shuffle ws))
170 (bench (doseq [w ws'] (re-matches n-pat w)))
171 ;; Execution time mean : 1.499489 sec
172 ;; Execution time std-deviation : 181.365166 ms
173 ;; Execution time lower quantile : 1.337817 sec ( 2.5%)
174 ;; Execution time upper quantile : 1.828733 sec (97.5%)
178 (bench (doseq [w ws'] (re-matches f-pat w)))
179 ;; Execution time mean : 155.515855 ms
180 ;; Execution time std-deviation : 5.663346 ms
181 ;; Execution time lower quantile : 148.168855 ms ( 2.5%)
182 ;; Execution time upper quantile : 164.164294 ms (97.5%)