From: Joel Holdbrooks Date: Tue, 30 Jul 2013 17:35:43 +0000 (-0700) Subject: Add more information to README X-Git-Url: https://vcs.fsf.org/?a=commitdiff_plain;h=24a60b46572481612440c63643bae42c6d8d6944;p=frak.git Add more information to README --- diff --git a/README.md b/README.md index 4779eac..0795ed1 100644 --- a/README.md +++ b/README.md @@ -26,10 +26,10 @@ user> (frak/pattern ["Clojure" "Clojars" "ClojureScript"]) ## How? -A frak pattern is constructed from a very stupid trie of characters. -As characters are added to it, meta data is stored in it's branches. -The meta data contains information such as which branches are terminal -and a record of characters which have "visited" the branch. +A frak pattern is constructed from a trie of characters. As characters +are added to it, meta data is stored in it's branches containing +information such as which branches are terminal and a record of +characters which have "visited" the branch. During the rendering process frak will prefer branch characters that have "visited" the most. In the example above, you will notice the @@ -48,6 +48,16 @@ user> (frak/pattern ["bit" "bat" "ban" "bot" "bar" "box"]) [Here's](https://github.com/guns/vim-clojure-static/blob/249328ee659190babe2b14cd119f972b21b80538/syntax/clojure.vim#L91-L92) why. Also because. +## Next + +While the patterns currently generated by frak are correct, there is +potential for improvement; the word trie could be converted to a +directed acyclic word graph (DAWG). + +By using a DAWG it might be possible to produce more efficient +patterns which consider both common prefixes and suffixes. This might +reduce both backtracking and overall pattern size. + ## And now for something completely different Let's build a regular expression for matching any word in