Tuesday, January 27, 2015

Live queries in Comprehend

Someone recently asked me if it's possible to have Comprehend fire events when the database is updated and new matches are encountered. This is indeed possible with not too much effort. The trick is to combine forward matching and mutable indexed sets.

To start, suppose we are maintaining a set of pairs and we are interested in all triples [a b c] such that [a b] and [b c] are in our database. We define a function that prints such matches iff they are new since the function was last called:
(require '[comprehend :as c])
(require '[comprehend.mutable :as cm])

(defn report-new-matches [m-s]
  "Prints new paths of length 3 since report-new-matches was last
  called on m-s. Prints zero matches on the first invocation."
  (let [!results (volatile! nil)]
    (dosync
      (vreset! !results (c/comprehend :mark :report-new-matches
                                      @m-s
                                      [a b]
                                      [b c]
                                      [a b c]))
      (cm/mark m-s :report-new-matches))
    (doseq [x @!results]
      (println "Matched" x))))
This function runs a transaction that comprises two steps:
  1. Run a query on the mutable indexed set
  2. Update the mutable indexed set with a marker
The transaction ensures that the set is not mutated between step 1 and step 2. After the transaction completes, report-new-matches prints the results.

We now hook up report-new-matches as the 'io' argument of the mutable indexed set constructor:
(declare m-s)
(defn live-reporter
  ([] nil)
  ([s diff] (report-new-matches m-s)))
(def m-s (cm/mutable-indexed-set live-reporter))
(report-new-matches m-s) ; init
(I admit the circular dependency between the definitions of live-reporter and m-s is currently a little ugly. I might tweak the signature of 'io' functions to remedy this problem.)

That's it! Let's see what happens when we add elements:
(reduce cm/conj m-s [[1 2] [2 3] [1 3] [3 4]])
; Matched [1 3 4]
; Matched [1 2 3]
; Matched [2 3 4]

(cm/conj m-s [0 1])
; Matched [0 1 3]
; Matched [0 1 2]
At some point the comprehend.mutable API will be amended with a macro to make this even easier. In the meantime, as you can see, it's straightforward to role out your own solution.

Thursday, January 1, 2015

Comprehend is now a lot faster

In a previous blog post I announced Comprehend, a Clojure library for pattern matching on indexed sets.

In its original implementation, when an item was added to an indexed set, Comprehend would translate the item’s relevant properties to predicate logic and it would store the resulting relations in an indexed core.logic database. When Comprehend was subsequently given a query, it would translate the query to predicate logic, and then have core.logic unify the database and the (translated) query.

The above approach came with quite a bit of overhead, both in terms of performance and in terms of memory use. That is why I have now rewritten the entire Comprehend engine.

The new engine uses a custom multi-threaded unifier. Indexes are created lazily, as they are needed, and are stored in a cache.

Comprehend currently uses a soft cache by default. This means that the garbage collector is in charge of evicting indexes. This seems to work well in practice, but if that’s not to your liking—don’t worry—it’s trivial to swap in an alternative cache from core.cache. Moreover, Comprehend should have enough hooks to allow for a custom cache that intelligently evicts indexes as items are added or removed from the indexed set.

I tried the new engine on a few queries and found the overhead small enough. Even for relatively small sets, Comprehend returns results quickly and I found it quite smart about what indexes (not) to create.