Clojure Performance Guarantees
Jarkko Oranen a.k.a. Chousuke posted an excellent table summarizing the performance characteristics of functions operating on different Clojure data structures:
hash-map | sorted-map | hash-set | sorted-set | vector | queue | list | lazy seq | |
---|---|---|---|---|---|---|---|---|
conj | near-constant | logarithmic | near-constant | logarithmic | constant (tail) | constant (tail) | constant (head) | constant (head) |
assoc | near-constant | logarithmic | - | - | near-constant | - | - | - |
dissoc | near-constant | logarithmic | - | - | - | - | - | - |
disj | - | - | near-constant | logarithmic | - | - | - | - |
nth | - | - | - | - | near-constant | linear | linear | linear |
get | near-constant | logarithmic | near-constant | logarithmic | near-constant | - | - | - |
pop | - | - | - | - | constant (tail) | constant (head) | constant (head) | constant (head) |
peek | - | - | - | - | constant (tail) | constant (head) | constant (head) | constant (head) |
count | constant | constant | constant | constant | constant | constant | constant | linear |
He also provided an explanation of the terminology used; I've expanded it a little for the big-O-challenged (such as me):
- constant means O(1) complexity, in other words, the time required for the operation is constant and independent of the number of elements contained in the collection. If you think of s simple linked list, you can imagine that adding something to its front always involves the same number of steps, regardless of the number of elements. This is the fastest you can hope for.
- linear is the slowest that can happen (in the Clojure libs, that is): This is O(n) complexity, meaning that the time required for applying this function on a list with 1000 elements will take a 100 times as long as applying it to a list of 10. In other words: this is slow.
- logarithmic means that the time required requires log n hops, which is pretty fast (e.g. doing this operation on a list of 1,000,000 items takes 6 times as long as doing it on a list with 10)
- near-constant performance characteristics refer to the places where the underlying tree structure for most (all?) of Clojure's collections shows its benefits: The number of hops required (as Rich puts it) is log32 n. (Update: I've erased the naïve explanation I had here, will follow up with a better one soon.)
- Some operations are, of course not supported on some structures.
On a related note, the Clojure IRC channel is simply great.
Any chance you’d consider switching from English terms to big-O notation? The big-O notation is easier to identify.
Minor: s/list/map/g (logarithmic paragraph). Reason: there are no logarithmic operations on lists.