2008-08-17

Lazy evaluation

Let's make an array of all Fibonacci numbers. All of them? Yes, all of them!

Of course it's impossible, we would have to spend infinite time computing them and use infinite memory to store them. But we can trick users of our array into thinking it does have all the numbers precomputed in it. We will use lazy evaluation.

How it works? Well, our array will not really be an Array from which we can read fields at random. It will be an object, and each time user wants to get some number from it, we will peek into the user's request, and compute the number, if not already computed. This simple concept is the definition of lazy evaluation (opposed to eager evaluation, which is precomputing values before they are needed).

The object that will look like the array will be a module. We don't want to create a class, because we wouldn't want to instantiate it. It will work just like this:
Fi[10]
will yield
55
, for instance (we assume
Fi[0]==0 and Fi[1]==1
).

Let's start with the code, as usually.
module Fi

class << self

def initialize
@fi=[0,1]
end

def [](n)
raise ArgumentError,"Incorrect index!" unless n.is_a? Integer and n>=0
get(n)
end

def each
i=0
loop\
{
yield(get(i))
i+=1
}
end

private

def get(n)
want(n)
@fi[n]
end

def want(n)
until @fi[n]
@fi<<@fi[-1]+@fi[-2]
end
end

end

initialize

end
Note that this is not a good implementation, a good implementation would use Matrix form to compute them much faster, but we're talking about laziness here, so let's leave the good implementation to the eager folks (well, if you do it, it will probably increase your Ruby skills, so try, if you dare). Also what I want to demonstrate doesn't get lost between the matrices in this simple implementation, so let's stick with it for now.

So, we declared a
module
, and then there's this funny
class << self
in it. Well, for now, just remember that that's the way (one of the ways) to turn a module into sort of a singleton class - a structure that has functions and maintains its state, but cannot be freely instantiated. Also note how it is initialized: we defined initializer inside the class, but we have to call in explicitely after the class definition in the module (in fact we could call this function anything else).

After this introduction, just remember that the module can be treated like an object. We define this object's most important method:
[]
(defining a method
[](params)
enables us to use the object as an Array - one more teaspoon of the syntactic sugar).

After validating the argument, we call the function
get
, which calls
want
and after that reads the value from the field
@fi
, which is the underlying real array of Fibonacci numbers. The function
want
simply ensures that there are at least
n
numbers counted by counting consecutive numbers until the one we want is inserted into the array (
@fi[-1]
is of course the last element of the array - another teaspoon, very handy).

Time to test our module:
irb(main):042:0> Fi[10]
=> 55
Exactly what we wanted! Note that when we ask for
Fi[100000]
, the console hungs for a moment before giving us the result, but when we want the some number again, it works immediately (if you see a delay, it's caused exclusively by the time needed to print the output, to see the real time the computation takes, try
Fi[100000];nil
to suppress the output).

Now get back to our code and have a look at the
each
method. Arrays have methd
each
that calls the passed block once for each element. Our object is a bit like Array so why not add this option? The only problem is, of course, that the function will never finish, unless user breaks it like this:
Fi.each\
{ |f|
puts f
break if f>100000
}
In fact the
break
breaks the loop inside the method, and lets it finish.

Again, our
each
is implemented badly. It should precompute some number of numbers ahead to reduce the number of calls to
get
. Well, calling
get
and not
[]
is already an optimisation, because thanks to this we don't validate the argument each time inside
each
.

We could also implement
each_with_index
and others, also
map
could be implemented, but it would have to behave differently from what it does for Arrays, because if we
break
inside
map
of an Array object, it returns
nil
, and our
map
would have to return the already computed part of the array. But definitely the most important change would be to introduce the matrix form to the calculations, so that if we ask for just one number, not all previous numbers would have to be calculated. I leave it as an exercise for the reader.

No comments: