Ruby (Al2O3::Cr)

2009-02-28

Gmail Notifier using IMAP

A simple program to check if there are any new messages in our Gmail inbox.

We will connect to Gmail using IMAP protocol and get a list of new (unread) messages from mail server. Here's more or less the code that does it:

require 'net/imap' 
 
class GNotifier 
 
  GMAIL_IMAP="imap.gmail.com" 
  
  LOGIN="login@gmail.com" # or just "login" 
  PASSWORD="password" 
  
  def initialize 
    @envs={} 
  end 
 
  def check 
    begin 
      unless @imap 
        @imap=Net::IMAP::new(GMAIL_IMAP,993,true,nil,false) 
        @imap.login(LOGIN,PASSWORD) 
      end 
      @imap.select("INBOX") 
      ids=@imap.search(["NOT","SEEN"]) 
      uids=ids.empty?? [] : @imap.fetch(ids,"UID").map{|e| e.attr["UID"]} 
      @envs.reject!{|uid,env| not uids.include?(uid)} 
      new_uids=uids-@envs.keys 
      if new_uids.empty? 
        new_envs=[] 
      else 
        new_envs=@imap.uid_fetch(new_uids,"ENVELOPE").map{|e| e.attr["ENVELOPE"]} 
        new_uids.each_with_index{|uid,i| @envs[uid]=new_envs[i]} 
      end 
      new_mail(new_envs) unless new_envs.empty? 
    rescue ThreadError, Errno::ECONNABORTED, Timeout::Error, IOError => e 
      @imap=nil 
      retry 
    end 
  end 
  
  def new_mail(new_m) 
    # ... 
  end 
  
end

The code is just a draft but shows the most important part. Let's explain it a bit.

First,

@imap

is the instance of the IMAP connector. It is created once (in the first call to

check

and if nothing goes wrong, all subsequent calls to

check

do not create a new connection and do not log into the mail system, but use the previously created one. It is deleted and renewed in case of an error, though (in the

rescue

clause.

The field

@envs

is a hash holding envelopes of each new message in the inbox associated with this message's UID (unique identifier). At the beginning, it is empty.

So now how we fetch new mail: first we call

@imap.search

to get IDs (not UIDs, don't mix up the two) of all messages that do not have the SEEN flag set. Then we fetch these messages' UIDs (the ternary operator is here because

fetch

fails with empty

ids

).

So now we have the UIDs of all new messages, and we can compare it with the list of messages that we have already fetched. First, we

@envs.reject!

all messages that were new but now are not (this means that they have been deleted or marked as read, it doesn't matter for us). Then we compute the list of

new_uids

- UIDs of new messages that are new for the first time (they were not on our list) and for those messages we get some more info - the ENVELOPE - into

new_envs

and then add them to

@envs

. Finally, we call

new_mail

and pass all new new mail that arrived. This method can be also left unimplemented if we just want to know what new messages lie on the server (this info is in

@envs

of course) and do not necessarily want a notification when a new new message arrives.

Some technical details
When creating the connector, we could have written just

Net::IMAP::new(GMAIL_IMAP,993,true)

but it will not work in Ruby 1.9, where the last parameter (authenticate) is true by default.

The line

@imap.select("INBOX")

could be called within the conditional above it, but then somehow not all new messages can be accessed by IMAP. It sort of refreshes the inbox.

The ENVELOPE attribute that we download from the server contains information that would be on the envelope of a regular letter: sender, receiver(s), date, also subject. All accessed simply by method calls. Helpful link: Envelope.

If you prefer to download the whole message and not just the envelope then use the property BODY instead. If you want something more specific, look into the documentation, for example here: Net::IMAP. Note that Gmail does not support some of the commands, like

sort

for instance.

2009-02-22

Enumerator

@ 03:00

Do you remember the Fiber? If not, better have a look there before reading on. I will show another use of fibres, this time we won't see them, but they are there, in the guts of the

Enumerator

.

One can use enumerators in either of the two main ways:

Virtual array
If you have read the post about fibres, you are already familiar with the virtual arrays. An enumerator is an easier and a bit more automated way to create a virtual array.

Let's say we'd like to observe a HOTPO sequence starting at any chosen number. As we know, the sequence might be infinite, so it wouldn't be very wise to create an array holding all the elements. But we can create an enumerator to iterate over them, like this:

def hotpo(v)
  Enumerator::new\
  { |y|
    loop\
    {
      y<<v
      break if v==1
      v=(v&1>0) ? 3*v+1 : v/2
    }
  }
end

hotpo(27).each{|x| print x," "}

The function takes the first value of the sequence as an argument, and returns an enumerator. We create an enumerator of this type by passing it a block and putting the sequentially computed values in the block argument. So: the

is the virtual array itself, say good evening.

The loop is not infinite, at least not in any of the known cases, because no infinite HOTPO sequence has been found. But if we remove the break, we could see that the program doesn't hung, even though it has the infinite loop in it! Well, it does hung, but it still produces the output, so it's not more hung than an open word processor.

This behaviour is very similar to this presented in the post about fibres, and that's because the enumerator uses them. If you have understood the fibres well, you could try to implement this kind of enumerator yourself - just as a small exercise for the reader.

View of an enumerable
The other way of using enumerators (the more standard way, I'd say) is to create them from existing enumerable objects. The idea of an enumerator is to allow only very limited access to the underlying enumerable.

For example,

an_array.each

(without passing a block) is an enumerable which can be regarded as a safe read-only view of the array. It does not allow the user to call any other method of the array but

each

and its derivate methods. So you can call

an_array.each.each{...}

an_array.each.map{...}

, but the call to

an_array.each_map!{...}

, even though legal, will not modify the underlying object. But still

an_array.map!.each{...}

is able to do so.

The general idea of using chained calls of enumerating methods is:
- If the first method is

each

, then any non-modifying method can be used as the second call.
- If the first method is something else, the valid second methods are

each

which simply forwards the call to the first method, and

with_index

, which does the same but also passes the element index to the block.

So the following, even though perfectly legal, makes no sense:

an_array.select.map{...}

. One of the methods should be neutral, and the neutral methods are

each

and

each_with_index

(or just

with_index

). So apart from making a save view, the main advantage of using enumerators is the possibility to call

an_array.map.with_index{|e,i| ...}

or even

an_array.select.with_index{|e,i| ...}

.

Note that the methods

all?

and

any?

do not return useful when called with no block, so you cannot make these checks with index.

2009-02-21

ASCII Art

@ 17:53

Ruby 1.9 introduces some nice ways to write less code and make it look more mysterious at the same time. It's enough to have a look at the following pieces of ASCII art, each line is a valid expression in Ruby 1.9:


->(){}[]
0-->(){0}[]<--0
{x: :x}
{:+@=>->{:-@}}

What they mean? First, there's a new syntax for defining lambdas:

->(args){body}

, and when defining a lambda with no args and no body, and then calling it by

[]

, you obtain the first line. There's also another way to call a lambda or a proc now:

some_lambda.(args)

. This allows us to write

->()[].()

, of someone finds this even more confusing than the first option.

The second line should not be a problem now, it says

(0 - ->{0}.call) < -(-0)

. Yes, in Ruby even

--------1

is a valid expression. It's as they say in the primary school: minus and minus gives plus.

There's also a new syntax for defining hashes that have symbols as keys:

{k:val}

is equivalent to

{:k=>val}

. In our example however one has to put a whitespace between the two colons or else the interpreter is confused, because two colons is another token - for calling a function or getting a constant.

And the last line is just some creative nothing. It uses the symbols

:+@

and

:-@

which are normally used as method names for unary plus and minus. See yourself:

5.-@()

gives

-5

.

There's a lot of articles (and blog entries on various Ruby blogs) that cover the differences between Ruby 1.8 and Ruby 1.9 so if you're interested, just look for them and you'll find easily. One that is worth reading if you'd like to know more than is usually contained in short presentations:

Ruby 1.8 vs Ruby 1.9

And a nice wrap up: Useful Ruby 1.9 links

2009-02-20

Fiber

@ 15:41

I haven't been here for quite a while. That's simply because I'm not a blogging kind of guy.

Let's have a look at one of the new features in Ruby 1.9.

class Fiber
You can think of a fibre as of a separate thread, but not a thread that is running all the time in the background, but rather one that is responsible for some specialised tasks and is activated only to do some job and return some results.

Also, a fibre is not a thread.

Let's skip to some code:

FILE=__FILE__

require 'fiber'

reader=Fiber::new\
{
  File::open(FILE){|f| f.each_line{|l| Fiber.yield(l) unless l.strip.empty?}}
}

puts "Let's get a line: #{reader.resume}"
puts "Let's get anothe line: #{reader.resume}"
puts "And the rest:"
puts reader.resume while reader.alive?

This produces the following output:


Let's get a line: FILE=__FILE__
Let's get anothe line: require 'fiber'
And the rest:
reader=Fiber::new\
{
  File::open(FILE){|f| f.each_line{|l| Fiber.yield(l) unless l.strip.empty?}}
}
puts "Let's get a line: #{reader.resume}"
puts "Let's get anothe line: #{reader.resume}"
puts "And the rest:"
puts reader.resume while reader.alive?
#<File:0xb114bc>

So first: let's have a look at the fibre code (I use the word fibre and not fiber because I prefer the British English; the truth is each time I want to use the class

Fiber

I first spell it

Fibre

and must correct later; I'll make an alias one day). The fibre opens the file and calls

Fiber.yield

passing each consecutive nonempty line to it. Think of it like this: it creates a virtual array containing all the elements that you put there, and

Fiber.yield

is putting and element in it. So we have a virtual array containing all the nonempty lines of code from

__FILE__

.

Of course the array does not exist - it is only a way to imagine how the fibre works. This fact saves memory - you don't have to load all the lines to memory before accessing them. Think of how to write this simple program without a fibre (assume that the file you're going to display is like 1G and you cannot simply load it all to memory) - I'm quite sure that using a fibre is one of the best ways to do it.

Now, how do we read from this virtual array? To read an element we call

reader.resume

. Simple, isn't it? Analyse the output to see that it did what we had expected: first it printed the first line, then the second (nonempty) line, and then the rest.

There's only one mysterious thing at the end:

#<File:0xb114bc>

. The explanation is: the virtual array created by a fibre is filled by all calls to

Fiber.yield

, and when the fibre finishes, its final value (the value of the last operation within the fibre block) is also added to the array. In our case, the last (and only) operation is opening the file and

File::open

returns the created file stream, so it was also added to the array. One can like this feature or not, but one has to live with it. So if we didn't want this line of output, we can change the end of the code to this:

puts "And the rest:"
loop\
{
  l=reader.resume
  break unless reader.alive?
  puts l
}

Now it works like it should. More lines of code but oh well.

fiber.resume(*args)
We've seen that if you pass an argument to

Fiber.yield

then it becomes the value of the

fiber.resume

call. This enables you to pass data from the body of the fibre to the outer world. Passing data is also possible in the opposite direction, and is by no means harder. As you might have already guessed: arguments passed to

fiber.resume

are the value of

Fiber.yield

. So if we wanted a writer instead of a reader:

FILE="test.txt"

require 'fiber'

writer=Fiber::new\
{
  File::open(FILE,"w")\
  { |f|
    loop\
    {
      l=Fiber.yield
      break unless l
      f.puts l
    }
    f.puts "---"
  }
}

writer.resume "Line 1"
writer.resume "Line 2"
writer.resume "Line 3"
writer.resume

Why we have to create a

loop

inside the fibre and break from it? Simply because now it's the outer world that decides when to finish the fibre. It signals the fibre to close the file (and add the

"---"

just for our information that the file was closed properly). If we remove the last line of the code (the one that calls the writer with no argument), we'll see that the file won't have

---

added at the end. It will be properly closed due to the finaliser hidden inside

File::open

but it will be closed and released no sooner than the whole program ends so in general it is a good idea to force file close manually.

But wait, there's no line 1 in the file! Yes, it's not there and that's why: the first call to

writer.resume

did not correspond with a call to

Fiber.yield

from within the fibre because at the time of this call the fibre has not yet been started, so it was not waiting on

Fiber.yield

but at the beginning of its block. So the line 1 just activated the fibre, but did not save the line to the file.

What's the solution? First: to get the value of the first call to

writer.resume

you have to add arguments to the fibre block itself. So one of the solutions is like this:

writer=Fiber::new\
{ |l0|
  File::open(FILE,"w")\
  { |f|
    f.puts(l0)
    loop\
    {
      l=Fiber.yield
      break unless l
      f.puts l
    }
    f.puts "---"
  }
}

But it doesn't look to nice, nor it is. In our case a best solution might be to use the first call as a special case and pass the file name in it, like this:

require 'fiber'

writer=Fiber::new\
{ |file|
  File::open(file,"w")\
  { |f|
    loop\
    {
      l=Fiber.yield
      break unless l
      f.puts l
    }
    f.puts "---"
  }
}

writer.resume FILE
writer.resume "Line 1"
writer.resume "Line 2"
writer.resume "Line 3"
writer.resume

For most cases I'd use this form.

Of course there are much more uses of

Fiber

, also such kinds that use passing values in both directions simultaneously, not just in one of them, like in the above examples.

Producer - Consumer
Here's one more way of looking at the whole fibre thing: it's sort of the producer - consumer pattern, with a queue of size limited to zero. In this way the element is produced no sooner than it is needed and most of the time there are zero elements waiting on the queue. Only as the fibre is not a thread, there are no synchronisation problems and so on and so on.

I hereby certify the Fiber class for everyday use.

RUBY (Al₂O₃::Cr)

2009-02-28

Gmail Notifier using IMAP

2009-02-22

Enumerator

2009-02-21

ASCII Art

2009-02-20

Fiber

Ruby links

Archives

Tags

Ruby is not everything

Computers are not everything

RUBY (Al2O3::Cr)

2009-02-28

Gmail Notifier using IMAP

2009-02-22

Enumerator

2009-02-21

ASCII Art

2009-02-20

Fiber

Ruby links

Archives

Tags

Ruby is not everything

Computers are not everything

RUBY (Al₂O₃::Cr)