Ruby (Al2O3::Cr)

2008-09-16

StringBuffer

Today a potentially useful code sample:

class StringBuffer

. The class is a buffer into which you can write strings using various printing methods, and from which you can read all the data in the order of arrival. This object can be useful in one-way communication between two threads (or two such objects in two-way comunication), it is thread-safe.

The class will use

StringIO

- a built-in object that has several printing methods (we will add two more) and which allows random access reading and writing. We will remember the reading and writing points in the buffer, and use them during read and write commands. The buffer will also be emptied once it gets too long, to prevent high (infinite) memory usage.

Here's the code, with comments inside this time.

require 'pp' # allows method pp (pretty print); test in irb
require 'thread'

require 'thread'

# Thread synchronizer. A thread calls +wait+ and is stopped
# until the +timeout+ passes or until the method +signal+
# of this object is called to release all waiting threads.
class Synchronizer

  def initialize
    @waiting=[]
    @mutex=Mutex::new
  end
  
  attr_reader :mutex # in case somebody wants to use it
  
  def wait(timeout=nil)
    thr=Thread.current
    begin
      # be sure to add myself to the list of waiting threads
      @mutex.synchronize{@waiting<<thr}
      # sleep given time or forever (yes, it does it)
      sleep(*[timeout].compact)
    ensure
      # be sure to remove myself
      @mutex.synchronize{@waiting.delete(thr)}
    end
  end
  
  def signal
    # wake up all waiting threads
    @mutex.synchronize{@waiting.each{|t| t.wakeup}}
  end

  # Check if the thread is currently waiting
  def waiting_thread?(thr)
    raise ArgumentError,"Argument must be Thread!"\
          unless thr.is_a? Thread
    @waiting.include?(thr)
  end
  
end

# We add two methods to the +StringIO+.
class StringIO
  
  # Inspect into the buffer (self).
  def p(*args)
    puts args.map{|a| a.inspect}
  end
  
  # Pretty print into buffer (self).
  def pp(*args)
    args.each{|a| PP.pp(a,self)}
    nil 
  end
  
end

class StringBuffer
  
  # List of writing methods.
  WRITE_METHODS=[:write,:<<,:print,:puts,:putc,:printf,:p,:pp]
  
  # List of reading methods.
  READ_METHODS=[:read,:gets,:getc,:readchar,:readline]
  
  # If this much data is in the buffer, empty it.
  TRUNC_LENGTH=1000
  
  # dynamically define all writing methods
  WRITE_METHODS.each\
  { |wm|
    class_eval(
      <<-METHOD
        def #{wm}(*args)
          # safely (mutex)
          @mutex.synchronize\
          {
            # move the +StringIO+ pointer to the end
            @buff.pos=@buff.length
            # call the same method on the internal buffer
            ret=@buff.#{wm}(*args)
            # signal the synchronizer in case
            # some thread was waiting for data
            @synchronizer.signal unless empty?
            ret
          }
        end
      METHOD
    )
  }

  # dynamically define read methods
  READ_METHODS.each\
  { |rm|
    class_eval(
      <<-METHOD
        def #{rm}(*args)
          @mutex.synchronize\
          {
            # move pointer to the saved position of last read
            @buff.pos=@r
            # perform the read
            ret=@buff.#{rm}(*args)
            # save the new pointer
            @r=@buff.pos
            # call +trunc+ if there is at least +TRUNC_LENGTH+
            # bytes of unnecessary data
            trunc if @r>=TRUNC_LENGTH
            ret
          }
        end
      METHOD
    )
  }

  def initialize
    @buff=StringIO::new
    # read position
    @r=0
    @mutex=Mutex::new
    # synchronizer for threads waiting for data
    @synchronizer=Synchronizer::new
  end
  
  attr_reader :synchronizer
  
  def length
    @buff.length-@r
  end
  
  def eof?
    length==0
  end
  
  alias empty? eof?
  
  def wait_for_data(timeout=nil)
    @synchronizer.wait(timeout) if empty?
    self unless empty?
  end
  
private

  def trunc
    @buff.string=@buff.string[@r..-1]
    @r=0
  end
  
end

A small test

irb(main):002:0> s=StringBuffer::new
=> #<StringBuffer:0x2bf4f44 @mutex=#<Mutex:0x2bf4ef4>, @r=0,
# @buff=#<StringIO:0x2bf4f1c>,
# @synchronizer=#<Synchronizer:0x2bf4ee0 @mutex=#<Mutex:0x2bf4e54>,
# @waiting=[]>>
irb(main):003:0> Thread::new{loop{sleep 1;s.print "X"}}
=> #<Thread:0x2bf0b60 sleep>
irb(main):004:0> loop{s.wait_for_data;puts s.read}
XXXXXXXXXX
X
X
X
X
X
X
X

The first line with lots of

'es is because this many of them had been accumulated in the buffer before I called the command in line

2008-09-06

class Array

@ 18:04

A handful of methods that could be added to the class

Array

class Array

  def sum
    s=0
    each{|e| s+=e}
    s
  end

  def mul
    m=1
    each{|e| m*=e}
    m
  end

  def mean
    sum.to_f/length
  end
  
  def map_with_index
    i=-1
    map{|e| yield(e,i+=1)}
  end
  
  def map_with_index!
    i=-1
    map!{|e| yield(e,i+=1)}
  end
  
  def any_with_index?
    each_with_index{|e,i| return true if yield(e,i)}
    false
  end
  
  def all_with_index?
    each_with_index{|e,i| return false unless yield(e,i)}
    true
  end
  
  def find_index
    each_with_index{|v,i| return i if yield(v)}
  end
  
  def find_indices
    ret=[]
    each_with_index{|v,i| ret<<i if yield(v)}
    ret
  end
  
  def select_by_index(*indices)
    ret=[]
    indices.each{|ind| ret<<self[ind]}
    ret
  end
  
  alias find_indexes find_indices
  
  def to_hash
    raise "Cannot convert to Hash!" unless all?\
    { |e|
      e.respond_to? :length and e.length==2 and e.respond_to? :[]
    }
    h={}
    each{|e| h[e[0]]=e[1]}
    h
  end
  
  def keys_to_hash
    h={}
    each{|e| h[e]=yield(e)}
    h
  end
  
  def keys_with_index_to_hash
    h={}
    each_with_index{|e,i| h[e]=yield(e,i)}
    h
  end

  def with(a2)
    ensure_same_length(a2)
    map_with_index{|e,i| [e,a2[i]]}
  end

  def with_to_hash(a2)
    ensure_same_length(a2)
    h={}
    each_with_index{|e,i| h[e]=a2[i]}
    h
  end
  
  def count_all
    h={}
    each\
    { |e|
      h[e]||=0
      h[e]+=1
    }
    h
  end
  
  def group_by
    h={}
    each\
    { |e|
      g=yield(e)
      h[g]||=[]
      h[g]<< e
    }
    h.map{|g,ee| ee}
  end
  
  alias contain? include?
  alias has? include?
  
  def rand
    a=to_a
    a[Kernel.rand(a.length)] unless a.empty?
  end
  
private

  def ensure_same_length(arg)
    raise ArgumentError,"Argument must be of the same length!"\
          unless arg.respond_to? :length and length==arg.length\
          and arg.respond_to? :[]
  end
  
end

They are not perfect, but I use them quite a lot.

Some of them (or similar methods) will be present in Ruby 1.9. For example there will be a method

inject

(or

reduce

) working like this:

[1,4,5].reduce(:*)           #=> 20 # 1*4*5
["a","b","dd"].reduce(:+)    #=> "abdd"

As you see, they are better than my

sum

, because they work for any type for which the operation is defined. This reduce is not hard to implement, too, but it will probably work a bit faster when included in Ruby core.

Ruby 1.9 is also going to have

group_by

, working exactly like mine, as far as I know.

Move to Enumerable
One more enhancement that can be done in the above code is to move all the methods in the module

Enumerable

(just write

module Enumerable

instead of

class Array

at the top). It allows you to use these methods also with other enumerable types, like

Hash

. You'll have to test the methods, though, as not all of them make sense when used with structures where the elements are not ordered.

Add to load path
If you create some files that you'd like to be easily accessible in your Ruby programs, you can add the path to your files to Ruby load path, so that you will be able to

require

your files without giving the full path. Under Windows, just go to environment variables, and add
RUBYLIB = P:/ath/To/Your/Dir
The path will be automatically added to Ruby load path each time Ruby starts, which can be verified by typing

$:

(or

$LOAD_PATH

) in irb and looking for your path.

If you want some of your files to be loaded even without the need to

require

them, then you can add them to the environment variable RUBYOPT. This variable can already contain -rubygems. If you want the file P:/ath/To/Your/Dir/start.rb to be loaded at startup, change the variable to
-rubygems -rstart
Each word starting with -r makes ruby load a file named by the rest of the word. Ruby will find your file because you already added file path to Ruby load path. If you want to load more files at startup, it is best to

require

them from within your first file.

As you might have guessed, there is file named ubygems that the original content of the variable caused to load. The strange name is in fact chosen only to make the whole command sound reasonable. All it does is load rubygems.rb, which initialises the Gems engine, enabling programs to use additional libraries.

2008-09-01

{block}

@ 21:34

Blocks. The most powerful out of the basic features of Ruby.

Block is a way to pass a bit of code into a function, to let the function execute it if it wants to, and as many times as it wants to. You know already some examples like

[1,2,5].each{|e| puts e}

, where the function

each

calls the block three times - once for each element of

self

.

Let's learn how to write a function that takes a block. I'd like to have a method of the class

Array

that converts the array to

Hash

, where the original array elements become keys, and the values are computed inside the block. Example of how it is supposed to work:

[1,5,3].keys_to_hash{|k| k**2}
#=> {1=>1,5=>25,3=>9}

["Ruby","Al2","O3","Cr"].keys_to_hash{|k| k.length}
#=> {"Al2"=>3,"O3"=>2,"Ruby"=>4,"Cr"=>2}
# remember that Hash does not maintain the order of elements
# so they might get reordered when written irb

So, our function definitely takes a block, and executes it once for each element, and collects the return values of the block as values in the hash. The code that does it is like this:

class Array
  def keys_to_hash
    raise LocalJumpError,"Block not given!" unless block_given?
    h={}
    each\
    { |e|
      h[e]=yield(e)
    }
    h
  end
end

First we raise an exception if the method was called without a block. This line is not obligatory, as the exception would be raised anyway at the moment when we try to execute the block, so I raise it here mostly to show you how to check if a block is given.

Then we create an empty hash, and then for each element of

self

(works like

self.each

) we write an element to the hash, using the current element

as the key, and

yield(e)

as the value. As you must have guessed by now, the keyword

yield

is a call to the passed block.

Finally we return the created

as the function result. You can check that the function works as expected.

Just one more example:

class Array
  def each_consequent(n)
    for i in (0..length-n)
      yield(*self[i,n])
    end
    self
  end
end

[2,3,5,7,11,13,17,19,23].each_consequent(3)\
{ |a,b,c|
  puts "#{a} #{b} #{c}"
}

# output:
2 3 5
3 5 7
5 7 11
7 11 13
11 13 17
13 17 19
17 19 23

I'll explain just the most suspicious part here:

self[i,n]

is an array (subarray of

self

) and we call

yield

with

before the array to make the array splash into the three block arguments

|a,b,c|

. This splash operator is not always necessary, but it's nice to include it to make it clear that the arguments get splashed.

If block is an object...
There are in general two ways of passing a block to a next function. Let's define two functions that behave exactly like

each

class Array

  def my_each1
    each{|a| yield(a)}
  end

  def my_each2(&b)
    each(&b)
  end

end

The first one makes a trivial block itself - the block is created just to call the original block coming to

my_each1

with the argument. The second one uses the

operator to make the block be assigned into the variable

. Inside

my_each2

the variable

is a

Proc

object. You could call it by hand inside the function, using

b.call(arg)

or for short

b[arg]

, but in our example it is instead passed to

each

, and the operator

makes it sort-of-unsplash back into a block. Two other ways to do it (not very elegant, though):

p=Proc::new{|a| yield(a)}; each(&p)

, or another ugly way:

each{|a| b.call(a)}

. I give these example just to touch your brain and make you understand!

If the block is not passed into a function declared with a block parameter, like

my_each2

, the value of

nil

, and you don't have to call

block_given?

to check it.

...then we can store it
Now another useful trick. If we can receive a block as an object, or wrap it into a new

Proc

, then it's an object, and can be stored in a variable. Look:

class K

  def store_block(&b)
    @b=b
  end
  
  def call_block(*args)
    @b.call(*args)
  end

end

k=K::new
k.store_block{|a,b| puts "#{a}::#{b}"}   # no output to the console
k.call_block("Al2O3","Cr")               # output: Al2O3::Cr
k.call_block("Hi","there")               # output: Hi::there

So, we saved the passed block, and called it later. Note one very useful trick: if we receive the arguments as

*args

and pass them on as

*args

as well, then any set of arguments, no matter how many of them you pass to

call_block

, will get forwarded to the block call. (Of course now calling

k.call_block(1,2,3)

will print just

"1::2"

because our block takes two arguments, which means it ignores the third one; but the argument gets lost in the block, and not in

call_block

).

This block saving is not useless. You can for example call a method that saves a block, and executes it later as a callback to an event that happens inside the object. This is a very useful behaviour.

Passing more blocks
Unfortunatelly, Ruby doesn't support passing more blocks to a function. You can have only one parameter with

, and there is only one

yield

too. But Ruby does allow passing multiple regular arguments, so what's the problem? Let's write a function that sort of takes two blocks, and calls one of them with the result returned by the call to the other with the argument 5, or opposite:

def random_caller(b1,b2)
  raise ArgumentError,"Arguments must be Procs"\
        unless b1.is_a? Proc and b2.is_a? Proc
  if rand(2).zero?
    b1.call(b2.call(5))
  else
    b2.call(b1.call(5))
  end
end

q=lambda\
{
  random_caller(lambda{|x| x+2},lambda{|x| x**2})
}
q[]   #=> 27
q[]   #=> 27
q[]   #=> 49
q[]   #=> 27
q[]   #=> 27

First we check if what we really got are procs. Then we randomly call one of them with

and the other with the result of the first one, or the opposite, and return the result.

Now the call. The structure

lambda{|arg| exp}

is more or less the same as

Proc::new{|arg| exp}

and

proc{|arg| exp}

. So

random_caller(lambda{|x| x+2},lambda{|x| x**2})

is a call to our function, and we can expect the result of the call to be either

(5+2)**2

which is

, or

(5**2)+2

which is

.

Now we must call our function multiple times. We could do it like this:

5.times{random_caller(lambda{|x| x+2},lambda{|x| x**2})}

But, as a part of this tutorial, I made the call to the function into another proc, and stored it in

. As you see, you don't even have to pass a block to a function to store it somewhere. You can create a proc just like that, and store it in a local variable, and then call it using

q[]

q.call

.

Scope
The scope visible to a block is its declaration scope. What is very interesting, even when the scope is no longer accessible, because the control left the function, it still exists if a lambda was declared there and can use it. This example illustrates the complicated words I just said:

def create_blocks
  x=nil
  getter=lambda{x}
  setter=lambda{|v| x=v}
  [setter,getter]
end

s,g=*create_blocks
s[6]      # or s.call(6)
g         #=> 6
s[:R]
g         #=> :R

The scope from inside

create_blocks

is not lost, even though the control left the method and will never return. The variable

is still accessible by the lambdas declared in the scope.

Other sources
Here are some link to learn more about gotchas in Ruby's blocks.
Ruby blocks gotchas
Proc vs lambda
Wikipedia - Closure (in many other languages the Ruby clock thing is called closure, or probably more like the closures are called blocks in Ruby)
Wikipedia Smalltalk (this blocks are pretty modern and fresh programming things, aren't they? well, they're not; have a look at Smalltalk (1980))

RUBY (Al₂O₃::Cr)

2008-09-16

StringBuffer

2008-09-06

class Array

2008-09-01

{block}

Ruby links

Archives

Tags

Ruby is not everything

Computers are not everything

RUBY (Al2O3::Cr)

2008-09-16

StringBuffer

2008-09-06

class Array

2008-09-01

{block}

Ruby links

Archives

Tags

Ruby is not everything

Computers are not everything

RUBY (Al₂O₃::Cr)