Piyush Ranjan 's den: February 2009

I have been using rails for quiet sometime now. Rails is easy but I feel stupid using it. It is bulky and slow for some of my use cases. I wanted something lean. I wanted something more geeky and powerful to handle lots of long running requests(more than 100K/day with avg. 30sec processing time).
I have been playing with mongrel handlers for about an year now and they work like a charm. In this post I will give a sneak peek into how to write a highly scalable back-end for doing real stuff. Mongrel is a very fast server and can take heavy load. So here we go writing our own mongrel server for heavy processing


require 'rubygems'
require 'mongrel'
require 'mysql'

PORT=4444
class LogHandler < Mongrel::HttpHandler
  def initialize
    @@mysql=Mysql.connect("host", "username", "password", "databse")
  end

   def process(request, response)
     response.start(200) do |head, out|
       logs = @@mysql.query("select * from huge")
       # Do some heavy processing on this data
       sleep 10
       # done
       logs.each{|row|
         out.write(row)
       end
     end
   end
end

config = Mongrel::Configurator.new :host => "0.0.0.0", :port => PORT do
  daemonize(:cwd => Dir.pwd, :log_file => "server.log")
  listener(:num_processors => 150, :timeout => 300) do
    uri "/", :handler => LogHandler.new
  end
   trap("INT") { stop }
end

config.run.join

This piece of code registers a url "/" on the machine on port provided (in this case 4444) and serves a log huge file. Not very developer friendly is it ?

Well we can use ERB along with it and that will make things look a little easier.
Lets add some more code to class LogHandler for ERB stuff.


class LogHandler < Mongrel::HttpHandler
   def initialize
      @mutex = Mutex.new
   end
# This is for making instance variables of this class available at the template
  def get_binding
    binding
  end

# We change the process function to render a rhtml file called view.rhtml
  def process(request, response)
     response.start(200) do |head, out|
       head["Content-Type"] = "text/html"
       logs = @@mysql.query("select * from huge order by id DESC limit 20")
       sleep 20 # Some heavy processing on logs      
       rhtml = ERB.new(File.read("view.rhtml"))
       @mutex.synchronize{
          @logs = logs
          out.write rhtml.result(self.get_binding) 
      }
     end
   end

Mutex lock is important to implement here as mongrel reuses instance variables of this class for subsequent requests and may lead to a race condition.

View.rhtml looks something like this


<html><body><title>Log console</title>
<h1>Last 20 hits on our page</h3>
<table>
        <th>IP</th>
        <th>URL fetched</th>
        <th>Came from</th>
        <th>time taken</th>
     <%@logs.each{|td|%>
<tr>
        <td><%=row[0]%></td>
        <td><%=row[1]%></td>
        <td><%=row[2]%></td>
        <td><%=row[3]%></td>
</tr>
<%}%>
</table>
</div>

Not bad right ? I can now render rhtml as I 'd do from a rails application.

But hey what about the Routing, MVC stuff, activerecord, logging, form helpers, javascript helpers, view side helpers, callbacks, migrations etc etc ? Well well! This is NOT a full scale framework or a Rails substitute. If you want to do view-side-heavy things use Rails or Merb. If you want to do processing heavy jobs which results in simple-html/no-html then use this.

That said we can very easily sneak in a few of the Rails goodies.

1. Active record - That is easy. Just require 'active_record'; establish_connection; create models by doing this


def User < ActiveRecord::Base
end

2. Logging
Rails does logging in two parts. Request logging and response logging. You may add callbacks in the 'process' function to log a request and reponse at start and end of the function respectively.
Something like


def process
   requestLogger(request)
   # Do stuff
   responseLogger(response)
end

3. Routing
This one much more difficult/the most difficult to implement. However if you have only a few urls to match and most of them are not dynamic it is a easier to hard code them. However it is dirty to do so. To implement a light weight routing is not that difficult and not so dirty. We take this path.

First step to implement routing it to get all the parameters. Both post and get parameters. You may use something like this:


  def post_params(request)
    post_params = {}
    request.body.readlines.first.split('&').each{|x|
      k,v=x.split('=')
      post_params[k.to_sym] = CGI.unescape(v)
    }
    return post_params
  end

   def get_params(request)
    get_params = {}
    request.params["QUERY_STRING"].split("&").each{|x|
      k, v = x.split("=")
      get_params[k.to_sym] = CGI.unescape(v)
    }
    return get_params
  end

Actual implementation of routing is a little complex and is not easy to cover in one blog post. Also I may have done it wrong so I do not want to put it out there. I will cover this in detail when I am sure about it.

The more stuff you add to this thing the more it will start looking like Rails! IMHO it is not a bad idea to implement your own framework. I did implement a framework to run rails code as it is. But I never used it on my production servers. Apparently that is the why how it should be :)

Update 1: I have changed some code to highlight that mysql query is not the heavy call and processing is being on done elsewhere (sleep in this case). Default mysql libraries are not thread safe however one may use something like Neverblock

Piyush Ranjan 's den

Monday, February 23, 2009

Mongrel as a stand alone server