Friday 22 March 2013

Call by reference in Ruby

In Ruby, some objects are passed by value and some by reference. In case, we want to pass by reference some object which by default gets passed by value, we can just use the id value of the object in Ruby's memory, i.e. the object space. The id can be obtained as follows.

some_object = "some intialization"
some_object.object_id


Here is how pass by reference would work using object id values.

def caller
    blah = "blahblueblah"
    callee(blah.object_id)
end

def callee(object_id)
    puts ObjectSpace._id2ref(doc.object_id)
end


Such use should be highly improbable and might even turn out to be bad practice; but sometimes we just want to do things for the heck of it.

Tuesday 19 March 2013

Bypassing ActiveRecord cache

ActiveRecord is the default object relational model of the Rails web framework. It obviously follows the active record architectural pattern. Now, ActiveRecord maintains it own query cache which is different from the query cache of the underlying database server. This query cache is a rather simplistic one.

The issue that brought the requirement of query cache bypassing into picture was as follows.

1. a call to first object of the model to check if any records existed (Model.first)
2. raw SQL query to truncate the table
3. a call again to first object of the model to check if any records existed (Model.first)

Now, #1 and #3 obviously generated same SQL query. So, ActiveRecord served #3 from its cache.

We found multiple approaches of bypassing the ActiveRecord cache.

Approach 1
Clear the entire ActiveRecord cache. In Rails 2 this can be done using

ActiveRecord::Base.query_cache.clear_query_cache

In Rails 3, the same can be achieved using the following line.

 ActiveRecord::Base.connection.clear_query_cache

This approach however would clear the entire ActiveRecord cache which in production environment means increasing load on database server which is already the bottleneck. Plus, this approach is like using a jack hammer where a finger-tap would work.

Approach 2
This approach exploits the simplicity of the ActiveRecord cache. It forms the query in such a manner that the query string is very likely to be different from previous queries.

r = call random number generator
where_clause = "r = r"


Appending the above where clause to #1 and #3, we obtained queries that are very likely to be different from previous ones at least within the life time of the cache. This approach is obviously not elegant.

Approach 3
In this approach, we went with raw SQL queries not only for #2 but also for #1 and #3. ActiveRecord does not seem to cache raw SQL queries. So we could replace the call to the 'first' method of the model with something similar to the following.


sql = "select * from table_name limit 1"
ActiveRecord::Base.connection.execute sql


Although we can get the job done by this approach, it is bad practice to execute raw SQL.

Approach 4
Finally, we found a way of doing it through Rails. We need to explicitly tell Rails not to serve our queries from its cache for #1 and #3. This can be done as follows.

Model.uncached do
    Model.first
end


As this method does the job using the Rails framework, the abstraction all provided by ActiveRecord remains unbroken.

Thursday 14 March 2013

Memory snapshot in Jetbrains RubyMine on linux

I use RubyMine on linux for Ruby on Rails development. Of late, it had been hanging up frequently. I reported this to Jetbrains. They got back to me asking me to provide more details so that the concerned developer can try and find more details about the issue. On a side note, their customer support was so fast in responding, I was amazed. Also, the guy responding back was pretty technical himself.

Getting back to the topic, they had asked me to provide a memory snapshot. The process of generating a memory snapshot is described here. However, that process requires users to download YourKit Java Profiler, which apart from being a large download, comes with a 15-day license. To me it did not make sense. So, I got back to them about it and it turns out on linux, you don't really need it. The linux version of RubyMine, comes with the profiler libraries bundled.

$RUBYMINE_HOME/bin/libyjpagent-linux.so
$RUBYMINE_HOME/bin/libyjpagent-linux64.so

To enable its usage all you need to do is, edit the following script and set IS_EAP to "true".

$RUBYMINE_HOME/bin/rubymine.sh

Restarting RubyMine after that change, will show the memory and CPU snapshot icons. It is also advised to provide thread dumps, as described here, along with the memory snapshot.

Thursday 7 March 2013

Extracting logs out of journalctl

Journalctl gives us nice consolidated logs. However, on a number of occasions, we need to extract parts of the logs. There are multiple ways of doing it. To filter by process, you can use PID numbers as shown below.

journalctl _PID=<pid number>

To obtain PID number when you have the process name [or part of the process name], use the following:

ps aux | grep -i <process name>

The manual page only refers to it in examples. A commonly used slicing option is to see logs of current boot only. This can be done as follows.

journalctl -b

Another option is to look at logs of a particular unit only. This can be done in the following way.

journalctl -u <unit name>

The unit name could be some daemon name like 'mysqld'. Unfortunately, this does not work with 'kernel' as a unit. It can be combined with the -b option though. However, I find myself dealing with messages from various units. So, I scan through all messages and find the messages I need. To filter them out, I can use time stamps in the messages using the following format.

journalctl --since='2013-03-06 22:58:34' --until='2013-03-06 23:00:34'

The beginning time stamp works fine; but the ending time stamp does not work. I talked about it at #systemd IRC channel. It is fixed and will be released soon.