Saturday, 21 December 2013

Spellcheck in Emacs

Recently I am using Emacs for writing a few essays. While writing, I started missing spellcheck feature of Microsoft Word. The fact that feature is also provided my many browser based tools, like in GMail has made me used to it. So, I did a quick lookup to find the shortcut for triggering spellcheck in the current Emacs buffer. The answer was to simply use M-x ispell. Trying that I got the following error in the Messages buffer below the main buffer.

No word lists can be found for the language "en_US"

The shortcut mentioned above uses aspell which I knew is installed on my system. So, looking for aspell errors, I figured I needed to install the package aspell-en to get it fixed. On Arch linux, it can be done using the following.

pacman -S aspell-en

Sunday, 15 December 2013

Gem installation fails due to packaged gems

Rake 0.9.6 comes bundled with ruby 2. However, I needed to build ruby-debug-ide 0.4.17.beta16 against rake 0.8.7. I tried uninstalling rake 0.9.6 but there is no way to do it. Using the following command does not list 0.9.6 as an option.

gem uninstall rake

So, now I had to find a way to tell the gem utility to use rake 0.8.7 instead of 0.9.6. A little reading showed that setting the RAKE environment variable could do this.

RAKE=/usr/lib/ruby/gems/2.0.0/gems/rake-0.8.7/bin/rake gem install ruby-debug-ide -v 0.4.17.beta16 --pre

The above  did not work for some reason. After that I tried the following and it worked.

RAKE=`bundle exec rake` gem install ruby-debug-ide -v 0.4.17.beta16 --pre

Saturday, 14 December 2013

Checking memory usage for a set of processes

Recently, I had to monitor memory usage of a set of processes. I used pmap to help me out. The following script gave memory usage of each PID I was interested in.

for foo in `ps aux | grep <my_process_identifier> | awk -F ' ' '{print $2}'`
echo $foo
pmap $foo | tail -n 1

Monday, 9 December 2013

Coherence break in user experience

Why give the option for special characters when they are not accepted?

Wednesday, 4 December 2013

Set MySql output charset

The character set of the results returned by MySql is controlled by character_set_results variable. When a client connects to a MySql server, this variable is set to the character set of the environment of the client. So, if you connect from a ISO-8859-1 (latin 1) console, the variable is set to latin 1 encoding. The current value can be checked by the following query.

show variables;

To set the variable, the client can be initialized from a console which has its encoding set appropriately or one can run the following query.

set character_set_results = <character_set>;

Monday, 21 October 2013

Remove X-Powered-By header in Nginx server

It is often desired to remove the 'X-Powered-By' from the HTTP headers. When using Nginx, it can be using a simple setting. The setting has to be in http section and it takes the following form.

<module_name>_hide_header X-Powered-By;

For example, if it is a FastCGI server the configuration setting will be as follows:

fastcgi_hide_header X-Powered-By;

If you are using Nginx as a load balancer, the configuration setting will be as follows:

proxy_hide_header X-Powered-By;

Unfortunately this configuration setting is not available yet in Passenger because Passenger is a thord party module and they have not implemented it yet.

Saturday, 19 October 2013

SVN export using capistrano

If you are using SVN for version control and
capistrano for deployments, capistrano does code checkout from the application machines. It is often preferable to use svn export instead of svn checkout. Capistrano defaults to checkout, but it can be configured to use export by setting the :deploy_via parameter to 'export' in deployment script.

Friday, 11 October 2013

Retrieving crontab after accidental removal

Yesterday at my workplace someone accidentally deleted all entries in crontab. We started looking for ways of restoring it as fast as possible. The tedious way of finding all processes on that machine would be our last option and it would also miss out frequency optimization done on the machine for various crons set on it.

Based on the usual Unix practice of backing up every file while saving an edit, we wanted to find out if the same happens with crontab too. To do that we needed to find out where crontab file was stored. A look at the manual page using the following command was enough to tell us that the file was stored at /var/spool/cron/.

man 8 cron

Looking in the folder, we could see the empty crontab file but no backup file was found. At this point, we did not have much hope of retrieving. Just to try our luck we tried looking at /tmp folder and fortune did favour us. We found the following file which had the previous contrab entries.


Wednesday, 25 September 2013

Declining BSNL

BSNL is India's public sector telecom company. A few years ago, it was doing good business because the mobile network of BSNL was the most widespread. Also, its internet services were decent and consistent than local competitors. However, of late BSNL services have degraded. Earlier BSNL did not have online payment system. It was fine with most of its customers although online payment would have been easier. Recently they introduced online payment and their online bill check system fails as shown below.

There was no necessity to hurry for an online payment site and make a mess as above.

Another issue with BSNL is misinformation/outdated information.

Here, they enumerate toll-free phone numbers for users of all mobile carriers to report problems. However, in reality, you need to dial in from a BSNL mobile only.

BSNL would just need to maintain its service to stay in the market. However, I guess that is too difficult, especially when employees don't want to come to office in time and want to leave before time. The thinking of such employees is very short-sighted and clearly shows how disinterested the management is. It is as if they are steering themselves into doom.

Saturday, 29 June 2013

Change repeat rate in awesome WM

I had noticed that when you keep a button pressed in KDE it takes less time to repeat that character than when I do the same in Awesome. To get my desired behaviour, I had to set something called the auto-repeat rate option. A quick look at the manual page [1] tells us that we can turn it off either completely r for specific buttons. We can also the delay of auto-repeat and the rate of repetition, if XFree86-Misc extension of X server is implemented. Now, I had to find the delay and repetition rate values that I felt comfortable with. So, I found out the values in my KDE environment [2]. When I tried the values in Awesome, I was back in my comfort zone with the keyboard. The last step was to set the values every time Awesome started [3] and my system was all set.

[1] To see the options, try man xset, search for the option r or the string "autorepeat".

[2] The values can be found using the following command.

xset q | grep rate

[3] In the rc.lua file, just add the following line. You might want to customize the values for yourself.

os.execute("xset r rate 220 30")

Friday, 28 June 2013

Cleaning up MySQL binary logs

While analyzing disk usage, I found that /var/lib/mysql was running into gigabytes. I use MySQL for local testing. So, there is no way I can generate that much data. When I checked the folder out, there were many logs files which were eating up the space. Following the docs, clearing them was rather easy. All I had to do was run the following command on MySQL console.

purge binary logs before '2013-06-28 00:00:00';

Only one catch here: this will work only when log_bin option in my.cnf is not commented out.

Thursday, 27 June 2013

Rails not detecting iconv

Whenever I started rails console or server, I was getting the following line about iconv.

no iconv, you might want to look into it.

I tried installing iconv package followed by the gem. Yet, I was unable to get rid of the message. I was not bothered much about the message till I started using rubyzip gem for creating some archives as it requires iconv gem. The solution is quite simple actually. Just add iconv to the Gemfile and rubyzip works fine. Also, the error line disappears. Of course, this is just a by-pass to get things working. I might go for an analysis some time later. Suggestions for that are welcome.

Friday, 31 May 2013

[Rails:] false.present? being false is not intuitive (to me at least)

In Rails I often use the present? method to check whether a key-value pair is present is an input hash. The code would be something like the following.

def trial_method(params)
    o = Model.where(:id => params[:id].to_i).first
    o.field = params[:field_key] if params[:field_key].present?!

This works fine when field is most cases. However, when the field is a boolean field, there is a subtle with this code.

When params[:field_key] is true, the code works fine; but whenever it is false, the code fails. The intent is to see if the key-value pair is present and if present the value should be assigned to the field. Now, the value can be both true and false and both of those should be assigned to field. However, when the value is false, the present? function returns false and the field is not set any more. So the field value stays true and with the above code can not be flipped to false. The correction is of course simple.

o.field = params[:field_key] unless params[:field_key].nil?

This behaviour did not seem intuitive to me. So, I looked up documentation and source for present?. The definition of present? is quite simple.

def present?

Clearly all it does is to negate the result of blank?. So, I looked up the definition for that too.

def blank?
  respond_to?(:empty?) ? empty? : !self

When I tested false.respond_to?(:empty?), it returned false. So, it is clearly executing the !self part, which translates to false.blank? being true and therefore false.present? being false.

Thursday, 30 May 2013

LibreOffice is not showing CSV files properly

I get to deal with comma-separated(.csv) and tab-separated(.tsv) files from time to time and I have seen LibreOffice showing gibberish when I try to view the files.
The fix is actually very simple. I just had to set the encoding correctly.

Monday, 27 May 2013

Separating data processor implementation and data processing logic

After about a year, yesterday I was looking some Mainframe source code. I was looking at some JCL to be precise. Looking at the familiar IDCAMS and REPRO procedures, I could get a sense of what the code was doing. Then I saw a sorting routine which used the program ICETOOL. I could figure out the input and output streams defined but I could not find the parameters for sorting. Finally, I figured they were getting passed through TOOLIN interface. Interestingly, the sorting parameters were kept in a separate file. This was nice programming: separating the rules from the implementation so that when the input changes, the changes to the rules can be made in that file only and the rest of the program works fine without any changes.

Thursday, 23 May 2013

Search in AWS console

When I managed a small number of servers, looking up a server in AWS console was rather easy. However, as the number of servers grew, I started thinking of having a filter/search option. Fortunately AWS provides a nice search feature in the console.

Initial view when my instances were showing and there was no filter text.

After I added incorrect filter text, no matching instances were found.

Directory size listing

A lot of times when freeing up disk space [Well yes in today's world also people like me need to do it.], instead of opening my file manager and finding the size of the folders by checking the properties section in the right-click drop-down, I wanted to look at a list view where I can sort by folder size. Now, KDE's Dolphin file manager shows number of items within a folder but not the size and I anyways prefer the command line so I wanted a command line tool that lists size of directories. The amount of space left on the partitions can be easily found by the following command:

df -h

I also found a command for directory size.


However, its default behaviour is to show the size of files recursively. So, I do not get the size of the folders in the directory I am interested in. Instead, I get the size of files within those folders and recursively so forth. I needed to set the depth of search into the folders. I found a parameter with a similar name. The following command works just as I need.

du --max-depth=1 -h

Saturday, 11 May 2013

Thursday, 9 May 2013

Operating on each file in a directory in Windows Powershell

Recently, when I had to do some tasks on a Windows machine, I thought of trying out the Powershell. I had to process every file in a directory. In linux, I could easily do it on the command line so I tried to find out similar way for Windows. It turned out to be fairly simple actually.

$files=get-childitem .
foreach ($file in $files) { echo $file.fullname  }

Wednesday, 8 May 2013

Sad state of medical profession

Today when a friend told me about an ad at about medical seats being sold (for those unaware of the concept, it means paying and getting into medical colleges irrespective of one's qualifications and intellectual merit), I was appalled. I knew that such things do happen. However, these deals were always secretly done. Openly selling seats in colleges is equivalent to committing theft in broad daylight and proudly announcing that it is the thief's right to do so. I believe it is blasphemy against humanity. It is a pity that people consider religious blasphemy very seriously but blasphemy against humanity is so simply ignored.

There is a propagation of crime going on here. These colleges bribe their way out of accreditation process or make a temporary show of fulfilling the requirements of the accreditation. Very rarely some are caught. Obviously the quality of education in such institutes is not up to the mark. The grave question then is whether we can trust diagnoses done by doctors passing our from such institutes that offer seats for sale. This actually is a more serious concern than global warming because incorrect diagnoses will cause undeserved deaths; but sadly no politician talks of it in his/her agenda.

Sunday, 14 April 2013

Uninstalling python packages

Continuing my system cleanup, I decided to get rid of all unnecessary python packages.

pip2 list | awk -F ' ' '{print $1}' | grep -vE "django-paypal|boto|mercurial|MySQL-python|nltk" | xargs pip2 uninstall -y

The -y flag indicates confirmation of uninstallation.

Wednesday, 10 April 2013

Uninstalling all gems

Some time back I was trying out jruby. I had installed multiple gems for it. Today, while cleaning up my system I was trying to get rid of it because it does not have much use to me. However, uninstalling the gems one by one is a pain. So, I wrote the following line to uninstall all gems.

jruby -S gem list | awk -F ' ' '{print $1}' | xargs jruby -S gem uninstall

P.S. : To understand how to construct such one liners please look at my previous post.

Friday, 22 March 2013

Call by reference in Ruby

In Ruby, some objects are passed by value and some by reference. In case, we want to pass by reference some object which by default gets passed by value, we can just use the id value of the object in Ruby's memory, i.e. the object space. The id can be obtained as follows.

some_object = "some intialization"

Here is how pass by reference would work using object id values.

def caller
    blah = "blahblueblah"

def callee(object_id)
    puts ObjectSpace._id2ref(doc.object_id)

Such use should be highly improbable and might even turn out to be bad practice; but sometimes we just want to do things for the heck of it.

Tuesday, 19 March 2013

Bypassing ActiveRecord cache

ActiveRecord is the default object relational model of the Rails web framework. It obviously follows the active record architectural pattern. Now, ActiveRecord maintains it own query cache which is different from the query cache of the underlying database server. This query cache is a rather simplistic one.

The issue that brought the requirement of query cache bypassing into picture was as follows.

1. a call to first object of the model to check if any records existed (Model.first)
2. raw SQL query to truncate the table
3. a call again to first object of the model to check if any records existed (Model.first)

Now, #1 and #3 obviously generated same SQL query. So, ActiveRecord served #3 from its cache.

We found multiple approaches of bypassing the ActiveRecord cache.

Approach 1
Clear the entire ActiveRecord cache. In Rails 2 this can be done using


In Rails 3, the same can be achieved using the following line.


This approach however would clear the entire ActiveRecord cache which in production environment means increasing load on database server which is already the bottleneck. Plus, this approach is like using a jack hammer where a finger-tap would work.

Approach 2
This approach exploits the simplicity of the ActiveRecord cache. It forms the query in such a manner that the query string is very likely to be different from previous queries.

r = call random number generator
where_clause = "r = r"

Appending the above where clause to #1 and #3, we obtained queries that are very likely to be different from previous ones at least within the life time of the cache. This approach is obviously not elegant.

Approach 3
In this approach, we went with raw SQL queries not only for #2 but also for #1 and #3. ActiveRecord does not seem to cache raw SQL queries. So we could replace the call to the 'first' method of the model with something similar to the following.

sql = "select * from table_name limit 1"
ActiveRecord::Base.connection.execute sql

Although we can get the job done by this approach, it is bad practice to execute raw SQL.

Approach 4
Finally, we found a way of doing it through Rails. We need to explicitly tell Rails not to serve our queries from its cache for #1 and #3. This can be done as follows.

Model.uncached do

As this method does the job using the Rails framework, the abstraction all provided by ActiveRecord remains unbroken.

Thursday, 14 March 2013

Memory snapshot in Jetbrains RubyMine on linux

I use RubyMine on linux for Ruby on Rails development. Of late, it had been hanging up frequently. I reported this to Jetbrains. They got back to me asking me to provide more details so that the concerned developer can try and find more details about the issue. On a side note, their customer support was so fast in responding, I was amazed. Also, the guy responding back was pretty technical himself.

Getting back to the topic, they had asked me to provide a memory snapshot. The process of generating a memory snapshot is described here. However, that process requires users to download YourKit Java Profiler, which apart from being a large download, comes with a 15-day license. To me it did not make sense. So, I got back to them about it and it turns out on linux, you don't really need it. The linux version of RubyMine, comes with the profiler libraries bundled.


To enable its usage all you need to do is, edit the following script and set IS_EAP to "true".


Restarting RubyMine after that change, will show the memory and CPU snapshot icons. It is also advised to provide thread dumps, as described here, along with the memory snapshot.

Thursday, 7 March 2013

Extracting logs out of journalctl

Journalctl gives us nice consolidated logs. However, on a number of occasions, we need to extract parts of the logs. There are multiple ways of doing it. To filter by process, you can use PID numbers as shown below.

journalctl _PID=<pid number>

To obtain PID number when you have the process name [or part of the process name], use the following:

ps aux | grep -i <process name>

The manual page only refers to it in examples. A commonly used slicing option is to see logs of current boot only. This can be done as follows.

journalctl -b

Another option is to look at logs of a particular unit only. This can be done in the following way.

journalctl -u <unit name>

The unit name could be some daemon name like 'mysqld'. Unfortunately, this does not work with 'kernel' as a unit. It can be combined with the -b option though. However, I find myself dealing with messages from various units. So, I scan through all messages and find the messages I need. To filter them out, I can use time stamps in the messages using the following format.

journalctl --since='2013-03-06 22:58:34' --until='2013-03-06 23:00:34'

The beginning time stamp works fine; but the ending time stamp does not work. I talked about it at #systemd IRC channel. It is fixed and will be released soon.

Saturday, 23 February 2013

Setting up an FTP server on AWS

Recently for testing some code, I had to host an FTP server. I tried doing it on my local first. It was easy. I just had to follow the Arch wiki for vsftpd. File transfers in both directions were working so I thought I can try it on an Amazon instance too.

My local machine ran Arch linux on while the Amazon instance ran Fedora 8. After looking up the details of the package manager for Fedora and some help from a friend, I installed vsftpd on it, applied the same config and started the FTP service. When we started testing it, we could operate successfully from command line but not from the code. From the command line, we were using active mode of operation while the code was using the passive mode, so we looked into the config to check settings related to passive mode of operation. It turned out that the passive mode is enabled by default. However, going through the various options we found an option called pasv_address. From prior experience I know that AWS machines have a private LAN IP and a separate public IP. Now, the OS on the cloud instance is not aware of what public IP it is serving. So, we suspected that in its response it must be asking the client to connect on the private LAN IP which would obviously fail. So we just set the pasv_address option to the public IP of the instance and passive mode started working fine. We could successfully connect to it and get file transfers done. So, we decided to use it for testing our code. However, when we tested it, we saw that our application was trying to post files but it was failing every time. The error we were getting each time said '500: Invalid Port command'.

The FTP protocol really goes funky with ports. It uses separate ports for control and data. The behaviour of data ports is dependent upon the mode of operation. In active mode, the client initiates the data connection and therefore the port selection is done by the client, while in passive mode, the server initiates the data connection and therefore the port selection is done by the server. We were using the passive mode of operation and the server was hitting the client at a port that later turned out to be blocked. To debug the situation, we tried connecting to the FTP server from the command line utility 'ftp' using the following command.

ftp ip address of FTP server

To turn on the passive mode and debug mode, we can use the commands 'passive' and 'debug' respectively. However, they only set the options on the client without actually sending any control data to the server. To test the FTP service, try some command that sends some control data. We went with an 'ls'. The following FTP commands were executed in sequence.


The PASV commands [1] outputs a line, like the following, indicating the port the data transfer will happen.

Entering Passive Mode (1,2,3,4,224,186)

The port has to be calculated from the last two numbers using the following formula.

n1 x 256 + n2

In the above instance, it is 224 x 256 + 186 = 57530. Once we knew that the issue was the port that the FTP server was trying to communicate to the application machine on was blocked, we decided to configure the FTP server to connect on some port within the open port range. This can be done setting the pasv_min_port and pasv_max_port options correctly in vsftpd.conf. Once we got the server connect to the client on proper ports, the transfers worked fine.

[1] A reference of FTP commands.

Friday, 22 February 2013

Getting to know the Syslog protocol

Recently I was looking into FTP issues, when I learnt some details about Syslog. I was using vsftpd for hosting an FTP service. I had enabled logging but I was not seeing anything in journalctl output. The reason for that turned out to be a configuration flaw. I had not turned on the option for vsftpd to use syslog. Once I turned the option on, proper log files were created.

I do not have Syslog-ng or any other syslog package installed. I had uninstalled it when I switched to systemd. So, I had not enabled that option in vsftpd. However, as it turns out when I turn that option on vsftpd uses the syslog protocol for logging. Systemd listens for messages sent using that protocol and creates appropriate logs.

This is a great way of unifying all logging. Individual packages do not have to bother about logging. They just act as clients of the protocol and the listener will take care of maintaining the logs. There are instances of similar architecture being followed for logging in other domains too.

As it turns out, there are standardized versions of the syslog protocol:

The version used commonly is the BSD one even the former is more advanced. Now syslog is being replaced by systemd's journal because it capitalizes over syslog. It provides efficient transfer of binary data and supports JSON. Maintaining logs is much easier.

It was interesting to know that even though syslog packages are becoming obsolete, the syslog protocol is still the logging standard. It is actually a nice example of robust architecture surviving over the years.

Tuesday, 8 January 2013

Kill all zombie processes of a process

Time and again I have found phpmyadmin not working because there are a lot of zombies of httpd. I do not know yet why these many instances of the daemon show up and why they turn into unresponsive zombie processes. Usually when this happens, I just kill all the zombies and spawn a new daemon and I carry on with my work. I used to kill all the zombies one at a time. However, today I figured out that I can do it with a single line.

ps aux | grep http | awk -F " " '{print $2}' | xargs kill -9

[If the working is clear to you, do read further and let me know if I can improve it or if I am interpreting anything incorrectly although things are somehow working.]

The one-liner above is easy to understand once we have understand the pieces. So, I am listing that below.

  • ps aux lists all running processes
  • grep http finds the lines containing the string 'http'
  • awk -F " " '{print $2}' splits each input line by delimiter specified with -F flag, space in this case and prints the second token thus obtained
  • kill -9 send SIGKILL signal to the processes whose ids are specified as arguments
  • xargs takes output of previous command and makes it input for the next
You should probably have a look at the man pages for more details on the commands.