Saturday 16 August 2014

Finding the file or device pointed to by a file descriptor

I had been noticing that Firefox starts off with around 400mb and reaches around 1.4 gb as I keep using it throughout the day. Also, CPU utilization clocked at 25% on my i7. So, I thought of trying to mess around in the hope of finding something interesting. I found the pid using the following.

ps aux | grep firefox

I ran an strace to try and see what was keeping it so busy.

strace -p <pid_of_firefox_process>

This is what I got.

poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 4294967295) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 4294967295) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 4294967295) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
getrusage(RUSAGE_SELF, {ru_utime={17387, 398261}, ru_stime={821, 999917}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={17387, 438261}, ru_stime={821, 999917}, ...}) = 0
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
write(18, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 4294967295) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 4294967295) = 1 ([{fd=17, revents=POLLIN}])
read(17, "\372", 1) = 1
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(4, 0x7fff427b9fb0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}], 4, 4294967295) = 1 ([{fd=17, revents=POLLIN}])

Now, I wanted to know the files pointed to by file descriptors 4, 5, 17, 19. To look those use, I read that I can use /proc filesystem. So, I just tried simply looking it up there.

ls -l /proc/<pid_of_firefox_process>/fd

The files turned out to be as follows:
  • 4 - a socket
  • 5 - anon_inode:[eventfd]
  • 17 - a pipe
  • 19 - another socket
Any suggestions are welcome. I will keep looking into the issue.

Wednesday 16 July 2014

Invoke a rake task multiple times from another

To invoke a rake task within another, one can do the following:

Rake::Task['namespace:task_name'].invoke

However, if you need to run a task multiple times, the following will not work.

n.times do
    Rake::Task['namespace:task_name'].invoke
end

It only executes the task once. This behaviour is useful when you are loading dependent tasks. For example, if there are two tasks that depend upon :environment task, this behaviour ensures :environment is loaded only once. However, the above code is written with a different intension. To make it work correctly, the rake task has to be re-enabled. The way to do it is as follows:

n.times do
    task = Rake::Task['namespace:task_name'].invoke

    task.reenable
    task.invoke
end

Thursday 5 June 2014

Writing files in a specific encoding in Ruby

Recently, I was trying to create PDF files of barcodes. I was using barby gem for generating barcodes. When I tried to write the generated PDF, I found that the PDF string was in ASCII-8BIT encoding while my console was in UTF-8 which was causing Ruby to attempt writing in UTF-8 encoding causing errors. So, to write to the file in ASCII-8BIT, I had to specify that while opening the file. I achieved that using the following and I could write to the file correctly.

f = File.open("test.pdf", "w:ASCII-8BIT")
f.write the_pdf_string
f.close

Sunday 25 May 2014

Compiling in Emacs

I use Emacs for development in C. I was trying out some algorithms and I had used some functions from math.h. I had included the header of course. However, when I wanted to compile I knew I would have to use the -lm flag to include math.h in the compilation. I use M-x compile for compilation. It usually asks me to enter the name of the executable to be built. In that same line, I added the CFLAGS=-lm part to the end of that line and the compilation worked fine. So instead of

make -k application

my compilation command looked like the following.

make -k application CFLAGS=-lm

Wednesday 14 May 2014

Do not save VBS file in UTF8 encoding

Recently, I had to run a VB script as suggested here. I do not have much experience on Windows; but I often keep my source code in utf8. I also have my IDE set to it. So, when I saved the script I opted for utf8. However, when I ran the script, I got a strange error. Since, I had opted specifically for utf8 overriding the default, I tried saving in the default character encoding, i.e. ASCII and the script ran fine. Reading about it, I can see that ASCII and utf-16 are supported. So, lesson learned: do not use utf8 on windows.

Sunday 16 March 2014

Starting with Lisp

Just for some fun and relaxation, I decided to start doing some Lisp. I had set up SLIME a few months backs while I was starting some book but I had not done much. I decided to use SBCL because I had read that is usually fast. The reason I was using SLIME is that the back arrow key does not work when I use the SBCL REPL and that is very annoying to me because I type both opening and closing parentheses together and type the rest of the code in between. To setup SLIME, a lot of documentation is available online so I wont re-iterate. Just for reference, the SLIME section of the .emacs file is as follows:

(setq inferior-lisp-program "/usr/bin/sbcl")
(add-to-list 'load-path "/usr/share/emacs/site-lisp/slime/")
(require 'slime)
(slime-setup)

Once I was in SLIME REPL, I started writing a few s-expression to get a hang of it.

(+ 1 2)

(print 'hello)

(format t "hello world")


After that, I started writing a simple function that returns the nth number in the Fibonacci sequence. I came up with the following.


(defun foo (n)
    (cond
     ((= n 1) 0)
     ((= n 2) (+ 1 (foo (- n 1))))
     (+ (foo (- n 1)) (foo (- n 2)))))


And I started testing it.


(foo 1)

(foo 2)
(foo 3)


The result for n = 3 was wrong. The reason lay in the last section of cond. I intended it to be default but I had not specified the condition for it. So, I corrected it as follows.

(defun foo (n)
    (cond
     ((= n 1) 0)
     ((= n 2) (+ 1 (foo (- n 1))))
     (t (+ (foo (- n 1)) (foo (- n 2))))))

Now, the method was returning correct values. Clearly, it is a bad implementation. So, I thought the calls to foo should be memoized. I could go ahead and try writing my own memoization but I wanted to see what Common Lisp had to offer. However, before doing that I wanted to know whether memoization will be beneficial. So, I wanted to benchmark the memoized and the plain versions. Our good friend Google helped me out. I was able to start profiling using SBCL's built-in package sb-prof.


(in-package :cl-user)
(require :sb-sprof)
(declaim (optimize speed))
(sb-sprof:with-profiling (:max-samples 1000
                          :report :flat
                          :loop nil)
  (foo 100))


I was a little doubtful that I should not be trying for the 100th Fibonacci number but still I went ahead with it and even after some minutes, it was going on. So, I now I decided to kill the profiling. Ctrl+C Ctrl+C came in handy. I started low this time. From 10, 15, 25, I reached up to 40 at which value it took around 6 seconds.


(sb-sprof:with-profiling (:max-samples 1000
                          :report :flat
                          :loop nil)
  (foo 40))


Now, I was ready to test a memoized version of the function and see how much benefit I can get. I remembered reading some code by Peter Norvig in Python which used a decorator to achieve generic memoization. So, I thought Lisp ought to have something similar. I found a nice memoization API. However, I had to install the package and I did not want to delve into that because I did not find any packages for Arch. So, I decided to settle for a less robust implementation.

(defun Basic-Memo (Function)
  "Takes a normal function object and returns an `equivalent' memoized one"
  (let ((Hash-Table (make-hash-table)))
    #'(lambda (Arg)
    (multiple-value-bind (Value Foundp)
        (gethash Arg Hash-Table)
      (if
        Foundp
        Value
        (setf (gethash Arg Hash-Table) (funcall Function Arg))))) ))

(defun Basic-Memoize (Function-Name)
  "Memoize function associated with Function-Name. Simplified version"
  (setf (symbol-function Function-Name)
    (Basic-Memo (symbol-function Function-Name))))

(Basic-Memoize 'foo)
Now, the 1000th Fibonacci number was also easily calculated. When I tried to the above profiling code for (foo 1000), I was getting error for the run being too short. Trying for 10000th Fibonacci number, I got 0.01 seconds.

Friday 14 March 2014

Speeding up RubyMine on linux

I use RubyMine for development on Ruby on Rails platform. A lot of my friends and colleagues using it often have the opinion that it is a resource hog and slows down the system. I have also observed a little slowness. Of late, the effect increased and crossed my tolerable limit. So, I decided to dig in and find out what was happening. Using top, I could see that for some reason a process jbd2 was being triggerred every 2 seconds. Reading about it, I found out that it was the process updating access time. I am using ext4 filesystem and I had not specified the noatime option in fstab for the partition. So, access times for files were being updated in real time. I do not need that at all. Access times are updated even for page cache hits, which is of no use to me. So, I turned it off by changing the mount options for the partition.

Now, it was time to look into RubyMine in particular. I was using 64-bit version of it and it is well known that the 64-bit version has deteriorated performance initially because the default heap size is set at the same value as the 32-bit version. I had already taken care of it by changing the heap size from 512 mb to 1536 mb. To do this, you can edit bin/rubymine64.vmoptions and edit it to reflect the following.

-Xmx1536m

Another important property that I had not set was the ReservedCodeCacheSize. It was also having the same value as the 32-bit version, i.e. 64mb. Changing it as follows, improved performance.

-XX:ReservedCodeCacheSize=256m

This ensured a good deal of code could be cached in memory making RubyMine's operations faster.

Friday 21 February 2014

Private class methods in Ruby


In Ruby instance methods can be marked as private by declaring them below the private section. However, the same does not work for class methods.

class SomeClass

  def method1
    # do something
  end

  private

  def method2
    # do something els
  end

end


The above works fine. Instances (objects) of the class can not call method2 directly.

class SomeOtherClass

  def self.method1
    # do something
  end

  private

  def self.method2
    # do something else
  end

end


In this case however, one can easily call method2 as follows.

SomeOtherClass.method2

To mark class methods as private,  we will need to use private_class_method as shown below.

class SomeOtherClass

  def self.method1
    # do something
  end

  private_class_method :method2 :method3

  def self.method2
    # do something else
  end

  def self.method3
    # do some other things here
  end


end


Tuesday 18 February 2014

Check if a MySql server is read-only

Sometimes, when you are not sure whether a MySQL server you connected to, is read-only or not and a write query is too risky, it is easy to find out the read-only option setting using the following query.

SELECT @@global.read_only;

Sunday 19 January 2014

Combining multiple PDF documents into one

A number of times I have faced the requirement of combining multiple PDF files into a single file. Doing that is fairly simple actually. It can be done using a number of tools. Two ways of doing it using tools very common on linux systems are as follows.
A lot many systems have imagemagick installed.  Using it is so simple that I use it as my default tool for this purpose.

convert file1.pdf /path/to/file2.pdf /destination/path/store.pdf

It takes a lot of options. For PDFs containing images, it is better to specify a quality parameter.

convert -quality 100 mine1.pdf mine2.pdf merged.pdf
Ghostscript is also very commonly found package in linux systems. Its usage is a little more obscure. However, it is very fast. I use it when I need to get PDF files of reduced size.

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input1.pdf input2.pdf

The above however reduces image quality badly. To get decent image quality with a little larger PDFs, we can use the following.

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input1.pdf input2.pdf

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress -sOutputFile=output.pdf input1.pdf input2.pdf

Tuesday 7 January 2014

Disable compression in Net::HTTP

Recently while upgrading to Ruby 2, I found a JSON API call was failing intermittently. The code was failing when I was trying to parse the response using JSON.parse. Investigating further, I found that the call was successful; but for some calls the response was different than the response for others. Looking at the response, I found it was not JSON at all. It looked like the following.

\x8B\x00\x00\x8C...

I had no clue as to what to make out of such a response. Reading up on it, I found out that the above might be a compressed response. It appeared that the server had the liberty of choosing whether to compress the response or not and whenever the response was uncompressed, my code was working fine. Looking into improvements introduced in Ruby 2, I found that Net::HTTP now automatically requests gzip and deflate compression by default. All I needed to do was to stop doing that and ask for uncompressed response. As my response was rather small, it would hardly matter. To request uncompressed response, I just needed to add the following header to my requests.

'Accept-Encoding' => 'identity'

Sunday 5 January 2014

Counting words on the command line

Recently, I have been writing a number of essays with various word limits. So, I have been finding myself in need of doing frequent word counts. Getting a word count for a file is easy using the wc utility.

wc -w /path/to/file

However, sometimes I just write a piece in a browser text box, which don't have word count. So, I want to get a word count from the console rather than saving it to a file and working with it. It turned out to be quite easy actually. I just had to cat the entire text and pipe that to wc.

cat << EOF | wc -w
>Your text here
>
>More here.
>EOF