Showing posts with label sed. Show all posts
Showing posts with label sed. Show all posts

Saturday, 6 October 2012

Formatting column entries for IN query

Extracting entries of a column in a spreadsheet and using them in an IN query is not a rare scenario [at least for me] at work. Now, copying and pasting the entries in a comma separated and sometimes quote-enclosed format is tedious and when there are 100 entries, doing that by hand is unacceptable for me. On top of that when each entry has to be modified a bit before passing in the IN query, then the whole process becomes drudgery. Today, when faced with such a situation, I used emacs and sed to save myself a lot of time and boredom.

Let us consider that the spreadsheet has a column that has the following entries:

12345-1
23456-2
34567-1
45678-1
56789-1

Now, lets consider that I have to strip the first suffix, enclose the values in quotes and pass them in the IN query. So, I copy the column and paste in a text file and run the following commands on the file.


sed -i 's/-1//' ./myfile.txt
sed -i 's/-2//' ./myfile.txt

Now the file had the following contents.

12345
23456
34567
45678
56789

Now, I opened the file in emacs and ran a keyboard macro to get the following string.

'12345', '23456', '34567', '45678', '56789',

Now, I could easily use it in my query. Now, the macro I used was simple. I started recording the macro using `C-x ('. The macro consisted of the following keys in sequence [try it yourself to be clear]:

' -> End -> ' -> , -> Delete -> Space

I ended recording the macro using `C-x )'. Now, I had one line edited so I let emacs run the macro 4 times using the following command:

C-u 4 C-x e

I got the inspiration for using macros in the following video.


Monday, 6 August 2012

Using sed to remove unwanted lines

I have written about sed's utility before as well and it has impressed me time and again by the amount of time it saves. Recently while testing E17, I generated backtraces using GNU debugger. However, the file to which I redirected the backtrace contained a large number of lines about reading and loading symbols. I wanted to quickly get rid of those lines so that the developers can directly view the backtrace instead of scrolling down pages before the actual backtrace. So I decided to use sed for the purpose. The approach I took to achieve my desired result was as follows:

  1. I wanted to deletes lines that mentioned about loading or reading symbols. So, the sed operation I required was delete matching lines.
  2. I wanted to edit the files. So, I turned the edit in-place option on.
With this approach in mind, I ran the following two lines and the file was formatted properly.

sed -i '/Reading symbols/d' my_file
sed -i '/Loaded symbols/d' my_file

N.B.: If you want to test the output, remove the -i option.

Wednesday, 7 October 2009

Sed as a handy developer tool

According to EFL standards, static functions started with '_'. So, I had renamed ertf_stylesheet_add to _ertf_stylesheet_add. Going through the source looking for style string stuff, I noticed the error messages did not reflect these changes. So, I decided to do the necessary changes. I wasn't in the mood of looking for all occurrences of ertf_stylesheet_add and change them. So, I decided to try sed.

Starting with the man page and experimenting a little, I wrote the following script:
sed -e s/ertf_stylesheet_add/_ertf_stylesheet_add/ ertf_stylesheet.c
Checking the output, I saw that it worked fine and I could redirect the output to a file to have it saved. Only problem was that it changed occurrences of _ertf_stylesheet_add to __ertf_stylesheet_add, which was completely uncalled for. Digging deeper, I found that I can get rid of this by specifying addresses, i.e. restricting sed usage by specifying line numbers in my case. I wasn't satisfied with this because then I would have to go back and check the line number to ignore. So, I kept looking and finally I found my solution in regular expressions. I typed the following at command line and it worked just fine.
sed -e s/[^_]ertf_stylesheet_add/_ertf_stylesheet_add/g ertf_stylesheet.c
Also, by now I had learnt that adding the 'g' option was safe. Again I had missed on an an essential subtelity. I had forgotten this shall delete the intial double quotes of the error messages which were like:
"ertf_stylesheet_add: ..."

So, I edited my command as follows.

sed -e 's/\([^_]\)\(ertf_stylesheet_add\)/\1_ertf_stylesheet_add/g' ertf_stylesheet.c

This removed the double quotes problem. Now, as things started rolling, I wanted to do with sed. Now, in the error messages _ertf_stylesheet_add was usually followed by a colon. However, a few typos were there where there were colons instead of semicolons. So, I edited the sed line as follows.

sed -e 's/\([^_]\)\(ertf_stylesheet_add\)/\1_ertf_stylesheet_add/g' -e 's/;\(.\)/:\1/g' ertf_stylesheet.c

This did the task with the added problem of replacing

case ';':
with

case ':':

which was undesirable. Now, the problem was I was interested in a conditional replacement. So, I further edited the line as follows.
sed -e 's/\([^_]\)\(ertf_stylesheet_add\)/\1_ertf_stylesheet_add/g' -e 's/stylesheet_add;/stylesheet:/g' ertf_stylesheet.c

Here I am using two s/// expressions as this is the only way out. Checking at ##sed on freenode, I found the semantics of s/// don't allow conditional replacement. I was talking about adding this functionality to sed; but that would change sed's regxp flavour. However, this could be achieved easily using perl.

perl -pe 's/(ertf_stylesheet_add)(;?)/"$1".($2?":":"")/eg'
Well doing this task using sed took longer than it would have taken if I had done it by hand. However, since these tasks shall be common (code refactoring), I thought it shall save much time and effort in future.

Actually, the method worked so well that the "future" came in seconds and I was writing multi-line sed scripts to format error messages the whole lib folder. I quickly wrote the following lines followed by some others.

sed -e 's/\([^_]\)\(ertf_font_add\)/\1_ertf_font_add/g' -e 's/font_add;/font:/g' ertf_font.c

sed -e 's/\([^_]\)\(ertf_color_add\)/\1_ertf_color_add/g' -e 's/colortbl:/ertf_color_table:/'  -e 's/color_add;/color_add:/g' ertf_color.c

sed -e 's/readloop:/ertf_document_parse:/g' ertf_document.c