Thursday, 24 May 2012

Putting data from a tab separated(.tsv) file into MySQL database

Many a times we need to populate data dumped by someone into a tab-separated file into our database. It can be done using the following ruby code:

      row = []"/path/to/file.tsv") do |f|
        f.each_line do |tsv|
          row << tsv.split(/\t/)
    rescue Exception => e
       puts "------exception------#{e.inspect}"

However, the file runs into hundreds of MBs, this will be way too slow. Instead we can use the following MySQL query.

LOAD DATA LOCAL INFILE '/path/to/file' REPLACE INTO TABLE table_name IGNORE 1 LINES (column1, column2, column3, column4);

The "IGNORE 1 LINES" part ensures that the first line containing the header is ignored. In case there are no headers, this part may be excluded. Also, if relative file paths are to be used, the LOCAL keyword may be dropped. This process is way faster than any other process; but validations are bypassed.

No comments: