Importing CSV file data into sqlite3

by Mandar Vaze on July 13, 2009
in Code, Hack, Linux, Open Source, tips

The :en:SQLite logo as of 2007-12-15
Image via Wikipedia

I was trying to import the data from CSV file into sqlite3 database. Ideally this should be very simple task, with following the steps given in the sqlite tutorial. It is a matter of calling the sqlite command with separator argument, followed by an import operation, as listed below.

sqlite3 test.db  "create table t1 (t1key INTEGER PRIMARY KEY,data TEXT);"
sqlite3 -separator , test.db ".import some.csv t1"

Except that main attribute of my CSV file was that it could contain single records with embedded comma. I was hoping that sqlite3 would be smart enough to detect that the fields were enclosed within double quotes and then separate by comma. But I soon realized that only a code specifically dealing with CSV would know about this.  As we can see in the example above, the import is a generic code and as a user I listed comma as a separator.

My Data looked something like this :

"1","data1"
"2","data2,data3"

So like any *nix geek would do, I tried providing double quote and comma as a separator. To my surprise it worked very well. I though separator would take only single character, and I had provided two (three?). Anyway, important thing to remember is to escape the single quote with a backslash (I didn’t try it without the backslash, may be that would work too)

So here is the syntax that worked :

sqlite3 -separator \", test.db ".import mydata.csv mytbl"

Update : Turns out SQLite Manager is much better solution after all.  It is an Extension for Firefox and other apps to manage any sqlite database. Not only it took care of above situation, it also handled empty cells as well where the command line failed with following error message :

line 4: expected 3 columns of data but found 2

Data with missing cells : Notice two successive commas :

"1","data1","data2"
"2","data3,data4","data5"
"3",,"data6"
Reblog this post [with Zemanta]

Why SharpDevelop is better IDE ?

SharpDevelop
Image via Wikipedia

In my first post about IronPython, I documented how installing IronPython Studio was painful (Needed Visual Studio shell, which in itself was confusing). When I started with IronPython I did not know about any other IDE, hence I went ahead with IronPython Studio. But later I came to know about SharpDevelop.

My initial problem with SharpDevelop was that it needed .NET 3.5 SP1 at the minimum. I had just gone through the painful exercise of downloading and installing the prerequisites for IronPython Studio. So I was in no mood of downloading another big chunk before I can start my IronPython Development. But once I got past my initial development cycle, I wanted to give Sharp Develop a try.

After using both the ID interchangeably, I finally settled on SharpDevelop as my choice for IronPython Development

Read more..

Reading CSV files in IronPython

stylized depiction of a csv text file
Image via Wikipedia

This is in continuation with my previous blog post :

To get IronPython to use Standard Python Modules,  one needs to add the following two lines to C:\IronPython-2.0.1\Lib\site.py :

import sys
sys.path.append(r"C:\Python25\Lib")

While this works for most part, it doesn’t help if you are using Python extensions written in C. More about my specific problems in another post. But there is an open source project IronClad to deal specifically with this issue. In the meantime, you can check the differences between IronPython and CPython

Reading (and writing to) CSV file is critical part of my program, while in stadard python it was as easy as “import csv”, the same thing took some efforts to get it working in IronPython. I got the following error for my import statement

Error on line 7 in csv.py
from functools import reduce

I also tried using ActiveState Python 2.5.2.2 (which I already had from few months ago, didn’t feel like downloading the latest version till the problem was fixed) But that didn’t help either. With ActiveState, I got the same error on the same line, except this time it was for _csv.

To Quote from IronPython Cookbook :

For some reason the Python standard library csv module is written in C, which means that it isn’t available to IronPython.

The cookbook points to a third party library called A Fast Csv Reader . The cookbook has a nice example of how to use the said DLL with your IronPython Program.

It wasn’t clear to me as to why I had to register at Code Project to download this binary since it is provided under MIT Open Source License. But whom am I gonna complain to ? Beggars can’t be choosers :(

Reblog this post [with Zemanta]