Salmon Run: Python Database hacks

Sunday, November 25, 2007

Python Database hacks

I seem to have writer's block this weekend. Or if you want to be more pragmatic, I did not take the time to work on a mini-project that I can write about. In reality, I spent last week goofing off, decompressing after an almost complete project, and putting finishing touches to a bunch of other little projects that have been in the pipeline for a while. As a result, I don't have anything to write about this weekend. A little disappointing, given that this was Thanksgiving weekend, and I had four days of vacation to do something interesting. But as so often happens with long vacations, the time gets sucked up into other little non-consequential things.

So this week, I will write about three little features that I came across while writing Python scripts to work against an Oracle database. These should work with other databases as well. Maybe they don't even deserve to be called hacks since they are adequately documented if you look hard enough, but its probably new stuff for beginning Python programmers such as myself. Hopefully this will be useful to others who are in the same situation as me.

Exception Handling

I found this out when trying to update a bunch of linked database tables using a Python script. An error in my program or an unhandled problem in the data would result in a partial update to the tables, forcing me to manually rollback the changes that had been applied. To handle this, Python has a try:...except: clause that can be used to check for exceptions and commit or rollback. Here is the code snippet to do this:

  import traceback
  ...
  conn = cx_Oracle.connect(connection_string)
  cursor1 = conn.cursor()
  cursor2 = conn.cursor()
  ...
  try:
    ... do something with various cursors derived from conn
    conn.commit()
  except Exception, inst:
    print "Error:", inst
    print traceback.print_exc(sys.exc_info()[2])
    conn.rollback()

The first print in the exception block will print the database error message, and the second print will print the stack trace. You can find more information in the Python tutorial page for Errors and Exceptions.

Accessing Database Metadata

I had posted code for a Python CSV to Excel converter in my blog sometime back. Most of the time, I would use this to convert CSV files dumped from a Java program or Python script into an Excel spreadsheet for consumption by my flat-file challenged clients. I decided to extend it recently to take in a SQL SELECT statement, and then dump out the results into an Excel spreadsheet. In addition to the rows, I also needed to get at the column names for the header. Python provides a description method for the cursor which returns this information, like so:

  conn = cx_Oracle.connect(connection_string)
  cursor = conn.cursor()
  cursor.execute(query)
  rows = cursor.fetchall()
  colnames = []
  metadaten = cursor.description()
  for metadata in metadaten:
    colnames.append(metadata[0])
  print ",".join(colnames)
  ...

The description method actually returns a lot more information about the column, but I was only interested in the column name. More information about the description() method and the contents of the data structure returned by it can be found in the ASPN Python Cookbook page here.

Unicode Handling

If your database is set up to store only ASCII characters, and if your data has characters which fall outside the 0..127 range, then you will have problems with storing this data. To get around this, you can have your script replace the non-ascii characters with "?" using the following call before you insert or update.