My guess is that the problem is this call:
unicode(error_msg)
What is the type of error_msg? I'm fairly sure by default the subprocess APIs will return the raw bytes output by the child program, the call to unicode
tries to convert the bytes into characters (code points), by assuming some encoding (in this case utf8).
My guess is that the bytes aren't valid utf8, but are valid latin1. You can specify what codec to convert between bytes and characters:
error_msg.decode('latin1')
Here's an example that hopefully demonstrates the problem and workaround:
>>> b'h\xcello'.decode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.2/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xce in position 1: invalid continuation byte
>>> b'h\xcello'.decode('latin1')
'hÎllo'
A better solution might be to make your child process output utf8, but then that depends on what data your database is capable of storing also.