Source large MySQL dump with Python script

https://stackoverflow.com/questions/5991521

13-11-2019
|

質問

I have a huge (1GB+) database dump that I want to load into new databases on other servers. I tried parsing it line by line and executing each into mysql, but it unfortunately doesn't split lines evenly into commands and just fails on the incomplete ones.

filename='/var/test.sql'
fp = open(filename)
while True:
        a = fp.readline()
        if not a:
           break
        cursor.execute(a) #fails most of the time

It is also way too large to load the entire thing into memory call that. Furthermore, the python MySQLdb module does not support the source command.

EDITED

File includes a bunch of insert and create statements. Where its failing is on the inserts of large tables that contain raw text. There are all sorts of semi-colons and newlines in the raw text so hard to split commands based on that.

解決

Any reason you can't spawn out a process to do it for you?

import subprocess

fd = open(filename, 'r')
subprocess.Popen(['mysql', '-u', username, '-p{}'.format(password), '-h', hostname, database], stdin=fd).wait()

You may want to tailor that a little as the password will be exposed to ps.

他のヒント

Assuming queries do end on line boundaries, you could just add lines together until they make a complete query.

Something like:

filename='/var/test.sql'
fp = open(filename)
lines = ''
while True:
        a = fp.readline()
        if not a:
           break
        try:
           cursor.execute(lines + a)
           lines = ''
        except e:
           lines += a

If it's only insert statements, you could look for lines ending ; and with the next line starting "INSERT".

filename='/var/test.sql'
fp = open(filename)
lines = ''
while True:
        a = fp.readline()
        if not a:
           break
        if lines.strip().endswith(';') and a.startswith('insert'):
           cursor.execute(lines)
           lines = a
        else:
           lines += a
# Catch the last one
cursor.execute(lines)

edit: replaced trim() with strip() & realised we don't need to execute the line a in second code example.

I think, sometimes, we should choose the other ways to do the job effectively. I prefer to use this stuff for large data: http://www.mysqldumper.net/

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow