Working with PHP websites you’ll regularly need to export/import copies of MySQL databases, whether for testing and debug purposes or at a minimum creating and restoring backups. As database sizes increase this poses risks particularly, for example, with large innodb based applications like Magento where database sizes can easily go into the gigabytes.
The Basics
The process for creating and restoring a snapshot is trivial
$ mysqldump -uuser -p mydatabase > mydump.sql # export
$ mysql -uuser -p mydatabase < mydump.sql # import
For the most part this works as you expect it to, and for small databases this is probably all that is needed. You may find the resultant .sql file is huge, it is after all uncompressed text, at 1 byte per character if the output is ANSI or up to 4 bytes if your character set is UTF-8. Bzip2 will bring the file size down a considerable amount but it’s also considerably slower than gzip.
It’s sometimes tempting to gzip/bzip2 your datadump while performing the mysqldump in a single line.
$ mysqldump -uuser -p mydatabase | bzip2 -c > mydump.sql.bz2
While it seems a nice efficient way to do your backup, this should be avoided as (by default in MyISAM) you’re locking tables and denying other clients access to them. InnoDB implements row level locking which is slightly less offensive, but still should be avoided as much as possible.
When importing a large database, the choice of zip format is important. You have to trade off decompression speed, with filesize. The extra cputime consumed decompressing a bzip2 datadump may actually be less preferable to a few extra megabytes gained by using the faster gzip. Whatever your choice, importing a zipped datadump is very easy.
$ gunzip -c mydump.sql.gz | mysql -uuuser -p mydatabase # importing a gzipped datadump
$ bunzip2 -c mydump.sql.bz2 | mysql -uuuser -p mydatabase # importing a bzip2 datadump
Locking and Transactional Integrity
As explained briefly above, MySQL’s storage engines come with some limitations. In the worst case, with MyISAM, while performing a mysqldump entire tables will be locked, that means other clients will not be permitted to write to a table while the dump is being performed. If you have large MyISAM tables this poses clear problems when backing up a running application.
InnoDB is slightly better because it uses row level locking. It locks only the rows affected by a query, and not the whole table. This makes a conflict far less likely to occur while performing a backup on a running application. InnoDB as a transactional storage engine does allow for the possibility that active transactions may be underway while you’re attempting your backup.
Two options we can pass to mysqldump mitigate these issues are, —single-transaction and —skip-lock-tables.
$ mysqldump --single-transaction --skip-lock-tables mysql -uuser -p mydatabase
The use of —single-transaction means mysql issues a begin statement before dumping the contents of a table, ensuring a consistent state of the table without blocking other applications. It means writes can occur while the backup is taking place and this will not affect the backup. The —skip-lock-tables option stops MyISAM tables being locked during the backup. This does mean the integrity of the table can be lost as writes occur to them during the backup process. The risk is weighed up against the risk of blocking access to the table during a lengthy backup process.
Improving Import Performance
Choice of zip format will have a large bearing on import performance. Gzip is appreciably faster than bzip2. Other options you can pass to mysqldump to improve import performance are —disable-keys and —no-autocommit.
Disabling keys significantly improves the performance of imports as mysql will only index the table at the end of the import. With keys enabled, the index is updated after each row is inserted. Given you are performing a batch import, this is suboptimal.
By default each statement in an InnoDB table is autocommitted. This comes with unneccessary overhead when performing a batch import as you really only need to commit once the table has been fully imported.
Further Reading
This only a brief look at using Mysqldump for backups. It’s a common enough development task that all developers should take the time to see how it can be best leveraged for their environment. There’s plenty of documentation out there on using the tool. But the best place to start is with the official docs.