LVM snapshot of mounted filesystem

https://stackoverflow.com/questions/1940093

20-09-2019
|

Question

I'd like to programmatically make a snapshot of a live filesystem in Linux, preferably using LVM. I'd like not to unmount it because I've got lots of files opened (my most common scenario is that I've got a busy desktop with lots of programs).

I understand that because of kernel buffers and general filesystem activity, data on disk might be in some more-or-less undefined state.

Is there any way to "atomically" unmount an FS, make an LVM snapshot and mount it back? It will be ok if the OS will block all activity for few seconds to do this task. Or maybe some kind of atomic "sync+snapshot"? Kernel call?

I don't know if it is even possible...

Solution

You shouldn't have to do anything for most Linux filesystems. It should just work without any effort at all on your part. The snapshot command itself hunts down mounted filesystems using the volume being snapshotted and calls a special hook that checkpoints them in a consistent, mountable state and does the snapshot atomically.

Older versions of LVM came with a set of VFS lock patches that would patch various filesystems so that they could be checkpointed for a snapshot. But with new kernels that should already be built into most Linux filesystems.

This intro on snapshots claims as much.

And a little more research reveals that for kernels in the 2.6 series the ext series of filesystems should all support this. ReiserFS probably also. And if I know the btrfs people, that one probably does as well.

OTHER TIPS

I know that ext3 and ext4 in RedHat Enterprise, Fedora and CentOS automatically checkpoint when a LVM snapshot is created. That means there is never any problem mounting the snapshot because it is always clean.

I believe XFS has the same support. I am not sure about other filesystems.

It depends on the filesystem you are using. With XFS you can use xfs_freeze -f to sync and freeze the FS, and xfs_freeze -u to activate it again, so you can create your snapshot from the frozen volume, which should be a save state.

I'm not sure if this will do the trick for you, but you can remount a file system as read-only. mount -o remount,ro /lvm (or something similar) will do the trick. After you are done your snapshot, you can remount read-write using mount -o remount,rw /lvm.

Is there any way to "atomically" unmount an FS, make an LVM snapshot and mount it back?

It is possible to snapshot a mounted filesystem, even when the filesystem is not on an LVM volume. If the filesystem is on LVM, or it has built-in snapshot facilities (e.g. btrfs or ZFS), then use those instead.

The below instructions are fairly low-level, but they can be useful if you want the ability to snapshot a filesystem that is not on an LVM volume, and can't move it to a new LVM volume. Still, they're not for the faint-hearted: if you make a mistake, you may corrupt your filesystem. Make sure to consult the official documentation and dmsetup man page, triple-check the commands you're running, and have backups!

The Linux kernel has an awesome facility called the Device Mapper, which can do nice things such as create block devices that are "views" of other block devices, and of course snapshots. It is also what LVM uses under the hood to do the heavy lifting.

In the below examples I'll assume you want to snapshot /home, which is an ext4 filesystem located on /dev/sda2.

First, find the name of the device mapper device that the partition is mounted on:

# mount | grep home
/dev/mapper/home on /home type ext4 (rw,relatime,data=ordered)

Here, the device mapper device name is home. If the path to the block device does not start with /dev/mapper/, then you will need to create a device mapper device, and remount the filesystem to use that device instead of the HDD partition. You'll only need to do this once.

# dmsetup create home --table "0 $(blockdev --getsz /dev/sda2) linear /dev/sda2 0"
# umount /home
# mount -t ext4 /dev/mapper/home /home

Next, get the block device's device mapper table:

# dmsetup table home
home: 0 3864024960 linear 9:2 0

Your numbers will probably be different. The device target should be linear; if yours isn't, you may need to take special considerations. If the last number (start offset) is not 0, you will need to create an intermediate block device (with the same table as the current one) and use that as the base instead of /dev/sda2.

In the above example, home is using a single-entry table with the linear target. You will need to replace this table with a new one, which uses the snapshot target.

Device mapper provides three targets for snapshotting:

The snapshot target, which saves writes to the specified COW device. (Note that even though it's called a snapshot, the terminology is misleading, as the snapshot will be writable, but the underlying device will remain unchanged.)
The snapshot-origin target, which sends writes to the underlying device, but also sends the old data that the writes overwrote to the specified COW device.

Typically, you would make home a snapshot-origin target, then create some snapshot targets on top of it. This is what LVM does. However, a simpler method would be to simply create a snapshot target directly, which is what I'll show below.

Regardless of the method you choose, you must not write to the underlying device (/dev/sda2), or the snapshots will see a corrupted view of the filesystem. So, as a precaution, you should mark the underlying block device as read-only:

# blockdev --setro /dev/sda2

This won't affect device-mapper devices backed by it, so if you've already re-mounted /home on /dev/mapper/home, it should not have a noticeable effect.

Next, you will need to prepare the COW device, which will store changes since the snapshot was made. This has to be a block device, but can be backed by a sparse file. If you want to use a sparse file of e.g. 32GB:

# dd if=/dev/zero bs=1M count=0 seek=32768 of=/home_cow
# losetup --find --show /home_cow
/dev/loop0

Obviously, the sparse file shouldn't be on the filesystem you're snapshotting :)

Now you can reload the device's table and turn it into a snapshot device:

# dmsetup suspend home && \
  dmsetup reload home --table \
    "0 $(blockdev --getsz /dev/sda2) snapshot /dev/sda2 /dev/loop0 PO 8" && \
  dmsetup resume home

If that succeeds, new writes to /home should now be recorded in the /home_cow file, instead of being written to /dev/sda2. Make sure to monitor the size of the COW file, as well as the free space on the filesystem it's on, to avoid running out of COW space.

Once you no longer need the snapshot, you can merge it (to permanently commit the changes in the COW file to the underlying device), or discard it.

To merge it:

replace the table with a snapshot-merge target instead of a snapshot target:

# dmsetup suspend home && \
  dmsetup reload home --table \
    "0 $(blockdev --getsz /dev/sda2) snapshot-merge /dev/sda2 /dev/loop0 P 8" && \
  dmsetup resume home

Next, monitor the status of the merge until all non-metadata blocks are merged:
```
# watch dmsetup status home
...
0 3864024960 snapshot-merge 281688/2097152 1104
```
Note the 3 numbers at the end (X/Y Z). The merge is complete when X = Z.

Next, replace the table with a linear target again:

# dmsetup suspend home && \
  dmsetup reload home --table \
    "0 $(blockdev --getsz /dev/sda2) linear /dev/sda2 0" && \
  dmsetup resume home

Now you can dismantle the loop device:
```
# losetup -d /dev/loop0
```
Finally, you can delete the COW file.
```
# rm /home_cow
```

To discard the snapshot, unmount /home, follow steps 3-5 above, and remount /home. Although Device Mapper will allow you to do this without unmounting /home, it doesn't make sense (since the running programs' state in memory won't correspond to the filesystem state any more), and it will likely corrupt your filesystem.

FS corruption is "highly unlikely", as long as you never work in any kind of professional environment. otherwise you'll meet reality, and you might try blaming "bit rot" or "hardware" or whatever, but it all comes down to having been irresponsible. freeze/thaw (as mentioned a few times, and only if called properly) is sufficient outside of database environments. for databases, you still won't have a transaction-complete backup and if you think a backup that rolls back some transaction is fine when restored: see starting sentence. depending on the activity you might just added another 5-10 mins of downtime if ever you need that backup. Most of us can easily afford that, but it can not be general advice. Be honest about downsides, guys.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow