HDFS Snapshots are read-only point-in-time copies of the file system. Snapshots can be taken on a subtree of the file system or the entire file system. Some common use cases of snapshots are data backup, protection against user errors and disaster recovery.
- It does not copy actual data. It will keep track of changes to metadata.
- First, we need to make the directory snapshottable – using hdfs dfsadmin -allowSnapshot. Only users in supergroup can allow snapshots on a directory.
- Once snapshots are allowed, we can create snapshot using hadoop fs -createSnapshot
- We can also delete, rename the snapshot using deleteSnapshot or renameSnapshot
- Users in supergroup can also disallow snapshot (using hdfs dfsadmin -disallowSnaphsot)