Python / Sysadmin Tools / Disk Imaging
diskdump.py
Privilege: sudo · Output: .img / .iso · ~110 lines
A minimal, zero-dependency disk imaging script that wraps
ddlsblkAt its core,
diskdump.pylsblkddddThe script asks you three questions — which disk, where to save it, and what format — then shows you exactly what it is about to do and waits for explicit confirmation before a single byte is read. The filename is generated automatically from the disk's model name and the current timestamp, so the output is always traceable and never accidentally overwrites a previous image.
The flow, step by steplsblk -J -o NAME,SIZE,TYPE,MODELtype == "disk"Samsung SSD 870 EVO 1TB[a-z0-9._-]os.path.isdir.img.iso{model}_{DD-MM-YYYY_HHhMM}.{ext}yesysudo dd if=/dev/sdX of=/path/to/output bs=4M status=progress conv=fsyncstatus=progressconv=fsync
#!/usr/bin/env python3
import subprocess
import json
import os
from datetime import datetime
import sys
import re
def sanitize_name(name):
"""Convert disk model to a safe filename."""
if not name:
return "diskimage"
name = name.strip().lower()
name = re.sub(r'\s+', '_', name)
name = re.sub(r'[^a-z0-9._-]', '', name)
return name
def get_disks():
result = subprocess.run(
["lsblk", "-J", "-o", "NAME,SIZE,TYPE,MODEL"],
capture_output=True,
text=True,
check=True
)
data = json.loads(result.stdout)
disks = []
for device in data["blockdevices"]:
if device["type"] == "disk":
disks.append(device)
return disks
def choose_disk(disks):
print("\nDetected Drives:\n")
for i, d in enumerate(disks):
model = (d.get("model") or "Unknown").strip()
print(f"{i+1}) /dev/{d['name']} | {d['size']} | {model}")
while True:
try:
choice = int(input("\nChoose a disk number: "))
if 1 <= choice <= len(disks):
return disks[choice-1]
except ValueError:
pass
print("Invalid selection.")
def run_dd(source, output_file):
print("\nStarting backup...\n")
cmd = [
"sudo",
"dd",
f"if={source}",
f"of={output_file}",
"bs=4M",
"status=progress",
"conv=fsync"
]
process = subprocess.run(cmd)
if process.returncode != 0:
print("\nBackup failed.")
sys.exit(1)
def main():
disks = get_disks()
disk = choose_disk(disks)
model = sanitize_name(disk.get("model"))
timestamp = datetime.now().strftime("%d-%m-%Y_%Hh%M")
directory = input("\nEnter destination directory: ").strip()
if not os.path.isdir(directory):
print("Directory does not exist.")
return
fmt = input("\nChoose image format (.img or .iso): ").strip().lower()
if fmt not in [".img", ".iso"]:
print("Invalid format.")
return
filename = f"{model}_{timestamp}{fmt}"
output_file = os.path.join(directory, filename)
source = f"/dev/{disk['name']}"
print(f"\nSource: {source}")
print(f"Destination: {output_file}")
confirm = input("\nProceed with backup? (yes/no): ").lower()
if confirm not in ["yes", "y"]:
print("Cancelled.")
return
run_dd(source, output_file)
print("\nBackup completed successfully!")
print(f"Image saved to: {output_file}")
if __name__ == "__main__":
main()
ddAlternatives exist.
ddrescueclonezillarsyncdd↳ bs=4M and conv=fsync
The default
ddbs=4Mconv=fsyncddfsync()The output filename —
samsung_ssd_870_evo_1tb_23-04-2025_14h30.imgThe
sanitize_namelsblk[a-z0-9._-]"diskimage"nullThe goal is a filename you can read six months later and immediately know what it contains, with no supporting documentation required.
The script is intentionally minimal. There are several things it does not attempt, and being clear about them matters before using it in anger:
-
No compression
The output is a raw byte-for-byte copy, the same size as the source drive. A 1 TB disk produces a 1 TB image file regardless of how much data is actually on it. Pipe through orplaintext
gzipmanually if storage space is a concern.plaintextzstd -
No error recovery
If hits a bad sector, it exits with a non-zero return code and the script halts. For imaging damaged or failing drives, useplaintext
dd, which can retry bad blocks, skip them, and resume interrupted sessions.plaintextddrescue -
No live drives
Imaging a mounted, running filesystem with produces an inconsistent snapshot — writes happening during the copy may be partially captured. The script makes no attempt to detect or warn about this. Unmount the source before imaging, or use LVM snapshots for live backups.plaintext
dd -
No verification
The script does not hash the source and destination after imaging to confirm they match. For forensic or archival purposes, run against bothplaintext
sha256sumand the output file manually after completion.plaintext/dev/sdX -
sudo required
Reading raw block devices requires root. The script calls directly, which will prompt for a password if the session doesn't have an activeplaintext
sudo ddtoken. It does not check in advance whether the user has the necessary privileges.plaintextsudo
✓ What it is good for
Creating a bootable image of a working system before a major upgrade. Archiving a decommissioned machine's full disk state. Cloning a configured install to identical hardware. Producing a forensic snapshot of a drive that needs to be handed to someone else. Any situation where you want an exact, offline, sector-level copy of a healthy block device and a filename you'll still understand next year.
# make executable chmod +x diskdump.py # run it ./diskdump.py # example session Detected Drives: 1) /dev/sda | 931.5G | Samsung SSD 870 EVO 1TB 2) /dev/sdb | 7.5G | SanDisk Ultra Choose a disk number: 1 Enter destination directory: /mnt/backup Choose image format (.img or .iso): .img Source: /dev/sda Destination: /mnt/backup/samsung_ssd_870_evo_1tb_23-04-2025_14h30.img Proceed with backup? (yes/no): yes Starting backup... 1000215216128 bytes (1.0 TB) copied, 1847 s, 541 MB/s
No installation required. No pip packages. The only dependencies are
lsblkdd/usr/local/bin