Network Backups using rsync and systemd timers
Prerequisites
- A system on the network with sufficient free storage
openssh-server
andrsync
installed on the remote system- A user on the remote system, say
remoteuser
, that can do key-based passwordless login via SSH and hassudo
access remoteuser
configured in/etc/sudoers
for passwordless sudo access forrsync
(addremoteuser ALL=NOPASSWD:/usr/bin/rsync
)- Appropriate firewall ports are open for SSH and rsync
- lineinfile:
path: /etc/sudoers
state: present
regexp: '^remoteuser ALL=NOPASSWD:/usr/bin/rsync'
line: 'remoteuser ALL=NOPASSWD:/usr/bin/rsync'
validate: '/usr/sbin/visudo -cf %s'
tags:
- rsync
Create a systemd service unit
[Unit]
Description=Data backup
Requires=network.target
After=network.target
[Service]
Type=oneshot
Nice=19
StandardOutput=journal
IOSchedulingClass=best-effort
IOSchedulingPriority=5
ExecStart=/usr/bin/rsync \
--rsync-path="sudo rsync" \
--archive \
--prune-empty-dirs \
--compress \
--update \
--quiet \
--acls \
--xattrs \
--progress \
--human-readable \
--exclude node_modules \
--exclude .DS_Store \
--exclude '*cache*' \
--exclude tmp \
/path/to/local/source/data \
remoteuser@network-1.local:/path/to/remote/archive/data
[Install]
WantedBy=multi-user.target
In the above script, change:
/path/to/local/source/data
to point to the source data directory.remoteuser@network-1.local
the system to access via SSH/path/to/remote/archive/data
destination directory on the remote system
You can copy the final rsync command and run it in a shell with the --dry-run
switch (and remove --quiet
) to ensure it works as intended.
Create a systemd timer
Set up a systemd timer to run the backup task daily. There are many options on how to set the frequency and nature of repetition (e.g. OnBootSec=15min
and OnUnitActiveSec=15min
options under [Timer]
to run every 15 mins in a non-overlapping fashion)
[Unit]
Description=Data backup timer
Requires=data-backup.service
[Timer]
OnCalendar=daily
Unit=data-backup.service
[Install]
WantedBy=timers.target
As root, enable and run both systemd units:
systemctl daemon-reload
systemctl enable data-backup.service
systemctl enable data-backup.timer
systemctl start data-backup.service
systemctl start data-backup.timer
# Ensure all is well
journalctl -f -u iridium-data-backup.service
Background
My system has a paltry 250 GB internal disk that keeps running out of space. This gets complicated when I build large, complex projects from source.
I don't like the idea of upgrading to a larger SSD because it makes full disk backups slower, harder. I (stubbornly) believe all programming-related data (code, not training data sets) that is useful and worth long term storage should realistically fit in ~100GB. That makes incremental backups easier. Everything else is transient stuff: node_modules, build objects and intermediate artifacts, docker images/volumes, npm/pip/gradle/mvn package cache, etc.
SSD's also have an inherent life, which means the idea of having a 1TB SSD just not wake up one day is a scary thought. A network backup is the least I can do.