Duplicity Onedrive Timeouts
Posted Wed, 22 Feb 2023 21:17:00 +0100 | Backups|
I use onedrive as my main cloud backup. Of course, I wouldn’t just upload my files there as is. As such, I decided to use duplicity as my solution for uploading encrypted backups. It worked very well when doing some test runs and uploading a few files. But when I attempted a full backup of several hundred GB and letting duplicity do its thing over the night, I would find that duplicity had frozen and nothing would happen. I restarted duplicity several times and was eventually able to back up all my files. But then I tried to verify the backup, and it wouldn’t work, even after a week of restarting duplicity.
I used pv -d $(pgrep -u $USER duplicity)
to verify that duplicity was stuck. I then used pgrep -u $USER duplicity | xargs -n1 sudo strace -p
(from this nice blog post about troubleshooting stuck processes)
to attach to duplicity and discovered that duplicity was trying to read
from a socket connected to 1drv.ms
.
So the onedrive server ended up freezing, and this led to duplicity also freezing.
I decided to look into the source code for the onedrive backend,
and that’s where I discovered that the calls to the python requests
library did not include timeouts.
It turned out that adding those, set to the --timeout
argument passed to duplicity, was enough to fix most of the
problems. Duplicity retries up to --num_retries
times when a backend throws an exception. However, it
turns out that exceptions raised inside the backend _retry_cleanup
function, a wrapper for the initialize_oauth2_session
function, are not always caught by duplicity, potentially leading to a crash. So I manually implemented the same retransmission logic in
initialize_oauth2_session
as in the common backend interface.
I created a fork of duplicity, committed my fixes, and opened an issue that was briefly afterward merged into the main branch.