It’s said necessity is the mother of invention and in this article we’ll show the reason why we initially wrote s3-pit-restore.
Our infrastructure heavily rely on S3 object storage to store the several millions files our users everyday produce. S3 works reliably and everything went well until we were in need for an older copy of a bunch of files for one of our customers.
Fortunately we had versioning enabled on our buckets, but restoring a subfolder to a given point in time revealed harder than expected.
Armed with patience, we explored all the options offered by the S3 web gui and all the available documentation only to understand what we want not to hear: there is no official way to restore a bucket to a given point in time.
Initially we were shocked but, hey! Writing good software is our job, isn’t it?
So we started writing s3-pit-restore. “S3 Point in Time Restore” is a tool you can use exactly to restore a bucket or a subset of a bucket to a given point in time, like this:
s3-pit-restore --bucket my-bucket --dest my-restored-bucket --timestamp "06-17-2016 23:59:50 +2"
What s3-pit-restore actually offers:
- Restore of all files with timestamp less than the given one
- Restore of a whole bucket or a bucket prefix
- Parallel download of multiple files with a great overall speed
- Customization of parallel workers count to optimize bandwidth usage
- Restore from s3 bucket versions or from glacier if enabled
We released s3-pit-restore as FLOSS under the terms of MIT licence, you can find it on https://github.com/madisoft/s3-pit-restore with more documentation and examples.
Hope this software will serve you as served us!