====== Backup to S3 ======
==== Ruby ====
This is a small ruby script that you should symbolically link to from your cron.daily directory. It does the following:
* Creates tarball of several data directories, including meta and attic
* Uploads the tar file to your Amazon S3 account
The script creates backups for each day of the last week and also has monthly permanent backups.
To use this script, you must:
* Open an [[http://aws.amazon.com/s3|Amazon S3]] account
* Get the access keys and put them into the script at appropriate place
* Have ruby and the [[http://amazon.rubyforge.org/|AWS::S3]] module
-- [[http://www.billkatz.com|bk]]
#!/bin/env ruby
require 'rubygems'
require 'aws/s3'
# Create a tar file with the wiki data files.
wiki_data = '/path/to/wiki/data'
target_dirs = ['attic', 'media', 'meta', 'pages']
tar_dirs = ''
target_dirs.each do |dir|
tar_dirs += wiki_data + '/' + dir + ' '
end
weekday = Time.now.wday
backup_filename = "/path/to/backup/wiki-#{weekday}.tar"
`tar -cvf #{backup_filename} #{tar_dirs}`
`gzip #{backup_filename}`
backup_filename += '.gz'
# If we are on monthly anniversary, archive a permanent backup.
permanent_backup = nil
if Time.now.day == 1 # Hardwired but what the hey...
timestamp = Time.now.strftime("%Y-%m-%d")
permanent_backup = "wiki-#{timestamp}.tar.gz"
end
# Put the backup file in the S3 bucket under backups.
AWS::S3::DEFAULT_HOST.replace('...put your bucket region here...')
AWS::S3::Base.establish_connection!(
:access_key_id => '...put your access key here...',
:secret_access_key => '...put your secret access key here...'
)
bucket_name = '...put your bucket name for wiki backups here...'
begin
AWS::S3::Bucket.find( bucket_name )
AWS::S3::S3Object.store(
File.basename(backup_filename),
open(backup_filename),
bucket_name
)
puts "#{backup_filename} was successfully backed up to Amazon S3"
if permanent_backup
AWS::S3::S3Object.store(
permanent_backup,
open(backup_filename),
bucket_name
)
puts "#{permanent_backup} (monthly archive) was successfully backed up to Amazon S3"
end
rescue
puts "Unable to backup file to S3"
end
==== Python ====
Here is a similar script for Python 2.7. In addition to the relevant ''data'' directories, it backs up the ''conf'' directory. Unlike the Ruby script, this script doesn't support monthly permanent backups, although you may find that you don't need it since Dokuwiki has an unlimited revision history.
#!/usr/bin/python
import boto
import subprocess
import datetime
import os
WIKI_PATH = '/path/to/wiki'
BACKUP_PATH = '/path/to/backup/to'
AWS_ACCESS_KEY = 'access key'
AWS_SECRET_KEY = 'secret key'
BUCKET_NAME = 'bucket name'
BUCKET_KEY_PREFIX = 'dokuwiki/'
TARGET_DIRS = ['conf', 'data/attic', 'data/media', 'data/meta', 'data/pages']
dirs = [WIKI_PATH + '/' + d for d in TARGET_DIRS]
weekday = datetime.datetime.now().strftime('%a')
filename = '{}/wiki-{}.tar'.format(BACKUP_PATH, weekday)
subprocess.call(['tar', '-cvf', filename] + dirs)
subprocess.call(['gzip','-f', filename])
filename += '.gz'
s3 = boto.connect_s3(AWS_ACCESS_KEY, AWS_SECRET_KEY)
bucket = s3.get_bucket(BUCKET_NAME)
k = bucket.new_key(BUCKET_KEY_PREFIX + os.path.basename(filename))
k.set_contents_from_filename(filename)
Python script updated to use Boto3
#!/usr/bin/python
import boto3
import botocore
import subprocess
import datetime
import os
WIKI_PATH = '/path/to/wiki'
BACKUP_PATH = '/path/to/backup/to'
AWS_ACCESS_KEY = 'access key'
AWS_SECRET_KEY = 'secret key'
BUCKET_NAME = 'bucket name'
BUCKET_KEY_PREFIX = 'dokuwiki/'
TARGET_DIRS = ['conf', 'data/attic', 'data/media', 'data/meta', 'data/pages']
dirs = [WIKI_PATH + '/' + d for d in TARGET_DIRS]
weekday = datetime.datetime.now().strftime('%a')
filename = '{}/wiki-{}.tar'.format(BACKUP_PATH, weekday)
subprocess.call(['tar', '-cvf', filename] + dirs)
subprocess.call(['gzip','-f', filename])
filename += '.gz'
s3 = boto3.resource('s3')
bucket = s3.Bucket(BUCKET_NAME)
exists = True
print filename
print os.path.basename(filename)
try:
s3.Object(BUCKET_NAME, BUCKET_KEY_PREFIX + os.path.basename(filename) ).put(Body=open(filename, 'rb'))
except botocore.exceptions.ClientError as e:
# If a client error is thrown, then check that it was a 404 error.
# If it was a 404 error, then the bucket does not exist.
error_code = int(e.response['Error']['Code'])
if error_code == 404:
exists = False