====== Backup to S3 ====== ==== Ruby ==== This is a small ruby script that you should symbolically link to from your cron.daily directory. It does the following: * Creates tarball of several data directories, including meta and attic * Uploads the tar file to your Amazon S3 account The script creates backups for each day of the last week and also has monthly permanent backups. To use this script, you must: * Open an [[http://aws.amazon.com/s3|Amazon S3]] account * Get the access keys and put them into the script at appropriate place * Have ruby and the [[http://amazon.rubyforge.org/|AWS::S3]] module -- [[http://www.billkatz.com|bk]] #!/bin/env ruby require 'rubygems' require 'aws/s3' # Create a tar file with the wiki data files. wiki_data = '/path/to/wiki/data' target_dirs = ['attic', 'media', 'meta', 'pages'] tar_dirs = '' target_dirs.each do |dir| tar_dirs += wiki_data + '/' + dir + ' ' end weekday = Time.now.wday backup_filename = "/path/to/backup/wiki-#{weekday}.tar" `tar -cvf #{backup_filename} #{tar_dirs}` `gzip #{backup_filename}` backup_filename += '.gz' # If we are on monthly anniversary, archive a permanent backup. permanent_backup = nil if Time.now.day == 1 # Hardwired but what the hey... timestamp = Time.now.strftime("%Y-%m-%d") permanent_backup = "wiki-#{timestamp}.tar.gz" end # Put the backup file in the S3 bucket under backups. AWS::S3::DEFAULT_HOST.replace('...put your bucket region here...') AWS::S3::Base.establish_connection!( :access_key_id => '...put your access key here...', :secret_access_key => '...put your secret access key here...' ) bucket_name = '...put your bucket name for wiki backups here...' begin AWS::S3::Bucket.find( bucket_name ) AWS::S3::S3Object.store( File.basename(backup_filename), open(backup_filename), bucket_name ) puts "#{backup_filename} was successfully backed up to Amazon S3" if permanent_backup AWS::S3::S3Object.store( permanent_backup, open(backup_filename), bucket_name ) puts "#{permanent_backup} (monthly archive) was successfully backed up to Amazon S3" end rescue puts "Unable to backup file to S3" end ==== Python ==== Here is a similar script for Python 2.7. In addition to the relevant ''data'' directories, it backs up the ''conf'' directory. Unlike the Ruby script, this script doesn't support monthly permanent backups, although you may find that you don't need it since Dokuwiki has an unlimited revision history. #!/usr/bin/python import boto import subprocess import datetime import os WIKI_PATH = '/path/to/wiki' BACKUP_PATH = '/path/to/backup/to' AWS_ACCESS_KEY = 'access key' AWS_SECRET_KEY = 'secret key' BUCKET_NAME = 'bucket name' BUCKET_KEY_PREFIX = 'dokuwiki/' TARGET_DIRS = ['conf', 'data/attic', 'data/media', 'data/meta', 'data/pages'] dirs = [WIKI_PATH + '/' + d for d in TARGET_DIRS] weekday = datetime.datetime.now().strftime('%a') filename = '{}/wiki-{}.tar'.format(BACKUP_PATH, weekday) subprocess.call(['tar', '-cvf', filename] + dirs) subprocess.call(['gzip','-f', filename]) filename += '.gz' s3 = boto.connect_s3(AWS_ACCESS_KEY, AWS_SECRET_KEY) bucket = s3.get_bucket(BUCKET_NAME) k = bucket.new_key(BUCKET_KEY_PREFIX + os.path.basename(filename)) k.set_contents_from_filename(filename) Python script updated to use Boto3 #!/usr/bin/python import boto3 import botocore import subprocess import datetime import os WIKI_PATH = '/path/to/wiki' BACKUP_PATH = '/path/to/backup/to' AWS_ACCESS_KEY = 'access key' AWS_SECRET_KEY = 'secret key' BUCKET_NAME = 'bucket name' BUCKET_KEY_PREFIX = 'dokuwiki/' TARGET_DIRS = ['conf', 'data/attic', 'data/media', 'data/meta', 'data/pages'] dirs = [WIKI_PATH + '/' + d for d in TARGET_DIRS] weekday = datetime.datetime.now().strftime('%a') filename = '{}/wiki-{}.tar'.format(BACKUP_PATH, weekday) subprocess.call(['tar', '-cvf', filename] + dirs) subprocess.call(['gzip','-f', filename]) filename += '.gz' s3 = boto3.resource('s3') bucket = s3.Bucket(BUCKET_NAME) exists = True print filename print os.path.basename(filename) try: s3.Object(BUCKET_NAME, BUCKET_KEY_PREFIX + os.path.basename(filename) ).put(Body=open(filename, 'rb')) except botocore.exceptions.ClientError as e: # If a client error is thrown, then check that it was a 404 error. # If it was a 404 error, then the bucket does not exist. error_code = int(e.response['Error']['Code']) if error_code == 404: exists = False