Backup to S3

Ruby

This is a small ruby script that you should symbolically link to from your cron.daily directory. It does the following:

Creates tarball of several data directories, including meta and attic
Uploads the tar file to your Amazon S3 account

The script creates backups for each day of the last week and also has monthly permanent backups.

To use this script, you must:

Open an Amazon S3 account
Get the access keys and put them into the script at appropriate place
Have ruby and the AWS::S3 module

– bk

#!/bin/env ruby
 
require 'rubygems'
require 'aws/s3'
 
# Create a tar file with the wiki data files.
wiki_data = '/path/to/wiki/data'
target_dirs = ['attic', 'media', 'meta', 'pages']
tar_dirs = ''
target_dirs.each do |dir| 
  tar_dirs += wiki_data + '/' + dir + ' '
end
weekday = Time.now.wday
backup_filename = "/path/to/backup/wiki-#{weekday}.tar"
`tar -cvf #{backup_filename} #{tar_dirs}`
`gzip #{backup_filename}`
backup_filename += '.gz'
 
# If we are on monthly anniversary, archive a permanent backup.
permanent_backup = nil
if Time.now.day == 1   # Hardwired but what the hey...
  timestamp = Time.now.strftime("%Y-%m-%d")
  permanent_backup = "wiki-#{timestamp}.tar.gz"
end
 
# Put the backup file in the S3 bucket under backups.
AWS::S3::DEFAULT_HOST.replace('...put your bucket region here...')
AWS::S3::Base.establish_connection!(
  :access_key_id         => '...put your access key here...',
  :secret_access_key     => '...put your secret access key here...'
)
bucket_name = '...put your bucket name for wiki backups here...'
begin
  AWS::S3::Bucket.find( bucket_name )
  AWS::S3::S3Object.store(
    File.basename(backup_filename),
    open(backup_filename),
    bucket_name
  )
  puts "#{backup_filename} was successfully backed up to Amazon S3"
  if permanent_backup
    AWS::S3::S3Object.store(
      permanent_backup,
      open(backup_filename),
      bucket_name
    )
    puts "#{permanent_backup} (monthly archive) was successfully backed up to Amazon S3"
  end
 
rescue
  puts "Unable to backup file to S3"
end

Python

Here is a similar script for Python 2.7. In addition to the relevant data directories, it backs up the conf directory. Unlike the Ruby script, this script doesn't support monthly permanent backups, although you may find that you don't need it since Dokuwiki has an unlimited revision history.

#!/usr/bin/python
import boto
import subprocess
import datetime
import os
 
WIKI_PATH = '/path/to/wiki'
BACKUP_PATH = '/path/to/backup/to'
AWS_ACCESS_KEY = 'access key'
AWS_SECRET_KEY = 'secret key'
BUCKET_NAME = 'bucket name'
BUCKET_KEY_PREFIX = 'dokuwiki/'
 
TARGET_DIRS = ['conf', 'data/attic', 'data/media', 'data/meta', 'data/pages']
 
dirs = [WIKI_PATH + '/' + d for d in TARGET_DIRS]
weekday = datetime.datetime.now().strftime('%a')
filename = '{}/wiki-{}.tar'.format(BACKUP_PATH, weekday)
subprocess.call(['tar', '-cvf', filename] + dirs)
subprocess.call(['gzip','-f', filename])
filename += '.gz'
 
s3 = boto.connect_s3(AWS_ACCESS_KEY, AWS_SECRET_KEY)
bucket = s3.get_bucket(BUCKET_NAME)
k = bucket.new_key(BUCKET_KEY_PREFIX + os.path.basename(filename))
k.set_contents_from_filename(filename)

Python script updated to use Boto3

#!/usr/bin/python
import boto3
import botocore
import subprocess
import datetime
import os
 
WIKI_PATH = '/path/to/wiki'
BACKUP_PATH = '/path/to/backup/to'
AWS_ACCESS_KEY = 'access key'
AWS_SECRET_KEY = 'secret key'
BUCKET_NAME = 'bucket name'
BUCKET_KEY_PREFIX = 'dokuwiki/'
 
TARGET_DIRS = ['conf', 'data/attic', 'data/media', 'data/meta', 'data/pages']
 
dirs = [WIKI_PATH + '/' + d for d in TARGET_DIRS]
weekday = datetime.datetime.now().strftime('%a')
filename = '{}/wiki-{}.tar'.format(BACKUP_PATH, weekday)
subprocess.call(['tar', '-cvf', filename] + dirs)
subprocess.call(['gzip','-f', filename])
filename += '.gz'
 
s3 = boto3.resource('s3')
 
bucket = s3.Bucket(BUCKET_NAME)
exists = True
 
print filename
print os.path.basename(filename)
 
try:
  s3.Object(BUCKET_NAME, BUCKET_KEY_PREFIX + os.path.basename(filename)  ).put(Body=open(filename, 'rb'))
 
except botocore.exceptions.ClientError as e:
    # If a client error is thrown, then check that it was a 404 error.
    # If it was a 404 error, then the bucket does not exist.
    error_code = int(e.response['Error']['Code'])
    if error_code == 404:
        exists = False

DokuWiki

Table of Contents

Backup to S3

Ruby

Python