Table of Contents
Maintenance
Here are some tips to automate some of the day-to-day maintenance needed or recommended for DokuWiki.
See also the plugins: cleanup and clearhistory
Keep Blacklist up to date
See blacklist on how to set up a cronjob to keep the Anti-Spam Blacklist current.
Automatic cleanup script
It is recommended to set up some cleanup process for busy DokuWikis. The following Bash (Unix shell) shell script serves as an example. It deletes old revisions from the attic, removes stale lock files and empty directories, and it cleans up the cache1).
- cleanup.sh
#!/bin/bash cleanup() { local data_path="$1" # full path to data directory of wiki local retention_days="$2" # number of days after which old files are to be removed # purge files older than ${retention_days} days from attic and media_attic (old revisions) find "${data_path}"/{media_,}attic/ -type f -not -name _dummy -mtime +"${retention_days}" -delete # remove stale lock files (files which are 1-2 days old) find "${data_path}"/locks/ -name '*.lock' -type f -mtime +1 -delete # remove empty directories find "${data_path}"/{attic,cache,index,locks,media,media_attic,media_meta,meta,pages,tmp}/ \ -mindepth 1 -type d -empty -delete # remove files older than ${retention_days} days from the cache if test -n "$(find "${data_path}"/cache/?/ -maxdepth 1 -print -quit &> /dev/null)" then find "${data_path}"/cache/?/ -type f -not -name _dummy -mtime +"${retention_days}" -delete fi } # cleanup DokuWiki installations (path to datadir, number of days) # some examples: cleanup /home/user1/htdocs/doku/data 256 cleanup /home/user2/htdocs/mywiki/data 180 cleanup /var/www/superwiki/data 180
To run it automatically, set up a cronjob. The following example calls the script every day 7 minutes after midnight. To run as non-root user remove root
.
7 0 * * * root /full/path/to/cleanup.sh
Be sure to set everything up correctly - you don't want to delete the wrong things, do you?
Windows -- warmzip
A script for cleaning out old files on Windows systems is waRmZip, available from here on SourceForge. Write a batch file to call it, and schedule it to run every day. And as the man says: 'Be sure to set everything up correctly'
I took the above suggestion to use waRmZip
and wrote this batch file - maybe it will help out.
My favorite way to run cron jobs on Windows is PyCron.
- dw-cleanup.bat
@echo off set waRmZip="c:\Program Files\waRmZip\waRmZip.wsf" set wikiHome="c:\path\to\htdocs\wiki\data" rem Move attic files older than 30 days to an archive location %waRmZip% %wikiHome%\attic /ma:30 /md:%wikiHome%_archive\attic /r /q rem Option: delete attic files older than 30 days rem %waRmZip% %wikiHome%\attic /da:30 /dc /r /q rem Delete empty attic directories; waRmZip requires the /da flag when using rem /df, so add filter for *.zzz so /da doesn't remove any files %waRmZip% %wikiHome%\attic /r /da:31 /df /fo:*.zzz /q rem Remove stale lock files %waRmZip% %wikiHome%\locks /da:1 /fo:*.lock /r /q rem Remove empty directories %waRmZip% %wikiHome%\pages /da:365 /df /fo:*.zzz /r /q
Windows -- batch script
This is another Windows command shell script for maintaining your dokuwiki base on a Windows environment. The script uses the free and open source utility find, which can be obtained via http://gnuwin32.sourceforge.net/
All paths are read from the DokuWiki config file. Files to be deleted can be shown before deletion, to prevent accidental deletion of files.
- maintain_dokuwiki.cmd
@echo off setlocal REM This script performs some basic DokuWiki maintenance REM Copyright (C) 2012 Peter Mosmans REM This program is free software: you can redistribute it and/or modify REM it under the terms of the GNU General Public License as published by REM the Free Software Foundation, either version 3 of the License, or REM (at your option) any later version. REM This program is distributed in the hope that it will be useful, REM but WITHOUT ANY WARRANTY; without even the implied warranty of REM MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the REM GNU General Public License for more details. REM You should have received a copy of the GNU General Public License REM along with this program. If not, see <http://www.gnu.org/licenses/>. REM Please contact support AT go-forward.net for questions and/or feedback REM Last modification: 02-05-2012 (Peter Mosmans) set NAME=maintain_dokuwiki set VERSION=0.13 REM path to the dokuwiki configuration file enclosed in double quotes set DOKUWIKICONFIG="\full\filename\of\your\dokuwiki\conf\local.php" REM preserve all files that are younger than DAYSTOKEEP days set DAYSTOKEEP=31 REM set to true if you want to show results and pause before deleting any files set SHOWRESULTSFIRST=true set FIND=c:\tools\find.exe set TEMPFILE=%TMP%\%NAME%.tmp REM see if all tools are present for %%i in (%FIND%) do ( if not exist %%i ( echo sorry, could not find %%i - exiting echo you can obtain the free GNU tools from gnuwin32.sourceforge.net exit /b ) ) REM see if the dokuwiki configuration file can be read if not exist %DOKUWIKICONFIG% ( echo sorry, could not find DokuWiki config at %DOKUWIKICONFIG% - exiting exit /b ) REM grab the correct paths from the configuration file for /f "usebackq delims=' tokens=2,4" %%i in (%DOKUWIKICONFIG%) do ( if /i "%%i"=="datadir" set DOCUMENTROOT=%%j if /i "%%i"=="olddir" set ATTICDIR=%%j if /i "%%i"=="cachedir" set CACHEDIR=%%j if /i "%%i"=="lockdir" set LOCKDIR=%%j ) if "%DOCUMENTROOT%" == "" ( echo sorry, could not find datadir variable in %DOKUWIKICONFIG%, exiting... exit /b ) REM use defaults if the paths are not specified if /i "%ATTICDIR%" == "" set ATTICDIR=%DOCUMENTROOT%/attic if /i "%LOCKDIR%" == "" set LOCKDIR=%DOCUMENTROOT%/lock if /i "%CACHEDIR%" == "" set CACHEDIR=%DOCUMENTROOT%/cache REM purge files older than DAYSTOKEEP days from the attic %FIND% "%ATTICDIR%" -type f -mtime +%DAYSTOKEEP% -print > %TEMPFILE% REM remove locks older than one day %FIND% "%LOCKDIR%" -name "*.lock" -type f -mtime +1 -print >> %TEMPFILE% REM remove cache files older than DAYSTOKEEP %FIND% "%CACHEDIR%" -type f -mtime +%DAYSTOKEEP% -print >> %TEMPFILE% REM show results, if any for /f "usebackq" %%i in (`%FIND% "%TMP%" -size +1 -name %NAME%.tmp`) do ( if /i "%SHOWRESULTSFIRST%"=="TRUE" ( echo files to be deleted: type %TEMPFILE% pause ) for /f "delims=#" %%i in (%TEMPFILE%) do del "%%i" ) REM clean up del /f /q %TEMPFILE% endlocal
Keeping Playground Clean
To keep the wiki's Playground and other pages clean, use a cron job e.g. every 30 minutes, that restores Playground and other pages to their original content.
Example: Restore Playground every 30 min:
0,30 * * * * cp -f /path/to/savedwiki/data/pages/playground/playground.txt /path/to/dokuwiki/data/pages/playground/
Example: Restore all pages in namespace “wiki” every 30 min:
0,30 * * * * cp -rf /path/to/savedwiki/data/pages/wiki/ /path/to/dokuwiki/data/pages/wiki/
Problems with CAPTCHA plugin
Using the CAPTCHA plugin and the recommended maintenance method to keep the playground clean, can result in the effect of being unable to edit the playground.
When this occurs, the problem can be easily resolved by removing the related playground files in the meta folder with the next cronjob.
Example: Deletes Playground metafiles every 30 min:
0,30 * * * * rm -f /path/to/dokuwiki/data/meta/playground/playground.*
When cronjob is not available
When your hosting doesn't allow to use cronjobs, consider using the cronojob plugin instead.
Discussion
Could you please provide PHP versions of these scripts to use with the cronojob plugin?
Regarding the above cleanup script which uses file modification time (mtime), wouldn't it be safer to use the timestamp in the filename to determine if a file in the attic should be deleted or not?
On the one hand, I'd say it could be done but it's of course trickier to set up. For many installations it will be fine to use mtime. On the other hand, some might want to make sure they clean up old files no matter what (e.g. files left after a crash or critical PHP error).
Could someone add the appropriate line for cache maintenance to the Windows waRmZip script?
Does the cleanup Plugin handle all the above tasks? Would it be recommended over running these scripts?
This is example of php script to clean old cache files. useful when .sh is not available to run.
- cleanup.php
<?php /* * mrlemonade ~ */ function getFilesFromDir($dir) { $files = array(); if ($handle = opendir($dir)) { while (false !== ($file = readdir($handle))) { if ($file != "." && $file != "..") { if(is_dir($dir.'/'.$file)) { $dir2 = $dir.'/'.$file; $files[] = getFilesFromDir($dir2); } else { $files[] = $dir.'/'.$file; } } } closedir($handle); } return array_flat($files); } function array_flat($array) { foreach($array as $a) { if(is_array($a)) { $tmp = array_merge($tmp, array_flat($a)); } else { $tmp[] = $a; } } return $tmp; } // Define the folder to clean $captchaFolder = 'data/cache'; // Here you can define after how many // days the files should get deleted $expire_time = 5; // Find all files of the given file type foreach (getFilesFromDir($captchaFolder) as $Filename) { // Read file creation time $FileCreationTime = filectime($Filename); // Calculate file age in seconds $FileAge = time() - $FileCreationTime; // Is the file older than the given time span? if ($FileAge > ($expire_time*60*60*24 )) { // Now do something with the olders files... print "The file $Filename is older than $expire_time days \n"; // For example deleting files: // unlink($Filename); } } echo 'ran'; ?>
use this at your own risk. — S.C. Yoo 2012/02/10 12:49
Cheers, I'd like to add that it is a good idea to clean up orphaned meta data, don't you think? I do the following (in an R script):
- list all files in the pages directory recursively
- add a column 'pagename' to this list that countains the file name again but without the base directory
- in pagename exchange '/' (or '\') with ':' and remove the file extension
- do the same for the meta directory + exclude some additional files
- remove all entries from the meta-list from which the page name is in the pages-list
- delete all files left in the meta list
Of course one could add a time constraint on it so that you don't use metadata immediately.
Clemo 2016/09/23 sometime