Romanize filenames
Keywords: UTF-8, romanize, cyrillic, latin, convert, filename
When upgrading from previous versions that did not yet have the “romanize” function, you will encounter a completely 'unreadable' directory structure.
For example: %D0%BA%D1%8B%D1%80%D0%B3%D1%8B%D0%B7%D1%81%D1%82%D0%B0%D0%BD.txt is the same as кыргызстан.txt
This is because UTF-8 filenames have been urlencoded.
In later versions, the “romanization” option has been added to circumvent this problem. 1)
The script below will convert this unreadable directory structure to “romanized” filenames.
You will have to include the UTF8.php file which is part of the dokuwiki installation.
Note: this script is not error free: for example: there are some cyrillic characters that will end your filename with “'”. This is because in UTF-8.php the transliteration of the 'ъ' is as “'”
Please check your pagestructure after conversion for invalid filenames.
I hope this will help someone. Any improvements welcome.
Update: UTF8.php has been rewritten, code below has only been tested with this version of UTF8.php
<?php include("utf8.php"); //to be found in the \inc directory of the default dokuwiki install /** * Copy a file, or recursively copy a folder and its contents, and clean up the filenames according to the dokuwiki UTF-8 * * @original_author Aidan Lister <aidan@php.net> * @link http://aidanlister.com/repos/v/function.copyr.php * @param string $source Source path * @param string $dest Destination path * @return bool Returns TRUE on success, FALSE on failure */ function copyr($source, $dest) { $dest2=cleanID($dest); echo $source."->".$dest." ->$dest2<br/>\n"; // Simple copy for a file if (is_file($source)) { return copy($source, $dest2); } // Make destination directory if (!is_dir($dest)) { mkdir($dest2); } // Loop through the folder $dir = dir($source); while (false !== $entry = $dir->read()) { // Skip pointers if ($entry == '.' || $entry == '..') { continue; } // Deep copy directories if ($dest !== "$source/$entry") { copyr("$source/$entry", "$dest/$entry"); } } // Clean up $dir->close(); return true; } copyr("/dokuwiki/data/pages/","/dokuwiki/data/pagesnew/"); function cleanID($id,$ascii=false){ $id = trim(urldecode($id)); $id = utf8_strtolower($id); $id = utf8_romanize($id); utf8_deaccent($id,-1); $id = preg_replace('#\'+#','_',$id); return($id); } ?>