BOM
: Byte Order MarkTable of Contents
BOMfix Syntax PlugIn
Compatible with DokuWiki
2005-07-13+
This extension has not been updated in over 2 years. It may no longer be maintained or supported and may have compatibility issues.
If you always edit your wiki pages with DokuWiki's built-in editor (i.e. the HTML form based edit option) you won't need this plugin at all.
External editors (i.e. separate standalone programs like word processing software) usually mark a file in UTF8 format by prepending its content with a “magic” byte sequence1) at the very start of file. While there is no harm in it as far as DokuWiki is concerned those “magic” bytes do appear in the page presented to the user.
Depending on a page's actual content and the respective CSS rules in effect this may lead to undesired results. One way to get rid of this problem would be to open the affected page(s) with DokuWiki's built-in edit feature and simply remove those bytes. However, such an approach would cause the word processor to open the file as plain text assuming it's in ASCII or, say, ISO-8859-1 format – whatever may be configured as its default text format. That, in consequence would invalidate (or at least render strangely) all UTF8 character sequences.
Actually that is the recommended approach if (i.e. if
) you never intend to edit the wiki pages by an external editor.
As it happens, personally I prefer to edit the pages (of a local DokuWiki installation) by editors like Kate or OpenOffice.org for various reasons2).
Therefor I3) need those “magic” bytes but I don't want them to show up in the pages presented to the end user (reader).
Enter syntax_plugin_bomfix
.
Usage
The whole purpose of this plugin is to suppress the output of that “magic” byte sequence. And nothing more.
There are no new wiki language features introduced by this plugin.
Nor is there anything special you have to remember when editing one of your already existing or newly created pages.
Hence – besides installing this plugin there's nothing to do or respect.
Installation
Search and install the plugin using the Extension Manager.
Alternatively, refer to Plugins on how to install plugins manually. It's quite easy to integrate this plugin with your DokuWiki:
- Download the source archive (~3KB) and unpack it in your DokuWiki plugin directory
{dokuwiki}/lib/plugins
(make sure, included subdirectories are unpacked correctly); this will create the directory{dokuwiki}/lib/plugins/bomfix
. - Make sure both the new directory and the files therein are readable by the web-server e.g.
chown apache:apache dokuwiki/lib/plugins/* -Rc
Plugin Source
Here comes the GPLed PHP source4) for those who'd like to scan before actually installing it:
<?php if (! class_exists('syntax_plugin_bomfix')) { if (! defined('DOKU_PLUGIN')) { if (! defined('DOKU_INC')) { define('DOKU_INC', realpath(dirname(__FILE__) . '/../../') . '/'); } // if define('DOKU_PLUGIN', DOKU_INC . 'lib/plugins/'); } // if // Include parent class: require_once(DOKU_PLUGIN . 'syntax.php'); /** * <tt>syntax_plugin_bomfix.php </tt>- A PHP4 class that implements * a <tt>DokuWiki</tt> plugin for <tt>UTF8 "magic" bytes</tt>. * * <p> * External editors (i.e. separate standalone programs like wordprocessing * software) usually mark a file in UTF8 format by prepending its content * with a "magic" byte sequence at the very start of file. While there is * no harm in it as far as DokuWiki is concerned those "magic" bytes * (Byte Order Mark) <em>do</em> appear in the page presented to the user. * </p><p> * Depending on a page's actual content and the respective CSS rules in * effect this may lead to undesired results. One way to get rid of this * problem would be to open the affected page(s) with DokuWiki's builtin * edit feature and simply remove those bytes. However, such an approach * would cause the wordprocessor to open the file as plain text assuming * it's in ASCII or, say, ISO-8859-1 format - whatever may be configured * as the default text format. That, in consequence, would invalidate (or * at least render strangely) all UTF8 character sequences. * </p><p> * Actually that is the recommended approach <em>if</em> (i.e. <tt>if</tt>) * you never intend to edit the wiki pages by an external editor. * </p><p> * As it happens, personally I prefer to edit the pages (of a local DokuWiki * installation) by OpenOffice.org for various reasons. (And, yes, I know * that I bypass DokuWiki's changes-system this way.) Therefor I need those * "magic" bytes <em>but</em> I don't want them to show up in the pages * presented to the end user (reader). Enter <tt>syntax_plugin_bomfix</tt>. * The whole purpose of this plugin is to suppress the output of that * "magic" byte sequence. And nothing more. There are no new wiki language * features introduced by this plugin. Nor is there anything special you * have to remember when editing one of your already existing or newly * created pages. * </p><p> * To use it just install the plugin in your DokuWiki's plugin folder. * That's all. * </p><pre> * Copyright (C) 2006, 2008 M.Watermann, D-10247 Berlin, FRG * All rights reserved * EMail : <support@mwat.de> * </pre> * <div class="disclaimer"> * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either * <a href="http://www.gnu.org/licenses/gpl.html">version 3</a> of the * License, or (at your option) any later version.<br> * This software is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * General Public License for more details. * </div> * @author <a href="mailto:support@mwat.de">Matthias Watermann</a> * @version <tt>$Id: syntax_plugin_bomfix.php,v 1.5 2008/11/16 13:21:55 matthias Exp $</tt> * @since created 24-Dec-2006 */ class syntax_plugin_bomfix extends DokuWiki_Syntax_Plugin { /** * @publicsection */ //@{ /** * Tell the parser whether the plugin accepts syntax mode * <tt>$aMode</tt> within its own markup. * * @param $aMode String The requested syntaxmode. * @return Boolean <tt>FALSE</tt> always since no nested markup * is possible with this plugin. * @public */ function accepts($aMode) { return FALSE; } // accepts() /** * Connect lookup pattern to lexer. * * @param $aMode String The desired rendermode. * @public * @see render() */ function connectTo($aMode) { $this->Lexer->addSpecialPattern('^\xEF\xBB\xBF', $aMode, 'plugin_bomfix'); } // connectTo() /** * Get an associative array with plugin info. * * <p> * The returned array holds the following fields: * <dl> * <dt>author</dt><dd>Author of the plugin</dd> * <dt>email</dt><dd>Email address to contact the author</dd> * <dt>date</dt><dd>Last modified date of the plugin in * <tt>YYYY-MM-DD</tt> format</dd> * <dt>name</dt><dd>Name of the plugin</dd> * <dt>desc</dt><dd>Short description of the plugin (Text only)</dd> * <dt>url</dt><dd>Website with more information on the plugin * (eg. syntax description)</dd> * </dl> * @return Array Information about this plugin class. * @public * @static */ function getInfo() { return array( 'author' => 'Matthias Watermann', 'email' => 'support@mwat.de', 'date' => '2008-11-16', 'name' => 'BOMfix Syntax Plugin', 'desc' => 'Ignore UTF8 "magic" bytes at start of page', 'url' => 'http://www.dokuwiki.org/plugin:bomfix'); } // getInfo() /** * Where to sort in? * * @return Integer <tt>380</tt> (doesn't really matter). * @static * @public */ function getSort() { return 380; } // getSort() /** * Get the type of syntax this plugin defines. * * @return String <tt>'substition'</tt> (i.e. 'substitution'). * @static * @public */ function getType() { return 'substition'; // sic! should be __substitution__ } // getType() /** * Handler to prepare matched data for the rendering process. * * <p> * The <tt>$aState</tt> parameter gives the type of pattern * which triggered the call to this method: * </p> * <dl> * <dt>DOKU_LEXER_ENTER</dt> * <dd>a pattern set by <tt>addEntryPattern()</tt></dd> * <dt>DOKU_LEXER_MATCHED</dt> * <dd>a pattern set by <tt>addPattern()</tt></dd> * <dt>DOKU_LEXER_EXIT</dt> * <dd> a pattern set by <tt>addExitPattern()</tt></dd> * <dt>DOKU_LEXER_SPECIAL</dt> * <dd>a pattern set by <tt>addSpecialPattern()</tt></dd> * <dt>DOKU_LEXER_UNMATCHED</dt> * <dd>ordinary text encountered within the plugin's syntax mode * which doesn't match any pattern.</dd> * </dl><p> * This implementation does nothing (ignoring the passed arguments) * and just returns the given <tt>$aState</tt>. * </p> * @param $aMatch String The text matched by the patterns. * @param $aState Integer The lexer state for the match. * @param $aPos Integer The character position of the matched text. * @param $aHandler Object Reference to the Doku_Handler object. * @return Integer The current lexer state. * @public * @see render() * @static */ function handle($aMatch, $aState, $aPos, &$aHandler) { return $aState; // doesn't really matter as it's ignored anyway ... } // handle() /** * Handle the actual output creation. * * <p> * The method checks for the given <tt>$aFormat</tt> and returns * <tt>FALSE</tt> when a format isn't supported. * <tt>$aRenderer</tt> contains a reference to the renderer object * which is currently handling the rendering. * The contents of <tt>$aData</tt> is the return value of the * <tt>handle()</tt> method. * </p><p> * Besides "eating" the BOM implicitely this implementation does * nothing (ignoring all passed arguments) and always returns * <tt>TRUE</tt>. * </p> * @param $aFormat String The output format to generate. * @param $aRenderer Object A reference to the renderer object. * @param $aData Integer The data created/returned by the * <tt>handle()</tt> method. * @return Boolean <tt>TRUE</tt> always since there's no actual * rendering done and hence can't ever fail. * @public * @see handle() * @static */ function render($aFormat, &$aRenderer, $aData) { // nothing to do here - just 'eat' the BOM return TRUE; } // render() //@} } // class syntax_plugin_bomfix } // if //Setup VIM: ex: et ts=2 enc=utf-8 : ?>
Changes
2008-11-16:
2008-10-29:
* minor doc corrections;
2007-08-15:
* added GPL link and fixed some doc problems;
2007-12-26:
+ initial release;
Matthias Watermann 2008-11-16
See also
Plugins by the same author
- BOMfix Plugin – ignore Byte-Order-Mark characters in your pages
- Code Syntax Plugin – use syntax highlighting of code fragments in your pages
- Definition List Syntax Plugin – use the only complete definition lists in your pages
- Diff Syntax Plugin – use highlighting of diff files (aka “patches”) in your pages5)
- HR Syntax Plugin – use horizontal rules in nested block elements of your pages
- LANGuage Syntax Plugin – markup different languages in your pages
- Lists Syntax Plugin – use the only complete un-/ordered lists in your pages
- NBSP Syntax Plugin – use Non-Breakable-Spaces in your pages
- NsToC Syntax Plugin – use automatically generated namespace indices
- Shy Syntax Plugin – use soft hyphens in your pages
- Tip Syntax Plugin – add hint areas to your pages
Discussion
Hints, comments, suggestions …