====== Language Syntax PlugIn ====== ---- plugin ---- description: This plugin allows for adding markup to indicate other languages. author : Matthias Watermann email : support@mwat.de type : syntax lastupdate : 2007-08-15 compatible : 2005-07-13+ depends : conflicts : similar : tags : language downloadurl: http://dev.mwat.de/dw/syntax_plugin_lang.zip ---- Sometimes there arises the need to use words, phrases or even whole sen­ten­ces or paragraphs in a language different from the document's main lan­gua­ge((for instance consider writing quotes in their respective native language)). To support the readers((i.e. their device/software accessing such a document and possibly providing some accessibility aids like switching fonts or quote characters or using another voice for reading or ...)) of such a document using several languages it's advisably to explicitly markup all language changes in a do­cu­ment. This plugin allows for adding markup to indicate such language changes. It is implemented -- technically speaking -- by adding appropriate ''span'' tags around the text in question. ===== Usage ===== To actually make use of this [[#Plugin Source|plugin]] embed the text using another language than the document's rest in ''lang'' tags: ... The language-''__code__'' part is usually the two-letter language code as defined by ISO standard 639, //Code for the representation of names of languages//, the details of its use are explained in [[http://www.ietf.org/rfc/rfc3066.txt|RFC 3066]] //Tags for the Identification of Languages//. See the [[http://www.w3.org/TR/html401/struct/dirlang.html#h-8.1.1|HTML specs]] as well for further details. Please note that this is so-called //inline// markup, meaning it is to be used in­si­de block elements((such as paragraphs, list items, table cells etc.)). The ''lang'' tag (as well as its HTML equivalent ''span'') does //not// constitute a text block but is part of it. In consequence this means that you'll have to open a new block (by inserting an empty line) in case you want to mark­up a whole paragraph as can be seen in the following examples. ==== Examples ==== Suppose a document written in plain English. Some sentences, however, are to be given in another language. Therefore those "foreign" parts are marked up as in the following example: **1** This is an __English__ sentence. Dies ist ein //deutscher// Satz. This is a second __English__ sentence. **2** This is an __English__ sentence. Dies ist ein //deutscher// Satz. This is a second __English__ sentence. **3** This is an __English__ sentence. Dies ist ein //deutscher// Satz. This is a second __English__ sentence. **4** This is an __English__ paragraph. Dies ist ein //deutscher// Absatz. This is a second __English__ paragraph. **5** This is an __English__ paragraph. Well, I, er ... dunno how to, hmmm... write Klingon. This is a second __English__ paragraph. As can be seen the formatting((i.e. the placement of the ''%%%%'' markup in regard to the surrounding text and newlines)) follows the usual rules for //inline// markup. In sec­tions one to three the text portion in a different language((I've used German here since I'm a German ;-))) is just a part (here: //sentence//) between other parts. In sections four and five, however, there are newlines (empty lines) before and after the ''lang'' markup which renders that part to become a //paragraph// between other paragraphs. The resulting HTML, btw, looks as follows:

1

This is an English sentence. Dies ist ein deutscher Satz. This is a second English sentence.

2

This is an English sentence. Dies ist ein deutscher Satz. This is a second English sentence.

3

This is an English sentence. Dies ist ein deutscher Satz. This is a second English sentence.

4

This is an English paragraph.

Dies ist ein deutscher Absatz.

This is a second English paragraph.

5

This is an English paragraph.

Well, I, er ... dunno how to, hmmm... write Klingon.

This is a second English paragraph.

===== Installation ===== Search and install the plugin using the [[plugin:extension|Extension Manager]]. Alternatively, refer to [[:Plugins]] on how to install plugins manually. * http://dev.mwat.de/dw/syntax_plugin_lang.zip ===== Plugin Source ===== Here comes the [[http://www.gnu.org/licenses/gpl.html|GPLed]] PHP source((The comments within the [[#Plugin Source|source]] file are suitable for the OSS [[http://www.stack.nl/~dimitri/doxygen/index.html|doxygen]] tool, a do­cu­men­ta­tion sy­stem for C++, C, Java, Ob­jec­tive-C, Python, IDL and to some extent PHP, C#, and D. --- Since I'm working with dif­fe­rent pro­gram­ming lan­gua­ges it's a great ease to have one tool that handles the docs for all of them.)) for those who'd like to scan it be­fore actu­ally in­stal­ling it: syntax_plugin_lang.php - A PHP4 class that implements * a DokuWiki plugin to specify an area using a different * language than the remaining document. * *

* Markup a section of text to be using a different language, * lang 2-letter-lang-code *

 *  Copyright (C) 2005, 2007 DFG/M.Watermann, D-10247 Berlin, FRG
 *      All rights reserved
 *    EMail : <support@mwat.de>
 * 
*
* This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either * version 3 of the * License, or (at your option) any later version.
* This software is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * General Public License for more details. *
* @author Matthias Watermann * @version $Id: syntax_plugin_lang.php,v 1.4 2007/08/15 12:36:19 matthias Exp $ * @since created 1-Sep-2005 */ class syntax_plugin_lang extends DokuWiki_Syntax_Plugin { /** * @publicsection */ //@{ /** * Tell the parser whether the plugin accepts syntax mode * $aMode within its own markup. * * @param $aMode String The requested syntaxmode. * @return Boolean TRUE unless $aMode is * plugin_lang (which would result in a * FALSE method result). * @public * @see getAllowedTypes() * @static */ function accepts($aMode) { return ('plugin_lang' != $aMode); } // accepts() /** * Connect lookup pattern to lexer. * * @param $aMode String The desired rendermode. * @public * @see render() */ function connectTo($aMode) { // See http://www.w3.org/TR/html401/struct/dirlang.html#h-8.1.1; // better (specialized) REs are used in 'handle()' method. $this->Lexer->addEntryPattern( '\x3Clang\s+[a-z\-A-Z0-9]{2,})?\s*\x3E\s*(?=(?s).*?\x3C\x2Flang\x3E)', $aMode, 'plugin_lang'); } // connectTo() /** * Get an associative array with plugin info. * *

* The returned array holds the following fields: *

*
author
Author of the plugin
*
email
Email address to contact the author
*
date
Last modified date of the plugin in * YYYY-MM-DD format
*
name
Name of the plugin
*
desc
Short description of the plugin (Text only)
*
url
Website with more information on the plugin * (eg. syntax description)
*
* @return Array Information about this plugin class. * @public * @static */ function getInfo() { return array( 'author' => 'Matthias Watermann', 'email' => 'support@mwat.de', 'date' => '2007-08-15', 'name' => 'LANGuage Syntax Plugin', 'desc' => 'Markup a text area using another language', 'url' => 'http://www.dokuwiki.org/plugin:lang'); } // getInfo() /** * Where to sort in? * * @return Integer 498 (doesn't really matter). * @public * @static */ function getSort() { return 498; } // getSort() /** * Get the type of syntax this plugin defines. * * @return String 'formatting'. * @public * @static */ function getType() { return 'formatting'; } // getType() /** * Handler to prepare matched data for the rendering process. * *

* The $aState parameter gives the type of pattern * which triggered the call to this method: *

*
*
DOKU_LEXER_ENTER
*
a pattern set by addEntryPattern()
*
DOKU_LEXER_MATCHED
*
a pattern set by addPattern()
*
DOKU_LEXER_EXIT
*
a pattern set by addExitPattern()
*
DOKU_LEXER_SPECIAL
*
a pattern set by addSpecialPattern()
*
DOKU_LEXER_UNMATCHED
*
ordinary text encountered within the plugin's syntax mode * which doesn't match any pattern.
*
* @param $aMatch String The text matched by the patterns. * @param $aState Integer The lexer state for the match. * @param $aPos Integer The character position of the matched text. * @param $aHandler Object Reference to the Doku_Handler object. * @return Array Index [0] holds the current * $aState, index [1] the match prepared for * the render() method. * @public * @see render() * @static */ function handle($aMatch, $aState, $aPos, &$aHandler) { if (DOKU_LEXER_ENTER == $aState) { $hits = array(); // RFC 3066, "2. The Language tag", p. 2f. // Language-Tag = Primary-subtag *( "-" Subtag ) if (preg_match('|\s+([a-z]{2,3})\s*>|i', $aMatch, $hits)) { // primary _only_ (most likely to be used) return array($aState, $hits[1]); } // if if (preg_match('|\s+([a-z]{2,3}\-[a-z0-9]{2,})\s*>|i', $aMatch, $hits)) { // primary _and_ subtag return array($aState, $hits[1]); } // if if (preg_match('|\s+([ix]\-[a-z0-9]{2,})\s*>|i', $aMatch, $hits)) { // 1-letter primary _and_ subtag return array($aState, $hits[1]); } // if if (preg_match('|\s+([a-z]{2,3})\-.*\s*>|i', $aMatch, $hits)) { // convenience: accept primary with empty subtag return array($aState, $hits[1]); } // if // invalid language specification return array($aState, FALSE); } // if return array($aState, $aMatch); } // handle() /** * Add exit pattern to lexer. * * @public */ function postConnect() { $this->Lexer->addExitPattern('\x3C\x2Flang\x3E', 'plugin_lang'); } // postConnect() /** * Handle the actual output creation. * *

* The method checks for the given $aFormat and returns * FALSE when a format isn't supported. $aRenderer * contains a reference to the renderer object which is currently * handling the rendering. The contents of $aData is the * return value of the handle() method. *

* @param $aFormat String The output format to generate. * @param $aRenderer Object A reference to the renderer object. * @param $aData Array The data created by the handle() * method. * @return Boolean TRUE if rendered successfully, or * FALSE otherwise. * @public * @see handle() * */ function render($aFormat, &$aRenderer, &$aData) { if ('xhtml' != $aFormat) { return FALSE; } // if static $VALID = TRUE; // flag to notice invalid markup switch ($aData[0]) { case DOKU_LEXER_ENTER: if ($aData[1]) { $aRenderer->doc .= ''; } else { $VALID = FALSE; } // if return TRUE; case DOKU_LEXER_UNMATCHED: $aRenderer->doc .= str_replace(array('&','<', '>'), array('&', '<', '>'), $aData[1]); return TRUE; case DOKU_LEXER_EXIT: if ($VALID) { $aRenderer->doc .= ''; } else { $VALID = TRUE; } // if default: return TRUE; } // switch } // render() //@} } // class syntax_plugin_lang } // if //Setup VIM: ex: et ts=2 enc=utf-8 : ?>
==== Changes ==== __2007-08-15__:\\ * added GPL link and fixed some doc problems; __2007-01-05__:\\ * minor internal changes (added comments, date updated); __2005-09-04__:\\ + initial release; //[[support@mwat.de|Matthias Watermann]] 2007-08-15// ===== See also ===== ==== Plugins by the same author ==== * [[bomfix|BOMfix Plugin]] -- ignore Byte-Order-Mark characters in your pages * [[code2|Code Syntax Plugin]] -- use syntax highlighting of code fragments in your pages * [[deflist|Definition List Syntax Plugin]] -- use the only complete definition lists in your pages * [[diff|Diff Syntax Plugin]] -- use highlighting of diff files (aka "patches") in your pages((obsoleted by incorporating its ability into the [[code2|Code]] plugin)) * [[hr|HR Syntax Plugin]] -- use horizontal rules in nested block elements of your pages * [[lang|LANGuage Syntax Plugin]] -- markup different languages in your pages * [[lists|Lists Syntax Plugin]] -- use the only complete un-/ordered lists in your pages * [[nbsp|NBSP Syntax Plugin]] -- use Non-Breakable-Spaces in your pages * [[nstoc|NsToC Syntax Plugin]] -- use automatically generated namespace indices * [[shy|Shy Syntax Plugin]] -- use soft hyphens in your pages * [[tip|Tip Syntax Plugin]] -- add hint areas to your pages ===== Discussion ===== Hints, comments, suggestions ... Dosn't seem to work too well in Internet Explorer. > Don't worry: M$IE is well known for not caring about standards((at least those they don't own)). Trying to work around the various bugs of that awful program((for need of a more adequate designation)) is an endless business. ---- Word 2003 has an option to manually insert **phonetics** above specified words... I was wondering if it was possible to create a module or plugin for DokuWiki that does the following for **Koine-Greek**: a) allows the user to upload a **two column wordlist**; first column **source text**, second column **phonetic text**. b) specify the fonts for the source and phonetic text. c) Have the DokuWiki, automatically recognize the words from the source text on any text [as one types] and auto-insert and center the phonetic text **ABOVE** each (tagged) occurrence... An optional button to insert tags on selected text would be great also, not to mention Unicode capability for the source text column, and the option to configure both language and fonts as per source text and phonetic output, if necessary Thanx a million... Please contact keith (at) pm-intl (.) org Keith > See [[:bounties]] for such requests. Suggestion: Add dir="rtl" to span tag for RTL languages. It can possibly be determined by $lang['direction'] in lang.php of that language. ---- Unfortunately, headline code, e.g. "== headline ==" is not interpreted as headline code, but printed as raw code, ie. the "==" are printed and no headlining code is generated. The same is true for the language tag of the [[plugin:wrap]] tool.\\ [[hemmerling@gmx.net|Rolf Hemmerling]] //2009-12-23 10:00// \\