DokuWiki

It's better when it's simple

User Tools

Site Tools


plugin:wordimport

wordimport Plugin

Compatible with DokuWiki

  • 2024-02-06 "Kaos" yes
  • 2023-04-04 "Jack Jackrum" unknown
  • 2022-07-31 "Igor" unknown
  • 2020-07-29 "Hogfather" unknown

plugin Import Microsoft Word Documents into DokuWiki

Last updated on
2024-07-25
Provides
CLI, Action
Repository
Source

Tagged with docx, import, word

A CosmoCode Plugin

This plugin allows you to import Microsoft Word documents as pages into the Wiki. The contents are converted into DokuWiki Syntax.

The current release implements basic support for the following elements.

  • Headers
  • Basic text formatting
  • Tables
  • Lists
  • Images (imported as media files)
  • Code Blocks (detected by monospace font)

Installation

Search and install the plugin using the Extension Manager. Refer to Plugins on how to install plugins manually.

Usage

Once installed a new icon is displayed in the page menu. Clicking it opens a dialog to upload a word document. It is imported as a new revision of the current page.

The import button is only shown for users with DELETE permissions to the current namespace.

Configuration and Settings

Word has no predefined style for code blocks. So they are detected by a certain font applied to a whole paragraph. Which fonts are recognized can be set in the config settings.

Limitations

The Word parser is written from scratch for this plugin and has been tested against a limited set of documents only. Unknown elements will be ignored, resulting in missing content in your Wiki pages. You should always check your results after importing. The simpler your Word documents are, the better the chance for successful imports are.

Please note that this importer only imports the XML-based .docx format. Not the older, proprietary .doc format.

To ensure the import can figure out how to transform your Word documents into Wiki syntax, you need to use proper semantic styling. This is especially true for header detection. Eg. set your paragraph style to “Heading 1” instead of just formatting them as big and bold.

Formattings not supported in Wiki Syntax (like text sizes, colors, etc.) will be ignored. Other word features that are supported in DokuWiki might not yet be implemented in the importer.

Texts imported from Word, are imported as is. This means some texts might be accidentally interpreted as wiki syntax.

If you'd like us to work on any of these limitations on your behalf, please contact us.

Early Access Features

Additional features are available as early access through our DokuWiki Business Plugin Partner Program.

Parsing Improvements

The parser has been improved to handle certain elements a bit better:

  • better list detection
  • better image handling (resizing, alignment, use in lists, inline in paragraphs)
  • support for external links
  • better code block detection

Formula Support

Math formulas are converted to MathML and can be displayed using the Ad-Hoc MathML Plugin.

Command Line Import

The plugin contains a command line component, supporting two commands: single and bulk.

The single command allows to import a single Word document to a given page name.

 php bin/plugin.php wordimport single /path/to/my/word.docx namespace:page

The bulk command will search a given directory recursively and import all .docx documents found into a given namespace. Pages are named after the word document, the folder structure is reused as namespace structure.

 php bin/plugin.php wordimport bulk /path/to/my/word-manual/ manual

By default, the command line import will not overwrite existing pages. Using the –overwrite flag, enables that.

 php bin/plugin.php wordimport --overwrite bulk /path/to/my/word-manual/ manual
plugin/wordimport.txt · Last modified: by andi

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki