DokuWiki

It's better when it's simple

User Tools

Site Tools


plugin:docsearch

DocSearch Plugin

Compatible with DokuWiki

  • 2024-02-06 "Kaos" unknown
  • 2023-04-04 "Jack Jackrum" unknown
  • 2022-07-31 "Igor" unknown
  • 2020-07-29 "Hogfather" yes

plugin Search through your uploaded documents

Last updated on
2016-07-18
Provides
Action
Repository
Source
Conflicts with
searchstats

Similar to elasticsearch, searchtext

Tagged with search

Needed for docsearchsitemap

This plugin allows you to search through your uploaded documents. It is integrated into the default DokuWiki search. Just fill in a search string and start to search.

:!: A probably better alternative to this plugin, is the elasticsearch Plugin with its ability to index documents.

A CosmoCode Plugin

Download and Installation

Search and install the plugin using the Extension Manager. Refer to Plugins on how to install plugins manually.

Changes

Index Updating and Cronjob

This plugin creates it's own index of documents, similar but separate from DokuWiki's own index. Only documents that have been indexed will be found by the plugin, so the index has to be updated periodically.

The index is built by the dokuwiki/lib/plugins/docsearch/cron.php script. You need to set up a scheduled job to call this command periodically. Eg. once a day. This can be done using cron on Linux, scheduled tasks on Windows or online cron job services like easycron.

When using your operating system's scheduler, be aware that PHP settings may differ between command line and web server execution. This is important if you want to increase the memory_limit of your PHP configuration (see ini.core).

Note: if you run a DokuWiki farm, you need to run the cronjob for each animal separately, passing the animal's name as first parameter to the script.

Converters

The plugin works by converting the documents to index into text files first. To do this, it relies on external converters for each file type. These have to be set up in the dokuwiki/lib/plugins/docsearch/conf/converter.php file.

Either edit it in your text editor of choice or use the ConfManager Plugin.

Each line in that file configures one file extension and the converter call to use. The abstract syntax is

fileextension /path/to/converter -with_calls_to_convert --from %in% --to %out%

%in% refers to the input document, %out% is the resulting text file. For converters writing their results to STDOUT, be sure to redirect it to the %out% file.

Below is a typical configuration. The comments show how to install the tools on a Debian linux system.

#<?php die(); ?>
pdf   /usr/bin/pdftotext -enc UTF-8   %in% %out%      # apt install poppler-utils
doc   /usr/bin/catdoc -d UTF-8        %in% > %out%    # apt install catdoc
ppt   /usr/bin/catppt -d UTF-8        %in% > %out%    # apt install catdoc
xls   /usr/bin/xls2csv -d UTF-8       %in% > %out%    # apt install catdoc
docx  /usr/bin/docx2txt               %in% %out%      # apt install docx2txt
xlsx  /usr/bin/incsv                  %in% > %out%    # apt install csvkit
pptx  /pathto/pptx2txt.sh             %in% > %out%    # curl https://raw.githubusercontent.com/welcheb/pptx2txt.sh/refs/heads/master/pptx2txt.sh -o /pathto/pptx2txt.sh
odt   /usr/bin/odt2txt         %in% --output=%out%    # apt install odt2txt

Below is a list of user contributed setups:

plugin/docsearch.txt · Last modified: by andi

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki