Table of Contents
DocSearch Plugin
Compatible with DokuWiki
- 2024-02-06 "Kaos" unknown
- 2023-04-04 "Jack Jackrum" unknown
- 2022-07-31 "Igor" unknown
- 2020-07-29 "Hogfather" yes
Similar to elasticsearch, searchtext
Needed for docsearchsitemap
This plugin allows you to search through your uploaded documents. It is integrated into the default DokuWiki search. Just fill in a search string and start to search.
A probably better alternative to this plugin, is the elasticsearch Plugin with its ability to index documents.
Download and Installation
Search and install the plugin using the Extension Manager. Refer to Plugins on how to install plugins manually.
Changes
- Merge pull request #33 from dokuwiki-translate/lang_update_581_167770… by splitbrain (2023-03-01 21:58)
- translation update by Klap-in (2023-03-01 21:20)
- Merge pull request #30 from dokuwiki-translate/lang_update_680_151403… by splitbrain (2018-01-03 10:30)
- translation update by tor@harnqvist.se (2017-12-23 13:35)
- Merge pull request #29 from dokuwiki-translate/lang_update_386 by splitbrain (2017-05-24 07:12)
- translation update by services.m@benard.info (2017-05-23 22:30)
- Merge pull request #28 from dokuwiki-translate/lang_update_213 by splitbrain (2016-12-22 12:41)
- translation update by sawachan (2016-12-22 05:50)
Index Updating and Cronjob
This plugin creates it's own index of documents, similar but separate from DokuWiki's own index. Only documents that have been indexed will be found by the plugin, so the index has to be updated periodically.
The index is built by the dokuwiki/lib/plugins/docsearch/cron.php
script. You need to set up a scheduled job to call this command periodically. Eg. once a day. This can be done using cron on Linux, scheduled tasks on Windows or online cron job services like easycron.
When using your operating system's scheduler, be aware that PHP settings may differ between command line and web server execution. This is important if you want to increase the memory_limit
of your PHP configuration (see ini.core).
Note: if you run a DokuWiki farm, you need to run the cronjob for each animal separately, passing the animal's name as first parameter to the script.
Converters
The plugin works by converting the documents to index into text files first. To do this, it relies on external converters for each file type. These have to be set up in the dokuwiki/lib/plugins/docsearch/conf/converter.php
file.
Either edit it in your text editor of choice or use the ConfManager Plugin.
Each line in that file configures one file extension and the converter call to use. The abstract syntax is
fileextension /path/to/converter -with_calls_to_convert --from %in% --to %out%
%in%
refers to the input document, %out%
is the resulting text file. For converters writing their results to STDOUT, be sure to redirect it to the %out%
file.
Below is a typical configuration. The comments show how to install the tools on a Debian linux system.
#<?php die(); ?> pdf /usr/bin/pdftotext -enc UTF-8 %in% %out% # apt install poppler-utils doc /usr/bin/catdoc -d UTF-8 %in% > %out% # apt install catdoc ppt /usr/bin/catppt -d UTF-8 %in% > %out% # apt install catdoc xls /usr/bin/xls2csv -d UTF-8 %in% > %out% # apt install catdoc docx /usr/bin/docx2txt %in% %out% # apt install docx2txt xlsx /usr/bin/incsv %in% > %out% # apt install csvkit pptx /pathto/pptx2txt.sh %in% > %out% # curl https://raw.githubusercontent.com/welcheb/pptx2txt.sh/refs/heads/master/pptx2txt.sh -o /pathto/pptx2txt.sh odt /usr/bin/odt2txt %in% --output=%out% # apt install odt2txt
Below is a list of user contributed setups: