====== AIchat Plugin ======

---- plugin ----
description: Chat with a LLM about your DokuWiki contents
author     : Andreas Gohr
email      : dokuwiki@cosmocode.de
type       : action, helper, syntax, CLI
lastupdate : 2024-07-29
compatible : Jack Jackrum, Kaos
depends    : sqlite
conflicts  : 
similar    : 
tags       : !experimental, openai, chatgpt, llm, ai, mistral, anthropic

downloadurl: https://github.com/cosmocode/dokuwiki-plugin-aichat/zipball/master
bugtracker : https://github.com/cosmocode/dokuwiki-plugin-aichat/issues
sourcerepo : https://github.com/cosmocode/dokuwiki-plugin-aichat/
donationurl: 

updatemessage: 2024-06-25 - New renderer added, recreating the embeddings index is recommended

screenshot_img : plugin:aichat.png
----

[[https://www.cosmocode.de/en/open-source/dokuwiki-plugins/|{{ https://www.cosmocode.de/static/img/dokuwiki/dwplugins.png?recache|A CosmoCode Plugin}}]]

This plugin adds the ability to use a Large Language Model (LLM) often called "artificial intelligence" (AI) to chat with a bot in your wiki. It uses OpenAI's ChatGPT or other APIs to interact with the natural language questions of the user. The Plugin will provide the bot with context extracted from your wiki pages, so the bot can answer questions directly related to your content. This is mechanism is also known as Retrieval-Augmented Generation (RAG).

You can read an [[https://forum.dokuwiki.org/d/21178-experimental-plugin-aichat-chat-with-your-wiki-pages-using-chatgpt|introductionary post about]] how this plugin works in the forum. There's also a blog post for more background information on how the [[https://www.splitbrain.org/blog/2023-08/15-using_sqlite_as_vector_store_in_php|clustering in sqlite]] works.

===== Installation =====

:!: This plugin is experimental. Implementation details may change and require manual intervention on future updates.

:!: The use of this plugin creates API costs. Be sure to be aware of the pricing details for your selected API providers and your usage patterns

:!: This plugin requires PHP 8.1 or higher. Installing it on systems with lower versions will "crash" your wiki.

:!: You need command line access and a scheduler (like cron) to use this plugin.


Install the plugin using the [[plugin:plugin|Plugin Manager]] and the download URL above, which points to latest version of the plugin. Refer to [[:Plugins]] on how to install plugins manually.

Once installed, continue with Configuration and Initial Setup.


===== Configuration =====

If you need help with installing, configuring and fine tuning this plugin, feel free to [[https://www.cosmocode.de/en/services/wiki/dokuwiki-ai/|contact us]] for a quote.

Use the [[config|Configuration Manager]] to adjust the settings described below.


==== 🧠 Models ====

Models are used to create embeddings (vectors describing your content), rephrasing and to actually answer the question. The plugin currently supports models provided by different providers. You need to configure their respective API keys and other credentials to use them. Of course you only need API keys for those providers whose models you want to use.


  * [[https://platform.openai.com/account/api-keys|OpenAI API keys]]
  * [[https://console.anthropic.com/settings/keys|Anthropic API keys]]
  * [[https://console.mistral.ai/api-keys/|Mistral API keys]]
  * [[https://dash.voyageai.com/api-keys|Voyage API keys]]
  * [[https://platform.reka.ai/apikeys|Reka API keys]]
  * [[https://console.groq.com/keys|Groq API keys]]


You can use different models for embeddings, rephrasing and chat. When changing the embedding model you need to rebuild the vector storage (see below). The rephrasing model is used to interpret a given question  when selecting possible source documents from your wiki.

An overview over prices, properties and a short description of each model can be seen when running the ''models'' CLI command. Prices shown here may not be correct! Always check with the model provider's pricing list!

Which model is the best for your use case depends on various factors and requires some experimentation. Contact CosmCode for help with picking and configuring the right solution for you.

==== 📥 Vector Storage ====

The plugin needs to store embedding vectors to do semantic similarity searches. There are multiple options available, you only need to configure one.

The default uses a local SQLite database. This does not need any additional configuration. However, the more wiki pages you have, the longer does it take to search through all available embeddings. To mitigate this, this storage mechanism uses a clustering approach, separating the similar page chunks into clusters and then only search the nearest cluster when looking for similar chunks.

Clusters are automatically created when the embedding index is built the first time. However, when your wiki grows a lot, topics change drastically or you add a new language, the initial clusters may not longer be a good fit. In that case you should run the ''maintenance'' CLI command to rebuild the clusters.

The other storage options are dedicated vector storage solutions, that each need their own configuration:

  * [[https://www.pinecone.io/|PineCone]] a SaaS vector database with a free tier. After signing up, you need to create an index and give it a dimension matching your embed model (see ''models'' command) eg. ''1536''. As metric select ''cosine''. Your ''pinecone_baseurl'' can be found in the list of indexes. It should look like ''https://myname-something.svc.gcp-starter.pinecone.io'' or similar.
  * [[https://www.trychroma.com/|Chroma]] is a vector storage you can host yourself. Chroma has to be run in [[https://docs.trychroma.com/deployment|Server-Mode]].
  * [[https://qdrant.tech/|Qdrant]] can either be self-hosted or used as SaaS with a free tier. After "creating an API", the shown curl snippet contains your baseurl and apikey


==== Finetuning ====

The plugin offers a few options to fine tune how to store vector data and talk to the LLM.

  * chunkSize - The maximum size (in tokens) of chunks your wiki pages will be split into.
  * similarityThreshold - A minimum percentage that documents need to match a given question. Different embedding models return different similarity distances. Increase this value if the bot uses too many irrelevant sources, lower the value if the bot finds too few relevant sources.
  * contextChunks - the maximum number of chunks to use as context when answering question. More chunks also mean larger context and higher cost!
  * rephraseHistory - how many question/answer pairs of the previous conversation should be taken into account when interpreting the given question for document retrieval? More history means larger context and higher cost!
  * chatHistory - how many question/answer pairs of the previous conversation should be sent as additional context when answering questions? More history means larger context and higher cost!


==== Multi-Language Options ====

By default, ChatGPT guesses which language the user used to ask a question and will answer in the same language. Sometimes it might guess wrong or may not be sure about the language and fall back to English. It will also always use all pages in your wiki as potential context to answer the question, regardless of it's language.

Using the ''preferUIlanguage'' option you can tune how the plugin should work in multilingual wikis. 

  - Guess language, use all sources
    * The default. The bot will answer in the language the user used and consider any page as possible context.
  - Prefer UI language, use all sources 
    * This will prompt the model to always answer in the language configured in the [[config:lang|global lang configuration setting]]. When using the [[translation]] plugin with the ''translateUI'' option, the chat bot will answer in the currently used UI language.
  - Prefer UI language, same language sources only
    * Just like in the previous setting, the model is prompted to use the current language to answer. It will also limit the considered sources to those matching the language. Please note that for switching from and to this option with the SQLite Storage, the ''maintenance'' command needs to be called to recluster the embeddings according to language.

==== Namespace/Page Restrictions ====

Which parts of the wiki will be available as context for the chat bot can be fine-tuned via the ''matchRegex'' and ''skipRegex'' options. Both options expect a regular expression (without ''/'' delimiters) to match against [[:pagename| page IDs]]. Page IDs will always start with a colon '':'' when matched.

The default configuration will skip any pages named or located in namespaces named ''playground'' or ''sandbox''.

Both regular expressions (when set) need to apply at the same time. Eg. a page **must match** the ''matchRegex'' and **must not match** the ''skipRegex'' to be indexed in the vector store.

The regular expressions are applied when running the ''embed'' command line command. Pages no longer adhering to a changed regex setup will be removed from the vector store during this command's run.

For the sqlite storage it is recommended to re-cluster the index when the regexes are changed by running the ''maintenance'' command.


==== General Options ====

All chat conversations can optionally be logged to DokuWiki's logging facilities by enabling the ''logging'' option. They will show up as a new tab in the [[plugin:logviewer|LogViewer]].

Access to the chat functionality can be restricted to users and groups with the ''restrict'' option. Simply list them comma separated. Prefix groups with an ''@'' as usual. By default, everyone may access.

===== Initial Setup =====

Once the plugin has been configured, the vector store has to be initialized. This is done via the command line interface.


   php bin/plugin.php aichat embed


Updates to the storage can be done by rerunning the command.

Depending on the size of your wiki this can take a while. Whenever you change certain settings you will need to completely clear and rebuild the storage. Settings that require a reindex are marked with 🔄 in the config manager. Use the clear parameter for that:

   php bin/plugin.php aichat embed --clear


Finally you need to set up a regular task to update the storage. How often you want to do that depends on how much your wiki changes, but generally running the command once a day should be fine. On unix systems you can use a cronjob. The cronjob should run under the same user as your web-facing PHP process.

   sudo crontab -e -u www-data


Here's an example of a crontab entry to run the update every morning at 2am. Be sure to adjust paths as needed.

<file>
0 2 * * * php /var/www/dokuwiki/bin/plugin.php aichat embed
</file>

That should be all for the setup. For more command line stuff see further down.

===== Syntax/Usage =====

==== Chat ====

The plugin comes with a simple syntax component to add the chat to your wiki.

The basic syntax looks like this:

  <aichat>Welcome to the Chat. Please be polite to the bot</aichat>

The text inside the ''<aichat>'' tags will be the message the chatbot will use to greet the user. Please note that this message is local only. It will not be part of the chat history used by the LLM. In other words it is not a prompt.

Above syntax will embed the chat directly into page. If you want the chat to be a bit out of the way, you can use the ''button'' parameter:

  <aichat button>Welcome to the Chat. Please be polite to the bot</aichat>

Now a simple chat icon will be shown. Only when clicked, will the chat open in an overlay. This syntax is best suited for use in a sidebar.

Finally there is an additional option to let the chat button float on the lower right corner:

  <aichat button float>Welcome to the Chat. Please be polite to the bot</aichat>

This is probably where most people expect a "chat" on a website. Again this is best placed in the sidebar.

==== Similar Pages ====

The embeddings created to find pages similar to the user's question can also be used to find similar pages to the current one.

A simple syntax component will show a list of the top 5 similar pages in the index. The syntax can also be used in a sidebar.

<code>
~~similar~~
</code>

Note this will not query the OpenAI API so no costs incur. Accordingly this feature ignores the access restrictions that apply to the chat itself.

===== CLI =====

The command line interface introduced in the Setup section above, has a few more features that might be useful for debugging or testing the chat before making it available to your users.

:!: Prices shown in the output are estimates and may not be correct!


  php bin/plugin.php aichat --help

This will print a help screen with all available commands and options. Be sure to check it for all available features, only the important ones are shown below.

  php bin/plugin.php aichat similar "What is DokuWiki?"

This prints the list of chunks that the similarity function thinks are similar to the given question along with chunkIDs and similarity scores.

  php bin/plugin.php aichat ask "What is DokuWiki?"

Answers the given question.

  php bin/plugin.php aichat chat

Starts an interactive chat session.

  php bin/plugin.php aichat info

Shows a few statistics on the currently configured vector store and models.

  php bin/plugin.php aichat maintenance

This runs maintenance on the embeddings store. This is currently only used in the SQLiteStorage as explained above.

===== FAQ =====

Here are a couple of things you might wonder.


  * Does this work on multilingual content?
    * It seem to work just fine. Most LLMs can work with many languages and will automatically match the user's question language to content in the same language.
  * Can I adjust the prompts used?
    * Yes you'll find them in the ''lang/en/*.prompt'' files. Refer to [[https://www.dokuwiki.org/localization#changing_some_localized_texts_and_strings_in_your_installation|Localizing Texts]] on how to override them
  * Will the chat bot hallucinate?
    * Yes. Even though the prompt asks the model not to, it has a tendency to state things with confidence that are not entirely true. Remember this is not real intelligence or understanding. It is just clever text prediction. However as long as the question covers a topic that does have an answer in your wiki, the results are generally pretty good.
  * Will the bot answer questions that are unrelated to the wiki contents?
    * That might happen. No answers will be given if no fitting sources can be found at all -- check the ''similarityThreshold'' option. But often sources are found that are just slightly above the cut-off but do not answer the question. The bot will then fall back to its LLM roots and happily answer completely off-topic questions. So far attempts to avoid this with prompting have failed.
  * Can I use a different model? Or even running my own?
    * Models can be added relatively easy. Again, contact us and we can work something out.
  * The bot seems to never find any sources?
    * Check the ''similarityThreshold'' setting. You might need to lower it for some models.


===== Development =====

Developers could use this plugin in their own plugins. Via the helper component you get access to an implementation of AbstractModel which will give you a Client to interact with the current model provider. You also get access to the Embedding object which lets you do similarity searches on the wiki content.

===== See also =====

Forum thread: [[https://forum.dokuwiki.org/d/21178-experimental-plugin-aichat-chat-with-your-wiki-pages-using-chatgpt|Experimental Plugin: AIChat - chat with your wiki pages using ChatGPT]]