Differences

This shows you the differences between two versions of the page.

--- plugin:aichat [2024-03-13 23:50] – version upped andi
+++ plugin:aichat [2025-10-18 01:02] (current) – version upped andi
@@ Line 6: / Line 6: @@
 email      : dokuwiki@cosmocode.de
 type       : action, helper, syntax, CLI
-lastupdate : 2024-03-13
+lastupdate : 2025-06-18
 compatible : Jack Jackrum, Kaos
 depends    : sqlite
 conflicts  :
 similar    :
-tags       : !experimental, openai, chatgpt, llm, ai
+tags       : !experimental, ai
 downloadurl: https://github.com/cosmocode/dokuwiki-plugin-aichat/zipball/master
@@ Line 17: / Line 17: @@
 sourcerepo : https://github.com/cosmocode/dokuwiki-plugin-aichat/
 donationurl:
 screenshot_img : plugin:aichat.png
@@ Line 23: / Line 24: @@
 [[https://www.cosmocode.de/en/open-source/dokuwiki-plugins/|{{ https://www.cosmocode.de/static/img/dokuwiki/dwplugins.png?recache|A CosmoCode Plugin}}]]
-This plugin adds the ability to use a "artificial intelligence" chat bot to your wiki. It uses OpenAI's ChatGPT API to interact with the natural language questions of the user. The Plugin will provide the bot with context extracted from your wiki pages, so the bot can answer questions directly related to your content. This is mechanism is also known as Retrieval-Augmented Generation (RAG).
+This plugin adds the ability to use a Large Language Model (LLM) often called "artificial intelligence" (AI) to chat with a bot in your wiki. It uses OpenAI's ChatGPT or other APIs to interact with the natural language questions of the user. The Plugin will provide the bot with context extracted from your wiki pages, so the bot can answer questions directly related to your content. This is mechanism is also known as Retrieval-Augmented Generation (RAG).
-You can read an [[https://forum.dokuwiki.org/d/21178-experimental-plugin-aichat-chat-with-your-wiki-pages-using-chatgpt|introductionary post about]] how this plugin works in the forum. There's also a blog post for more background information on how the [[https://www.splitbrain.org/blog/2023-08/15-using_sqlite_as_vector_store_in_php|clustering in sqlite]] works.
+In the forum, you can read an [[https://forum.dokuwiki.org/d/21178-experimental-plugin-aichat-chat-with-your-wiki-pages-using-chatgpt|introductionary post about]] how this plugin works. There's also a blog post for more background information on how the [[https://www.splitbrain.org/blog/2023-08/15-using_sqlite_as_vector_store_in_php|clustering in sqlite]] works.
 ===== Installation =====
@@ Line 31: / Line 32: @@
 :!: This plugin is experimental. Implementation details may change and require manual intervention on future updates.
-:!: The use of this plugin creates [[https://openai.com/pricing|OpenAI API costs]]. Be sure to be aware of their pricing details and your usage patterns
+:!: The use of this plugin creates API costs. Be sure to be aware of the pricing details for your selected API providers and your usage patterns
 :!: This plugin requires PHP 8.1 or higher. Installing it on systems with lower versions will "crash" your wiki.
@@ Line 39: / Line 40: @@
 Install the plugin using the [[plugin:plugin|Plugin Manager]] and the download URL above, which points to latest version of the plugin. Refer to [[:Plugins]] on how to install plugins manually.
-Optionally install the [[text]] plugin to allow the plugin to index text contents without wiki syntax.
 Once installed, continue with Configuration and Initial Setup.
@@ Line 47: / Line 46: @@
 ===== Configuration =====
+If you need help with installing, configuring and fine tuning this plugin, feel free to [[https://www.cosmocode.de/en/services/wiki/dokuwiki-ai/|contact us]] for a quote.
-==== OpenAI ====
+Use the [[config|Configuration Manager]] to adjust the settings described below.
-You need to create an [[https://platform.openai.com/account/api-keys|OpenAI API key]].
-Configure the key in the [[plugin:config|Configuration Manager]]. Optionally you can set your organization's ID here as well.
+==== 🧠 Models ====
-You can select the AI model used here as well. Currently GPT-3.5 models with 4k and 16k token limits and GPT-4 with a 8k token limit can be used. The default model is the cheapest one. Refer to [[https://openai.com/pricing|OpenAI's Pricing page]] for details.
+Models are used to create embeddings (vectors describing your content), rephrasing and to actually answer the question. The plugin currently supports models provided by different providers. You need to configure their respective API keys and other credentials to use them. Of course you only need API keys for those providers whose models you want to use.
-==== Vector Storage ====
-The plugin needs to store embedding vectors to do semantic similarity searches. There are multiple options available, you only need to configure one.
+  * [[https://platform.openai.com/account/api-keys|OpenAI API keys]]
+  * [[https://aistudio.google.com/apikey|Google Gemini API keys]]
+  * [[https://console.anthropic.com/settings/keys|Anthropic API keys]]
+  * [[https://console.mistral.ai/api-keys/|Mistral API keys]]
+  * [[https://dash.voyageai.com/api-keys|Voyage API keys]]
+  * [[https://platform.reka.ai/apikeys|Reka API keys]]
+  * [[https://console.groq.com/keys|Groq API keys]]
+Alternatively you can use [[#local models]] using Ollama.
-=== SQLite Storage ===
+You can use different models for embeddings, rephrasing and chat. When changing the embedding model you need to rebuild the vector storage (see below). The rephrasing model is used to interpret a given question  when selecting possible source documents from your wiki.
-This is the default mechanism. It uses the [[sqlite]] plugin to store vectors in a SQLite database. The more wiki pages you have, the longer does it take to search through all available embeddings. To mitigate this, this storage mechanism uses a clustering approach, separating the similar page chunks into clusters and then only search the nearest cluster when looking for similar chunks.
+An overview over prices, properties and a short description of each model can be seen when running the ''models'' CLI command. Prices shown here may not be correct! Always check with the model provider's pricing list!
-Clusters are automatically created when the embedding index is built the first time. However, when your wiki grows a lot, topics change drastically or you add a new language, the initial clusters may not longer be a good fit. In that case you should run the ''maintenance'' command to rebuild the clusters. See below for the command line usage.
+Which model is the best for your use case depends on various factors and requires some experimentation. Contact CosmCode for help with picking and configuring the right solution for you.
-=== PineCone Storage ===
+=== Local Models ===
-This mechanism uses the [[https://www.pinecone.io/|PineCone]] vector database service to store embeddings. It is the more scalable choice and should be used when using the plugin on larger wikis.
+The plugin is able to use [[https://ollama.com/|Ollama]] to run LLMs locally. You need to have the appropriate hardware to do so performantly. Please refer to their website to learn how to [[https://ollama.com/download|Install Ollama]].
-Their free tier should be enough to get you started.
-  * Sign up at [[https://www.pinecone.io/|pinecone.io]]
+Install the required models with ''ollama pull <modelname>'' and start the API server with ''ollama serve''. Configure the appropriate API endpoint, eg. ''%%http://localhost:11434/api/%%''.
-  * Create a new index
-    * Use a dimension of ''1536'' with a ''cosine'' metric
-  * The list of indexes will give you the ''pinecone_baseurl'' setting. It should look like ''https://myname-something.svc.gcp-starter.pinecone.io'' or similar.
-  * Under API Keys you find the key to use for the ''pinecone_apikey'' setting
-Once you set the ''pinecone_baseurl'', the plugin will use Pinecone for storage. Continue with the initial setup below.
+==== 📥 Vector Storage ====
+The plugin needs to store embedding vectors to do semantic similarity searches. There are multiple options available, you only need to configure one.
-=== Chroma Storage ===
+The default uses a local SQLite database. This does not need any additional configuration. However, the more wiki pages you have, the longer does it take to search through all available embeddings. To mitigate this, this storage mechanism uses a clustering approach, separating the similar page chunks into clusters and then only search the nearest cluster when looking for similar chunks.
-[[https://www.trychroma.com/|Chroma]] is a vector storage you can host yourself. You need to run at least version ''0.4.16''. Chroma has to be run in [[https://docs.trychroma.com/deployment|Server-Mode]].
+Clusters are automatically created when the embedding index is built the first time. However, when your wiki grows a lot, topics change drastically or you add a new language, the initial clusters may not longer be a good fit. In that case you should run the ''maintenance'' CLI command to rebuild the clusters.
-Once you configure a ''chroma_baseurl'', Chroma will be used as storage backend. Continue with the initial setup below.
+The other storage options are dedicated vector storage solutions, that each need their own configuration:
+  * [[https://www.pinecone.io/|PineCone]] a SaaS vector database with a free tier. After signing up, you need to create an index and give it a dimension matching your embed model (see ''models'' command) eg. ''1536''. As metric select ''cosine''. Your ''pinecone_baseurl'' can be found in the list of indexes. It should look like ''%%https://myname-something.svc.gcp-starter.pinecone.io%%'' or similar.
+  * [[https://www.trychroma.com/|Chroma]] is a vector storage you can host yourself. Chroma has to be run in [[https://docs.trychroma.com/deployment|Server-Mode]].
+  * [[https://qdrant.tech/|Qdrant]] can either be self-hosted or used as SaaS with a free tier. After "creating an API", the shown curl snippet contains your baseurl and apikey
-=== Qdrant Storage ===
-[[https://qdrant.tech/|Qdrant]] is another self-hostable storage.
+==== Finetuning ====
-Alternatively there is a free hosted trial available:
+The plugin offers a few options to fine tune how to store vector data and talk to the LLM.
-  * Sign up at https://cloud.qdrant.io/
+  * chunkSize - The maximum size (in tokens) of chunks your wiki pages will be split into.
-  * Create a cluster
+  * similarityThreshold - A minimum percentage that documents need to match a given question. Different embedding models return different similarity distances. Increase this value if the bot uses too many irrelevant sources, lower the value if the bot finds too few relevant sources.
-    * keep all sliders to the left for the free tier
+  * contextChunks - the maximum number of chunks to use as context when answering questions. More chunks also mean larger context and higher cost!
-  * Create an API key for your cluster
+  * rephraseHistory - how many question/answer pairs of the previous conversation should be taken into account when interpreting the given question for document retrieval? More history means larger context and higher cost!
-    * the curl snippet displayed shows you your baseurl and apikey
+  * chatHistory - how many question/answer pairs of the previous conversation should be sent as additional context when answering questions? More history means larger context and higher cost!
+  * customprompt - this is a custom prompt that is added to the built-in prompt. See the FAQ below to completely change the prompts.
-Once you configure a ''qdrant_baseurl'', Qdrant will be used as storage backend. Continue with the initial setup below.
 ==== Multi-Language Options ====
-By default, ChatGPT guesses which language the user used to ask a question and will answer in the same language. Sometimes it might guess wrong or may not be sure about the language and fall back to English. It will also always use all pages in your wiki as potential context to answer the question, regardless of it's language.
+By default, the Model guesses which language the user used to ask a question and will answer in the same language. Sometimes it might guess wrong or may not be sure about the language and may fall back to English or another language. It will also always use all pages in your wiki as potential context to answer the question, regardless of it's language.
 Using the ''preferUIlanguage'' option you can tune how the plugin should work in multilingual wikis.
@@ Line 130: / Line 132: @@
 For the sqlite storage it is recommended to re-cluster the index when the regexes are changed by running the ''maintenance'' command.
+==== Context Restrictions ====
+The plugin renders pages into a simple text format to be used as context when answering questions. You can preview the rendered output that the plugin uses by using DokuWiki's export mechanism. Simply add ''?do=export_aichat'' to any page URL.
+Sometimes pages may contain content that should not be used for AI Chat context, for example because it may be irrelevant to the page's topic. There are two ways to exclude this type of content from being used as context:
+  - You can configure a regular expression in the ''ignoreRegex'' setting. Everything matching this regular expression will not be indexed. This applies to all pages.
+  - You can use the ''<ai-ignore></ai-ignore>'' syntax in you page. Everything within those tags will not be indexed.
+You can verify your exclusions using the export mechanism explained above. Restrictions will only apply on the next re-generation of embeddings.
 ==== General Options ====
@@ Line 135: / Line 147: @@
 All chat conversations can optionally be logged to DokuWiki's logging facilities by enabling the ''logging'' option. They will show up as a new tab in the [[plugin:logviewer|LogViewer]].
-Access to the chat functionality can be restricted to users and groups wiht the ''restrict'' option. Simply list them comma separated. Prefix groups with an ''@'' as usual. By default, everyone may access.
+Access to the chat functionality can be restricted to users and groups with the ''restrict'' option. Simply list them comma separated. Prefix groups with an ''@'' as usual. By default, everyone may access.
 ===== Initial Setup =====
-Once the plugin has been configured, the embeddings store has to be initialized. This is done via the command line interface.
+Once the plugin has been configured, the vector store has to be initialized. This is done via the command line interface.
@@ Line 147: / Line 159: @@
 Updates to the storage can be done by rerunning the command.
-Depending on the size of your wiki this can take a while. Whenever you change the model (or in some cases when updates to the inner workings of the plugin are made) you will need to completely clear and rebuild the storage. Use the clear parameter for that:
+Depending on the size of your wiki this can take a while. Whenever you change certain settings you will need to completely clear and rebuild the storage. Settings that require a reindex are marked with 🔄 in the config manager. Use the clear parameter for that:
    php bin/plugin.php aichat embed --clear
@@ Line 226: / Line 238: @@
   php bin/plugin.php aichat info
-Shows a few statistics on the vector store.
+Shows a few statistics on the currently configured vector store and models.
   php bin/plugin.php aichat maintenance
 This runs maintenance on the embeddings store. This is currently only used in the SQLiteStorage as explained above.
+===== Remote API =====
+Some functionality of this plugin can also be accessed via the [[devel:remote_api|Remote API]]. Use DokuWiki's API Explorer to explore available endpoints.
 ===== FAQ =====
@@ Line 238: / Line 255: @@
   * Does this work on multilingual content?
-    * It seem to work just fine. ChatGPT can work with many languages and will automatically match the user's question language to content in the same language.
+    * It seem to work just fine. Most LLMs can work with many languages and will automatically match the user's question language to content in the same language.
   * Can I adjust the prompts used?
-    * Yes you'll find them in the ''lang/en/prompt_*.txt'' files. Refer to [[https://www.dokuwiki.org/localization#changing_some_localized_texts_and_strings_in_your_installation|Localizing Texts]] on how to override them
+    * Yes you'll find them in the ''lang/en/*.prompt'' files. Refer to [[https://www.dokuwiki.org/localization#changing_some_localized_texts_and_strings_in_your_installation|Localizing Texts]] on how to override them
   * Will the chat bot hallucinate?
     * Yes. Even though the prompt asks the model not to, it has a tendency to state things with confidence that are not entirely true. Remember this is not real intelligence or understanding. It is just clever text prediction. However as long as the question covers a topic that does have an answer in your wiki, the results are generally pretty good.
   * Will the bot answer questions that are unrelated to the wiki contents?
-    * That might happen. No answers will be given if no fitting sources can be found at all -- a similarity cut-off is used. But often sources are found that are just slightly above the cut-off but do not answer the question. The bot will then fall back to its ChatGPT roots and happily answer completely off-topic questions. So far attempts to avoid this with prompting have failed.
+    * That might happen. No answers will be given if no fitting sources can be found at all -- check the ''similarityThreshold'' option. But often sources are found that are just slightly above the cut-off but do not answer the question. The bot will then fall back to its LLM roots and happily answer completely off-topic questions. So far attempts to avoid this with prompting have failed.
-  * Can I use a different model? Like Bard? Or even running my own?
+  * Can I use a different model? Or even running my own?
-    * Models can be added relatively easy (as long as there is an embedding mechanism). Again, contact us and we can work something out.
+    * Models can be added relatively easy. Again, contact us and we can work something out.
+  * The bot seems to never find any sources?
+    * Check the ''similarityThreshold'' setting. You might need to lower it for some models.
 ===== Development =====
-Developers could use this plugin in their own plugins. Via the helper component you get access to an implementation of AbstractModel which will give you a Client to interact with the current model provider (OpenAI only currently). You also get access to the Embedding object which lets you do similarity searches on the wiki content.
+Developers could use this plugin in their own plugins. Via the helper component you get access to an implementation of AbstractModel which will give you a Client to interact with the current model provider. You also get access to the Embedding object which lets you do similarity searches on the wiki content.
+===== Early Access Features =====
+Additional features are available as early access through our [[https://www.cosmocode.de/en/services/wiki/dokuwiki-partnership/|DokuWiki Business Plugin Partner Program]].
+==== Current Page Context ====
+Sometimes you already have found the right page in the wiki, but need help with understanding it. Or you may want to have a generic example adjusted to your concrete use case. The chat bot can help with that. To do so, enable the "current page context" mode via the little page icon. Questions asked in this mode will always use the current page as context.
+==== Generic Model Provider ====
+Many model provider use an API that is at least partly compatible with the OpenAI API. The new Generic provider allows to configure such a provider. Since available models and their properties are unknown for such a provider, the plugin has been adjusted to work even without a prior knowledge of context sizes.
+==== Full Page Context ====
+Some modern LLMs provide huge context windows. For those models it can make sense to send the whole page content of matching chunks instead of the chunk contents only as context. The new option `fullpagecontext` allows to enable this.
+==== Improved Text Chunking ====
+Splitting texts into chunks has been improved for texts that can not be split by sentence structure.
+===== See also =====
+  * Forum thread: [[https://forum.dokuwiki.org/d/21178-experimental-plugin-aichat-chat-with-your-wiki-pages-using-chatgpt|Experimental Plugin: AIChat - chat with your wiki pages using ChatGPT]]
+  * Forum thread:[[https://forum.dokuwiki.org/d/23404-aichat-plugin-always-responds-sorry-something-went-wrong|Aichat Plugin always responds "Sorry, something went wrong"]]