plugin:aichat
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
plugin:aichat [2024-03-13 23:50] – version upped andi | plugin:aichat [2025-05-21 14:17] (current) – [Generic Model Provider] andi | ||
---|---|---|---|
Line 6: | Line 6: | ||
email : dokuwiki@cosmocode.de | email : dokuwiki@cosmocode.de | ||
type : action, helper, syntax, CLI | type : action, helper, syntax, CLI | ||
- | lastupdate : 2024-03-13 | + | lastupdate : 2025-04-01 |
compatible : Jack Jackrum, Kaos | compatible : Jack Jackrum, Kaos | ||
depends | depends | ||
conflicts | conflicts | ||
similar | similar | ||
- | tags : !experimental, | + | tags : !experimental, |
downloadurl: | downloadurl: | ||
Line 17: | Line 17: | ||
sourcerepo : https:// | sourcerepo : https:// | ||
donationurl: | donationurl: | ||
+ | |||
screenshot_img : plugin: | screenshot_img : plugin: | ||
Line 23: | Line 24: | ||
[[https:// | [[https:// | ||
- | This plugin adds the ability to use a " | + | This plugin adds the ability to use a Large Language Model (LLM) often called |
- | You can read an [[https:// | + | In the forum, you can read an [[https:// |
===== Installation ===== | ===== Installation ===== | ||
Line 31: | Line 32: | ||
:!: This plugin is experimental. Implementation details may change and require manual intervention on future updates. | :!: This plugin is experimental. Implementation details may change and require manual intervention on future updates. | ||
- | :!: The use of this plugin creates | + | :!: The use of this plugin creates API costs. Be sure to be aware of the pricing details |
:!: This plugin requires PHP 8.1 or higher. Installing it on systems with lower versions will " | :!: This plugin requires PHP 8.1 or higher. Installing it on systems with lower versions will " | ||
Line 39: | Line 40: | ||
Install the plugin using the [[plugin: | Install the plugin using the [[plugin: | ||
- | |||
- | Optionally install the [[text]] plugin to allow the plugin to index text contents without wiki syntax. | ||
Once installed, continue with Configuration and Initial Setup. | Once installed, continue with Configuration and Initial Setup. | ||
Line 47: | Line 46: | ||
===== Configuration ===== | ===== Configuration ===== | ||
+ | If you need help with installing, configuring and fine tuning this plugin, feel free to [[https:// | ||
- | ==== OpenAI ==== | + | Use the [[config|Configuration Manager]] to adjust the settings described below. |
- | You need to create an [[https:// | ||
- | Configure the key in the [[plugin: | + | ==== 🧠Models ==== |
- | You can select the AI model used here as well. Currently GPT-3.5 models with 4k and 16k token limits | + | Models are used to create embeddings (vectors describing your content), rephrasing |
- | ==== Vector Storage ==== | ||
- | The plugin needs to store embedding vectors to do semantic similarity searches. There are multiple options available, you only need to configure one. | + | * [[https:// |
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | Alternatively you can use [[#local models]] using Ollama. | ||
- | === SQLite Storage === | + | You can use different models for embeddings, rephrasing and chat. When changing the embedding model you need to rebuild the vector storage (see below). The rephrasing model is used to interpret a given question |
- | This is the default mechanism. It uses the [[sqlite]] plugin to store vectors in a SQLite database. The more wiki pages you have, the longer does it take to search through all available embeddings. To mitigate this, this storage mechanism uses a clustering approach, separating | + | An overview over prices, properties and a short description of each model can be seen when running |
- | Clusters are automatically created when the embedding index is built the first time. However, when your wiki grows a lot, topics change drastically or you add a new language, the initial clusters may not longer be a good fit. In that case you should run the '' | + | Which model is the best for your use case depends on various factors and requires some experimentation. Contact CosmCode |
- | === PineCone Storage | + | === Local Models |
- | This mechanism uses the [[https://www.pinecone.io/|PineCone]] vector database service | + | The plugin is able to use [[https://ollama.com/|Ollama]] to run LLMs locally. You need to have the appropriate hardware to do so performantly. Please refer to their website to learn how to [[https:// |
- | Their free tier should be enough to get you started. | ||
- | * Sign up at [[https:// | + | Install the required models |
- | * Create a new index | + | |
- | * Use a dimension of '' | + | |
- | * The list of indexes will give you the '' | + | |
- | * Under API Keys you find the key to use for the '' | + | |
- | Once you set the '' | + | ==== 📥 Vector Storage ==== |
+ | The plugin needs to store embedding vectors to do semantic similarity searches. There are multiple options available, you only need to configure one. | ||
- | === Chroma Storage === | + | The default uses a local SQLite database. This does not need any additional configuration. However, the more wiki pages you have, the longer does it take to search through all available embeddings. To mitigate this, this storage mechanism uses a clustering approach, separating the similar page chunks into clusters and then only search the nearest cluster when looking for similar chunks. |
- | [[https:// | + | Clusters are automatically created when the embedding index is built the first time. However, when your wiki grows a lot, topics change drastically or you add a new language, the initial clusters may not longer be a good fit. In that case you should |
- | Once you configure a '' | + | The other storage |
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
- | === Qdrant Storage === | ||
- | [[https:// | + | ==== Finetuning ==== |
- | Alternatively there is a free hosted trial available: | + | The plugin offers |
- | * Sign up at https:// | + | * chunkSize - The maximum size (in tokens) of chunks your wiki pages will be split into. |
- | * Create | + | * similarityThreshold - A minimum percentage that documents need to match a given question. Different embedding models return different similarity distances. Increase this value if the bot uses too many irrelevant sources, lower the value if the bot finds too few relevant sources. |
- | * keep all sliders to the left for the free tier | + | * contextChunks - the maximum number of chunks to use as context when answering questions. More chunks also mean larger context and higher cost! |
- | * Create an API key for your cluster | + | * rephraseHistory - how many question/ |
- | * the curl snippet displayed shows you your baseurl | + | * chatHistory - how many question/ |
- | Â | + | * customprompt - this is a custom prompt that is added to the built-in prompt. See the FAQ below to completely change the prompts. |
- | Once you configure | + | |
==== Multi-Language Options ==== | ==== Multi-Language Options ==== | ||
- | By default, | + | By default, |
Using the '' | Using the '' | ||
Line 130: | Line 132: | ||
For the sqlite storage it is recommended to re-cluster the index when the regexes are changed by running the '' | For the sqlite storage it is recommended to re-cluster the index when the regexes are changed by running the '' | ||
+ | ==== Context Restrictions ==== | ||
+ | |||
+ | The plugin renders pages into a simple text format to be used as context when answering questions. You can preview the rendered output that the plugin uses by using DokuWiki' | ||
+ | |||
+ | Sometimes pages may contain content that should not be used for AI Chat context, for example because it may be irrelevant to the page's topic. There are two ways to exclude this type of content from being used as context: | ||
+ | |||
+ | - You can configure a regular expression in the '' | ||
+ | - You can use the ''< | ||
+ | |||
+ | You can verify your exclusions using the export mechanism explained above. Restrictions will only apply on the next re-generation of embeddings. | ||
==== General Options ==== | ==== General Options ==== | ||
Line 135: | Line 147: | ||
All chat conversations can optionally be logged to DokuWiki' | All chat conversations can optionally be logged to DokuWiki' | ||
- | Access to the chat functionality can be restricted to users and groups | + | Access to the chat functionality can be restricted to users and groups |
===== Initial Setup ===== | ===== Initial Setup ===== | ||
- | Once the plugin has been configured, the embeddings | + | Once the plugin has been configured, the vector |
Line 147: | Line 159: | ||
Updates to the storage can be done by rerunning the command. | Updates to the storage can be done by rerunning the command. | ||
- | Depending on the size of your wiki this can take a while. Whenever you change | + | Depending on the size of your wiki this can take a while. Whenever you change |
php bin/ | php bin/ | ||
Line 226: | Line 238: | ||
php bin/ | php bin/ | ||
- | Shows a few statistics on the vector store. | + | Shows a few statistics on the currently configured |
php bin/ | php bin/ | ||
This runs maintenance on the embeddings store. This is currently only used in the SQLiteStorage as explained above. | This runs maintenance on the embeddings store. This is currently only used in the SQLiteStorage as explained above. | ||
+ | |||
+ | |||
+ | ===== Remote API ===== | ||
+ | |||
+ | Some functionality of this plugin can also be accessed via the [[devel: | ||
===== FAQ ===== | ===== FAQ ===== | ||
Line 238: | Line 255: | ||
* Does this work on multilingual content? | * Does this work on multilingual content? | ||
- | * It seem to work just fine. ChatGPT | + | * It seem to work just fine. Most LLMs can work with many languages and will automatically match the user's question language to content in the same language. |
* Can I adjust the prompts used? | * Can I adjust the prompts used? | ||
- | * Yes you'll find them in the '' | + | * Yes you'll find them in the '' |
* Will the chat bot hallucinate? | * Will the chat bot hallucinate? | ||
* Yes. Even though the prompt asks the model not to, it has a tendency to state things with confidence that are not entirely true. Remember this is not real intelligence or understanding. It is just clever text prediction. However as long as the question covers a topic that does have an answer in your wiki, the results are generally pretty good. | * Yes. Even though the prompt asks the model not to, it has a tendency to state things with confidence that are not entirely true. Remember this is not real intelligence or understanding. It is just clever text prediction. However as long as the question covers a topic that does have an answer in your wiki, the results are generally pretty good. | ||
* Will the bot answer questions that are unrelated to the wiki contents? | * Will the bot answer questions that are unrelated to the wiki contents? | ||
- | * That might happen. No answers will be given if no fitting sources can be found at all -- a similarity cut-off is used. But often sources are found that are just slightly above the cut-off but do not answer the question. The bot will then fall back to its ChatGPT | + | * That might happen. No answers will be given if no fitting sources can be found at all -- check the '' |
- | * Can I use a different model? Like Bard? Or even running my own? | + | * Can I use a different model? Or even running my own? |
- | * Models can be added relatively easy (as long as there is an embedding mechanism). Again, contact us and we can work something out. | + | * Models can be added relatively easy. Again, contact us and we can work something out. |
+ | * The bot seems to never find any sources? | ||
+ | * Check the '' | ||
===== Development ===== | ===== Development ===== | ||
- | Developers could use this plugin in their own plugins. Via the helper component you get access to an implementation of AbstractModel which will give you a Client to interact with the current model provider | + | Developers could use this plugin in their own plugins. Via the helper component you get access to an implementation of AbstractModel which will give you a Client to interact with the current model provider. You also get access to the Embedding object which lets you do similarity searches on the wiki content. |
+ | Â | ||
+ | ===== Early Access Features =====Â | ||
+ | Â | ||
+ | Additional features are available as early access through our [[https:// | ||
+ | Â | ||
+ | ==== Current Page Context ====Â | ||
+ | Â | ||
+ | Sometimes you already have found the right page in the wiki, but need help with understanding it. Or you may want to have a generic example adjusted to your concrete use case. The chat bot can help with that. To do so, enable the " | ||
+ | Â | ||
+ | Â | ||
+ | ==== Generic Model Provider ====Â | ||
+ | Â | ||
+ | Many model provider use an API that is at least partly compatible with the OpenAI API. The new Generic provider allows to configure such a provider. Since available models and their properties are unknown for such a provider, the plugin has been adjusted to work even without a prior knowledge of context sizes. | ||
+ | Â | ||
+ | ==== Full Page Context ====Â | ||
+ | Â | ||
+ | Some modern LLMs provide huge context windows. For those models it can make sense to send the whole page content of matching chunks instead of the chunk contents only as context. The new option `fullpagecontext` allows to enable this. | ||
+ | Â | ||
+ | ===== See also =====Â | ||
+ | Â | ||
+ | Forum thread: [[https:// | ||
plugin/aichat.1710370208.txt.gz · Last modified: by andi