Use Chatbots to add AI chatbots to WordPress. Each chatbot has its own model, instructions, knowledge sources, tools, interface, popup settings, limits, connected apps, and rules.
Live chatbot demo
Open the frontend chatbot example.
Deploy
Use popup, on-page, or external embed.
Model
Choose the provider, model, instructions, and memory.
Knowledge
Use vectors, page context, and training data.
Tools
Add file upload, web search, images, and voice.
Interface
Customize theme, popup, starters, consent, and labels.
Limits
Set usage limits and credit behavior.
Connected Apps
Send chatbot events to Slack, HubSpot, Notion, and more.
Rules
Trigger actions from chatbot events and conditions.
Open the General section to set the chatbot identity and model behavior.
Setting
Use it for
Name
Internal chatbot name shown in the Chatbots screen.
Engine
AI provider for this chatbot.
Model
Model used by this chatbot.
Instructions
System instructions for the chatbot. Use [date] when the bot needs today’s date in its instructions.
Temperature
Response variation. Lower values are more predictable.
Context
Maximum completion tokens for the model response.
Messages
Number of previous conversation messages included as history.
Session memory
Only applies when the chatbot uses OpenAI. AI Puffer stores the OpenAI response ID and sends it with the next message so OpenAI can continue the same conversation state.
Reasoning
Reasoning effort for supported models. Keep it set to None for faster responses; higher values can make replies slower.
Session memory applies only to OpenAI chatbots. When enabled, conversation continuity depends on OpenAI response IDs, so review this setting before using it for privacy-sensitive chatbots.
Knowledge controls the information a chatbot can use before it answers.It can read the current page content, search trained vector data, or use both depending on the chatbot setup.
Vector is the better option when you want to train the chatbot with your own content.AI Puffer converts your content into searchable chunks.When a visitor asks a question, the chatbot retrieves the closest matching chunks and uses them as context for the answer.
Pinecone stores vectors in an index. AI Puffer creates those vectors with the embedding model you choose.The Pinecone index dimension must match the embedding model.For example, if your index is 3072 dimensions, use a 3072-dimension embedding model.Use the same embedding model when adding data and when enabling Pinecone in the chatbot.
If the Pinecone index dimension does not match the embedding model, search can fail or return poor context.
To create an index:
Add your Pinecone credentials in AI Puffer > Settings > Integrations.
Go to AI Puffer > Knowledge Base.
Select Pinecone as the provider.
Select the embedding model you want to use.
Click Create new index.
Enter an index name and use the dimension for the selected embedding model.
Create the index, then add training data with the same embedding model.
To use it in a chatbot:
Go to AI Puffer > Chatbots and select the chatbot.
In Knowledge, select Vector as a data source.
Set Vector provider to Pinecone.
Select the Pinecone index.
Select the same embedding provider and model used when you added the data.
Qdrant stores vectors in collections. AI Puffer creates those vectors with the embedding model you choose.The Qdrant collection size must match the embedding model. For example, if your collection is 3072 dimensions, use a 3072-dimension embedding model.Use the same embedding model when adding data and when enabling Qdrant in the chatbot.
If the Qdrant collection size does not match the embedding model, search can fail or return poor context.
To create a collection:
Add your Qdrant URL and API key in AI Puffer > Settings > Integrations.
Go to AI Puffer > Knowledge Base.
Select Qdrant as the provider.
Select the embedding model you want to use.
Click Create new collection.
Enter a collection name and use the dimension for the selected embedding model.
Create the collection, then add training data with the same embedding model.
To use it in a chatbot:
Go to AI Puffer > Chatbots and select the chatbot.
In Knowledge, select Vector as a data source.
Set Vector provider to Qdrant.
Select one or more collections.
Select the same embedding provider and model used when you added the data.
If answers miss useful context, increase Limit or lower Threshold. If answers include unrelated context, lower Limit or raise Threshold.After testing, open Usage > Logs and check the Score badge in the conversation details.
Start with a small limit and a moderate threshold. Then use the Score badge in logs to tune the chatbot with real questions.
Scores show which vector results matched the visitor question and help you tune the threshold.
Pinecone and Qdrant need an because AI Puffer must turn your content and the visitor question into vectors before it can search.OpenAI Vector Stores do not need this setting in AI Puffer. OpenAI handles the vector store search on its side.
For Pinecone and Qdrant, use the same embedding model when adding training data and when enabling the chatbot vector provider.
OpenAI
Model
Dimension
text-embedding-3-small
1536
text-embedding-3-large
3072
text-embedding-ada-002
1536
Google
Model
Dimension
gemini-embedding-2-preview
3072
gemini-embedding-001
3072
models/text-embedding-004
768
OpenRouter
Model
Dimension
baai/bge-base-en-v1.5
768
baai/bge-large-en-v1.5
1024
baai/bge-m3
1024
google/gemini-embedding-001
3072
google/gemini-embedding-2-preview
3072 by default. Supports 128-3072.
intfloat/e5-base-v2
768
intfloat/e5-large-v2
1024
intfloat/multilingual-e5-large
1024
mistralai/mistral-embed-2312
1024
nvidia/llama-nemotron-embed-vl-1b-v2:free
2048
openai/text-embedding-3-large
3072
openai/text-embedding-3-small
1536
openai/text-embedding-ada-002
1536
perplexity/pplx-embed-v1-0.6b
1024
perplexity/pplx-embed-v1-4b
2560
qwen/qwen3-embedding-4b
2560
qwen/qwen3-embedding-8b
4096
sentence-transformers/all-minilm-l12-v2
384
sentence-transformers/all-minilm-l6-v2
384
sentence-transformers/all-mpnet-base-v2
768
sentence-transformers/multi-qa-mpnet-base-dot-v1
768
sentence-transformers/paraphrase-minilm-l6-v2
384
thenlper/gte-base
768
thenlper/gte-large
1024
Ollama
Model
Dimension
nomic-embed-text-v2-moe
768
qwen3-embedding / qwen3-embedding:8b
4096
qwen3-embedding:4b
2560
qwen3-embedding:0.6b
1024
embeddinggemma
768
nomic-embed-text
768
mxbai-embed-large
1024
bge-m3
1024
snowflake-arctic-embed / snowflake-arctic-embed:l
1024
snowflake-arctic-embed:m
768
snowflake-arctic-embed:m-long
768
snowflake-arctic-embed:s
384
snowflake-arctic-embed:xs
384
all-minilm / all-minilm:l6
384
all-minilm:l12
384
paraphrase-multilingual
768
snowflake-arctic-embed2
1024
granite-embedding:30m
384
granite-embedding / granite-embedding:278m
768
bge-large
1024
Azure
Azure OpenAI embedding deployments are synced from your Azure resource. Use the dimension of the model behind the deployment.
Adds the current page or post content to the chatbot context.When this feature is enabled, the chatbot will use the current page’s content as part of its contextual understanding.If the page has an excerpt, that excerpt will be used directly as the bot’s context.
If no excerpt is available, the plugin will automatically generate a short summary of the page content and feed that to the bot instead.This is ideal for creating page-specific chatbots.
Tools are optional capabilities the chatbot can use during a conversation. They let visitors upload files, search the web, attach images, generate images, or use voice features when those options are enabled.Enable only the tools you want your chatbot to offer.
File upload lets visitors attach a document to the current chat. AI Puffer reads the file, prepares it for the selected vector provider, and uses the matching file content as context for the visitor’s next messages.File upload uses the chatbot’s Vector provider setting: OpenAI, Pinecone, Qdrant, or Claude Files when the chatbot provider is Claude.Visitors can upload .txt and .pdf files. The frontend limit is 20 MB, but your WordPress or server upload limit can be lower.
Text-based PDFs work best. Scanned PDFs may not provide usable text unless they contain OCR text.
To enable file upload:
Go to AI Puffer > Chatbots and select the chatbot.
In Knowledge, select Vector as a data source.
Select the Vector provider.
Configure vector provider.
Open Tools.
Add File upload to Enabled tools.
Save the chatbot.
Test the chatbot on the frontend and upload a .txt or .pdf file.
Web Search lets a chatbot use online sources while answering. When enabled, the chatbot input shows a web/search toggle on the frontend.If the frontend toggle is off, the chatbot answers without web search even when the tool is enabled in the admin.Web Search is available for OpenAI, Google, Claude, and OpenRouter models that support web search.
Web search has two controls: the admin tool setting enables the feature, and the frontend toggle decides whether a specific visitor message uses it.
To enable web search:
Go to AI Puffer > Chatbots and select the chatbot.
In General, select OpenAI, Google, Claude, or OpenRouter as the engine.
In Tools, add Web search to Enabled tools.
Set Web search to Yes.
Click Options.
Configure the provider settings.
Save the chatbot.
Test the chatbot on the frontend.
These options appear for every web search provider:
Option
What it does
Web toggle default on
Starts the frontend web/search toggle enabled.
Show sources
Shows source links under replies when the provider returns them.
Sources label
Changes the label shown above source links.
Searching web text
Changes the temporary status text shown while a web search is running.
Provider-specific options are shown based on the chatbot engine.
OpenAI
Google
Claude
OpenRouter
Option
What it does
Search context size
Controls how much web-search context OpenAI can use.
User location
Sends approximate location only when local results matter.
Option
What it does
Mode
Controls Gemini grounding mode.
Retrieval threshold
Applies only to Dynamic mode. Lower values search more often.
Option
What it does
Max uses
Limits web searches per answer.
User location
Sends approximate country, city, region, or timezone.
Allowed domains
Restricts web search to specific domains.
Blocked domains
Excludes specific domains.
Cache TTL
Sets optional Claude cache time: None, 5 minutes, or 1 hour.
Option
What it does
Engine
Chooses Auto, Native, or Exa search when supported.
Image analysis lets visitors attach an image to a chat message. The image is sent with the next message to the chatbot’s selected model.Image analysis uses the chatbot’s selected provider and model.
Image analysis is model-dependent. If the option is missing or replies fail, choose a vision-capable model and sync models again.
Provider
Support
Notes
OpenAI
Yes
Uses the selected OpenAI chat model.
Claude
Yes
Uses the selected Claude chat model.
OpenRouter
Model-dependent
The selected model must support image input. Sync models if the option does not appear.
Ollama
Model-dependent
The selected local model must be vision-capable. Sync Ollama models first.
Google
No
Chatbot image analysis is not exposed for Google.
Azure
No
Chatbot image analysis is not exposed for Azure.
DeepSeek
No
Chatbot image analysis is not exposed for DeepSeek.
Visitors can upload .jpg, .jpeg, .png, and .webp files. AI Puffer accepts one image per message, with a 20 MB frontend limit.To enable image analysis:
Go to AI Puffer > Chatbots and select the chatbot.
In General, select a supported provider and model.
Open Tools.
Add Image analysis to Enabled tools.
Set Image analysis to Yes.
Save the chatbot.
Test the chatbot on the frontend.
On the frontend, visitors use the image upload button, select an image, type a question, and send the message.If File Upload and Image Analysis are both enabled, the attachment button opens a small menu.
Image generation lets visitors create a new image from a chat command. The visitor types a trigger followed by a prompt, for example:
/image a clean product photo on a white background
Each command returns one image.Image generation uses a separate image model. It does not use the chatbot’s answer model.
Image generation uses the image model selected in Tools, not the chatbot model selected in General.
Provider
Support
Model source
OpenAI
Yes
Built-in GPT Image models.
Google
Yes
Synced Google image models.
Azure
Yes
Synced Azure image deployments.
OpenRouter
Model-dependent
Synced OpenRouter models with image output.
Replicate
Yes
Add the API key under Settings > Integrations > Replicate, then sync models.
The chatbot image generation model list includes those providers only.If a model is missing, configure the provider and sync models in AI Providers.To enable image generation:
Go to AI Puffer > Chatbots and select the chatbot.
Open Tools.
Add Image generation to Enabled tools.
Select the image model.
Set the image triggers.
Save the chatbot.
Test the command on the frontend.
Triggers are comma-separated and must start with /. Examples:
/image, /generate, /draw
When a visitor uses an image trigger, AI Puffer extracts the prompt after the trigger and sends it to the selected image model. The original command is saved in the conversation log, and the generated image reply is shown in the chat.
Speech to Text adds a microphone button to the chatbot input.When a visitor clicks the microphone, AI Puffer records the audio in the browser, uploads it to WordPress, sends it to the speech-to-text provider, then submits the transcript as the user message.Speech to Text is currently available for OpenAI.
Go to AI Puffer > Chatbots and select the chatbot.
Open Tools.
Add Speech to Text to Enabled tools.
Set Speech to Text to Yes.
Select the model if the model selector is shown.
Save the chatbot and test the microphone on the frontend.
Speech to Text needs browser microphone permission and HTTPS. Localhost can be used for testing without HTTPS. Recorded audio uploads are limited to 4 MB by default.
Text to Speech adds playback controls for assistant replies.When a visitor clicks the play button, AI Puffer sends the assistant reply text to the selected text-to-speech provider and plays the returned audio in the browser.Text to Speech is available for Google, OpenAI, and ElevenLabs.
Realtime Voice is separate from Speech to Text and Text to Speech. It creates a live session with OpenAI Realtime, streams microphone audio, receives spoken replies, and logs completed turns.Realtime Voice is currently available for OpenAI.
Go to AI Puffer > Chatbots and select the chatbot.
Open Tools.
Add Realtime Voice to Enabled tools.
Set Realtime voice agent to Yes.
Select the realtime model.
Select the voice.
Choose turn detection.
Save the chatbot and test voice mode on the frontend.
Realtime options:
Setting
What it does
Model
OpenAI Realtime model used for the voice session.
Voice
Voice used for spoken replies.
Direct voice mode
Popup-only. The popup launcher starts and stops the voice session directly, and the in-chat realtime button is hidden.
Noise reduction
Applies input audio noise reduction before the model receives the microphone stream.
Audio format
Sets the input and output audio format. Available values are pcm16, g711_ulaw, and g711_alaw.
Response speed
Controls spoken reply speed from 0.25 to 1.5.
Turn detection options:
Mode
Behavior
None
No server voice activity detection. Use this for push-to-talk style sessions.
Automatic
Uses server voice activity detection. This is the default.
Smart
Uses semantic voice activity detection, so the model can wait for a more complete thought before replying.
Realtime sessions create chat log entries for the user transcript and assistant transcript when a turn completes. If OpenAI returns usage data, AI Puffer records the token usage against the chatbot.
Realtime Voice uses microphone access and an OpenAI Realtime model. Test it on HTTPS and review costs in your OpenAI account.
Choose Light, Dark, ChatGPT, or one of the custom color presets from the Theme menu.To build your own theme, select Custom and click Edit. The custom editor lets you change the main colors, bubble radius, font, inline width, popup width, chat height, and advanced colors for messages, header, footer, input area, buttons, and sidebar. Use Reset to return the custom theme fields to their defaults.
Conversation starters are quick prompts shown inside the chatbot before the first message. Use them to help visitors begin with common questions.To enable and customize starters:
Go to AI Puffer > Chatbots and select the chatbot.
Consent shows a notice before the conversation starts. The chatbot input stays disabled until the visitor clicks the consent button.To enable and customize the consent notice:
Go to AI Puffer > Chatbots and select the chatbot.
Chatbots use both global security settings and chatbot-level controls.Open AI Puffer > Settings > Security.
Setting
What it does
IP Anonymization
Stores anonymized IP addresses in logs.
Banned Words
Blocks messages that contain configured words or phrases.
Banned IPs
Blocks messages from configured IP addresses.
OpenAI Moderation
Uses OpenAI moderation for OpenAI chatbots.
Set custom block messages for banned words, banned IPs, and OpenAI moderation if you want visitors to see specific text.Chatbot logs include conversation messages, usage data, feedback, and metadata needed for history and reporting. Enable IP anonymization if you do not want full IP addresses stored in logs.
Use IP anonymization when you need usage logs without storing full visitor IP addresses.
Limits control how much a visitor can use this chatbot before AI Puffer stops new messages.AI Puffer checks the visitor’s quota before a chat request starts and records usage after the response. Guests are tracked by session. Logged-in users are tracked on their WordPress account. Leave a quota empty for unlimited usage, or set it to 0 to block that group.
Setting
Use it for
Quota mode
Use the same quota for all logged-in users or define role-based quotas.
Guest quota
Usage quota for visitors who are not logged in. Empty means unlimited. 0 blocks guests.
User quota
Usage quota for logged-in users when using general quota mode. Empty means unlimited. 0 blocks logged-in users.
Role-based quotas
Usage quota per WordPress role. Empty means unlimited for that role.
Reset period
Never, daily, weekly, or monthly.
Quota reached message
Message shown when the visitor reaches the quota.
Primary button
Optional button shown after the quota message.
Secondary button
Optional second button shown after the quota message.
Quota buttons can link to customer dashboard usage, credits, purchases, the buy credits page, a custom URL, or no button.
For credit-based chatbot access, define pricing rules in Usage. To sell prepaid credits, create WooCommerce credit packages in Usage.
Bot token: open OAuth & Permissions, scroll to Bot Token Scopes, click Add an OAuth Scope, add chat:write, click Install to Workspace, then copy the Bot User OAuth Token that starts with xoxb-.
Incoming webhook: open Incoming Webhooks, turn on Activate Incoming Webhooks, click Add New Webhook to Workspace, select a channel, then copy the webhook URL.
If you use a bot token, invite the Slack app to the target channel.
In WordPress, open AI Puffer > Settings > Apps.
Create a Slack connection.
Select Token or Webhook.
For Token, enter the Bot Token and Default Channel.
For Webhook, enter the Webhook URL.
Save the connection.
For token connections, click Test Connection. AI Puffer posts a temporary Slack message and removes it after the channel is verified.
Create a recipe, choose a chatbot event, choose Slack Message, map the fields, and enable it.
Scope the recipe to all chatbots or one chatbot.
Test from the frontend chatbot.
Webhook connections are tested at delivery time. If a token test fails, check that the bot is in the default channel.
Adds content to the system instruction or conversation history.
Block message
Stops the user message from reaching the AI.
Call webhook
Sends an HTTP request to an external URL.
Set variable
Stores a variable in user meta or bot context.
Display form
Shows a form inside the chatbot.
Store form submission
Saves submitted form data to the chatbot log.
Webhook and message fields support placeholders. Form placeholders include submitted data, display values, labels, and individual submitted fields.
Webhook actions send data outside WordPress. Use trusted webhook URLs and avoid sending sensitive conversation data unless the destination is meant to receive it.