Generic answers, polished delivery
The model paraphrases something close enough. Visitors who already know the topic spot the gap immediately. Trust leaks out before the conversation ends.
WordPress RAG · External Knowledge NEW
PressBot now bundles retrieval-augmented generation straight into the WordPress chatbot. Visitors ask plain questions, the bot calls your retriever, and answers come back grounded in transcripts, manuals, courses, policies — whatever corpus you already own. The plugin stays lean. Your retriever stays in charge. Source URLs can now flow through as clickable markdown links.
External Knowledge/RAG uses PressBot’s BYOK tool workflow today. The WordPress AI Client route in 1.7.0 is text-only while Core tool support matures.
Why WordPress needs RAG
Most WordPress chatbots reach for whatever the underlying model happened to memorise, then sound confident about it. Retrieval-augmented generation flips that — the bot looks up your corpus first, then answers. For teams whose best source material lives in transcripts, manuals, support docs, or research archives, RAG closes the gap.
The model paraphrases something close enough. Visitors who already know the topic spot the gap immediately. Trust leaks out before the conversation ends.
Transcripts in a vault. Manuals in a doc site. Course archives behind a login. WordPress only ever saw a fraction of what visitors actually need answered.
When the bot hedges, the visitor opens a ticket. Or they leave. Either way, the answer was right there in your corpus — just not reachable from the chat.
Who needs WordPress RAG
If your real knowledge lives outside the WordPress database, RAG is the right answer. PressBot plugs into whichever retriever already owns your corpus — same chat widget your visitors already trust, now grounded in the material that actually answers their questions.
Connect manuals, troubleshooting libraries, policy decks, and how-to docs. Visitors get the actual answer instead of a deflection to the ticket form.
e.g. shipping rules · SLAs · product specsLet learners ask plain-language questions against course transcripts, lessons, and reference material — without exposing the full archive publicly.
e.g. cohort lessons · module notesGround answers in research libraries, interview archives, document collections, and curated knowledge bases — the work you spent years assembling.
e.g. interviews · whitepapers · case filesHow WordPress RAG works
PressBot does not ship a vector database, an embedding pipeline, or an ingestion queue. That is deliberate — RAG works cleanly when your service owns the corpus and PressBot owns the WordPress conversation. Four steps, no SDK.
Setup
Add an HTTPS retrieval URL, optional encrypted bearer token, default scope, result limit, label, and any custom instructions for the model.
POST https://retriever.example/search
Runtime
The public chatbot decides when to call the visitor-safe search_knowledge_corpus tool — or to keep talking from on-site context.
search_knowledge_corpus({ query, scope })
Retrieval
Your service returns ranked matches with source titles, URLs, timestamps, paths, or whatever metadata you publish in the response schema.
{ matches: [{ text, score, source }] }
Grounding
Snippets are normalised and handed to the model with instructions to cite the matches and avoid guessing when nothing scores high enough.
cited · grounded · honest
A real RAG exchange
A visitor asks a plain question. PressBot decides to call your retriever, surfaces the top matches, and answers with markdown links the visitor can click. Nothing is invented — this is what RAG looks like in production.
RAG safety & limits
External Knowledge inherits the same posture as the rest of PressBot — cap everything, log what matters, default to the safer fallback when something looks off. RAG that respects its own perimeter.
Optional bearer-token auth is stored encrypted in WordPress. Endpoint URLs are restricted to HTTP/HTTPS. No tokens leak into request logs.
Queries, scopes, result count, response size, match text, source metadata, and URLs are all capped before the model ever sees them.
Public chat streams answers chunk-by-chunk while preserving accents, ideographs, and other non-English characters that retrievers often return.
Developer contract
PressBot does not prescribe how you index, embed, or rank. We send a query and a scope. You return ranked matches with whatever metadata you want surfaced.
{ "query": "What does our refund policy say about digital downloads?", "limit": 6, "language": "en-US", "collection": "policies" // optional scope }
{ "matches": [ { "text": "Digital downloads are refundable within 14 days...", "score": 0.94, "source": { "title": "Refund Policy", "url": "https://example.com/legal/refunds", "path": "docs/refund-policy.md", "timestamp_start": 125, "timestamp_end": 210, "collection": "policies" } } ] }
Your service
PressBot
Common WordPress RAG questions
Short answers to what most teams ask before they connect a retriever to their WordPress chatbot.
A WordPress RAG chatbot uses retrieval-augmented generation — instead of answering from whatever the language model happened to memorise, the bot first looks up your own corpus (transcripts, manuals, docs, policies, anything) and grounds the reply in those matches.
PressBot is the WordPress chatbot layer. Your retrieval service is the corpus. The two talk over a simple POST /search contract, and the visitor sees citations they can click. That is RAG, applied to the WordPress chat widget you already trust.
Yes — any of them. PressBot only speaks the request/response shape shown above. Whatever sits behind your endpoint is your decision. Pinecone, Weaviate, Qdrant, Milvus, pgvector, Elastic, Algolia, your own homegrown ranker — all fine.
The contract is HTTPS + JSON. No SDK, no specific vendor lock.
PressBot tells the visitor it could not find a match in your corpus, and offers to keep the conversation going from on-site context. It will not invent a passage to fill the gap.
You can also tune a minimum score threshold in the endpoint settings if you want stricter behaviour.
Yes, whenever your response includes a source.url. PressBot renders the title as a link inline with the answer. If only a path or timestamp is provided, the citation still shows — just as text rather than a link.
External Knowledge passes the visitor’s language preference along to your retriever (BCP-47 codes like es-ES or ja-JP), so you can filter or re-rank accordingly. The chat streaming pipeline is UTF-8 safe end-to-end — accents, ideographs, and emoji all survive.
The RAG bridge (External Knowledge) ships with PressBot Pro. The free chatbot still answers from your WordPress content; Pro adds the retriever connector, the agent surface, and everything else listed on the Pro page.
Only inside the conversation transcript that the visitor already sees. We do not mirror your corpus, build a shadow index, or send the matches anywhere outside the model call required to ground the answer.
Related chatbot guides
Same plugin, different lens. Each page goes deeper on a specific reason teams are reaching for a WordPress chatbot in 2026.
How the visitor chat, BYOK pricing, and the admin agent fit together in one plugin.
Read the guide 02 Free WordPress ChatbotWhat the free tier actually covers, where the upgrade lives, and why there are no per-message fees.
Read the guide 03 WordPress AI ChatbotGrounded answers for visitors, typed tools for admins, one chat layer for both.
Read the guideWordPress RAG · External Knowledge
Bring transcripts, manuals, support docs, or research libraries. Keep the WordPress plugin lean. Hand visitors retrieval-grounded answers, not confident guesses.