Rapidly build RAG-powered internal tools

Embed documents or text and ask questions with a single API.

In minutes, you can build an internal tool for searching documents, or a customer-facing app for answering questions.

What is RAG? When you ask an LLM a question, you're hoping that it has seen the answer somewhere in its training data, or has enough context to generate a reasonable answer. But what if the data it needs is private, like some internal company documentation? Do you train a model from scratch on your data? Do you create a whole training set to fine-tune your model?

There's a simpler way to feed the data into your model. You can add your documents to a database, and for each query, retrieve the most relevant chunks of text from your database and give them to your model, along with your question. This is called retrieval-augmented generation. You're just giving the LLM the context it needs.

But what if I'm just building an internal tool and don't want to spin up a whole database and deal with document parsing, chunking, and vector embeddings?

This is where RagHost can help! We provide a simple API that allows you to upload your documents and query them. We do all the work for you, so you can focus on building your app.

Embed a File

$ curl -F 'file=@./tesla-q10.pdf' https://raghost.ai/api/embed

Ask a Question

$ curl -G https://raghost.ai/api/ask -d 'q=What was revenue growth for 2022?'
Configurable Embedding
In Retrieval-Augmented Generation, how you chunk your documents for embedding can make a big difference on the quality of answers.
When embedding a PDF or text with the RagHost API's /embed endpoint, you can choose your chunk size and chunk overlap.
Built-In Streaming
Don't leave your users waiting for an answer. Our /ask endpoint streams LLM responses.
Okay... why should I use RagHost and not the OpenAI Assistants API?
Use Non-OpenAI Models
Our current default model is OpenAI's GPT 3.5 Turbo, but we will be adding more models very soon, starting with Anthropic's Claude 2. Stay tuned!
Configure Chunking
The right chunking strategy can make or break your retrieval quality. We let you specify both chunk size and chunk overlap, and are adding more settings in the future.
Spend Less Money
OpenAI charges you for every GB per day, and perassistant. It adds up quickly. We're working on fair pricing for our beta so you won't sweat when you open up your Billing page.