OrioSearch — Open-Source Web Search API

Every AI agent needs web search.
Why rent it?

$$$

API bills add up fast

Tavily charges per search. Scale your agents and watch costs explode. OrioSearch costs nothing per query.

Your data, their servers

Every search query you send goes through someone else's infrastructure. Self-host and keep your data where it belongs.

Rate limits kill your agents

Hit a rate limit mid-task and your AI agent grinds to a halt. Your server, your rules, no limits.

Already using Tavily?
Change one line.

OrioSearch is a drop-in Tavily replacement. Same API shape, same response format. Just swap the URL.

Before

# Tavily — $100+/month
base_url = "https://api.tavily.com"
api_key  = "tvly-xxxxxxxxxxxxxxxx"

After

# OrioSearch — Free forever
base_url = "http://localhost:8000"
api_key  = ""  # optional

That's it. Your existing code, your existing agents — they all just work.

Up and running in three steps.

Deploy

git clone https://github.com/vkfolio/orio-search
cd oriosearch
docker compose up --build

Three services start automatically: API, SearXNG, and Redis.

Search

curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -d '{"query": "latest AI news", "max_results": 5}'

Get structured search results with titles, URLs, snippets, and relevance scores.

Extract & Answer

curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -d '{"query": "what is docker",
       "include_answer": true,
       "search_depth": "advanced"}'

Get full page content extraction and AI-generated answers with citations.

Everything you need.
Nothing you don't.

70+ Search Engines

SearXNG aggregates results from Google, Bing, DuckDuckGo, and 70+ more. Automatic DuckDuckGo fallback if SearXNG is down.

Content Extraction

Multi-tier pipeline with trafilatura (F1: 0.958) and readability-lxml fallback. Get clean markdown or text from any URL.

AI Answers

Set include_answer: true and get LLM-synthesized answers with source citations. Works with Ollama, OpenAI, Groq, or any OpenAI-compatible API.

SSE Streaming

Real-time results via Server-Sent Events. Results stream in as they arrive — no waiting for the full response.

Redis Caching

Pipeline-batched lookups, configurable TTLs, stale-cache graceful degradation. Fast responses for repeated queries.

Production Ready

Circuit breakers, exponential backoff retries, per-domain rate limiting, rotating user-agents, and Gunicorn with 4 workers.

Result Reranking

FlashRank ONNX model (~4MB, CPU-only, no PyTorch). Reranks results by semantic relevance to your query.

Image Search

Set include_images: true to get image results alongside web results. Parallel search — no extra latency.

LLM Tool Schema

GET /tool-schema returns OpenAI function-calling definitions. Register OrioSearch as a tool with any LLM in one call.

How OrioSearch stacks up.

Side-by-side with the tools you're probably paying for.

Feature	OrioSearch	Tavily	Serper	Google CSE
Self-hosted	✓	✕	✕	✕
Open source	✓	✕	✕	✕
Content extraction	✓	✓	✕	✕
AI answer generation	✓	✓	✕	✕
SSE streaming	✓	✕	✕	✕
Tavily-compatible API	✓	✓	✕	✕
LLM tool schema	✓	✕	✕	✕
Result reranking	✓	✕	✕	✕
Image search	✓	✓	✓	✓
Price	Free	From $100/mo	From $50/mo	$5/1K queries

AI answers, powered by your LLM.

Plug in Ollama for free local inference, or use OpenAI, Groq, Together AI — any OpenAI-compatible endpoint. You provide the LLM, OrioSearch does the rest.

Response with include_answer: true

{
  "query": "what is docker",
  "answer": "Docker is an open-source platform that automates
    the deployment of applications inside lightweight,
    portable containers [1]. It packages code and
    dependencies together so applications run reliably
    across environments [3].",
  "results": [
    {
      "title": "What is Docker? | Docker Docs",
      "url": "https://docs.docker.com/get-started/",
      "content": "Docker is an open platform for...",
      "score": 0.95
    }
  ],
  "response_time": 2.14
}

Configure in config.yaml

llm:
  enabled: true
  provider: "ollama"       # or "openai", "groq"
  base_url: "http://ollama:11434/v1"
  model: "llama3.1"
  api_key: "ollama"       # real key for cloud

AI answers are optional. When disabled or unavailable, search results still return normally with answer: null. Graceful degradation, always.

Stop renting your search.

Every AI agent needs web search.
Why rent it?

API bills add up fast

Your data, their servers

Rate limits kill your agents

Already using Tavily?
Change one line.

Up and running in three steps.

Deploy

Search

Extract & Answer

Everything you need.
Nothing you don't.

70+ Search Engines

Content Extraction

AI Answers

SSE Streaming

Redis Caching

Production Ready

Result Reranking

Image Search

LLM Tool Schema

How the pieces fit together.

How OrioSearch stacks up.

AI answers, powered by your LLM.

Configure in config.yaml

Deploy in 30 seconds.

Stop renting your search.

Every AI agent needs web search.Why rent it?

API bills add up fast

Your data, their servers

Rate limits kill your agents

Already using Tavily?Change one line.

Up and running in three steps.

Deploy

Search

Extract & Answer

Everything you need.Nothing you don't.

70+ Search Engines

Content Extraction

AI Answers

SSE Streaming

Redis Caching

Production Ready

Result Reranking

Image Search

LLM Tool Schema

How the pieces fit together.

How OrioSearch stacks up.

AI answers, powered by your LLM.

Configure in config.yaml

Deploy in 30 seconds.

Every AI agent needs web search.
Why rent it?

Already using Tavily?
Change one line.

Everything you need.
Nothing you don't.