AI online · grounded

Talk to your files like they finally understand you.

SeekFiles AI is a private AI that reads, embeds, and reasons over your documents — then answers in plain language with the exact source chunks attached. No hallucinations. No guesswork.

50 welcome credits · no card · 3 assistants free

streaming
You
What did the Q4 report say about churn drivers?
retrieving · hybrid (vector + bm25 + rrf)
Q4-Report.pdf· p.120.91
Q4-Report.pdf· p.180.87
Customer-Calls.csv0.82

The Q4 report identifies three primary churn drivers: onboarding friction in week one, missing Slack/Notion integrations, and pricing perception in the mid-market segment.

Q4-Report.pdf · p.12Q4-Report.pdf · p.18Customer-Calls.csv
  • HTTPS encrypted
    All traffic over TLS
  • Secure payments
    Powered by Lemon Squeezy
  • Your files, your AI
    Never used to train models
  • Daily backups
    14-day retention
  • Cited answers
    Every reply has receipts
~340ms
First-token latency
Streaming SSE end-to-end
100 MB
Max file size
PDF · DOCX · XLSX · PPTX · EPUB · ZIP
Hybrid
Retrieval engine
Vector + BM25 + RRF + rerank
EN · TL
Tuned cross-lingual
Ask one language, retrieve the other

The pipeline

Five stages between your file and a grounded answer

  1. Ingest
    Extract · Chunk
  2. Embed
    OpenAI · pgvector
  3. Retrieve
    Vector + BM25 + RRF
  4. Ground
    Hallucination guard
  5. Answer
    With citations

What it does

An AI that actually reads your documents.

Not just summarizes, not just searches — extracts, indexes, and reasons across your private corpus.

Hybrid retrieval, not just embeddings

Vector similarity catches meaning. BM25 keyword search catches exact terms. Reciprocal rank fusion merges them — so the right chunk surfaces even when the question and the doc don't share a single word.

Vector
pgvector · HNSW
Keyword
BM25 · pg full-text
Fusion
RRF · rerank

Receipts on every answer

Every reply ships with the exact chunks it used — file, page, and similarity score. Click through to verify.

Sees images and scans

Vision pipeline OCRs scans, captions charts, and reads handwriting — image-only PDFs become searchable.

Cross-lingual by design

Ask in one language, retrieve from source files in another. Tuned multilingual embeddings — not bolted on after.

Almost any file format

PDFs, Word, Excel, slides, EPUB, ZIP, plain text. Up to 100 MB each. We handle the extraction.

Mobile + web, same brain

Native iOS and Android for capture-and-ask on the go. Web dashboard for heavy lifting.

From upload to answer

Three steps. Two minutes. Zero hallucinations.

01Ingest

Drop your files in.

PDFs, slides, spreadsheets, scans, EPUBs, ZIPs — up to 100 MB each. We extract the text, OCR the images, and chunk everything with overlap so context stays intact.

  • All major file formats
  • Vision pipeline for scans & charts
  • Overlapping chunks preserve context
Files · Q4 research
contract.pdf12 MB
deck.pptx4 MB
scan.jpg890 KB72%
Extracted · Chunked87%
Chunk 042 · contract.pdf

“...termination on 30-day notice can be exercised by either party without cause...”

vector: [0.23, −0.84, 0.51, ...]
Vector
pgvector · HNSW
Keyword
bm25 · pg_fts
Indexed · Searchable
02Embed

We embed & index — twice.

Each chunk gets an OpenAI embedding stored in Postgres pgvector with HNSW. In parallel, a BM25 keyword index gets built. Hybrid retrieval beats either alone.

  • Vector + BM25 fused with RRF
  • Optional cross-encoder reranking
  • Cross-lingual: ask EN, retrieve TL
03Answer

Ask. Get the answer + receipts.

Create an assistant scoped to specific files or folders. Every reply cites the exact chunks it used — file, page, similarity score. Never guessed, never hallucinated.

  • Streaming SSE responses
  • Citation chips link to source
  • Refuses when grounding is weak
Chat · Q4 research
You
What does clause 4.2 say about termination?

Clause 4.2 covers termination rights — either party may terminate on 30-day written notice...

contract.pdf · p.12contract.pdf · p.18

In production

People use SeekFiles for...

Anywhere you have a pile of documents and a question. From students reading textbooks to enterprises wrangling SOPs.

Knowledge bases

Turn internal SOPs, runbooks, and policies into a searchable assistant your whole team can ask in plain language.

Customer support

Train an assistant on your product docs and FAQs. Embed it on your site or share a public chat link with customers.

Research and study

Drop in textbooks, papers, lecture slides — ask anything and get cited answers tied to page-level sources.

Personal second brain

Your notes, contracts, manuals, receipts. Stop digging through folders — just ask.

Under the hood

A real RAG stack — not a wrapper around a prompt.

We invest in retrieval quality — because a great-sounding answer pulled from the wrong chunk is worse than no answer at all.

~340ms first-token · streaming SSE
  • Hybrid retrieval
    pgvector + BM25 + RRF
  • Hallucination guard
    Similarity floor · refuses on weak grounding
  • Query rewriter
    History-aware · multi-turn
  • Optional reranking
    Cross-encoder · Cohere
  • Streaming chat
    Server-Sent Events · ~340ms first token
  • Citations
    File · page · similarity score
Questions, pre-empted

Most-asked, up front.

Six common ones. Need more? The full FAQ has everything else.

Newsletter

Get smarter about AI file search.

Once a week: how teams use SeekFiles, retrieval & RAG deep-dives, and product updates. No spam, unsubscribe anytime.

no spam · unsubscribe in one click

Ready when you are

Stop searching your files. Start asking them.

50 welcome credits. 3 assistants. Free to start, no credit card. Upgrade only when you actually need more.

We use cookies

We use essential cookies for sign-in and session security, plus local storage for your theme preference. We don't set third-party advertising cookies. See our Privacy Policy.