Organize your files so AI can find anything
The same folder structure that makes AI retrieval great makes your own search great. A short, opinionated guide.
Most people's file organisation is a strata of "Downloads," "Documents," and "Desktop" with occasional folders called "Important" or "Misc." That structure is bad for humans and worse for AI retrieval.
A small amount of upfront discipline makes both you and the AI dramatically more effective. Here's the system.
Principle 1 — Topic, not date
"Q1 2026" tells you when. It doesn't tell you what. Folder by topic instead, with date as a file attribute:
- ✅
Vendor Contracts/acme-msa-2025-03.pdf - ❌
2025/March/contracts/contract.pdf
Topic-folders are how you'll think about retrieval ("show me Acme's MSA"), not how you'll think about archiving.
Principle 2 — Descriptive filenames
The filename is metadata. AI retrieval often picks up on filenames as a signal. Use:
- ✅
acme-msa-signed-2025-03-14.pdf - ❌
contract.pdf - ❌
IMG_4521.pdf - ❌
Final_FINAL_v3_signed.pdf
A good filename template: {counterparty}-{doc-type}-{date}.pdf.
Principle 3 — Flat-ish hierarchies
Don't nest 5 levels deep. 2–3 levels max. Deep nesting makes scope selection annoying and adds zero retrieval value.
- ✅
Contracts / Vendors / acme-msa.pdf - ❌
Business / Legal / Contracts / 2025 / Q1 / Vendors / IT / acme-msa.pdf
Principle 4 — One folder per assistant
When you build a SeekFiles Assistant, you scope it to a folder (or specific files). A 1:1 relationship between folders and assistants keeps everything clean.
- "Bar Reviewer" assistant → "Bar Reviewer" folder.
- "Tax 2025" assistant → "Tax 2025" folder.
- "Client X Case" assistant → "Client X" folder.
Principle 5 — Separate raw from processed
If you OCR a scan into a clean PDF, keep the original and the cleaned version, but in different subfolders:
Receipts / raw-scans/Receipts / cleaned-pdfs/
Point your assistant at the cleaned folder. The raw scans stay for audit if needed but don't pollute retrieval.
Principle 6 — Tag the unusual
Most files are routine. The unusual ones (lapsed contracts, exceptional cases, legal flags) get a tag in the filename:
acme-msa-2025-03-EXPIRED.pdfcase-doe-v-roe-FLAG-conflict.pdf
You can then filter chats with "find all EXPIRED contracts" and get clean retrieval.
Cost of NOT organising
A folder with 500 mixed-topic files retrieves worse than five folders of 100 topic-scoped files each. The math is just retrieval noise: more irrelevant chunks crowd out the relevant ones. Organisation isn't aesthetic — it's accuracy.
Bonus — AI-assisted reorganisation
If your existing files are messy, build a one-time "Organiser" assistant scoped to everything. Ask:
"Group these files into topic-based folders. Suggest a folder structure and assign each file."
You won't follow the suggestion verbatim but it's a strong starting outline. Then you reorganise once and reap the benefits forever.
Like this? Get the next one in your inbox.
Weekly tips on getting more out of your file library — RAG, retrieval tricks, and product updates. No spam.
Try it free
Ask your files anything. Get answers with citations.
50 welcome credits. 3 assistants. No credit card. Upload your first file in under two minutes.