RAG CHATBOT

Chat With My Work

API Service

INTRO

What you’re looking at

This is an in-site retrieval-augmented chatbot for my portfolio website.

It’s grounded in a curated, structured knowledge base of my work history and projects. Responses include citations to the exact source chunks used to generate the answer.

I started this project back in the Fall of 2025. I began by building a core library of AI utilities to reuse across a variety of applications.

From there I wired up a FastAPI backend and a Next.js frontend. I explored the different ways to work with the OpenAI API. I had a very basic chatbot working in a couple of hours.

EVOLUTION

Starting small and building up

What started as a simple chat interface quickly evolved into something far more ambitious. I built a RAG template from the ground up, giving users the ability to upload documents, kick off a full parsing pipeline, and configure everything from embedding models to chunk sizes and chunking strategies. This became the foundation of my portfolio chatbot.

To stress-test it, I loaded over 100MB of veterinary textbooks and queried it with a locally run open-source LLM. The cracks showed fast: tables broke the parsing pipeline, and as I layered improvements, the pipeline grew brittle and unpredictable. Cursor kept generating hacky patches instead of reusing existing code, pulling me into a loop of upload → error → patch → new error. I made the deliberate call to revert and rebuild from scratch.

That reset changed everything. I defined first-principles rules for working with Cursor, including incremental testing, clean interfaces, and scalable architecture, and stopped asking for massive systems in a single prompt. What emerged was a disciplined, modular approach that gave me real control over the pipeline and a much stronger foundation to build on.

DEPLOYMENT

Getting to production

With a stable pipeline in place, the next step was turning it into a live, production-grade portfolio chatbot. I used AI to synthesize context from my resume, interview notes, projects, and GitHub activity into a structured knowledge base, then ran it through my pipeline to parse, chunk, embed, and store it.

For deployment, I split the architecture across two platforms: the RAG API service on Render and the frontend on Vercel, using path-based rules to manage and trigger deployments independently. Once live, I used Claude and Playwright to automate an evaluation loop by generating sample questions, assessing response quality, and iteratively updating the system prompt to improve accuracy and relevance.

Finally, I built out an admin module to give myself runtime control over the top k, model selection, and chat history review without touching the codebase. The result is a fully deployed, self-improving chatbot backed by my own data and infrastructure.

TAKEAWAYS

Lessons Learned

Start small and iterate: big 'build everything' prompts are tempting but fragile. Small, modular steps with tests are far more reliable and compound over time.

Treat AI tools like junior collaborators, not magic. Some tools will happily overbuild, duplicate work, or drift from your architecture unless you enforce rules, reuse existing modules, and redirect as needed.

Stability and architecture matter. Reverting to a known-good commit and rebuilding with clear principles (versioning, clean interfaces, test coverage) is sometimes better than endlessly patching.

Design workflows for humans in the loop. Features like document previews and selectable chunking strategies acknowledge that users need control over ambiguous steps.

Bottom line

AI accelerates the work when I build in small, testable slices, enforce clean interfaces, and keep humans in the loop so automation scales quality instead of chaos.

Chat With My Work

What you’re looking at

Starting small and building up

Getting to production

Lessons Learned

Let's Work Together