How I Built My AI Digital Twin

A technical deep-dive into combining RAG, embeddings, and LLMs to create an AI that represents me to recruiters and potential employers.

🎯

The Problem

What challenge was I trying to solve?

Recruiters and CEOs are busy. They're evaluating dozens of candidates and need to quickly assess fit. My resume lists achievements, but it doesn't convey how I think, communicate, or approach problems.

I've written three books on product management, but who has time to read them during a hiring process? The knowledge is there but inaccessible.

Initial screening calls are inefficient. Both parties spend 30 minutes on basic questions that could be answered asynchronously.

💡

The Solution

An AI that knows me as well as I know myself

Build an AI digital twin that can answer questions about my experience, discuss product management concepts from my books, and help recruiters assess fit—24/7, instantly, in my voice.

The Job Fit Analyzer takes this further: paste a job description and get an immediate analysis of how my skills align with the role, with specific talking points for follow-up conversations.

The result: Recruiters get answers in seconds instead of days. I get higher-quality conversations with people who already understand my background.

System Architecture

👤

User

💻

Next.js

Frontend

⚙️

API Route

RAG Pipeline

🗄️

pgvector

Embeddings

🧠

Claude

LLM

Key AI Concepts

Each concept plays a critical role in making the AI useful, accurate, and engaging.

🔍

1/7

Retrieval Augmented Generation (RAG)

RAG solves a fundamental LLM limitation: they only know what they were trained on. By retrieving relevant context at query time, the AI can answer questions about my specific experience, books, and background.

Implementation Details:

Chunks my books and resume into ~1000 token segments with overlap
Stores chunks with vector embeddings in a PostgreSQL database
At query time, finds the most semantically similar chunks
Injects relevant context into the prompt before the LLM responds

💡

Why It Matters:

The AI can accurately discuss my three books and career history without hallucinating.

🧮

2/7

Vector Embeddings

Embeddings convert text into numerical vectors that capture semantic meaning. Similar concepts cluster together in vector space, enabling semantic search rather than just keyword matching.

Implementation Details:

Using Voyage AI's voyage-3-lite model (512 dimensions)
Each text chunk becomes a point in 512-dimensional space
Cosine similarity finds the closest matches to user queries
pgvector extension enables efficient vector search in PostgreSQL

💡

Why It Matters:

A question about "team leadership" finds content about "building high-performing teams" even without exact word matches.

🧠

3/7

Large Language Model (Claude)

Anthropic's Claude serves as the reasoning engine. It takes the retrieved context, system instructions, and user query to generate natural, helpful responses in my voice.

Implementation Details:

Claude claude-sonnet-4-20250514 for the optimal balance of capability and speed
Streaming responses for real-time typing effect
Carefully crafted system prompt defines persona and boundaries
Temperature and token limits tuned for consistency

💡

Why It Matters:

Responses feel like talking to me—authoritative yet approachable, with specific examples from my experience.

🎭

4/7

Persona Engineering

The system prompt is the secret sauce. It's not just "pretend to be Simon"—it's a detailed specification of communication style, topic boundaries, and response patterns.

Implementation Details:

First-person voice with executive-level vocabulary
Specific metrics and achievements to reference
Clear boundaries on personal topics and disclaimers
Job Fit Analysis mode with optimistic framing

💡

Why It Matters:

The AI maintains consistent personality across thousands of interactions without going off-script.

✅

5/7

Job Fit Analysis

A specialized prompt mode that analyzes job descriptions against my experience. The AI is instructed to be optimistic, highlight transferable skills, and encourage connection.

Implementation Details:

Detects job descriptions via [JOB FIT ANALYSIS] prefix
Structured response format with visual indicators (✅🔄❌)
Reframes gaps as growth opportunities
Always ends with call-to-action to connect

💡

Why It Matters:

Recruiters get instant, helpful feedback that positions me positively while remaining honest.

⚡

6/7

Streaming Architecture

Server-Sent Events (SSE) enable real-time response streaming. Users see words appear as they're generated, creating a natural conversation feel.

Implementation Details:

Next.js API route with ReadableStream response
Chunk-by-chunk transmission as Claude generates
Client-side state management with React hooks
Graceful error handling and retry logic

💡

Why It Matters:

No waiting for complete responses—the conversation feels immediate and engaging.

🧪

7/7

AI Evaluations

Evaluations are automated tests that assess AI quality. Unlike traditional unit tests, evals use LLM-as-judge patterns to evaluate subjective criteria like tone, accuracy, and persona consistency.

Implementation Details:

Test suite covering persona, accuracy, and boundary behaviors
LLM-as-judge scoring with Claude evaluating responses
Regression detection when prompts or RAG pipeline changes
Metrics for response quality, factual accuracy, and tone

💡

Why It Matters:

Confidence that updates improve the AI without breaking existing behaviors—catching issues before users do.

Tech Stack

Next.js 16

Full-stack React framework

Anthropic Claude

LLM for reasoning and generation

Voyage AI

Text embeddings

PostgreSQL + pgvector

Vector database

Vitest

Testing and AI evaluations

Tailwind CSS

Styling

Radix UI

Accessible components

Vercel

Deployment

Want to See It In Action?

Try the AI yourself. Ask about my experience, discuss product strategy, or use the Job Fit Analyzer to see how I'd match your open role.

Try the AI Connect on LinkedIn