Portfolio Project
AI Lead
Qualification
System

Multi-agent pipeline · Python · FastAPI · Streamlit

A fully automated system that takes inbound sales leads and decides — without any human — whether they're worth a sales rep's time.

The Problem

Sales reps are wasting
half their day on bad leads

  • 30–50% of demo requests are unqualified and go nowhere
  • Reps manually review every single submission
  • Response times stretch to 6–24 hours, killing interest
  • No consistent process — every rep scores leads differently
  • CRM data is incomplete or wrong because enrichment is skipped
~120h
Wasted per month
at 300 leads/mo, 40% unqualified
$8,400
Monthly labor wasted
at $70/hr average rep cost
The Solution

A five-agent AI pipeline that
qualifies leads in under 15 minutes

When a sales lead fills out a demo form, the system automatically enriches their data, scores them, asks follow-up questions, and either routes them to a rep or filters them out — with zero human involvement.

Agent 1
Intake
Validates email, cleans up name/company data
Validates
Agent 2
Enrichment
Adds industry, company size, revenue, location
Enriches
Agent 3
Scoring
Scores lead 0–100 against ideal customer profile
Scores
Agent 4
Qualification
Sends follow-up, extracts budget/timeline answers via LLM
Qualifies
Agent 5
Routing
Assigns rep, creates booking link, alerts Slack, logs CRM
Routes
02
Deep Dive

How each agent
actually works

Agent 1

Intake Agent

The first gate. It checks that the email address is real and not from a disposable domain, normalizes capitalization on names and company names, and saves a clean record to the database. If the email has already been processed, it catches that too — no duplicates.

  • Blocks 20+ known disposable email providers
  • Normalizes: "john SMITH" → "John Smith"
  • Deduplication prevents double-processing
Agent 2

Enrichment Agent

Fills in the blanks. Takes the email domain and looks up the company — industry, employee count, revenue range, and location. Known companies are matched from a database. Unknown companies get synthetic but realistic data generated automatically.

  • Looks up by email domain first
  • Falls back to keyword-based industry detection
  • Simulates Clearbit / Apollo.io in production
Agent 3 — Scoring

Every lead gets a score from 0 to 100

Points are awarded based on how closely the lead matches the ideal customer profile. Clear cutoffs determine what happens next.

CriteriaPointsWhy it matters
50–500 employees+25Sweet spot — not too small, not enterprise
Target industry+20SaaS, FinTech, HealthTech, etc.
Director+ title+20Has decision-making authority
US-based+10Primary target market
Budget indicated+25Strongest buying signal on the form
80–100
SQL — Sales Ready
Routed to a rep immediately
60–79
MQL — Promising
Qualifies further via email
< 60
Disqualified
Filtered out, no rep time wasted
Agent 4

Qualification Agent

For leads that scored 60 or higher, this agent sends a follow-up email asking four BANT questions — Budget, Authority, Need, and Timeline. The lead's reply is then parsed by an LLM (or rule-based fallback) to extract structured answers and adjust the final score.

  • Uses GPT-4o-mini when an API key is present
  • Falls back to regex pattern matching automatically
  • Score can increase up to +35 points from strong answers
Agent 5

Routing Agent

Only SQL-tier leads reach this agent. It round-robins assignment across the sales team, generates a unique booking link with a Calendly-style URL, fires a Slack alert with the full lead summary, and logs everything to HubSpot or Salesforce. All integrations are simulated when keys aren't configured.

  • Round-robin rep assignment — no manual coordination
  • Booking link generated per lead
  • HubSpot, Salesforce, Slack all supported
Testing & Development

No real company needed —
here's how testing worked

01
Synthetic lead generation with Faker
A Python script generates realistic B2B demo form submissions — names, company email addresses, job titles, budgets, and timelines. Each run produces hundreds of unique, plausible leads. Seed is fixed so results are reproducible.
02
Mock company enrichment database
Instead of paying for Clearbit or Apollo, a JSON file contains realistic company profiles keyed by email domain. Unknown companies get synthetic data generated on the fly — weighted by industry to match real-world distributions.
03
Pre-written qualification responses
10 realistic "email reply" scenarios cover the full range — from a CEO with $100K+ budget who needs this immediately, to a pre-Series A startup with no budget and board approval required. The LLM or rule engine processes each one.
Business Impact

The rep hours this system saves
add up fast

Based on a realistic B2B SaaS company receiving 300 demo requests per month, with a $70/hr average rep cost.

120h
Rep hours saved monthly
300 leads × 40% unqualified × 1hr each = 120 hours that never need to be touched
$8,400
Monthly labor value recovered
120 hours × $70/hr — that's labor that can now go toward closing, not sorting
<15 min
Speed-to-lead
Down from 6–24 hours of manual review. Faster response dramatically improves close rates.
100%
Consistent scoring
Every lead is evaluated against the same rubric, every time. No gut-feel variation between reps.
Skills Demonstrated

What went into building this

Backend
Python FastAPI SQLAlchemy SQLite Pydantic
AI / LLM
OpenAI API GPT-4o-mini Prompt Engineering Structured Extraction
Multi-Agent Design
Agent Orchestration Pipeline Architecture Fallback Logic State Management
Data & Viz
Streamlit Plotly Pandas Faker
Integrations
HubSpot Salesforce Slack Webhooks Calendly
Engineering
REST API Design OOP Regex Parsing Logging CLI Tools
View the Source
See it
in action

The full codebase — all five agents, the Streamlit dashboard, data generators, mock enrichment database, and FastAPI backend — is on GitHub.

View on GitHub