Building an Intelligent CV Analyzer with Streamlit, LangChain, and Google Gemini

Have you ever looked at a massive pile of resumes and wished there was a faster way to find the perfect candidate? What if you could chat with all those CVs at once and get match scores based on your specific job description?

In this post, we’re going to break down the technical magic behind our CV Intelligence Lite app. We’ll walk through how it’s built using Streamlit (for the user interface), LangChain (for document processing and AI chaining), and Google’s Gemini 2.5 Flash (the brain that reads and analyzes the CVs).

And don’t worry—we’ll keep the tech jargon to a minimum!

The Big Picture: How Does It Work?

Our app does two main things:

Analyze CVs against a Job Description: You upload resumes (PDF or Word docs) and paste a job description. The app reads everything and gives you a match score (0-100) along with a summary of each candidate.
Chat with the CVs: You can ask the app questions like “Who has the most Python experience?” or “Does anyone have a background in finance?”

Here is the tech stack we used to make this happen:

Python: The core programming language.
Streamlit: A fantastic Python library that turns scripts into shareable web apps in minutes.
LangChain: A framework for developing applications powered by language models. It handles the messy parts of reading different file formats and organizing our chat history.
Google Gemini 2.5 Flash: A lightning-fast, highly capable Artificial Intelligence model that actually reads and understands the text.

Step-by-Step Implementation Breakdown

Let’s look under the hood at the Python code that makes this possible (app.py).

1. Setting Up the Environment and UI

First, we need to load our required tools and set up the look and feel of our app.

import streamlit as st
import tempfile
import os
from dotenv import load_dotenv
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_community.document_loaders import PyPDFLoader, Docx2txtLoader
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage

# Load environment variables (like our API keys)
load_dotenv()

# Page Config
st.set_page_config(page_title="CV Analyzer Lite", page_icon="📝", layout="wide")

To make our app look professional and modern, we use some custom CSS styling right inside Streamlit to give it dark mode vibes and a colorful header.

2. Remembering the State (Session State)

Web apps are naturally forgetful. Every time you click a button or type something, Streamlit reruns the code from top to bottom. To prevent it from forgetting our uploaded CVs or our chat history, we use Streamlit’s session_state. Think of session_state as the app’s short-term memory.

def initialize_session_state():
    if "cv_text" not in st.session_state: st.session_state.cv_text = ""
    if "chat_history" not in st.session_state: st.session_state.chat_history = []
    if "analysis" not in st.session_state: st.session_state.analysis = ""

We are remembering three things: the raw text from the CVs, the previous chat messages, and the initial analysis summary.

3. Extracting Text from Files

When a user uploads a PDF or a Word document, our AI can’t just “look” at the file. It needs plain text. LangChain provides excellent tools (called Loaders) to extract text from files.

def extract_text(uploaded_files):
    combined_text = ""
    for uploaded_file in uploaded_files:
        # 1. Save the uploaded file temporarily so our tools can read it
        with tempfile.NamedTemporaryFile(delete=False, suffix=os.path.splitext(uploaded_file.name)[1]) as tmp:
            tmp.write(uploaded_file.getvalue())
            tmp_path = tmp.name

        try:
            # 2. Pick the right tool based on the file type
            loader = PyPDFLoader(tmp_path) if uploaded_file.name.endswith(".pdf") else Docx2txtLoader(tmp_path)
            docs = loader.load()

            # 3. Combine all the text into one big string
            combined_text += f"\n--- Start of {uploaded_file.name} ---\n"
            combined_text += "\n".join([doc.page_content for doc in docs])
            combined_text += f"\n--- End of {uploaded_file.name} ---\n"
        finally:
            os.remove(tmp_path) # Clean up the temporary file
    return combined_text

This function neatly packages all the resumes into one giant, labeled text document that our AI can read.

4. The Core Logic: Processing and Analyzing

Now we get to the fun part. The user interface uses st.columns to split the screen into two halves: the upload section (left) and the chat section (right).

When the user clicks “🚀 Process CVs”, here is what happens:

Extract Text: We run our extract_text() function.
Call the AI: We initialize ChatGoogleGenerativeAI. We set the temperature=0 here. Temperature controls how “creative” the AI is. For precise CV analysis, we want it to be focused and factual, hence zero.
The Prompt: We give the AI a clear set of instructions.

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0)

prompt = f"""
Analyze these CVs against the following Job Description:

JOB DESCRIPTION:
{job_desc}

CV DATA:
{st.session_state.cv_text}

Provide a brief summary for each candidate, their match score (0-100), and a final recommendation.
"""
response = llm.invoke(prompt)
st.session_state.analysis = response.content

The AI reads the prompt, compares the combined CV text with the job description, and returns a nicely formatted summary that we display on the screen.

5. Chatting With the CVs

The second column of our app allows for conversational interaction.

To enable chatting, we send a continuous stream of messages back and forth to the AI. We start with a SystemMessage giving the AI its persona (an HR assistant) and the context (the extracted CV text).

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.3)

messages = [
    # Tell the AI who it is and what information it knows
    SystemMessage(content=f"You are an HR assistant. Use the following CV data as context:\n{st.session_state.cv_text}"),
]

# Add the most recent chat history so the AI remembers the conversation
for msg in st.session_state.chat_history[-6:]:
    if msg["role"] == "user": 
        messages.append(HumanMessage(content=msg["content"]))
    else: 
        messages.append(AIMessage(content=msg["content"]))

# Send the whole conversation to the AI and get a response
response = llm.invoke(messages)

Note: We set the temperature slightly higher (0.3) here to make the chat feel a bit more natural and conversational.

By passing the recent chat history (st.session_state.chat_history[-6:]) along with every new question, the AI “remembers” what you were just talking about, allowing for follow-up questions!

The Result: A Production-Ready Architecture (Without the Complexity)

And there you have it! By combining Streamlit for a rapid UI, LangChain for handling documents and conversation history, and the sheer processing power of Gemini Flash, we built a tool that takes the headache out of CV screening.

The beauty of this architecture is its simplicity:

No complex databases or Vector Stores: We just inject the text directly into the prompt (since Gemini has a massive context window capable of reading hundreds of pages at once).
Easy Deployment: A requirements.txt file and a Python script are all you need to host this on Streamlit Community Cloud or any other PaaS.

Next time you have a stack of fifty resumes, let the AI read them first!

Github – https://github.com/sethlahaul/streamlit-langchain-cv-analyzer

Building an Intelligent CV Analyzer with Streamlit, LangChain, and Google Gemini

The Big Picture: How Does It Work?

Step-by-Step Implementation Breakdown

1. Setting Up the Environment and UI

2. Remembering the State (Session State)

3. Extracting Text from Files

4. The Core Logic: Processing and Analyzing

5. Chatting With the CVs

The Result: A Production-Ready Architecture (Without the Complexity)

About Lahaul Seth

Subscribe To Our Newsletter

The Big Picture: How Does It Work?

Step-by-Step Implementation Breakdown

1. Setting Up the Environment and UI

2. Remembering the State (Session State)

3. Extracting Text from Files

4. The Core Logic: Processing and Analyzing

5. Chatting With the CVs

The Result: A Production-Ready Architecture (Without the Complexity)

About Lahaul Seth

Footer CTA

Subscribe To Our Newsletter