Create Useful Recruiting Application powered by LLM
Software Development Journey Overview
Follow this step-by-step software development journey to see real progress updates, challenges overcome, and practical experience.
Progress Updates (6 total)
Update #1: Create Useful Recruiting Application powered by LLM
I want to make an Al/LLM-powered search tool for recruiting so l don't have to build endless keyword strings. Traditional searches pull in resumes stuffed with buzzwords, even when candidates don't actually have the right experience. I need a tool that understands context and scopes out the best talent.
Challenges Overcome: I used Cursor to start build a tool that can reliably turn resumes (pdfs) into structured JSON which then gets fed to ChatGPT via the OpenAl API for natural language search.
Obstacles Faced: If I ever want to commercialize this there are PlI (Personally Identifiable Information) considerations to be made when sending resumes to ChatGPT for parsing. I'll need to figure out a way to remove PIl before using an ai model to parse through them. I will also need to integrate this with many different ATS (Applicant Tracking Systems) which are all different.
Looking Back: First turn the inconsistently formatted resumes into consistently structured JSON files so the LLM has an easier time parsing through the information. It's faster and cheaper that way.
Update #2: Turn inconsistently formatted PDFs into structured JSON
10% completeUsed Cursor to build a tool that takes inconsistently formatted resumes (PDFs), sends them to ChatGPT 4o mini via the OpenAI API (which reliably formats the text), then saves ChatGPT’s output as a consistently structured JSON file.
Challenges Overcome: Used Cursor to make a page using HTML on the frontend and python running in the backend to grab the text from resumes I upload, send it to ChatGPT via the OpenAI API and have gpt4o mini turn structure that text to be used as a JSON file. Eventually that JSON file will be fed back into ChatGPT when I build the natural language search and filter tool
Update #3: Developed UI for the Application
15% completeNeeded a UI that imported candidate information from an ATS to my app where it could be filtered and searched through by my LLM-powered tool. Modeled the UI off of "Workable" (Applicant Tracking System) for now.
Update #4: Make the LLM-powered Seach Feature Actually Work
50% completeUsed OpenAI API again to return candidates that match my natural language search criteria. The application sends every resume in the form of a structured JSON file JSON to ChatGPT and asks whatever question I’ve typed in. Then the system only displays the candidates who actually meet that criteria. Amazingly, it’s highly accurate.
Challenges Overcome: Every resume is sent to ChatGPT with this prompt: You are a hyper-critical hiring assistant. Your job is to determine if a candidate's resume (in JSON format) strictly meets all criteria in a user's query. Follow these steps: 1. **Analyze the Query:** Identify every single constraint in the user's query (e.g., skill, years of experience, current role). 2. **Strictly Evaluate the Resume:** Scrutinize the provided resume JSON. Look for explicit, undeniable evidence that the candidate meets EVERY constraint. Pay special attention to work history. If the query specifies 'current position', you must only look at the work entry where `is_current` is `true`. A skill found elsewhere is an automatic disqualification for this query. 3. **Formulate a Decision:** Based on your analysis, decide if the candidate is a perfect match. Ambiguity or a partial match is a "NO". 4. **Generate an Explanation:** Write a one-sentence explanation for your decision. If the answer is "NO", state which specific constraint was not met. If the answer is "YES", state how the primary constraint was met. 5. **Format the Output:** Return a JSON object with two keys: - "match": Your final decision, either "YES" or "NO".
Update #5: Remove PII (Personally Identifiable Information) before sending resume data to OpenAI
70% completeRunning a local Mistral 7B Instruct model on my computer for initial parsing. Then any PII from the JSON output (candidate name, email, GitHub URL, etc.) is tokenized before being sent to OpenAI during the actual search feature.
Obstacles Faced: Running a local LLM is not scaleable or commercializable. I’m considering using an AWS EC2 GPU to host.
Update #6: Set up private Mistral LLM on AWS EC2 GPU and wired Skill Scope backend to for PII-safe resume parsing
80% completeStood up a private LLM on AWS using GPU EC2 (g4dn.xlarge) to keep all resume PII private. On that instance, installed Docker + NVIDIA tooling and ran Ollama with a Mistral 7B Instruct model, binding the API strictly to localhost so it was not accessible on any external network interface. Connected the Skill Scope backend to that private LLM via AWS Systems Manager (SSM) port forwarding, which used IAM-authenticated, encrypted sessions and required no public ports, SSH keys, or inbound rules. The backend performed PDF/DOCX text extraction locally and sent prompts through the SSM tunnel; the LLM’s responses returned over the same tunnel and structured JSON was produced and stored locally. To pull the container and model once, outbound egress was briefly enabled from the private subnet through a temporary NAT gateway and the instance’s egress was widened; no inbound exposure was added. The public GPU instance was stopped to reduce cost. Net result: end-to-end parsing ran against the EC2-hosted Mistral with no PII sent to external LLMs. (This was all done using Cursor running commands on AWS CLI. Jad Hanna helped as well)
Looking Back: Be careful about what cursor creates on AWS. I accidentally had two unnecessary ec2 instances running for a while and I had to pay $130.