Optimizing HR Processes with LangChain's Tagging System

At CodeDuo, we faced a significant challenge: our small team was overwhelmed with the task of reviewing nearly 400 CVs for a single role. This manual process was time-consuming, prone to inconsistencies, and inefficient. Recognizing the need for a more streamlined approach, we decided to leverage our technical expertise to create a tool that would automate and optimize the candidate evaluation process.

To address this challenge, we developed an HR ranking application using LangChain’s Tagging System, LLM, and Chainlit. This application was designed to extract structured data from CVs, rank candidates based on specified criteria, and present the results in an easily digestible format. Here’s a technical breakdown of how we achieved this.

Overview

We create an HR ranking application featuring a user-friendly web-based interface designed to streamline the daunting task of reviewing CVs. Harnessing the capabilities of LangChain alongside ChatGPT and Chainlit, it application delivers a seamless experience for HR team members.

When users access the application, they are asked to enter the job description for the open position. The system then identifies the relevant technical skills required for the job and presents this information to the user. Following this, the system asks users to assign weights to each skill, allowing them to indicate the importance of each skill in relation to the role.

Job description prompt and skill extraction

Once the skills and weights are specified, users can effortlessly drag and drop CVs onto the interface. The system then proceeds to process the CVs, extracting pertinent information and providing real-time updates on the progress.

File Uploads

Upon completion, the application presents users with a comprehensive summary, including candidate scores and a visually intuitive bar chart depicting rankings.

Results

Decoding the Application Flow

The application’s workflow can be visualized using Business Process Model and Notation (BPMN). It encompasses three key stages:

User interaction: Gathering job description and weights.
LangChain Language Model (LLM) interaction: Extracting personal and skill-related data from CVs.
Summary generation: Scoring candidates and generating rankings.

BPMN Diagram

Understanding the Architecture

Our HR ranking application is built upon a robust architecture, comprising two main components:

LangChain-based Engine: Responsible for processing skills, weights, and CVs.
Chainlit-based UI: Facilitates seamless interaction with the application.

Let’s dissect each component to understand its role and implementation in detail.

LangChain-based Engine: Extracting Insights from CVs

The core functionality of our application lies in its ability to extract structured data from CVs. This is achieved through the utilization of LangChain’s Tagging System. Here’s how it works:

Dynamic Schema Creation: We dynamically generate schemas tailored to extract both personal and skill-related information from CVs. This ensures flexibility and accuracy in data extraction.

class CandidateInfoResponse(BaseModel):
    name: Optional[str] = Field(..., description="Candidate's name")
    email: str = Field(..., description="Candidate's email address")
    phone: str = Field(..., description="Candidate's phone number")
    portfolio: str = Field(..., description="Candidate's portfolio website")
    github: str = Field(..., description="Candidate's GitHub URL")
    linkedin: str = Field(..., description="Candidate's LinkedIn URL")
    present_job: str = Field(..., description="Current position and company")
    education: str = Field(..., description="Highest level of education")
    years_of_experience: Optional[int] = Field(..., description="Total years of experience")


class SkillNumberOfYearsResponse(BaseModel):
    has_skill: bool = Field(..., description="describes whether the candidate has experience in the current skill")
    number_of_years_with_skill: int = Field(..., description="describes how many years of experience the candidate has in the current skill")
    skill: str

Personal Data Extraction: Using LangChain’s pydantic style tag chain, we extract personal information such as candidate names, emails, ages, and genders. This data is crucial for building a comprehensive candidate profile.
```
from langchain.chains import create_tagging_chain_pydantic

def extract_personal_info(doc):
    chain = create_tagging_chain_pydantic(CandidateInfoResponse, cfg.llm)
    return chain.run(doc)
```

Skill-related Data Extraction: Leveraging JSON-based schemas, we extract skill-related data, including years of experience with specific technologies. This involves meticulous schema generation and tagging to ensure precise extraction of relevant information.

def create_skill_schema(skill):
    skill_key = skill.replace(" ", "_")
    schema = {
        "properties": {
            f"has_{skill_key}_experience": {"type": "boolean"},
            f"years_with_{skill_key}": {"type": "integer"},
            "skill": {"type": "string"}
        }
    }
    return schema

def extract_skill_info(doc, skill):
    schema = create_skill_schema(skill)
    chain = create_tagging_chain(schema, cfg.llm)
    return chain.run(doc)

Chainlit-based UI: Streamlining User Interaction

The user interface plays a pivotal role in enhancing the user experience and streamlining the candidate evaluation process. Here’s how we’ve leveraged Chainlit to achieve this:

Intuitive Interaction Model: The UI guides users through the process of inputting skills and weights, ensuring clarity and ease of use.
Real-time Processing Updates: Users receive real-time updates on the progress of CV processing, enhancing transparency and efficiency.
File Upload Functionality: The UI allows users to effortlessly upload CVs, simplifying the data input process.

Challenges and Future Directions

Despite the promising results, we faced challenges such as accurately interpreting unstructured CV data and optimizing model selection. Our experiments with different language models, including gpt-4o, highlighted the need for further refinement.

Moving forward, we aim to address several areas to enhance the application:

Expanding File Format Support: Currently, the application only accepts PDF files. We plan to extend support to other file formats such as DOCX, TXT, and even online profiles like LinkedIn. This will broaden the range of input sources and make the tool more versatile.
Optimizing Model Selection: Further research and experimentation with different language models will help us balance accuracy and computational efficiency. We aim to refine the model selection process to ensure optimal performance.
Enhanced User Feedback: Implementing more detailed progress indicators and feedback mechanisms in the UI will provide users with better insights into the processing stages and outcomes.
Integration with HR Systems: To streamline HR workflows, we plan to integrate the application with existing HR management systems. This will allow seamless data transfer and better coordination within the HR team.

Conclusion

By leveraging LangChain’s Tagging System, ChatGPT, and Chainlit, we’ve transformed a tedious manual process into a streamlined, efficient, and scalable solution. This technical innovation not only saves time but also ensures a more consistent and objective candidate evaluation, paving the way for better hiring decisions. With ongoing improvements and expansions, we are committed to pushing the boundaries of HR technology and delivering even greater value to our users.

Explore the Code

For those interested in the technical implementation, the complete codebase for this application is available on

GitHub - CodeDuoLabs/hr-ranking-langchain github.com

Contribute to CodeDuoLabs/hr-ranking-langchain development by creating an account on GitHub.

About CodeDuo

At CodeDuo, we specialize in creating cutting-edge software solutions to address complex business challenges. As your technical partner, we leverage our engineering know-how to turn your ideas into successful business solutions. We provide tailored software services and expert consulting to shape your digital future, combining the nimbleness of a startup with the dependability of an enterprise.

Principles & Best Practices of REST API Design

How We Achieved a Perfect 100 Google PageSpeed Score with Astro.js and Partial Hydration

Not Just Code: The Real Skills That Define Great Software Engineers

Optimizing HR Processes with LangChain's Tagging System

Overview

Decoding the Application Flow

Understanding the Architecture

LangChain-based Engine: Extracting Insights from CVs

Chainlit-based UI: Streamlining User Interaction

Challenges and Future Directions

Conclusion

Explore the Code

About CodeDuo

Related articles

Optimizing HR Processes with LangChain's Tagging System

Overview#

Decoding the Application Flow#

Understanding the Architecture#

LangChain-based Engine: Extracting Insights from CVs#

Chainlit-based UI: Streamlining User Interaction#

Challenges and Future Directions#

Conclusion#

Explore the Code#

About CodeDuo#

Related articles #

Principles & Best Practices of REST API Design #

How We Achieved a Perfect 100 Google PageSpeed Score with Astro.js and Partial Hydration #

Not Just Code: The Real Skills That Define Great Software Engineers #

Overview

Decoding the Application Flow

Understanding the Architecture

LangChain-based Engine: Extracting Insights from CVs

Chainlit-based UI: Streamlining User Interaction

Challenges and Future Directions

Conclusion

Explore the Code

About CodeDuo

Related articles

Principles & Best Practices of REST API Design

How We Achieved a Perfect 100 Google PageSpeed Score with Astro.js and Partial Hydration

Not Just Code: The Real Skills That Define Great Software Engineers