Project: HR Policy Query Resolution System using RAG pipeline

Objective:

Design and implement an advanced Retrieval-Augmented Generation (RAG) system for efficient and context-aware resolution of HR policy queries, leveraging document retrieval and generation techniques to extract and synthesize relevant information from policy documents.

Problem Statement:

This project addresses the need for an automated solution capable of answering HR policy-related queries with high precision. The system is designed to:

Process and organize policy documents.
Retrieve contextually relevant content based on user input.
Generate accurate, informative responses that reflect the content of the documents.

Example:

Given a query regarding HR policies, the system should retrieve the most relevant document segments and generate a contextually-relevant response based on those.

Pre-requisites:

HR Policy Document PDFs – The primary source of policy-related data.
Pre-trained Embedding Model ("all-mpnet-base-v2") – Used for converting document chunks into embeddings.
Vector Storage (e.g., ChromaDB or FAISS) – For storing document chunks and their embeddings.
Mistral-7B-Instruct-v0.2 Large Language Model (LLM) – For generating contextually relevant responses.

Expected Outcome:

A robust system capable of processing and indexing HR policy documents.
A model that can retrieve the most relevant document segments based on a query.
Accurate and contextually relevant responses generated by the system.

Output Format:

HR Policy Query Input
Generated Response

Approach:

The system follows a series of steps to process documents and generate responses to queries:

Document Loading: Policy documents are loaded from the repository.
Document Splitting: Documents are divided into chunks for easier retrieval.
Embedding Generation: Each chunk is converted into embeddings using the "all-mpnet-base-v2" model.
Vector Storage: Chunks and embeddings are stored in ChromaDB for efficient retrieval.
Query Resolution: The system retrieves relevant document chunks based on the query and combines them with the input to generate a final response using the Mistral-7B-Instruct-v0.2 LLM.
Response Output: A contextually accurate response is generated and returned to the user.

Repository Files:

Code File: rag_llm_pipeline_query_jpmc_code_of_conduct_policy.ipynb
Jupyter notebook with the implementation of the RAG pipeline and query resolution logic.
HR Policy PDF: jp-morgan-chase-code-of-conduct-policy.pdf
Example of a JP Morgan HR policy document for querying.
Model Saving & Quantization: Script_For_Saving_LLM_Quantized_4bit_Model+Tokenizer.ipynb
Code to download the Mistral-7B-Instruct-v0.2 LLM and its tokenizer, and save the quantized 4-bit version for efficient reloading.

Outcome:

RAG pipeline based responses:

(Case 1: In-context query responses)

(Case 2: Out-of-context query responses)

Feel free to contribute by improving the workflow or suggesting optimizations!

Disclaimer:

The dataset utilized in this project has been obtained from public sources and is used solely for educational and research purposes. All efforts have been made to ensure that no proprietary or sensitive information is included. If you have any concerns or identify any conflicts regarding the use of this dataset, please feel free to get in touch. For inquiries or further information, you can contact me.

Thank you for taking the time to visit this repository!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Mistral-7B-v0.2-Model-Files		Mistral-7B-v0.2-Model-Files
Temp_Images		Temp_Images
data_files		data_files
output_files		output_files
LICENSE		LICENSE
README.md		README.md
Script_For_Saving_LLM_Quantized_4bit_Model+Tokenizer.ipynb		Script_For_Saving_LLM_Quantized_4bit_Model+Tokenizer.ipynb
rag_llm_pipeline_query_jpmc_code_of_conduct_policy.ipynb		rag_llm_pipeline_query_jpmc_code_of_conduct_policy.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project: HR Policy Query Resolution System using RAG pipeline

Objective:

Problem Statement:

Example:

Pre-requisites:

Expected Outcome:

Output Format:

Approach:

Repository Files:

Outcome:

RAG pipeline based responses:

Disclaimer:

About

Releases

Packages

Languages

License

ChaitanyaC22/HR_Policy_Query_Resolution_with_Retrieval_Augmented_Generation_RAG

Folders and files

Latest commit

History

Repository files navigation

Project: HR Policy Query Resolution System using RAG pipeline

Objective:

Problem Statement:

Example:

Pre-requisites:

Expected Outcome:

Output Format:

Approach:

Repository Files:

Outcome:

RAG pipeline based responses:

Disclaimer:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages