Advancing MLOps and DevOps Efficiency: A Systematic Approach to Issue Management using Large Language Models

SWE-bench Inference

In this package, we provide various tools to get started on SWE-bench inference. In particular, we provide the following important scripts and sub-packages:

make_datasets, this sub-package contains scripts to generate new datasets for SWE-bench inference with your own prompts and issues.
run_api.py, this script is used to generate API model generations for a given dataset.

`make_datasets`

For more information on how to use this sub-package, please refer to the README in the sub-package.

Run API inference on test datasets

This python script is designed to run inference on a dataset using either the OpenAI or Anthropic API, depending on the model specified. It sorts instances by length and continually writes the outputs to a specified file, so that the script can be stopped and restarted without losing progress.

For instance, to run this script on SWE-bench context and Anthropic's Claude 3 model, you can run the following command:

export ANTHROPIC_API_KEY=<your key>
python run_api.py --dataset_name_or_path princeton-nlp/SWE-bench_oracle --model_name_or_path claude-3 --output_dir ./outputs

Run live inference on open GitHub issues

Follow instructions here to install Pyserini, to perform BM25 retrieval.

Then run run_live.py to try solving a new issue. For example, you can try solving this issue by running the following command:

export OPENAI_API_KEY=<your key>
python run_live.py --model_name gpt-3.5-turbo-1106 \
    --issue_url https://github.com/huggingface/transformers/issues/26706

Issue Tagging and Prioritization

Here is the evaluation of all the datasets:

MulDIC

Run the following files to evaluate the MulDIC datasets

lvlm_gemini_pro.py, This Python script appears to utilize the google.generativeai library, specifically the GenAI module, to interact with Google's Generative AI models, particularly the gemini-pro-vision model.
lvlm_gpt_vision.py, This Python script utilizes the OpenAI API to generate responses to issue titles and code snippets extracted from a dataset.

For more information ont his dataset, please refer to the here in the sub-package.

Issue Ticket Tagger:

Run the following files to evaluate the Issue Ticket Tagger datasets:

llm_gemini_pro.py, This Python script appears to utilize the Gemini Pro model to generate labels for a list of issue texts extracted from a file. The generated labels are then saved for further analysis or use.
llm_gpt3.py, This Python script utilizes the GPT-3.5 Turbo model from OpenAI to generate labels for a list of issue texts extracted from a file. The output of this script is a series of generated responses printed to the console and saved in the ticket_tagger_gpt3.json file.

NLBSE'24:

Run the following files to evaluate the NLBSE'24 datasets:

issueclassificationgpt.ipynb, This script uses Installation of Required Libraries along with that Importing Libraries and Loading Data.
nlbse_eval.py, This code snippet performs several tasks related to data cleaning and interaction with the OpenAI GPT-3 API for generating responses. The output of this script is a series of generated responses printed to the console and saved in the ticket_tagger_gpt3.json file.
requirements.txt, It provided lists specific versions of Python packages as dependencies.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
issue_resolution		issue_resolution
issue_tagging		issue_tagging
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advancing MLOps and DevOps Efficiency: A Systematic Approach to Issue Management using Large Language Models

SWE-bench Inference

`make_datasets`

Run API inference on test datasets

Run live inference on open GitHub issues

Issue Tagging and Prioritization

MulDIC

Issue Ticket Tagger:

NLBSE'24:

About

Releases

Packages

Languages

aman-17/Issue-Management-LLM

Folders and files

Latest commit

History

Repository files navigation

Advancing MLOps and DevOps Efficiency: A Systematic Approach to Issue Management using Large Language Models

SWE-bench Inference

make_datasets

Run API inference on test datasets

Run live inference on open GitHub issues

Issue Tagging and Prioritization

MulDIC

Issue Ticket Tagger:

NLBSE'24:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`make_datasets`

Packages