binary2source-matching-under-function-inlining

This is the repository illustrating how we label the inlined call sites, train the classifier for ICS prediction, and generate SFSs for binary2source matching.

Dataset

The dataset can download from https://drive.google.com/file/d/1K9ef-OoRBr0X5u8g2mlnYqh9o1i6zFij/view and https://drive.google.com/file/d/1wt7GY-DDp8J_2zeBBVUrcfWIyerg_xLO/view. It is constructed using Binkit (https://github.com/SoftSec-KAIST/BinKit).

Instructions

If you want to replicate the work, please run the following instructions:

Processing and Labeling

run 0.preprocessing/Binary_FCG_extraction/IDA_fcg_extractor/run_IDA_on_all_binaries.py to extract FCG for binaries. Some paths in the above files should be changed to your destination.
run 0.preprocessing/Source_FCG_extraction/run_understand_to_extract_fcgs.py to extract FCG for source projects. Paths of Understand and source projects should also be changed. An example of source FCG can refer to 0.preprocessing/Source_FCG_extraction/FCG/a2ps-4.14_fcg.json.
run 1.inlining_ground_truth_labelining/build_binary_to_binary_matching_ground_truth.py to summarize the binary2source function-level mapping of the dataset. Before running this script, please refer to https://github.com/island255/TOSEM2022 to construct the binary2source function-level mapping of the dataset.
run 1.inlining_ground_truth_labelining/extract_mapped_call_site.py to identify the inlined call sites and the normal call sites.

Feature Extraction

run 2.feature_extraction/function_feature_extraction/processing_projects.py to extract function contents for source projects.
run 2.feature_extraction/call_site_feature_extraction/call_site_feature_extraction.py to extract the call site features.

Classifier Training

run 3.classifier/multi-label_classifier/find_best_para/find_best_parameters_for_models.py to find the best number of estimators for different MLC models.

SFS Generation

run 4.apply_classifier_to_test_projects/multi-label_classifiers/using_multi-label_classifiers.py to generate SFSs for source projects.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

binary2source-matching-under-function-inlining

Dataset

Instructions

Processing and Labeling

Feature Extraction

Classifier Training

SFS Generation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
0.preprocessing		0.preprocessing
1.inlining_ground_truth_labelining		1.inlining_ground_truth_labelining
2.feature_extraction		2.feature_extraction
3.classifier/multi-label_classifier/find_best_para		3.classifier/multi-label_classifier/find_best_para
4.apply_classifier_to_test_projects/multi-label_classifiers		4.apply_classifier_to_test_projects/multi-label_classifiers
README.md		README.md
environment.yaml		environment.yaml

island255/binary2source-matching-under-function-inlining

Folders and files

Latest commit

History

Repository files navigation

binary2source-matching-under-function-inlining

Dataset

Instructions

Processing and Labeling

Feature Extraction

Classifier Training

SFS Generation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages