Dependencies: GitPython~=3.1.31, pylint==2.17.4, pydantic==2.3.0, matplotlib>=3.5, and scipy.
The entry point for all experiments performed for the study is the scripts in the src/suppression_study/experiments directory. You can run experiments with a command like this:
python -m suppression_study.experiments.<NameOfExperiment>
Some experiments have dependencies on others. For example, you can run all experiments in this order:
- RunCheckersOnLatestCommit.py Get warnings in the newest studied commit.
- CountSuppressionsNumOnMainCommits.py Extract and count suppressions for all commits.
- CountSuppressionsOnLatestCommit.py Extract and count suppressions for a specific commit.
- ComputeSuppressionHistories.py Extract histories of suppressions.
- ComputeWarningSuppressionMapsOnLatestCommit.py Compute the mappings between warnings and suppressions (the newest commit).
- ComputeIntermediateChains.py Extract intermediate line number chains for the histories.
- ComputeAccidentallySuppressedWarnings.py Find potentially unintended suppressions.
Visualization
- DistributionOfSuppressionsNumOnMainCommits.py
- DistributionOfSuppressionsOnLatestCommit.py
- PlotSuppressionDistributionJavaAndJS.py
- VisualizeWarningSuppressionMapsOnLatestCommit.py
- VisualizeLifetimeForAllSuppressions.py
Inspections:
- InspectSuppressionHistories.py Shuffle all extracted suppression histories. Get the add and delete commit git urls for the corresponding change event for manual inspection.
- InspectSuppressionRelatedCommits.py Randomly samples commit that either add or remove a suppression and prepare them for manual inspection.
- InspectAccidentallySuppressedWarnings.py Collect all potentially unintended suppressions from the projects and prepare them for manual inspection.
The data directory contains the following subdirectories and files, most of which are created by running the experiments:
- Files:
- data/specific_numeric_type_map.csv includes the mappings between Pylint warning types and their numeric code.
- 2020_CodeReview_Spec.pdf and 2022_CodeReview_Spec.pdf are course requirements for the Java/JavaScript student projects.
- Other files are about the studied repositories.
- Subdirectories:
- data/repos contains the repositories we study:
- Now it is empty, the experiments will check the existence and clone the repositories as needed.
- data/results Load pre-computed results, contains the results of running the experiments:
- [Individual perspective] data/results/<repo_name> contains all results for the repository.
repo_name.
- For each project, the following records the detailed results for RQ1 to RQ5:
- Related to RQ1
- data/results/<repo_name>/commits/<commit_id> contains all results for a specific commit hash <commit_id>
- Related to RQ2
- grep records the suppressions in the 1,000 commits.
- histories_suppression_level_all.json is the history file.
- Related to RQs 3 and 4
- suppression2warnings contains the files where record the maps between suppressions and warnings, and the useless/useful suppressions.
- RQ5 is based on all result files above
- Related to RQ1
- For each project, the following records the detailed results for RQ1 to RQ5:
- [Overall perspective] Folders RQ1 to RQ5 contain the overall result of the corresponding research question.
- [Individual perspective] data/results/<repo_name> contains all results for the repository.
repo_name.
- data/repos contains the repositories we study:
Choose between SLOW MODE, which extracts the suppressions and warnings and may take several hours, depending on hardware, and FAST MODE, which generates the tables and plots from pre-computed results and should take less than 30 minutes (include cloning the repositories).
Note: If no explicit path is written, by default all code files are in src/suppression_study/experiments and result files are in data/results.
- Run RunCheckersOnLatestCommit.py.
- Analysis to get the unified warning kinds and the number of each kind. -> Table 1.
- Run ComputeSuppressionHistories.py.
- Run CountSuppressionsOnLatestCommit.py -> suppressions_per_repo.tex (part of Table 2).
- Run DistributionOfSuppressionsNumOnMainCommits.py -> Figure 4.
- Run VisualizeLifetimeForAllSuppressions.py -> Figure 5 and commits_and_histories.tex (remaining Part of Table II).
- suppressions_per_repo.tex + commits_and_histories.tex -> Table 2.
- Run ComputeWarningSuppressionMapsOnLatestCommit.py
- Run VisualizeLifetimeForAllSuppressions.py --> Figure 5
- Run VisualizeWarningSuppressionMapsOnLatestCommit.py -> Table 3 and Figure 6.
- ComputeIntermediateChains.py
- ComputeAccidentallySuppressedWarnings.py
- Manual analysis. -> Table 5
- Run InspectSuppressionRelatedCommits.py
- Manual analysis. -> Table 6
Before generate the table and plots, load the pre-computed results. Here are two options:
- [Option 1] Run LoadPreComputedResults.py, it will automatically load and place the results folder in the correct location.
- [Option 2] Manually download results.zip from the latest release and extract it into the data folder.
You can check the structure of data/results here.
All tables and figures (exclude the ones require manual analysis) in results:
- Run CountSuppressionsOnLatestCommit.py -> suppressions_per_repo.tex (part of Table 2).
- Run DistributionOfSuppressionsNumOnMainCommits.py -> Figure 4.
- Run VisualizeLifetimeForAllSuppressions.py -> Figure 5 and commits_and_histories.tex (remaining Part of Table 2).
- suppressions_per_repo.tex + commits_and_histories.tex -> Table 2.
- Run VisualizeLifetimeForAllSuppressions.py --> Figure 5
- Run VisualizeWarningSuppressionMapsOnLatestCommit.py -> Table 3 and Figure 6.