Skip to content

This project analyzes log files to track IP request counts, identify the most accessed endpoints, and flag suspicious activity based on failed login attempts. It outputs results in CSV format and visualizes key insights using bar charts.

License

Notifications You must be signed in to change notification settings

revanthchristober/Security-Log-Files-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Log File Analysis

This project involves parsing and analyzing log files to extract useful information about user activity, most accessed endpoints, and potential security issues. The analysis includes identifying IP addresses that made the most requests, the most accessed endpoints, and IP addresses with failed login attempts that exceed a specified threshold.

Features

  • Requests per IP: Analyzes the log file to count the number of requests made by each IP address.
  • Most Accessed Endpoint: Identifies the most frequently accessed endpoint in the log file.
  • Suspicious Activity: Flags IP addresses with failed login attempts exceeding a defined threshold.
  • Data Visualization: Displays bar charts showing the top IP addresses by request count and the suspicious activity (failed login attempts).

Installation

  1. Clone this repository or download the code files.
  2. Install the required dependencies:
pip install pandas matplotlib

Usage

  1. Ensure the log file you want to analyze is located at the specified log_file_path.
  2. Adjust the FAILED_LOGIN_THRESHOLD value if necessary to detect suspicious activity based on your log data.
  3. Run the script using Python:
python log_analysis.py

The script will output:

  • CSV Files: The results are saved as individual CSV files:

    • log_analysis_results.csv: Contains all the analysis data in a single CSV file.
  • Bar Charts: The analysis results will also be visualized in the form of bar charts:

    • A bar chart showing the top 10 IP addresses by request count.
    • A bar chart showing the suspicious IP addresses with failed login attempts.

File Structure

  • log_analysis.py: The script containing the analysis code.
  • sample_data.log: Example log file for analysis (replace with your own log file).
  • processed_outputs/: Directory for storing the output CSV files.

Sample Output

  1. Requests per IP: Displays a table of IP addresses and their request count.
  2. Most Accessed Endpoint: Displays the most frequently accessed endpoint.
  3. Suspicious Activity: Displays IP addresses that have exceeded the failed login threshold.

Example:

Requests per IP:
   IP Address    Request Count
0  192.168.1.1          150
1  192.168.1.2          120
...

Most Accessed Endpoint:
                Endpoint  Access Count
0        /login             200
1        /dashboard         150
...

Suspicious Activity:
   IP Address    Failed Login Count
0  192.168.1.1          15
...

License

This project is licensed under the GNU General Public License 3.0 - see the LICENSE file for details.

Acknowledgments

  • This project uses pandas for data analysis and matplotlib for data visualization.
  • The log parsing is based on common log formats, but it can be adapted for different log file structures.

About

This project analyzes log files to track IP request counts, identify the most accessed endpoints, and flag suspicious activity based on failed login attempts. It outputs results in CSV format and visualizes key insights using bar charts.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published