🌐LinkedIn Alumni Scraper

This project is a web scraper that automates the process of extracting LinkedIn alumni data from a specific university. It collects information such as names, job titles, profile links, experience, education, and certifications.

🚀Features

Automated Login: Supports auto login that able to bypass LinkedIn's bot detection.
Manual Login: Safer option to bypass LinkedIn's bot detection.
Dynamic Scrolling: Loads more profiles dynamically for comprehensive data extraction.
Profile Scraping: Extracts detailed information such as work experience, education, and certifications.
Data Storage: Saves extracted data into a CSV file for further analysis.
City-Based Search: Scrapes alumni based on city names from a predefined list.

✅Requirements

Before running the scraper, ensure you have the following dependencies installed:

pip install -r requirements.txt

Dependencies

Web Scraping & Automation

selenium>=4.10.0 – Automates web interactions.
webdriver-manager>=4.0.1 – Manages WebDriver installations.
beautifulsoup4>=4.12.0 – Parses HTML content.
lxml>=4.9.0 – Faster XML and HTML parsing.

Data Processing

pandas>=2.1.0 – Data manipulation and analysis.
numpy>=1.25.0 – Numerical computing.
openpyxl>=3.1.2 – Reads and writes Excel files.
python-dotenv>=1.0.0 – Loads environment variables.

🔧Setup

Install Dependencies
```
pip install -r requirements.txt
```
Set Up Credentials Create a .env file in the project directory and add your LinkedIn credentials:
```
LINKEDIN_EMAIL=your-email@example.com
LINKEDIN_PASSWORD=your-password
```
Download ChromeDriver Ensure you have the appropriate version of ChromeDriver installed. The script will attempt to download it automatically using webdriver-manager.
Prepare City List Ensure that the Data/Person Locations/indonesia_cities.csv file contains a list of cities in a column named City.
Prepare Class Code
```
('div', {
         'class': 'YqprdwMdlHkSDMqLRuVsNMDuqpfpOSlCY EUugwXMAWHNSsJUZCvVoLYGTUzCejokiBUPPY aDbiGyAraCVAtqkDKUGRiLuhDZgkXmYiMA' # Make sure this Code is UP TO DATE
     })
```
Ensure that the Class, code from your Linkedin is Up To Date, the Class Code on the program might be different due to Linkedin Dynamic Section Class Code. This is to get data from Experience, Education and License & Certification

This class code in here is for getting location information.

💡Tips: Place your Cursor in the Border of the Section While Inspect With Cursor

▶️Usage

Run the script with the following command:

python main.py

Manual Login

The script will prompt you to log in manually to LinkedIn.
After logging in, press Enter in the terminal to continue.

Auto Login

The script will automatically login to your LinkedIn, ensure your Email and Password on .env are correct.
Don't do to much, otherwise the Linkedin Anti-Scraping System will notice unusual request and your account can get restriction.

User Prompts During Scraping

Press Enter to continue scraping the next profile.
Type next to skip to the next city.
Type exit to stop the script immediately.

💾Output

The extracted data will be saved in:

Data/LinkedIn_SCU_Alumni.csv

with the following fields:

City
Name
Job Title
LinkedIn Profile Link
Profile Picture URL
Experience
Education
Licenses & Certifications

🔖Notes

Scraping LinkedIn data is against their terms of service; use this tool responsibly.
Avoid running the script too frequently to prevent detection.
Ensure your LinkedIn account is in good standing before scraping.

🙏Acknowledgement

This project is under the name of Soegijapranata Student Career Centre (SSCC) and only use for academic purposes ---

Author: Faiz Noor Adhytia Contact: faizadhytia24@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
Assets		Assets
Data		Data
Linkedin Alumni Scrapper		Linkedin Alumni Scrapper
driver		driver
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
tempCodeRunnerFile.py		tempCodeRunnerFile.py
web_structure.txt		web_structure.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌐LinkedIn Alumni Scraper

🚀Features

✅Requirements

Dependencies

Web Scraping & Automation

Data Processing

🔧Setup

▶️Usage

Manual Login

Auto Login

User Prompts During Scraping

💾Output

🔖Notes

🙏Acknowledgement

About

Releases

Packages

Languages

notyouriiz/Linkedin_Scraper

Folders and files

Latest commit

History

Repository files navigation

🌐LinkedIn Alumni Scraper

🚀Features

✅Requirements

Dependencies

Web Scraping & Automation

Data Processing

🔧Setup

▶️Usage

Manual Login

Auto Login

User Prompts During Scraping

💾Output

🔖Notes

🙏Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages