Skip to content

Collecting and scraping KBO data, including player stats and game results.

License

Notifications You must be signed in to change notification settings

leewr9/kbo-data-collector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KBO Data Collector

Test Scraper Status

This repository is dedicated to collecting and scraping KBO (Korea Baseball Organization) data. It includes scripts and processes for gathering player statistics, team data, game results, and other related information.

Installation

  • Python 3.12+
  1. Clone the repository:

    git clone https://github.com/leewr9/kbo-data-collector.git
    cd kbo-data-collector
  2. Install dependencies:

    pip install -r requirements.txt

Usage

Main Command

The tool can be run via the command line and offers four main commands: game, player, schedule, and team.

python run.py <command> [options]

Commands

game: Scrapes KBO game data

This command scrapes data related to specific KBO games. It will internally fetch schedule data as well.

python run.py game --date <target_date>

Options:

  • -p, --path: Path to the schedule file to be parsed.
  • -d, --date: Specify a date (in YYYYMMDD format) to fetch data for that day.
  • -f, --full: Scrape all available data from April 5, 2001, to today.

Note: Since the game command internally fetches the schedule data, the options -d and -f are the same as those for the schedule command and will also apply when scraping game data.


player: Scrapes KBO player data

This command allows you to scrape data for different types of players, including batters, pitchers, fielders, and base runners.

python run.py player --player <player_type> --season <target_season>
Options:
  • -p, --player: Specify the type of player data to scrape. Valid options are:
    • hitter for batting statistics
    • pitcher for pitching statistics
    • fielder for fielding statistics
    • runner for base running statistics
  • -a, --all: Scrape data for all players.
  • -s, --season: Specify the season year (e.g., 2024) to scrape data for that year.

schedule: Scrapes KBO schedule data

This command scrapes the schedule data for KBO games. You can fetch data for a specific date or scrape all data from the start of the KBO season in 2001 to today.

python run.py schedule --date <target_date>
Options:
  • -d, --date: Specify a date (in YYYYMMDD format) to fetch data for that day.
  • -f, --full: Scrape all available data from April 5, 2001, to today.

team: Scrapes KBO team data

This command scrapes data related to KBO teams.

python run.py team

Help

For more detailed information on any command, you can use the --help flag:

python run.py <command> --help

Functions

Each command is mapped to a corresponding function in the code:

  • scrape_game_data_command: Handles scraping of game data.
  • scrape_player_data_command: Handles scraping of player data.
  • scrape_schedule_data_command: Handles scraping of schedule data.
  • scrape_team_data_command: Handles scraping of team data.

These functions take care of the web scraping and data processing based on the command-line arguments passed.


Example Commands

  1. Scrape Game Data:

    python run.py game --date 20240205
  2. Scrape Player Data for Batters in 2024 Season:

    python run.py player --player hitter --season 2024
  3. Scrape KBO Schedule Data for a Specific Date:

    python run.py schedule --date 20240205

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

Collecting and scraping KBO data, including player stats and game results.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages