This program calculates the number of train required to cover a given train service schedule.
Curious about Taiwan High Speed Rail's trainset utilization rate, I decided to write a program to solve this question.
python3 main.py <service-timetable> <positioning-train-library>
- service-timetable: file name of a timetable located in
data/
folder - positioning-train-library: file name of a list of all possible positioning trains, located in
data/
folder.
The alogrithm will choose a set of trains from the positioning train list to minimize the required trainsets to cover the whole service timetable.
python3 approxmain.py <service-timetable>
Without a specified list of possible positioning trains, the algorith assumes that a trainset can be repositioned anytime within the following times:
- Nangang ⇄ Taichung: 60 minutes
- Taichung ⇄ Zuoying: 50 minutes
Each line consists a train with the format:
<train number> <origin-station> <origin-time> <destination> <destination-time>
e.g: 1202 ZUY 06:30 NAG 08:20
Each line consists a positioning train with the format:
<train number> <origin-station> <origin-time> <destination> <destination-time> <conflict-trains>
e.g: 1455 NAG 20:30 TAC 21:30 1557
conflict-trains is a list of conflicting service trains (specified by train number, seperated by space), , the algorithm will avoid selecting this positioning train when the service train list contains a train in conflict.
Tip: To get better results, Split Nangang ⇄ Zuoying positioning trains into two legs, Nangang ⇄ Taichung and Taichung ⇄ Zuying, respectively. Leave it to the algorithm to decide whether to position a trainset all the way back to Nangang/Zuoying or just to Taichung.
A script to crawl the timetable down from THSR's website.
(Requires python3.6 and pipenv)
pipenv --python 3.6
pipenv install
pipenv shell
python timetable-crawler.py <YYYYMMDD>
A interactive script for generating train schedule list (Helpful for generating positioning train lists for peroidic timetables)
python3 train-generator.py
THSR currently has 34 trainsets.
Statistics obtained from this program (without specifying the positioning train list):
Day | utilization |
---|---|
Mon | 28 trainsets |
Tue-Thu | 27 trainsets |
Fri, Sun | 30 trainsets |
Sat | 29 trainsets |
Date | utilization |
---|---|
1/31 | 29 trainsets |
2/1 | 32 trainsets |
2/2 - 2/4 | 31 trainsets |
2/5 | 32 trainsets |
2/6 - 2/10 | 33 trainsets |
2/11 | 30 trainsets |
The algorithm is an extension of this problem, which can be transformed into the Bipartite Maching Prblem, and then the Maximum Flow Problem, and be solved by the Ford-Fulkerson Algorithm.
For the Given Service timetable, without positioning train list version, It is exactly that codejam problem.
Construct a directed graph with each node representing a train service. Add a directed edge between two nodes if both trains can be assigned to the same trainset (Whether the terminus of the first train equals the origin of the second train and the second train departs after the first train ends its journey, or they share different terminus/origins but there's plenty of time for the trainset to be positioned to the origin of the second train).
The Given Service timetable and a list of possible positioning trains version, however, requires some modification since some nodes(positioning trains) are optional.
Consider this timetable:
train no. | origin | depart time | destination | arrival time | type |
---|---|---|---|---|---|
101 | A | 8 | B | 9 | Service |
105 | A | 9 | B | 10 | Service |
406 | B | 9 | A | 10 | Positioning |
109 | A | 10 | B | 11 | Service |
110 | B | 10 | A | 11 | Service |
113 | A | 11 | B | 12 | Service |
114 | B | 11 | A | 12 | Service |
In the maximum flow problem, add those optional nodes in the middle (without connection to the source or sink) as follows: