Skip to content

Commit

Permalink
Merge pull request #1 from dnouri/getdata-cli
Browse files Browse the repository at this point in the history
A command for OpenSky_Get_Data.py allows passing of parameters
  • Loading branch information
simonrp84 authored May 2, 2020
2 parents 7e85104 + 11fa6de commit 2128b0b
Show file tree
Hide file tree
Showing 2 changed files with 48 additions and 51 deletions.
78 changes: 32 additions & 46 deletions OpenSky_Get_Data.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,29 +3,19 @@
a small perimeter is set up around the airport to catch the approach path
"""

from datetime import datetime, timedelta
from traffic.data import opensky
from datetime import datetime, timedelta, timezone
from importlib import import_module
import multiprocessing as mp
import numpy as np
import os

# Use this line to change the airport to retrieve.
import OS_Airports.VABB as AIRPRT
import click
import numpy as np
from traffic.data import opensky

# Use these lines if you need debug info
# from traffic.core.logging import loglevel
# loglevel('DEBUG')

# This line sets the output directory
outdir = '/gf2/eodg/SRP002_PROUD_ADSBREP/GO_AROUNDS/VABB/INDATA/'

# Setting up the start and end times for the retrieval
start_dt = datetime(2019, 8, 10, 0, 0)
end_dt = datetime(2019, 8, 20, 23, 59)

# Sets the number of simultaneous retrievals
nummer = 6


def get_bounds(rwys):
"""
Expand All @@ -50,7 +40,7 @@ def get_bounds(rwys):
return bounds


def getter(init_time, bounds, timer, anam):
def getter(init_time, bounds, timer, anam, outdir):
"""
This function downloads the data, which is done in
one hour segments. Each hour is downloaded separately
Expand All @@ -77,33 +67,29 @@ def getter(init_time, bounds, timer, anam):
return


bounds = get_bounds(AIRPRT.rwy_list)

# Loop over timestamps to retrieve all the data.
while True:

dtst = start_dt.strftime("%Y%m%d%H%M")
outf = outdir + 'OS_' + dtst+'_' + AIRPRT.icao_name + '.pkl'
print("Now processing:",
start_dt.strftime("%Y/%m/%d %H:%M"),
'for', AIRPRT.airport_name + ' / ' +
AIRPRT.icao_name)

# Create processes for each hour, in total 'nummer' hours are
# processed simultaneously
processes = [mp.Process(target=getter,
args=(start_dt, bounds, i, AIRPRT.icao_name))
for i in range(0, nummer)]

# Start, and then join, all processes
for p in processes:
p.start()
for p in processes:
p.join()

# Move on to the next block of times
start_dt = start_dt + timedelta(hours=nummer)

# If we have reached the end of the block then exit
if (start_dt >= end_dt):
break
@click.command()
@click.option('--airport', default='VABB')
@click.option('--start-dt', default='2019-08-10')
@click.option('--end-dt', default='2019-08-21')
@click.option('--outdir', default='INDATA/')
@click.option('--n-jobs', default=1)
def main(airport, start_dt, end_dt, outdir, n_jobs):
os.makedirs(outdir, exist_ok=True)
airport = import_module('OS_Airports.' + airport)
bounds = get_bounds(airport.rwy_list)
start_dt = datetime.strptime(start_dt, '%Y-%m-%d').replace(
tzinfo=timezone.utc)
end_dt = datetime.strptime(end_dt, '%Y-%m-%d').replace(
tzinfo=timezone.utc)
hours = int((end_dt - start_dt).total_seconds() / 60 / 60 + 0.5)

pool = mp.Pool(n_jobs)

pool.starmap(getter, [
(start_dt, bounds, hour, airport.icao_name, outdir)
for hour in range(hours)
])


if __name__ == '__main__':
main()
21 changes: 16 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,29 @@ Requires:

Usage:
First you must download aircraft data, which can be done using the `OpenSky_Get_Data` script. You can then point `GA_Detect` at the download location to scan for go-arounds.
This tool is in very early development, so has many manual tweaks that would ideally be changeable via a config file or directly via the command line call. The most important of these tweaks are listed below:
This tool is in very early development, so has manual tweaks that would ideally be changeable via a config file or directly via the command line call. The most important of these tweaks are listed below:

### In `OpenSky_Get_Data.py`:
### `OpenSky_Get_Data.py`:

`outdir`, the output directory, must be manually set in the script
Use the script's `--outdir` option so the the output directory. This defaults to `INDATA` in your current working directory.

`nummer` specifies the number of concurrent retrievals from the OpenSky database. I have found that six works well, but this may be different for you.
Use `--n-jobs` to specify the number of concurrent retrievals from the OpenSky database. I have found that six works well, but this may be different for you.

The airport region to retrieve data for is specified with the import line: `import airport.VABB as AIRPRT`, which will import Mumbai airport (VABB). You should create your own airport definition in the `./airports` directory.
The airport region to retrieve data for is specified with the `--airport` option. The default is `VABB`, which will import Mumbai airport (VABB). You should create your own airport definition in the `./airports` directory.

The border region around the airport is manually specified (as `0.45 deg`) in `get_bounds()`. You may wish to change this.

Running the script without parameters defaults to downloading data for
the ``VABB`` airport between 2019-08-10 and 2019-08-21 and saving that
data into the `INDATA` directory in your working directory. Thus,
these two calls are equivalent:

```bash
python OpenSky_Get_Data.py # does the same thing as the next command:
python OpenSky_Get_Data.py \
--airport=VABB --start-dt=2019-08-10 --end-dt=2019-08-21 \
--outdir=INDATA --n-jobs=1
```

### In `GA_Detect.py`
The directory structure is set at the beginning of `main()`. You will probably want to adjust this to your own requirements.
Expand Down

0 comments on commit 2128b0b

Please sign in to comment.