Skip to content

Commit

Permalink
Migrated to digital ocean (#39)
Browse files Browse the repository at this point in the history
Cheaper
  • Loading branch information
jorgecardleitao authored Feb 7, 2024
1 parent c1df0c0 commit 62460f2
Show file tree
Hide file tree
Showing 17 changed files with 357 additions and 271 deletions.
11 changes: 5 additions & 6 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ reqwest-middleware = "*"
rand = {version="*", default_features = false, features = ["std", "std_rng", "getrandom"]}

# to perform time-based calculations
time = {version="*", default_features = false, features = ["formatting", "parsing", "macros"]}
time = {version="*", default_features = false, features = ["formatting", "parsing", "macros", "serde"]}

# compute distances between geo-points
geoutils = {version="*", default_features = false}
Expand All @@ -33,11 +33,10 @@ futures = "0.3"
# logging
log = "*"

# azure integration
azure_storage = "*"
azure_storage_blobs = "*"
azure_core = "*"
bytes = "1.5"
# S3 integration
aws-config = { version = "1.1.4", features = ["behavior-version-latest"] }
aws-sdk-s3 = "*"
aws-credential-types = "*"

[dev-dependencies]
tinytemplate = "1.1"
Expand Down
28 changes: 13 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

This repository contains a CLI application to analyze flights of private jets.

It is supported by an Azure Blob storage container for caching data, thereby
It is supported by an S3 Blob storage container for caching data, thereby
reducing its impact to [https://adsbexchange.com/](https://adsbexchange.com/).

## Risk and impact
Expand All @@ -20,9 +20,9 @@ to cache all hits to [https://adsbexchange.com/](https://adsbexchange.com/)
on an horizontally scaled remote storage and therefore remove its impact to adsbexchange.com
of future calls.

All data cached is available on Azure blob storage:
* account: `privatejets`
* container: `data`
All cached data is available on S3 blob storage at endpoint

> `https://private-jets.fra1.digitaloceanspaces.com`
and has anonymous and public read permissions.

Expand All @@ -35,20 +35,19 @@ to perform actual calculations. To use one of such examples:
2. run `cargo run --example single_day -- --tail-number "OY-GFS" --date "2023-10-20"`
3. open `OY-GFS_2023-10-20_0.md`

Step 2. has an optional argument, `--azure-sas-token`, specifying an Azure storage SAS
token.
When used, cache is written to the remote container, as opposed to disk.
Step 2. has an optional arguments, `--access-key`, `--secret-access-key`, specifying
credentials to write to the remote storate, as opposed to disk.

Finally, setting `--backend disk` ignores the Azure's remote storage altogether and
Finally, setting `--backend disk` ignores the remote storage altogether and
only uses disk for caching (resulting in higher cache misses and thus more
interactions with ADS-B exchange).

In general:
* Use the default parameters when creating ad-hoc stories
* Use `--azure-sas-token` when improving the database with new data.
* Use `--access-key` when improving the database with new data.
* Use `--backend disk` when testing the caching system

As of today, the flag `--azure-sas-token` is only available when the code is executed
As of today, the flag `--access-key` is only available when the code is executed
from `main`, as writing to the blob storage must be done through a controlled code base
that preserves data integrity.

Expand All @@ -62,9 +61,8 @@ cargo run --example country -- --from=2024-01-13 --to=2024-01-21 --country=denma
# Story about Portuguese private jets that flew between two dates
cargo run --example country -- --from=2024-01-13 --to=2024-01-21 --country=portugal

# Story about German private jets that flew between in 2023, where the azure-sas-token
# is on the file token.txt
cargo run --example country -- --from=2023-01-01 --to=2024-01-01 --country=germany --azure-sas-token=$(cat token.txt)
# Story about German private jets that flew in 2023, where secret is on a file
cargo run --example country -- --from=2023-01-01 --to=2024-01-01 --country=germany --access-key=DO00AUDGL32QLFKV8CEP --secret-access-key=$(cat secrets.txt)
```

## Methodology
Expand All @@ -75,5 +73,5 @@ The methodology used to extract information is available at [`methodology.md`](.

### Set of worldwide aicrafts whose primary use is to be a private jet:

* [Data](https://privatejets.blob.core.windows.net/data/database/private_jets/2023/11/06/data.csv)
* [Description](https://privatejets.blob.core.windows.net/data/database/private_jets/2023/11/06/description.md)
* [Data](https://private-jets.fra1.digitaloceanspaces.com/private_jets/2023/11/06/data.csv)
* [Description](https://private-jets.fra1.digitaloceanspaces.com/private_jets/2023/11/06/description.md)
4 changes: 2 additions & 2 deletions examples/cache_state.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ use itertools::Itertools;
use flights::Aircraft;

async fn private_jets(
client: Option<&flights::fs_azure::ContainerClient>,
client: Option<&flights::fs_s3::ContainerClient>,
) -> Result<Vec<Aircraft>, Box<dyn std::error::Error>> {
// load datasets to memory
let aircrafts = flights::load_aircrafts(client).await?;
Expand All @@ -29,7 +29,7 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
})
.collect::<Vec<_>>();

let client = flights::fs_azure::initialize_anonymous("privatejets", "data");
let client = flights::fs_s3::anonymous_client().await;

let existing = flights::existing_months_positions(&client).await?;

Expand Down
32 changes: 15 additions & 17 deletions examples/country.rs
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ pub struct Context {
#[derive(clap::ValueEnum, Debug, Clone)]
enum Backend {
Disk,
Azure,
Remote,
}

fn parse_date(arg: &str) -> Result<time::Date, time::error::Parse> {
Expand Down Expand Up @@ -181,10 +181,13 @@ impl Country {
#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
struct Cli {
/// The Azure token
/// The token to the remote storage
#[arg(long)]
azure_sas_token: Option<String>,
#[arg(long, value_enum, default_value_t=Backend::Azure)]
access_key: Option<String>,
/// The token to the remote storage
#[arg(long)]
secret_access_key: Option<String>,
#[arg(long, value_enum, default_value_t=Backend::Remote)]
backend: Backend,

/// Name of the country to compute on
Expand Down Expand Up @@ -213,7 +216,7 @@ async fn legs(
to: Date,
icao_number: &str,
location: Option<Location>,
client: Option<&flights::fs_azure::ContainerClient>,
client: Option<&flights::fs_s3::ContainerClient>,
) -> Result<Vec<Leg>, Box<dyn Error>> {
let positions = flights::aircraft_positions(from, to, icao_number, client).await?;
let mut positions = positions
Expand Down Expand Up @@ -278,18 +281,13 @@ async fn main() -> Result<(), Box<dyn Error>> {

let cli = Cli::parse();

// optionally initialize Azure client
let client = match (cli.backend, cli.azure_sas_token) {
(Backend::Disk, None) => None,
(Backend::Azure, None) => Some(flights::fs_azure::initialize_anonymous(
"privatejets",
"data",
)),
(_, Some(token)) => Some(flights::fs_azure::initialize_sas(
&token,
"privatejets",
"data",
)?),
// initialize client
let client = match (cli.backend, cli.access_key, cli.secret_access_key) {
(Backend::Disk, _, _) => None,
(_, Some(access_key), Some(secret_access_key)) => {
Some(flights::fs_s3::client(access_key, secret_access_key).await)
}
(Backend::Remote, _, _) => Some(flights::fs_s3::anonymous_client().await),
};

// load datasets to memory
Expand Down
15 changes: 7 additions & 8 deletions examples/export_legs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,12 @@ const ABOUT: &'static str = r#"Builds the database of all private jet positions
#[derive(Parser, Debug)]
#[command(author, version, about = ABOUT)]
struct Cli {
/// The Azure token
#[arg(short, long)]
azure_sas_token: Option<String>,
/// The token to the remote storage
#[arg(long)]
access_key: String,
/// The token to the remote storage
#[arg(long)]
secret_access_key: String,
}

#[tokio::main(flavor = "multi_thread")]
Expand All @@ -26,11 +29,7 @@ async fn main() -> Result<(), Box<dyn Error>> {

let cli = Cli::parse();

// optionally initialize Azure client
let client = match cli.azure_sas_token.clone() {
None => flights::fs_azure::initialize_anonymous("privatejets", "data"),
Some(token) => flights::fs_azure::initialize_sas(&token, "privatejets", "data")?,
};
let client = flights::fs_s3::client(cli.access_key, cli.secret_access_key).await;

// load datasets to memory
let aircrafts = load_aircrafts(Some(&client)).await?;
Expand Down
36 changes: 17 additions & 19 deletions examples/export_private_jets.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,24 @@ use flights::{load_aircrafts, load_private_jet_models};
#[derive(clap::ValueEnum, Debug, Clone)]
enum Backend {
Disk,
Azure,
Remote,
}

const ABOUT: &'static str = r#"Exports the database of all worldwide aircrafts whose primary use is to be a private jet to "data.csv"
and its description at `description.md` (in disk).
If `azure_sas_token` is provided, data is written to the public blob storage instead.
If `access_key` and `secret_access_key` is provided, data is written to the public blob storage instead.
"#;

#[derive(Parser, Debug)]
#[command(author, version, about = ABOUT)]
struct Cli {
/// The Azure token
#[arg(short, long)]
azure_sas_token: Option<String>,
#[arg(short, long, value_enum, default_value_t=Backend::Azure)]
/// The token to the remote storage
#[arg(long)]
access_key: Option<String>,
/// The token to the remote storage
#[arg(long)]
secret_access_key: Option<String>,
#[arg(short, long, value_enum, default_value_t=Backend::Remote)]
backend: Backend,
}

Expand All @@ -36,18 +39,13 @@ async fn main() -> Result<(), Box<dyn Error>> {

let cli = Cli::parse();

// optionally initialize Azure client
let client = match (cli.backend, cli.azure_sas_token.clone()) {
(Backend::Disk, None) => None,
(Backend::Azure, None) => Some(flights::fs_azure::initialize_anonymous(
"privatejets",
"data",
)),
(_, Some(token)) => Some(flights::fs_azure::initialize_sas(
&token,
"privatejets",
"data",
)?),
// initialize client
let client = match (cli.backend, cli.access_key, cli.secret_access_key) {
(Backend::Disk, _, _) => None,
(_, Some(access_key), Some(secret_access_key)) => {
Some(flights::fs_s3::client(access_key, secret_access_key).await)
}
(Backend::Remote, _, _) => Some(flights::fs_s3::anonymous_client().await),
};

// load datasets to memory
Expand Down Expand Up @@ -77,7 +75,7 @@ It contains 3 columns:
Both `icao_number` and `tail_number` are unique keys (independently).
"#;

if cli.azure_sas_token.is_some() {
if client.as_ref().map(|c| c.can_put()).unwrap_or(false) {
let client = client.unwrap();
client
.put("database/private_jets/2023/11/06/data.csv", data_csv)
Expand Down
30 changes: 14 additions & 16 deletions examples/period.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ fn render(context: &Context) -> Result<(), Box<dyn Error>> {
#[derive(clap::ValueEnum, Debug, Clone)]
enum Backend {
Disk,
Azure,
Remote,
}

fn parse_date(arg: &str) -> Result<time::Date, time::error::Parse> {
Expand All @@ -53,10 +53,13 @@ fn parse_date(arg: &str) -> Result<time::Date, time::error::Parse> {
#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
struct Cli {
/// The Azure token
/// The token to the remote storage
#[arg(long)]
azure_sas_token: Option<String>,
#[arg(long, value_enum, default_value_t=Backend::Azure)]
access_key: Option<String>,
/// The token to the remote storage
#[arg(long)]
secret_access_key: Option<String>,
#[arg(long, value_enum, default_value_t=Backend::Remote)]
backend: Backend,

/// The tail number
Expand All @@ -79,18 +82,13 @@ async fn main() -> Result<(), Box<dyn Error>> {

let cli = Cli::parse();

// optionally initialize Azure client
let client = match (cli.backend, cli.azure_sas_token) {
(Backend::Disk, None) => None,
(Backend::Azure, None) => Some(flights::fs_azure::initialize_anonymous(
"privatejets",
"data",
)),
(_, Some(token)) => Some(flights::fs_azure::initialize_sas(
&token,
"privatejets",
"data",
)?),
// initialize client
let client = match (cli.backend, cli.access_key, cli.secret_access_key) {
(Backend::Disk, _, _) => None,
(_, Some(access_key), Some(secret_access_key)) => {
Some(flights::fs_s3::client(access_key, secret_access_key).await)
}
(Backend::Remote, _, _) => Some(flights::fs_s3::anonymous_client().await),
};

// load datasets to memory
Expand Down
34 changes: 16 additions & 18 deletions examples/single_day.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ pub struct Context {
#[derive(clap::ValueEnum, Debug, Clone)]
enum Backend {
Disk,
Azure,
Remote,
}

const ABOUT: &'static str = r#"Writes a markdown file per leg (named `{tail-number}_{date}_{leg}.md`) on disk with a description of:
Expand All @@ -54,11 +54,14 @@ struct Cli {
/// The date in format `yyyy-mm-dd`
#[arg(short, long, value_parser = parse_date)]
date: time::Date,
/// Optional azure token to write any new data to the blob storage
#[arg(short, long)]
azure_sas_token: Option<String>,
/// The token to the remote storage
#[arg(long)]
access_key: Option<String>,
/// The token to the remote storage
#[arg(long)]
secret_access_key: Option<String>,
/// The backend to read cached data from.
#[arg(short, long, value_enum, default_value_t=Backend::Azure)]
#[arg(short, long, value_enum, default_value_t=Backend::Remote)]
backend: Backend,
}

Expand All @@ -75,7 +78,7 @@ async fn flight_date(
owners: &Owners,
aircraft_owners: &AircraftOwners,
aircrafts: &Aircrafts,
client: Option<&fs_azure::ContainerClient>,
client: Option<&fs_s3::ContainerClient>,
) -> Result<Vec<Event>, Box<dyn Error>> {
let models = load_private_jet_models()?;
let airports = airports_cached().await?;
Expand Down Expand Up @@ -177,18 +180,13 @@ async fn main() -> Result<(), Box<dyn Error>> {

let cli = Cli::parse();

// optionally initialize Azure client
let client = match (cli.backend, cli.azure_sas_token) {
(Backend::Disk, None) => None,
(Backend::Azure, None) => Some(flights::fs_azure::initialize_anonymous(
"privatejets",
"data",
)),
(_, Some(token)) => Some(flights::fs_azure::initialize_sas(
&token,
"privatejets",
"data",
)?),
// initialize client
let client = match (cli.backend, cli.access_key, cli.secret_access_key) {
(Backend::Disk, _, _) => None,
(_, Some(access_key), Some(secret_access_key)) => {
Some(flights::fs_s3::client(access_key, secret_access_key).await)
}
(Backend::Remote, _, _) => Some(flights::fs_s3::anonymous_client().await),
};

let owners = load_owners()?;
Expand Down
2 changes: 1 addition & 1 deletion methodology.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ has a continuous sequence of ADS-B positions in time where the aircraft is flyin
The aircraft at a given segment between two ADS-B positions is considered grounded (not flying) when any of:
1. both positions are on the ground
2. the time between these positions is > 5m and any of the positions is below 10.000 feet
3. the time between these positions is > 10h
3. the time between these positions is > 4h

Condition 1. is the normal case where ADS-B signal was received when the aircraft landed.
Condition 2. is used to mitigate the risk that ADS-B receivers sometimes
Expand Down
Loading

0 comments on commit 62460f2

Please sign in to comment.