Skip to content

Commit

Permalink
add covid code
Browse files Browse the repository at this point in the history
  • Loading branch information
nikosbosse committed Nov 27, 2024
1 parent fff92f1 commit de82d07
Show file tree
Hide file tree
Showing 3 changed files with 230 additions and 0 deletions.
126 changes: 126 additions & 0 deletions .github/workflows/submit-covid-forecasts.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
name: Make COVID Forecast Submission

on:
schedule:
# Every Wednesday at 6pm CET (17:00 UTC)
- cron: '0 17 * * 3'
workflow_dispatch: # Allow manual triggering

jobs:
submit-forecast:
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install requests pandas numpy
- name: Create required directories
run: |
mkdir -p covid/submissions
- name: Run covid forecasting script
run: python covid/run-covid-forecasts.py
env:
PYTHONPATH: ${{ github.workspace }}

- name: Commit forecasts to repository
run: |
# Configure git
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
# Add and commit the new forecasts
git add covid/submissions/
# Only commit if there are changes
if git diff --staged --quiet; then
echo "No new forecasts to commit"
else
git commit -m "Store covid forecasts $(date +%Y-%m-%d)"
git push
fi
- name: Fork and sync target repository
run: |
# Install GitHub CLI
gh auth login --with-token <<< "${{ secrets.PRIVATE_ACCESS_TOKEN }}"
# Fork the repository (if not already forked)
gh repo fork CDCgov/covid19-forecast-hub --clone=true || true
# Sync the fork with upstream
pushd covid19-forecast-hub
# Add upstream remote if it doesn't exist
git remote add upstream https://github.com/CDCgov/covid19-forecast-hub.git 2>/dev/null || true
# Set the origin remote with authentication token
git remote set-url origin "https://${{ secrets.PRIVATE_ACCESS_TOKEN }}@github.com/${{ github.actor }}/covid19-forecast-hub.git"
# Fetch and sync with upstream
git fetch upstream
git checkout main
git reset --hard upstream/main
git push origin main --force
popd
- name: Copy forecast files
run: |
# Create target directory if it doesn't exist
mkdir -p covid19-forecast-hub/model-output/Metaculus-cp
# Copy new forecasts
cp -r covid/submissions/* covid19-forecast-hub/model-output/Metaculus-cp/
- name: Create Pull Request
run: |
cd covid19-forecast-hub
# Setup git config
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
# Create a new branch with timestamp to ensure uniqueness
BRANCH_NAME="covid-forecast-update-$(date +%Y%m%d-%H%M%S)"
git checkout -b $BRANCH_NAME
# Add and commit changes
git add model-output/Metaculus-cp/
# Only commit if there are changes
if git diff --staged --quiet; then
echo "No changes to commit"
exit 0
else
git commit -m "Update covid forecasts $(date +%Y-%m-%d)"
# Set the remote URL with authentication token
git remote set-url origin "https://${{ secrets.PRIVATE_ACCESS_TOKEN }}@github.com/${{ github.actor }}/covid19-forecast-hub.git"
# Force push to fork with the new unique branch
git push -f origin $BRANCH_NAME
# Create PR
gh pr create \
--title "Update covid forecasts $(date +%Y-%m-%d)" \
--body "Automated covid forecast submission from Metaculus" \
--repo CDCgov/covid19-forecast-hub \
--base main \
--head "${{ github.actor }}:$BRANCH_NAME"
fi
env:
PRIVATE_ACCESS_TOKEN: ${{ secrets.PRIVATE_ACCESS_TOKEN }}
28 changes: 28 additions & 0 deletions covid/Metaculus-cp.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
team_name: "Metaculus"
team_abbr: "Metaculus"
model_name: "Metaculus Community Prediction"
model_abbr: "cp"
model_version: "1.0"
model_contributors:
[
{
"name": "Ryan Beck",
"affiliation": "Metaculus",
"email": "ryan@metaculus.com",
},
{
"name": "Nikos Bosse",
"affiliation": "Metaculus",
"email": "nikos@metaculus.com",
},
]
website_url: "https://www.metaculus.com/questions/30049/us-covid-hospitalization-forecasts-2024-25/"
repo_url: "https://github.com/Metaculus/respiratory-diseases"
license: "CC-BY-4.0"
designated_model: true
team_funding: "This project is supported by the National Science Foundation under Award No. 2438211. Any opinions, findings and conclusions or recommendations expressed in this project are those of Metaculus and our forecasters, and do not necessarily reflect the views of the National Science Foundation."
methods: "A recency-weighted average of predictions made by forecasters on the Metaculus prediction platform."
data_inputs: "Users are allowed to make use of any data they choose. The recency-weighted average takes only the numeric forecasts made by forecasters on the platform into account."
methods_long: "The Metaculus Community Prediction is a consensus of recent forecaster predictions. It is designed to respond to big changes in forecaster opinion while still being fairly insensitive to outliers. For every forecaster, on ly their most recent prediction is kept. Predictions are assigned a number n, from oldest to newest (oldest is 1). Every prediction is weighted proportional to exp(sqrt(n)). Predictions are then aggregated by creating a mixture distribution of all available weighted forecasts."
ensemble_of_models: true
ensemble_of_hub_models: false
76 changes: 76 additions & 0 deletions covid/run-covid-forecasts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# https://www.metaculus.com/questions/30049/us-covid-hospitalization-forecasts-2024-25/

# The Challenge Period will begin November 20, 2024, and will run until May 31, 2025. Participants are asked to submit weekly nowcasts and forecasts by 11PM Eastern Time each Wednesday (herein referred to as the Forecast Due Date).

import requests
from datetime import datetime, timedelta
import pandas as pd
from utils import internal_to_actual
import numpy as np

question_id = 30049
url = f"https://metaculus.com/api/posts/{question_id}"
response = requests.get(url).json()

# get reference date, which is the saturday following the submission due date
today = datetime.now().date() # this is the submission due date, a Tuesday
days_until_saturday = (5 - today.weekday()) % 7 # 5 is Saturday
reference_date = today + timedelta(days=days_until_saturday)

forecasts = []

subquestions = response["group_of_questions"]["questions"]
for subquestion in subquestions:
# obtain the target end date
question_title = subquestion["title"]
target_end_date = question_title.split("(")[1].split(")")[0].strip()
target_end_date = datetime.strptime(target_end_date, "%B %d, %Y").date()

# calculate horizon based on target_end_date and reference_date
# target_end_date should be equal to the reference_date + horizon*(7 days).
horizon = (target_end_date - reference_date).days // 7

# # only deal with forecast, if horizon is -1, 0, 1, 2, 3
if horizon not in [-1, 0, 1, 2, 3]:
continue

# obtain the scaling of the x-axis
range_max = subquestion["scaling"]["range_max"]
range_min = subquestion["scaling"]["range_min"]
zero_point = subquestion["scaling"]["zero_point"]

try:
cdf = subquestion["aggregations"]["recency_weighted"]["latest"][
"forecast_values"
]
except TypeError:
print(f"No forecast for {question_title}")
continue

internal_x_grid = np.linspace(0, 1, 201)
actual_x_grid = internal_to_actual(
internal_x_grid, zero_point, range_min, range_max, is_linear=False
)

desired_quantile_levels = np.concatenate(
[[0.01, 0.025], np.arange(0.05, 0.95 + 0.05, 0.05), [0.975, 0.99]]
).round(3)
desired_quantiles = np.interp(desired_quantile_levels, cdf, actual_x_grid)

# need a dataframe with the quantiles and the quantile levels
latest_forecast_df = pd.DataFrame(
{
"reference_date": reference_date,
"target": "wk inc covid hosp",
"horizon": horizon,
"target_end_date": target_end_date,
"location": "US",
"output_type": "quantile",
"output_type_id": desired_quantile_levels,
"value": desired_quantiles,
}
)
forecasts.append(latest_forecast_df)

forecasts_df = pd.concat(forecasts)
forecasts_df.to_csv(f"covid/submissions/{reference_date}-Metaculus-cp.csv", index=False)

0 comments on commit de82d07

Please sign in to comment.