Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: finemapping template and DAG for UKB PPP #10

Merged
merged 10 commits into from
Sep 18, 2024
Merged

Conversation

tskir
Copy link
Contributor

@tskir tskir commented Jul 9, 2024

The idea is to have a common finemapping template, which specific DAGs can reuse and modify according to their needs.

@tskir tskir marked this pull request as ready for review September 18, 2024 14:25
@tskir
Copy link
Contributor Author

tskir commented Sep 18, 2024

@project-defiant This has been quite substantially rewritten compared to the first draft. Could you do another round of reviews, please?

Copy link
Collaborator

@project-defiant project-defiant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a good setup for me to use it during the gwas_catalog etl step (also for other ones that require finemapping).

**common.shared_dag_kwargs,
) as dag:
(
FinemappingBatchOperator.partial(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The partial will not work beyond the threshold, and I have tested it on local airflow DAG, this breaks on around ~5k partial tasks even with the threshold increase.

https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/dynamic-task-mapping.html#placing-limits-on-mapped-tasks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, but just to clarify, in this PR the partial/expand routine iterates not on individual loci (of which there are potentially 100,000s in the worst case), but on chunks of the manifest, of which there are <10 in either case

@tskir tskir merged commit a838e93 into dev Sep 18, 2024
2 checks passed
@tskir tskir deleted the tskir-finemapping branch September 18, 2024 15:14
project-defiant pushed a commit that referenced this pull request Sep 23, 2024
* feat: template for creating finemapping jobs

* feat: example DAG for creating finemapping jobs

* fix: quote parameters containing = for Hydra

* chore: add GENTROPY_DOCKER_IMAGE to common layer

* feat: always use a list of jobs in the DAG

* refactor: use manifest as input

* feat: implement generate_manifests_for_finemapping

* refactor: rewrite the DAG to use new functions

* fix: import errors in DAG

* fix: multiple fixes following test runs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants