Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code crashes when chromosome ID's are digit-only #120

Open
jmoellmann opened this issue Oct 31, 2024 · 1 comment
Open

Code crashes when chromosome ID's are digit-only #120

jmoellmann opened this issue Oct 31, 2024 · 1 comment
Labels
bug Something isn't working todo This issue will be addressed in a future update

Comments

@jmoellmann
Copy link

jmoellmann commented Oct 31, 2024

When running pixy on datasets where chomosome IDs consists only of digits, and starting with zeros (e.g. ["0001", [...], "0016"]), the program will break when reading from the temp files, as the IDs get automatically converted to numerics, removing any trailing zeros (pd.read_csv, line 328, main.py), resulting in a KeyError on lines 363 and 370.

This is certainly very much an edge case, as most chromosome IDs will not be digit-only, but some tools output numeric-only chromosome IDs.

This bug is certainly irrespective of the pixy command and populations files used and the system architecture and it is very easy to reproduce.

A line from an exemplary VCF:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT X1 [...] XY
0001 1 . A . 100 . DP=[...] GT:[...]

I suggest the following fix at line 328, main.py:

< --- outpanel = pandas.read_csv(temp_file, sep='\t', header=None)
---- > outpanel = pandas.read_csv(temp_file, sep='\t', header=None, dtype = {3 : 'string'})

@jmoellmann jmoellmann added the bug Something isn't working label Oct 31, 2024
@ksamuk ksamuk added the todo This issue will be addressed in a future update label Jan 24, 2025
@ksamuk
Copy link
Owner

ksamuk commented Jan 24, 2025

Thanks for the fix, we will address this soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working todo This issue will be addressed in a future update
Projects
None yet
Development

No branches or pull requests

2 participants