Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use overloads or generics instead of return with | #92

Closed
deanm0000 opened this issue Sep 16, 2024 · 2 comments
Closed

Use overloads or generics instead of return with | #92

deanm0000 opened this issue Sep 16, 2024 · 2 comments

Comments

@deanm0000
Copy link
Contributor

deanm0000 commented Sep 16, 2024

for example in compute.pyi

def mean -> lib.DoubleScalar | lib.Decimal128Scalar: ...

def exp-> lib.FloatArray | lib.DoubleArray: ...

etc.

@deanm0000
Copy link
Contributor Author

I put a tiny dent in this with #105

@deanm0000
Copy link
Contributor Author

I prodded at pyarrow compute functions to try to make a in/out tree of their functions and came up with this. The first file is the script and the second is the json output (so you don't have to run the first).

About the results

The first key is the name of the pyarrow compute function. Within each of those the keys are keys of output types. For one parameter functions, the value is the list of input types that result in that output. For two parameter functions the value is a 3 element list. The first two elements are the first two paramters' types respectively. The third element is a bool indicating if the second parameter was a scalar (False is array).

I'm assuming that all one parameter functions can take a Scalar or Array and that two parameter functions take only an Array as the first parameter. (As I type that, I'm less happy with that assumption than when I did the code)

About the process

I originally just used a basic try except for every attempted function call but on some of the two parameter functions it would crash the kernel so I had to hack up the multiprocessing approach so that it would crash the kernel of a separate process and could keep running. The one parameter functions finished in about a second but adding in the two parameter functions made it take over 10 minutes.

@zen-xu zen-xu closed this as completed Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants