forked from flatironinstitute/cufinufft
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGELOG
59 lines (51 loc) · 2.63 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
v 1.3 (06/10/23)
* Move second half of onedim_fseries_kernel() to GPU (with a simple heuristic
basing on nf1 to switch between the CPU and the GPU version).
* Melody fixed bug in MAX_NF being 0 due to typecasting 1e11 to int (thanks
Elliot Slaughter for catching that).
* Melody fixed kernel eval so done w*d not w^d times, speeds up 2d a little, 3d
quite a lot! (PR#130)
* Melody added 1D support for both types 1 (GM-sort and SM methods) 2 (GM-sort),
in C++/CUDA and their test executables (but not Python interface).
* Various fixes to package config.
* Miscellaneous bug fixes.
v 1.2 (02/17/21)
* Warning: Following are Python interface changes -- not backwards compatible
with v 1.1 (See examples/example2d1,2many.py for updated usage)
- Made opts a kwarg dict instead of an object:
def __init__(self, ... , opts=None, dtype=np.float32)
=> def __init__(self, ... , dtype=np.float32, **kwargs)
- Renamed arguments in plan creation `__init__`:
ntransforms => n_trans, tol => eps
- Changed order of arguments in plan creation `__init__`:
def __init__(self, ... ,isign, eps, ntransforms, opts, dtype)
=> def __init__(self, ... ,ntransforms, eps, isign, opts, dtype)
- Removed M in `set_pts` arguments:
def set_pts(self, M, kx, ky=None, kz=None)
=> def set_pts(self, kx, ky=None, kz=None)
* Python: added multi-gpu support (in beta)
* Python: added more unit tests (wrong input, kwarg args, multi-gpu)
* Fixed various memory leaks
* Added index bound check in 2D spread kernels (Spread_2d_Subprob(_Horner))
* Added spread/interp tests to `make check`
* Fixed user request tolerance (eps) to kernel width (w) calculation
* Default kernel evaluation method set to 0, ie exp(sqrt()), since faster
* Removed outdated benchmark codes, cleaner spread/interp tests
v 1.1 (09/22/20)
* Python: extended the mode tuple to 3D and reorder from C/python
ndarray.shape style input (nZ, nY, nX) to to the (F) order expected by the
low level library (nX, nY, nZ).
* Added bound checking on the bin size
* Dual-precision support of spread/interp tests
* Improved documentation of spread/interp tests
* Added dummy call of cuFFTPlan1d to avoid timing the constant cost of cuFFT
library.
* Added heuristic decision of maximum batch size (number of vectors with the
same nupts to transform at the same time)
* Reported execution throughput in the test codes
* Fixed timing in the tests code
* Professionalized handling of too-small-eps (requested tolerance)
* Rewrote README.md and added cuFINUFFT logo.
* Support of advanced Makefile usage, e.g. make -site=olcf_summit
* Removed FFTW dependency
v 1.0 (07/29/20)