-
Notifications
You must be signed in to change notification settings - Fork 11
/
NEWS
221 lines (134 loc) · 5.45 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
0.3-5 2023-11-29
* add format casts
0.3-4 2023-11-28
* pass max.size argument through chunk.map() (#39)
* minor change to work around rchk not being able to follow
protections across functions.
0.3-3 2022-12-09
* fix error/segfault (depending on R version) in as.output()
when a type that doesn't support LENGTH() is passed (such as
NULL).
* CH.MAX.SIZE was ignored in chunk.apply() for parallel jobs
* add CH.BINARY flag which can be set to TRUE if the merge step
should be performed continually as a call to a binary CH.MERGE
function instead of collecting all results and then calling
CH.MERGE.
Analogously, CH.INITIAL has been added which is a function
called on the first result. If NULL then
CH.MERGE(NULL, result) is called instead.
Note: in previous versions regular chunk.apply() was behaving
like CH.BINARY=FALSE, but when parallel was set then it
behaved like CH.BINARY=TRUE. Now CH.BINARY is explicit.
* new parallel chunk.apply() implementation
The related arguments have been re-named to avoid clashes with
actual function arguments. CH.MERGE now behaves the same way
as with sequential processing for consistency.
CH.PARALLEL - if set to 2 or higher triggers parallel
processing of chunks
CH.SEQUENTIAL - if FALSE then parallel processing is allowed
to change the order of the chunks to process
chunks yield results faster frist.
0.3-2 2021-07-23
* minor changes for compatibility with write-barrier
and R-devel (no functional difference)
0.3-1 2020-03-10
* make sure connections are closed in examples so
check doesn't complain
* add PROTECT() to chunk.apply() and string singletons
0.3-0 2020-03-09
* integers incorrectly parsed empty strings to 0
instead of NA (#27)
* add as.output.raw() which supports both direct file
descriptors and connections
* Extend the handling of as.output()
as.output() now supports three modes:
1) con=NULL: a raw vector is created
2) con=connection: writes output to binary connection
3) con=iotools.stderr/stdout/fd(fd): writes directly
to a file descriptor
Also as.output() is now pass-through for raw vectors.
Finally, most methods now support keys to be either a
logical value to suppress names/row names or it can
also be a character vector in which case its content
is used as keys.
0.2-6 2018-02-05
* add support for logical vectors in fdrbind
0.2-5 2018-01-24
* disable non-blocking raw fd reads on Windows since select()
does NOT work on FDs there.
0.2-4 2017-04-13
* remove unnecessary reference to stdout
* increase tmeporary buffer to (hopefully) appease gcc7
* add stdout_writeBin C code
* add fdrbind()
0.2-3 2016-09-16
* fix a bug in timeout parameter of read.chunk() where subsecod
timeouts were computed incorrectly
0.2-2 2016-04-26
* add support for raw file descriptors and timeout in the chunk
reader
0.2-1 2015-08-20
* use R_GetConnection() API in R >=3.3.0
* add chunk.map to mimic hmr locally
* fix col.names handing in write.csv.raw() (#26)
* clean up as_output_matrix to be 64-bit safe
* use internal C methods for all output
support ragged lists (with recycling) and long vectors in
as_output_dataframe
* support I() to tag ojebcts that don't want to use
as.character()
* make string coersion rules consistent
* re-factor as.output.data.frame to use dybuf
* support binary connection con in as.output() instead of
buffering
* add support for quoting via quote= parameter (#25)
0.1-12 2015-07-28
* don't import parallel::mc* since it doesn't exist on Windows
0.1-11 2015-07-28
* fix issues, mostly convert to 64-bit
0.1-10 2015-06-22
* remove old stdio API
* add quoting to read.csv.raw
* support quotes in character fields (#24)
0.1-9 2015-06-22
* fix handing of Windows line endings (#23)
0.1-8 2015-03-18
* add support for iterators - imstrsplit/idstrsplit
(Thanks to Mike Kane! - #19)
* add tests and fixes to make them run on edge cases
* fix mstrsplit when given length zero input
* re-factor as.output() to use dynamic buffers
0.1-7 2015-02-10
* add C implementations of as.output()
0.1-6 2015-02-08
* support tab/comma separated files with as.output() when x is a
data.frame or matrix
* make loading hmr silently the default until we rename hmr and
go to CRAN
* fix header=TRUE bug
* treat NAs in dstrsplit list input as a way to skip columns
0.1-5 2014-12-15
* Removed "pipeline" parameter for chunk.apply and updated the
documentation
* Parallel option added to chunk.apply()
* major re-structuring of the raw parsers (dstrsplit and
mstrsplit)
------------------------------------------------------------------------
previous versions included code for Hadoop Map/Reduce, that code
has now been moved to a separate package:
https://github.com/s-u/hmr
------------------------------------------------------------------------
0.1-4 2014-06-09
* support names from colspec, support list colspec
* add experimental remote submission capability
0.1-3 2014-05-20
* add hadoop.opt option and hadoop.conf support
0.1-2
* fix missing PROTECT in chunk.tapply
0.1-1 2014-04-17
* add key-awareness when splitting
* add ctapply() - more efficient implementation of tapply()
for contiguous keys
* add support for Hadoop 2.x
0.1-0 2013-05-23
* initial public release