Releases: dathere/qsv
Releases · dathere/qsv
0.86.0
Added
apply
: addedthousands
operation which adds thousands separators to numeric values.
Specify the separator policy with --comparand (default: comma). The valid policies are:
comma, dot, space, underscore, hexfour (place a space every four hex digits) and
indiancomma (place a comma every two digits, except the last three digits). #748searchset
: added--unmatched-output
option. This was done to allow Datapusher+ to screen for PIIs more efficiently. Writing PII candidate records in one CSV file, and the "clean" records in another CSV in just one pass. #742
Changed
fetch
&fetchpost
: expanded usage text info on HTTP2 Adaptive Flow Control supportfetchpost
: added more detail about--compress
optionstats
: added more tests- updated prebuilt zip archive READMEs 072973e
- Bump redis from 0.22.2 to 0.22.3 by @dependabot in #741
- Bump ahash from 0.8.2 to 0.8.3 by @dependabot in #743
- Bump jql from 5.1.4 to 5.1.6 by @dependabot in #747
- applied select clippy recommendations
- cargo update bump several indirect dependencies
- pin Rust nightly to 2023-01-27
Fixed
stats
: fixed antimodes null display. Use the literalNULL
instead of just "" when listing NULL as an antimode. #745tojsonl
: fixed invalid escaping of JSON values #746
Full Changelog: 0.85.0...0.86.0
0.85.0
Added
- Update csvs_convert by @kindly in #736
sniff
: added--delimiter
option #732fetchpost
: add--compress
option in #737searchset
: several tweaks for PII screening requirement of Datapusher+.--flag
option now shows regex labels instead of just row number; new--flag-matches-only
option sends only matching rows to output when used with--flag
;--json
option returns rows_with_matches, total_matches and rowcount as json to stderr. #738
Changed
luau
: minor tweaks to increase code readability 31d01c8stats
: now normalizes after rounding. Normalizing strips trailing zeroes and converts -0.0 to 0.0. f838272safenames
: mention CKAN-specific options f371ac2fetch
&fetchpost
: document decompression priority 43ce13c- Bump actix-governor from 0.3.2 to 0.4.0 by @dependabot in #728
- Bump sysinfo from 0.27.6 to 0.27.7 by @dependabot in #730
- Bump serial_test from 0.10.0 to 1.0.0 by @dependabot in #729
- Bump pyo3 from 0.17.3 to 0.18.0 by @dependabot in #731
- Bump reqwest from 0.11.13 to 0.11.14 by @dependabot in #734
- cargo update bump for other dependencies
- pin Rust nightly to 2023-01-21
Fixed
sniff
: now checks that--sample
size is greater than zero cd4c390
Full Changelog: 0.84.0...0.85.0
0.84.0
Added
headers
: added--trim
option to trim quote and spaces from headers #726
Changed
input
:--trim-headers
option also removes excess quotes #727safenames
: trim quotes and spaces from headers 0260833- cargo update bump dependencies
- pin Rust nightly to 2022-01-13
Full Changelog: 0.83.0...0.84.0
0.83.0
Added
stats
: add sparsity to "streaming" statistics #719schema
: also infer enum constraints for integer fields. Not only good for validation, this is also required bytojsonl
for smarter boolean inferencing #721
Changed
stats
: change--typesonly
so it will not automatically--infer-dates
. Let the user decide. #718stats
: if median is already known, use it to calculate Median Absolute Deviation 08ed08dtojsonl
: smarter boolean inferencing. It will infer a column as boolean if it only has a domain of two values,
and the first character of the values are one of the following case-insensitive "truthy/falsy"
combinations: t/f; t/null; 1/0; 1/null; y/n & y/null are treated as true/false. #722 and #723safenames
: process--reserved
option before--prefix
option. b333549strum
andstrum-macros
are no longer optional dependencies as we use it with all the binary variants now bea6e00- Bump qsv-stats from 0.6.0 to 0.7.0
- Bump sysinfo from 0.27.3 to 0.27.6
- Bump hashbrown from 0.13.1 to 0.13.2 by @dependabot in #720
- Bump actions/setup-python from 4.4.0 to 4.5.0 by @dependabot in #724
- change MSRV from 1.66.0 to 1.66.1
- cargo update bump indirect dependencies
- pin Rust nightly to 2023-01-12
Fixed
safenames
: fixed--prefix
option. When checking for invalid underscore prefix, it was checking for hyphen, not underscore, causing a problem with Datapusher+ 4fbbfd3
Full Changelog: 0.82.0...0.83.0
0.82.0
Added
diff
: Find the difference between two CSVs ludicrously fast! by @janriemer in #711stats
: added Median Absolute Deviation (MAD) #715- added Testing section to README 517d69b
Changed
validate
: schema-less validation error improvements #703stats
: faster date inferencing #706stats
: minor performance tweaks 15e6284 3f0ed2bstats
: refactored modes compilation, with antimodes no longer unnecessarily compiling more than 10 antimodes it won't show anyway. 6e448b0stats
: simplify if condition ae7cc85luau
: show luau version when invoking --version f7f9c42excel
: add "sheet" suffix to end msg for readability ae3a8e3- cache
util::count_rows
result, so if a CSV without an index is queried, it caches the result and future calls to count_rows in the same session will be instantaneous e805ded - Bump console from 0.15.3 to 0.15.4 by @dependabot in #704
- Bump cached from 0.41.0 to 0.42.0 by @dependabot in #709
- Bump mlua from 0.8.6 to 0.8.7 by @dependabot in #712
- Bump qsv-stats from 0.5.2 to 0.6.0 with the new MAD statistic support and faster, more memory-efficient antimodes compilation
- cargo update bump dependencies - notably mimalloc from 0.1.32 to 0.1.34, luau0-src from 0.4.1_luau553 to 0.5.0_luau555, csvs_convert from 0.7.9 to 0.7.11 and regex from 1.7.0 to 1.7.1
- pin Rust nightly to 2023-01-08
Fixed
tojsonl
: fix escaping of unicode string. Replace hand-rolled escape fn with built-in escape_default fn #707. Fixes #705tojsonl
: more robust boolean inferencing #710. Fixes #708
New Contributors
- @janriemer made their first contribution in #711
Full Changelog: 0.81.0...0.82.0
0.81.0
[0.81.0] - 2023-01-02
Added
stats
: added range statistic #691stats
: added additional mode stats. For mode, added mode_count and mode_occurrences. Added "antimode" (opposite of mode - least frequently non-zero occurring value), antimode_count and antimode_occurrences. #694- qsv-dateparser now recognizes unix timestamp values with fractional seconds to nanosecond precision as dates.
stats
,sniff
,apply datefmt
andschema
, which all use qsv-dateparser, now infer unix timestamps as dates - a29ff8e #702
USAGE NOTE: As timestamps can be float or integer, and data type inferencing will guess dates last, preprocess timestamp columns with
apply datefmt
first to more date-like, non-timestamp formats, so they are recognized as dates by other qsv commands.
Changed
apply
: document numtocurrency --comparand & --replacement behavior cc88fe9index
: explicitly flush buffer after creating index ee5d790sample
: no longer requires an index to do percentage sampling 45d4657slice
: removed unneeded utf8 check 5a199f4schema
: expand usage text regarding--strict-dates
3d22829stats
: date stats refactor. Date stats are returned in rfc3339 format. Dates are converted to timestamps with millisecond precision while calculating date stats. #690 e7c2977- filter out variance/stddev in tests as float precision issues are causing flaky CI tests #696
- Bump qsv-dateparser from 0.4.4 to 0.6.0
- Bump qsv-stats from 0.4.6 to 0.5.2
- Bump qsv-sniffer from 0.5.0 to 0.6.0
- Bump serde from 1.0.151 to 1.0.152 by @dependabot in #692
- Bump csvs_convert from 0.7.7 to 0.7.8 by @dependabot in #693
- Bump once_cell from 0.16.0 to 0.17.0 d3ac255
- Bump self-update from 0.32.0 to 0.34.0 5f95933
- Bump cpc from 1.8 to 1.9; set csvs_convert dependency to minor version ee91648
- applied select clippy recommendations
- deeplink to Cookbook from Table of Contents
- pin Rust nightly to 2023-01-01
- implementation comments on
stats
,sample
,sort
& Python distribution
Fixed
stats
: prevent premature rounding, and make sum statistic use the same rounding method 879214a 1a13620- fix autoindex so we return the index path properly d3ce6a3
fetch
&fetchpost
: corrected typo 684036b
Full Changelog: 0.80.0...0.81.0
0.80.0
Added
- new
to
command. Converts CSVs "to" PostgreSQL, SQLite, XLSX, Parquet and Data Package by @kindly in #656 apply
: add numtocurrency operation #670sort
: add --ignore-case option #673stats
: now computes summary statistics for dates as well #684- added --updatenow option, resolves #661 #662
- replace footnotes in Available Commands list with emojis 😄
Changed
apply
&applydp
: expose --batch size option #679validate
: add last valid row to validation error 7680011input
: add last valid row to error message 492e51f- upgrade to csvs-convert 0.7.5 by @kindly in #668
- Bump serial_test from 0.9.0 to 0.10.0 by @dependabot in #671
- Bump csvs_convert from 0.7.5 to 0.7.7 by @dependabot in #674
- Bump num_cpus from 1.14.0 to 1.15.0 by @dependabot in #678
- Bump robinraju/release-downloader from 1.6 to 1.7 by @dependabot in #677
- Bump actions/stale from 6 to 7 by @dependabot in #676
- Bump actions/setup-python from 4.3.1 to 4.4.0 by @dependabot in #683
- added concurrency check to CI tests so that redundant CI test are canceled when new ones are launched
- instead of saying "descriptive statistics", use more understandable "summary statistics"
- changed publishing workflows to enable
to
feature for applicable target platforms - cargo update bump dependencies, notably qsv-stats from 0.4.5 to 0.4.6 and qsv_currency from 0.5.0 to 0.6.0
- pin Rust nightly to 2022-12-22
Fixed
Full Changelog: 0.79.0...0.80.0
0.79.0
Added
safenames
: add --reserved option, allowing user to specify additional "unsafe" names #657safenames
: add --prefix option #658fetch
&fetchpost
: added simple retry backoff multiplier - e343398
Changed
excel
: refactored --metadata processing; added more debug messages; minor perf tweaks f137bab- set MSRV to Rust 1.6.6
- cargo update bump several dependencies, notably qsv-dateparser
- pin Rust nightly to 2022-12-15
Full Changelog: 0.78.2...0.79.0
0.78.2
Changed
- cargo update bump paste 1.0.9 to 1.0.10
- pin Rust nightly to 2022-12-12
Removed
excel
: remove --safenames option. If you need safenames, use thesafenames
command e5da73b
Full Changelog: 0.78.1...0.78.2
0.78.1
Changed
qsvdp
:apply
now available in qsvdp asapplydp
- removing the geocode and calconv subcommands, and removing all operations that require third-party crates EXCEPT dynfmt and datefmt which is needed for Datapusher+ #652excel
: fine-tune --metadata processing 09530d4- bump serde from 1.0.149 to 1.0.150
qsvdp
in now included in CI tests
Full Changelog: 0.78.0...0.78.1