Skip to content

Latest commit

 

History

History
101 lines (79 loc) · 5.31 KB

README.md

File metadata and controls

101 lines (79 loc) · 5.31 KB

pxi-dsv teaser

🧚pxi-dsv is a delimiter-separated values plugin for pxi (pixie), the small, fast, and magical command-line data processor.

See the pxi github repository for more details!

node version npm version license PRs Welcome linux unit tests status macos unit tests status windows unit tests status

Installation

👌 pxi-dsv comes preinstalled in pxi. No installation necessary. If you still want to install it, proceed as described below.

pxi-dsv is installed in ~/.pxi/ as follows:

npm install pxi-dsv

The plugin is included in ~/.pxi/index.js as follows:

const dsv = require('pxi-dsv')

module.exports = {
  plugins:  [dsv],
  context:  {},
  defaults: {}
}

For a much more detailed description, see the .pxi module documentation.

Extensions

This plugin comes with the following pxi extensions:

Description
dsv deserializer Deserializes delimiter-separated values files. The delimiter, quote, and escape characters, as well as several other options make it very flexible.
csv deserializer Deserializes comma-separated values files. Follows RFC4180 for the most part. Uses dsv internally and accepts the same options.
tsv deserializer Deserializes tab-separated values files. Useful for processing tabular database and spreadsheet data. Uses dsv internally and accepts the same options.
ssv deserializer Deserializes space-separated values files. Useful for processing command line output from ls, ps, and the like. Uses dsv internally and accepts the same options.
csv serializer Serializes JSON into CSV format.

Known Limitations

This plugin has the following limitations:

  1. No type casting: The deserializers do not cast strings to other data types, like numbers or booleans. This is intentional. Since different use cases need different data types, and some use cases need their integers to be strings, e.g. in case of IDs, there is no way to know for sure when to cast a string to another type. If you need different types, you may cast strings by using functions.
  2. Integer header order: Headers that are integers are always printed before other headers. This is an implementation detail of the way JavaScript orders object keys internally. Although this is an inconvenience, this behaviour will stay for now, since changing it would reduce performance. If you have a good way to solve this and retain performance, please let me know.
  3. Non-optimal tsv (de-)serializer implementations: The tsv deserializer is implemented in terms of the dsv deserializer and thus supports quotes and escaping tabs. Other implementations of tsv deserializers do not allow tabs in values and thus have no need of quotes and escapes. This means, the current tsv implementation works just fine, but an implementation without quotes should be faster. Such an implementation may come at some point in the future.
  4. No multi-line CSV files: The csv deserializer does not appear to support multi-line values, aka values with line breaks inside quotes. Actually, no pxi deserializer could support this feature alone, since it is the chunkers' responsibility to chunk data. Currently there is no dedicated chunker that supports chunking multi-line csv files, but there may be in the future.

Reporting Issues

Please report issues in the tracker!

License

pxi-dsv is MIT licensed.