Skip to content

Commit

Permalink
doc: update commands
Browse files Browse the repository at this point in the history
  • Loading branch information
sumeshi authored Jun 30, 2024
1 parent 4d5d61b commit 1117446
Show file tree
Hide file tree
Showing 2 changed files with 266 additions and 17 deletions.
281 changes: 265 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
# snip-snap-csv
[![MIT License](http://img.shields.io/badge/license-MIT-blue.svg?style=flat)](LICENSE)
[![PyPI version](https://badge.fury.io/py/sscsv.svg)](https://badge.fury.io/py/sscsv)
[![Python Versions](https://img.shields.io/pypi/pyversions/sscsv.svg)](https://pypi.org/project/sscsv/)

A tool designed for rapid CSV file processing and filtering, specifically designed for log analysis.

## Description

A tool designed for rapid data processing and filtering, specifically tailored for handling CSV files for log analysis.

> [!NOTE]
> This project is in the early stages of development. Please be aware that frequent changes and updates are likely to occur.
Expand Down Expand Up @@ -34,28 +39,252 @@ shape: (3, 5)


## Archtecture
This tool processes csv by connecting three processes: initializer, chainable, and finalizer.
For example, the initializer reads in the file, goes through multiple chainable processing steps, and then outputs the file using the finalizer.

Also, each process is explicitly separated from the others by "-".

![](https://gist.githubusercontent.com/sumeshi/644af27c8960a9b6be6c7470fe4dca59/raw/74764568e282ad173a9a51659c65c9f0a029ae38/sscsv.svg)

### initializer
#### load
Loads the specified CSV files.

```
initializer -> chainable manipulations... -> finalizer
Arguments:
path*: str
```

### initializer
- load
examples

```
$ sscsv load ./Security.evtx
```

```
$ sscsv load ./logs/*.evtx
```

### chainable manipulation
- select
- isin
- contains
- head
- tail
- sort
- changetz
#### select
Displays the specified columns.

```
Arguments:
columns: Union[str, tuple[str]]
```

examples

```
$ sscsv load ./Security.evtx - select 'Event ID'
```

```
$ sscsv load ./Security.evtx - select "Date and Time-Event ID"
```

```
$ sscsv load ./Security.evtx - select "'Date and Time,Event ID'"
```

#### isin

Displays rows that contain the specified values.

```
Arguments:
colname: str
values: list
```

examples

```
$ sscsv load ./Security.evtx - isin 'Event ID' 4624,4634
```

#### contains

Displays rows that contain the specified string.

```
Arguments:
colname: str
regex: str
```

examples

```
$ sscsv load ./Security.evtx - contains 'Date and Time' '10/6/2016'
```

#### head

Displays the first specified number of rows of the data.

```
Options:
number: int = 5
```

examples

```
$ sscsv load ./Security.evtx - head 10
```

#### tail

Displays the last specified number of rows of the data.

```
Options:
number: int = 5
```

examples

```
$ sscsv load ./Security.evtx - tail 10
```

#### sort

Sorts the data by the values of the specified column.

```
Arguments:
columns: str
Options:
desc: bool = False
```

examples

```
$ sscsv load ./Security.evtx - sort 'Date and Time'
```

#### changetz

Changes the timezone of the specified date column.

```
Arguments:
columns: str
Options:
timezone_from: str = "UTC"
timezone_to: str = "Asia/Tokyo"
new_colname: str = None
```

examples

```
$ sscsv load ./Security.evtx - changetz 'Date and Time' --timezone_from=UTC --timezone_to=Asia/Tokyo --new_colname='Date and Time(JST)'
```

### finalizer
- headers
- stat
- showquery
- show
- dump
#### headers

Displays the column names of the data.

```
Options:
plain: bool = False
```

examples

```
$ sscsv load ./Security.evtx - headers
2024-06-30T13:17:53+0000 [DEBUG] 1 files are loaded. Security.csv
┏━━━━┳━━━━━━━━━━━━━━━┓
┃ # ┃ Column Name ┃
┡━━━━╇━━━━━━━━━━━━━━━┩
│ 00 │ Level │
│ 01 │ Date and Time │
│ 02 │ Source │
│ 03 │ Event ID │
│ 04 │ Task Category │
└────┴───────────────┘
```

#### stats

Displays the statistical information of the data.

examples

```
$ sscsv load ./Security.evtx - stats
2024-06-30T13:25:53+0000 [DEBUG] 1 files are loaded. Security.csv
shape: (9, 6)
┌────────────┬─────────────┬───────────────────────┬─────────────────────────────────┬─────────────┬─────────────────────────┐
│ statistic ┆ Level ┆ Date and Time ┆ Source ┆ Event ID ┆ Task Category │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ f64 ┆ str │
╞════════════╪═════════════╪═══════════════════════╪═════════════════════════════════╪═════════════╪═════════════════════════╡
│ count ┆ 62031 ┆ 62031 ┆ 62031 ┆ 62031.0 ┆ 62031 │
│ null_count ┆ 0 ┆ 0 ┆ 0 ┆ 0.0 ┆ 0 │
│ mean ┆ null ┆ null ┆ null ┆ 5058.625897 ┆ null │
│ std ┆ null ┆ null ┆ null ┆ 199.775419 ┆ null │
│ min ┆ Information ┆ 10/6/2016 01:00:35 PM ┆ Microsoft-Windows-Eventlog ┆ 1102.0 ┆ Credential Validation │
│ 25% ┆ null ┆ null ┆ null ┆ 5152.0 ┆ null │
│ 50% ┆ null ┆ null ┆ null ┆ 5156.0 ┆ null │
│ 75% ┆ null ┆ null ┆ null ┆ 5157.0 ┆ null │
│ max ┆ Information ┆ 10/7/2016 12:59:59 AM ┆ Microsoft-Windows-Security-Aud… ┆ 5158.0 ┆ User Account Management │
└────────────┴─────────────┴───────────────────────┴─────────────────────────────────┴─────────────┴─────────────────────────┘
```

#### showquery
Displays the data processing query.

examples

```
sscsv load Security.csv - showquery
2024-06-30T13:26:54+0000 [DEBUG] 1 files are loaded. Security.csv
naive plan: (run LazyFrame.explain(optimized=True) to see the optimized plan)
Csv SCAN Security.csv
PROJECT */5 COLUMNS
```

#### show
Outputs the processing results to the standard output.

examples

```
$ sscsv load Security.csv - show
2024-06-30T13:27:34+0000 [DEBUG] 1 files are loaded. Security.csv
2024-06-30T13:27:34+0000 [DEBUG] heading 5 lines.
Level,Date and Time,Source,Event ID,Task Category
Information,10/7/2016 06:38:24 PM,Microsoft-Windows-Security-Auditing,4658,File System
Information,10/7/2016 06:38:24 PM,Microsoft-Windows-Security-Auditing,4656,File System
Information,10/7/2016 06:38:24 PM,Microsoft-Windows-Security-Auditing,4658,File System
Information,10/7/2016 06:38:24 PM,Microsoft-Windows-Security-Auditing,4656,File System
Information,10/7/2016 06:38:24 PM,Microsoft-Windows-Security-Auditing,4658,File System
```

#### dump
Outputs the processing results to a CSV file.

```
Options:
path: str = yyyymmdd-HHMMSS_{QUERY}.csv
```

examples

```
$ sscsv load Security.csv - dump ./Security-sscsv.csv
```

## Planned Features:
- CSV cache (.pkl)
Expand All @@ -65,5 +294,25 @@ initializer -> chainable manipulations... -> finalizer
- Config Batch
- Export Config

## Installation
### from PyPI
```
$ pip install sscsv
```

### from GitHub Releases
The version compiled into a binary using Nuitka is also available for use.

#### Ubuntu
```
$ chmod +x ./sscsv
$ ./sscsv {{options...}}
```

#### Windows
```
> sscsv.exe {{options...}}
```

## License
snip-snap-csv is released under the [MIT](https://github.com/sumeshi/snip-snap-csv/blob/master/LICENSE) License.
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[tool.poetry]
name = "sscsv"
version = "0.1.1"
description = "A tool designed for rapid data processing and filtering, specifically tailored for handling CSV files for log analysis."
description = "A tool designed for rapid CSV file processing and filtering, specifically designed for log analysis."
authors = ["sumeshi <sum3sh1@protonmail.com>"]
license = "MIT"
readme = "README.md"
Expand Down

0 comments on commit 1117446

Please sign in to comment.