Skip to content

Commit

Permalink
doc: add example of SlimBytes; add desc of SlimBytes to README
Browse files Browse the repository at this point in the history
  • Loading branch information
drmingdrmer committed Jan 16, 2021
1 parent 31ec831 commit 1f00019
Show file tree
Hide file tree
Showing 5 changed files with 210 additions and 1 deletion.
4 changes: 3 additions & 1 deletion .github/settings.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ _extends: gh-config

repository:
name: slimarray
description: SlimArray compresses uint32 into several bits, by using a polynomial to describe overall trend of an array.
description: |
SlimArray compresses uint32 into several bits, by using a polynomial to describe overall trend of an array.
SlimBytes use SlimArray to index a record array, to reduce memory overhead.
homepage: https://openacid.github.io/
topics: go, golang, memory, compacted, compress, array, space
52 changes: 52 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,11 @@ With a SlimArray with a million sorted number in range `[0, 1000*1000]`,
- reading a `uint32` with `Get()` takes **7 ns**.
- batch reading with `Slice()` takes **3.8 ns**/elt.

SlimBytes is an array of var-length records(a record is a `[]byte`), which is indexed by SlimArray.
Thus the memory overhead of storing `offset` and `length` of each record is very low, e.g., about **8 bits/record**,
compared to a typical implementation that uses an offset of type int(`32 to 64 bit / record`).
An `Get()` takes **15 ns**.

中文介绍: [https://blog.openacid.com/algo/slimarray/](https://blog.openacid.com/algo/slimarray/)

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
Expand All @@ -30,6 +35,7 @@ With a SlimArray with a million sorted number in range `[0, 1000*1000]`,
- [Install](#install)
- [Synopsis](#synopsis)
- [Build a SlimArray](#build-a-slimarray)
- [Build a SlimBytes](#build-a-slimbytes)
- [How it works](#how-it-works)
- [The General Idea](#the-general-idea)
- [What It Is And What It Is Not](#what-it-is-and-what-it-is-not-1)
Expand Down Expand Up @@ -149,6 +155,52 @@ func ExampleSlimArray() {
}
```


## Build a SlimBytes

```go
package slimarray_test

import (
"fmt"

"github.com/openacid/slimarray"
)

func ExampleSlimBytes() {

records := [][]byte{
[]byte("SlimBytes"),
[]byte("is"),
[]byte("an"),
[]byte("array"),
[]byte("of"),
[]byte("var-length"),
[]byte("records(a"),
[]byte("record"),
[]byte("is"),
[]byte("a"),
[]byte("[]byte"),
[]byte("which"),
[]byte("is"),
[]byte("indexed"),
[]byte("by"),
[]byte("SlimArray"),
}

a, err := slimarray.NewBytes(records)
_ = err

for i := 0; i < 16; i++ {
fmt.Print(string(a.Get(int32(i))), " ")
}
fmt.Println()

// Output:
// SlimBytes is an array of var-length records(a record is a []byte which is indexed by SlimArray
}
```

# How it works

Package slimarray uses polynomial to compress and store an array of uint32. A
Expand Down
12 changes: 12 additions & 0 deletions docs/README.md.j2
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ With a SlimArray with a million sorted number in range `[0, 1000*1000]`,
- reading a `uint32` with `Get()` takes **7 ns**.
- batch reading with `Slice()` takes **3.8 ns**/elt.

SlimBytes is an array of var-length records(a record is a `[]byte`), which is indexed by SlimArray.
Thus the memory overhead of storing `offset` and `length` of each record is very low, e.g., about **8 bits/record**,
compared to a typical implementation that uses an offset of type int(`32 to 64 bit / record`).
An `Get()` takes **15 ns**.

中文介绍: [https://blog.openacid.com/algo/slimarray/](https://blog.openacid.com/algo/slimarray/)

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
Expand Down Expand Up @@ -78,6 +83,13 @@ go get github.com/openacid/slimarray
{% include 'example_slimarray_test.go' %}
```


## Build a SlimBytes

```go
{% include 'example_slimbytes_test.go' %}
```

# How it works

{% include 'docs/slimarray-package.md' %}
103 changes: 103 additions & 0 deletions docs/slimarray.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,13 @@ SlimArray compact `Seg` into a dense format:

## Usage

```go
var (
BytesTooLarge = errors.New("total bytes exceeds max value of uint32")
TooManyRows = errors.New("row count exceeds max value of int32")
)
```

```go
var File_slimarray_proto protoreflect.FileDescriptor
```
Expand Down Expand Up @@ -218,6 +225,16 @@ Get returns the uncompressed uint32 value. A Get() costs about 7 ns

Since 0.1.1

#### func (*SlimArray) Get2

```go
func (sm *SlimArray) Get2(i int32) (uint32, uint32)
```
Get2 returns two uncompressed uint32 value at i and i + 1. A Get2() costs about
15 ns.

Since 0.1.4

#### func (*SlimArray) GetBitmap

```go
Expand Down Expand Up @@ -317,3 +334,89 @@ Since 0.1.1
```go
func (x *SlimArray) String() string
```

#### type SlimBytes

```go
type SlimBytes struct {

// Positions is the array of start position of every record.
// There are n + 1 int32 in it.
// The last one equals len(Records)
Positions *SlimArray `protobuf:"bytes,21,opt,name=Positions,proto3" json:"Positions,omitempty"`
// Records is byte slice of all record packed together.
Records []byte `protobuf:"bytes,22,opt,name=Records,proto3" json:"Records,omitempty"`
}
```

SlimBytes is a var-length []byte array.

Internally it use a SlimArray to store record positions. Thus the memory
overhead is about 8 bit / record.

Since 0.1.4

#### func NewBytes

```go
func NewBytes(records [][]byte) (*SlimBytes, error)
```
NewBytes creates SlimBytes, which is an array of byte slice, from a series of
records.

Since 0.1.14

#### func (*SlimBytes) Descriptor

```go
func (*SlimBytes) Descriptor() ([]byte, []int)
```
Deprecated: Use SlimBytes.ProtoReflect.Descriptor instead.

#### func (*SlimBytes) Get

```go
func (b *SlimBytes) Get(i int32) []byte
```
Get the i-th record.


A Get costs about 17 ns

Since 0.1.14

#### func (*SlimBytes) GetPositions

```go
func (x *SlimBytes) GetPositions() *SlimArray
```

#### func (*SlimBytes) GetRecords

```go
func (x *SlimBytes) GetRecords() []byte
```

#### func (*SlimBytes) ProtoMessage

```go
func (*SlimBytes) ProtoMessage()
```

#### func (*SlimBytes) ProtoReflect

```go
func (x *SlimBytes) ProtoReflect() protoreflect.Message
```

#### func (*SlimBytes) Reset

```go
func (x *SlimBytes) Reset()
```

#### func (*SlimBytes) String

```go
func (x *SlimBytes) String() string
```
40 changes: 40 additions & 0 deletions example_slimbytes_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
package slimarray_test

import (
"fmt"

"github.com/openacid/slimarray"
)

func ExampleSlimBytes() {

records := [][]byte{
[]byte("SlimBytes"),
[]byte("is"),
[]byte("an"),
[]byte("array"),
[]byte("of"),
[]byte("var-length"),
[]byte("records(a"),
[]byte("record"),
[]byte("is"),
[]byte("a"),
[]byte("[]byte"),
[]byte("which"),
[]byte("is"),
[]byte("indexed"),
[]byte("by"),
[]byte("SlimArray"),
}

a, err := slimarray.NewBytes(records)
_ = err

for i := 0; i < 16; i++ {
fmt.Print(string(a.Get(int32(i))), " ")
}
fmt.Println()

// Output:
// SlimBytes is an array of var-length records(a record is a []byte which is indexed by SlimArray
}

0 comments on commit 1f00019

Please sign in to comment.