use offsets instead of pointers to reduce memory footprint and allow simple copy
- Remove excessive template parameters
- Common part of buffer via type-erasure + trampolines
- Non-templated allocator
- Introduce .cpp files?
it's much simpler to debug encode independent of decode