Skip to content
This repository has been archived by the owner on Sep 20, 2024. It is now read-only.
martin schouwenburg edited this page Apr 2, 2014 · 5 revisions

The Grid is internal storage for raster based data

Design Rationale

Raster imagery can be very big. This means that operations on a raster can take a lot of time and that optimizing access to this data can be worthwhile. The fastest way to access data is by having data directly in memory. But memory is limited; even more limited than purely the physical memory your computer seems to have. Contiguous memory is (much) smaller, your OS might not allow full access, other software might be running. So we need a way to have "as much as possible" in main memory and what is not possible should cached on local storage to be swapped into memory on demand. A backing store of a memory mapped file is not sufficient as it lacks the raw performance(it is fast, but not that fast). As far as the programmer knows he is working directly on a memory image of the data. The grid will taken care of all of these things

The grid is implemented as a storage of doubles and not as a templated class. Though the templated approach seems to give more flexibility it comes at a price. The flow of data from one data-source to another is desired and this becomes rather trouble some if data representation of one source isnt compatible in an obvious way with the representation at the other side. Programming agains a generic "typename T" becomes rather complex then. Though template meta programming can be helpfull here it makes the code harder to maintain and rather unforgiving for the uninitiated. Doubles are simpeler and cover almost all the cases (grid are multi-dimensional). Note that it doesnt mean that the data is a double, but that it fits inside a double.

Description

A grid is a stack of 2D rasters (1..n) in which all layers of the stack have the same raster size and pixel size. The z component is referred to as the ‘layer-index’ or ‘index’. A layer is always rectangular as are its pixels. each layer is sub-divided into a number of blocks of a fixed number of raster lines (apart from the last block, which might have less lines). Blocks may reside in memory (preferred) or in a cache on disk.
Each layer starts at a new block. For example, a three layer grid-coverage with dimensions 1300 x 1200 might be represented as follows (assuming block size = 500 lines).

The framework determines how much memory might be consumed by this raster ( based on the actual memory) and determines how many blocks can stay in memory ( many cases, all). If not all fit in memory it uses a simple queue to determine who stays in memory and who doenst. Access to a block promotes it to the top of the queue so that often accessed blocks (usually the block an algorithm is currently working on) stay on top and are not switched to the cache. The grid is thread-safe and can be used in a multithreading enviroment. Many of the native algorithms in Ilwis-Objects run multiple threads for maximum performance. All cells use 64-bits double precission numbers. Of course this doesnt mean the data is always using floating point numbers, but it will fit inside it.

Clone this wiki locally