Skip to content

omarsalama98/OS20

Repository files navigation

OS20

The Idea behind this approach:

The main idea is that some images contain a lot of unimportant information like a person standing in the middle of a desert or in a wide street, occupying only a small portion of the image so we could store only the part where that person is and inflect some loss on the unimportant information in the image (the parts surrounding that person).

The Algorithm:

My algorithm went through a lot of different implementation phases, but the final form consists of the following:

• A deep learning model used to determine the location of the main (important) object in the image. The model used can be found here: https://www.pyimagesearch.com/2017/09/11/object-detection-with-deep-learning-and-opencv/

• A way of averaging pixels by storing some pixels that are equal to the third of the total pixels of the image.

• Quantizing the chosen pixels into numbers of units (3, 6 and 9)

• Applying Huffman encoding on the result of the quantization.

My algorithm essentially applies lossy averaging on unimportant parts of the image to preserve the important parts from being messed with and stay almost lossless. So, in order to achieve that I firstly manually set a ratio that is taken from each edge of the image like taking the quarter of its width from the right and left and quarter of its height from the top and the bottom leaving quarter of the image in the center. By taking these ratios I mean that these certain parts have the lossy averaging method applied to them and storing the remaining part in the middle as it is. By doing this I have roughly ¾ of the image reduced to third of its value and the remaining ¼ as it is. Yet this approach was not satisfactory as it takes a lot of the image and that is not what we want with most images. So, I found out about a deep learning model that detects certain objects in an image and bounds them by a rectangle. That was exactly what I wanted so I utilized this model and used it to determine the bounds for me. I take only the biggest object detected by the model and store its pixels untouched and I apply the averaging algorithm around that object. There is an other averaging method that does not cause much loss but it only reduces the size to 2/3 which is not very optimal so it could be used on the object only but it is better to leave it untouched.

Regarding the averaging algorithm I implemented a bunch of them, one was averaging blocks of four pixels and on decoding the four pixels take that average.

Also, another algorithm was to average blocks of three to be less lossy but apparently these algorithms caused pixelation of the image, so I applied a gaussian blur to them to smoothen the image a bit.

The latest averaging method added and the one currently used is by averaging along the diagonal in a way and this method results in similar results to the three blocks method followed by blurring, but this method does not require any blurring as it kind of already blurs the image a bit and does not cause pixelation while decreasing the size to roughly its third.

chosen pixels for encoding

The pixels in blue are the ones disregarded.

After taking these pixels I quantize them into values that have the numbers (3, 6 and 9) in their units’ part. Followed by Huffman encoding. Quantization is essential for Huffman to work or have a significant effect.

I start in decoding by taking averages of pixels surrounded by at least two original pixels and averaging them to get that pixel starting from bottom left corner. Then I continue onto other pixels and averaging any known pixels surrounding them.

chosen pixels for encoding

Compared to other Compression Algorithms

Algorithims OS20 JPEG PNG
Comression ratio 3.5 -> 10.1 10 2.7
Encoding Time 4.5s -> 45s 0.4s 6s
Decoding Time 6s -> 47s 0.1s 1.5s
Peak SNR 47 35 Lossless

My results were taken for images that range from 136kb to 1.103mb.

Encoding and decoding times are not accurate as they vary with image sizes and there were no good sources to find accurate times (for JPEG and PNG).

Note: All times mentioned for my algorithm are the ones when running on python, a C++ implementation would be much faster.

Some Results:

Images on the left(or on top) are the original ones.

chosen pixels for encoding chosen pixels for encoding

For example, this image has the best results. (4.5 seconds encoding time, 6 seconds decoding time and a compression ratio of 10.

chosen pixels for encoding chosen pixels for encoding

This is the largest one with 45 seconds encoding and 47 seconds decoding and only 3.5 compression ratio.

chosen pixels for encoding chosen pixels for encoding

This has a compression ratio 4.38.

chosen pixels for encoding chosen pixels for encoding

Lastly, this image has a compression ratio of 4.45.

For running the project you should run the following command in the terminal.

python main.py --prototxt MobileNetSSD_deploy.prototxt.txt --model MobileNetSSD_deploy.caffemodel

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages