Data Compression a Way to Reduce Data Size


Sharing is caring!

The process which reduces data sizes, by removing excessive information is known as data compression. The size of a file is reduced, in order to save space, save time and reduce redundancy, during compression. A number of data compression algorithms exist, in order to compress different types of data formats. All the images we see online, are typically compressed in the JPEG or GIF formats, and file systems also compress files automatically. When files are stored and for those large files, that don’t automatically get compressed, we do it ourselves. There are two actions that take place during compression, these elements are: an encoding algorithm and a decoding algorithm. The encoding algorithm generates a compressed representation of a message, and the decoding algorithm takes the compressed representation of a message and reconstruct the original message.

There are two algorithms, which can be used for data compression, these algorithms are lossless and lossy algorithm.

In order to explain the distinction between two file compression formats, the lossless and lossy terms are used. Every single bit of data, which was originally in the file, remains once the file is uncompressed, with a lossless compression. When losing words or financial data could pose a problem, then lossless is the appropriate compression technique to be used. The lossless algorithms, are able to restore an original message, from the compressed state of that message. The original data is restored with a lossless algorithm, however, it is hard to achieve high compression levels, with a lossless algorithm. The lossless algorithm is used, when the same material is needed, after decompression; therefore, the lossless algorithm is necessary for data and text. The Huffman Coding, Run Length Encoding, Arithmetic Encoding and dictionary based encoding, are some of the main techniques, used in a lossless data compression algorithm.

In order to compress images and sound, the lossy algorithm is used, because a little bit of loss in resolution, is not detected, once the image or sound is decompressed. When a few errors are allowed in the reconstruction of data, then higher levels of compression can be achieved, when using the lossy algorithm. Data compression, which uses the lossy algorithm is usually more challenging, because the pixels are not easy to reconstruct, closely to the original pixels. The lossy algorithm, will always be accompanied by a significant risk for error accumulation, due to loss of data. It is impossible for the lossy compression techniques, to reconstruct an original message, from its compressed states, without the loss of some information.

The lossy compression algorithm has higher space efficiency and time efficiency than that of the lossless compression algorithm. The size of the source file and the organization of the symbols in the source file, will determine the performance of the algorithms. The size and type of file that needs to be compressed, will determine, which file compression algorithm should be used, also the amount of space available and the amount of space, that need to be saved, will play a role, in the choice of file compression algorithm to be used.