Simple Compression! Compressing a sequence of characters using run-length encoding


Not one that is in the book, but I have used this with students in class to investigate simple ways to compress a small bitmap image of black and white pixels.







Given a bitmap image encoded using 'B' for a black pixel and 'W' for a white pixel...

    WWBBBBWW
    WBBBBBBW
    BBWBBWBB
    BBWBBWBB
    BBBBBBBB
    BBWWWWBB
    WBBWWBBW
    WWBBBBWW

...the program will look for contiguous blocks of pixels (those that are all the same colour). For each block of pixels that are the same colour, it will display the colour of the block and the number of pixels that it contains. This data can be used in many cases (but not all!) to store the image using less data.
The first row of the image above can then be encoded as W24BW2. This means "White pixels: block of 2, Black pixels: block of 4, White pixels: block of 2."

Knowing that the image is on an 8x8 grid, the entire sequence becomes:

    W2B4W3B6W1B2W1B2W1B4W1B2W1B8W4B2W1B2W2B2W3B4W2

Get a piece of squared paper and try it!

 If you want to try out the program, select and copy the C++ source-code at the bottom of this post.


Paste into a text editor, such as Nano or Geany.

Then save the new file, ending in .cpp

I used  comp.cpp



To compile from the command-line:



    g++ -o comp comp.cpp



To run from the command-line:



    ./comp



Here's the code:



#include <iostream>     // Program uses screen and keyboard

#include <sstream>      // To convert between integer/string values

using namespace std;    // Simple abbreviations for screen, keyboard, end-of-line

int main()
{

    // Data input
    cout << "Type in your string of data using 'b' and 'w' for black and white: " << endl;
    string data;
    cin >> data;
    cout << "Thanks." << endl;



    // Validate data - check it is ready for processing
    int len = data.length();


    // Start at the first character
    int pos = 0;


    // Start by assuming there are no errors
    bool allOK = true;


    // Examine each symbol in the string
    while ( allOK == true and pos < len )
    {


        // Deal with any invalid character
        if ( data.at( pos ) != 'w'  and  data.at( pos ) != 'b' )
        {
            allOK = false;
            cout << "Problem in data!" << endl;
        }
  // end of if


        pos = pos + 1;

    }  // end of while


    // Compress the valid data if no problems were found in the original data
    if ( allOK == true )
    {
        cout << "Compressing..." << endl;

        char currentLetter;
        char currentBlockColour;
        int blockLength = 0;
        string compressedData = "";


        // Start at beginning of data string
        pos = 0;


        while ( pos < len )
        {
            currentLetter = data.at( pos );  // Get single letter

            cout << "At position " << pos << " it is a " << currentLetter << endl;

            if ( pos==0 )
            {

                // If is the first char, then it's the start of a new block
                blockLength = 1;
                currentBlockColour = currentLetter;
                cout << "Colour " << currentLetter << endl;
                cout << "Started new " << currentLetter << " block" << endl;
            }
            else
            {

                // Not the first character - is it the same colour as previous character
                if ( currentLetter == currentBlockColour )
                {
                    cout << "Still in the same colour..." << endl;
                    blockLength = blockLength + 1;
                    cout << "Block length is now " << blockLength << endl;

                }
                else
                {

                  // They are different colours - end of one colour block, start of another
                  // Put the number on to the compressed string

                  stringstream converter;
                  converter << blockLength;
                  string asAString;
                  converter >> asAString;

                  // Add the number and the colour code to the compressed string
                  compressedData.append( asAString );
                  compressedData = compressedData + currentBlockColour;

                  // Change the block to the new colour
                  currentBlockColour = currentLetter;
                  blockLength = 1;

                }  // end of if-else
            }  // end of if-else

            cout << "Compressed data so far is: " << compressedData << endl << endl;

            cout << "- - - - - - - - - - - -" << endl << endl;

            pos++;

            // If this is the last letter at the end of the string
            // Dump out the block

            if ( pos == len )
            {
                cout << "END OF ORIGINAL STRING - DUMP OUT THE FINAL BLOCK." << endl;


                stringstream converter;
                converter << blockLength;
                string asAString;
                converter >> asAString;


                // Add the number and the colour code to the compressed string
                compressedData.append( asAString );
                compressedData = compressedData + currentBlockColour;
            }
            else
              cout << "Still in the string - keep going..." << endl;
  // end of the string - block dumped!


        }  // end of the while loop that examines the string


        // DISPLAY FINAL RESULTS!        cout << "COMPRESSED DATA: " << compressedData << endl;
        cout << endl << "Thank you and good-night!" << endl;
    }
    else
     {
        cout << "Not compressed." << endl;
        return 13;  // Report an error code
    }


    return 0;
}



Find out more - buy the book on Amazon...