Click here to Skip to main content
15,891,607 members
Articles / DevOps
Tip/Trick

4-bit Encoder/Decoder

Rate me:
Please Sign up or sign in to vote.
4.38/5 (4 votes)
12 Feb 2019CPOL 17.8K   13
Code for a 4-bit encoder to store 15 different symbols with higher efficiency

Introduction

Converts an 8 bit string to a 4-bit string (max. 15 different characters allowed).

Respectively: Converts two 8 bit strings to one 8 bit string.

Through this conversion, strings can be stored using only 1/2 of the size of a usual string. This might be useful for a huge amount of data, that uses 15 different characters at max (like phone numbers).

Background

I was thinking, that storing telephone numbers in a database as strings is a waste of memory. But storing as an integer is also not possible. My solution was to use an encoded string.

Using the Code

Below, you see the implementation of the class. At the bottom, there is a test() function, that shows how to use the code.

For customizing the symbols, that can be represented/encoded, change Encode4Bits._mappingTable. Never use more than 15 customized values.

Python
class Encode4Bits:
    def __init__(self):
        # first element is always "END"
        self._mappingTable = ['\0', \
                              '0','1','2','3','4','5','6','7','8','9', \
                              '-','','','','']

    def _encodeCharacter(self,char):
        """@return index of element or None, if not exists"""
        for p in range(len(self._mappingTable)):
            if(char == self._mappingTable[p]):
                return p
        return None

    def encode(self, string):
        strLen = len(string)

        # ===== 1. map all chars to an index in our table =====
        mappingIndices = []
        for i in range(strLen):
            char = string[i]
            index = self._encodeCharacter(char)
            if(index is None):
                raise("ERROR: Could not encode '" + char + "'.")
            mappingIndices.append(index)
        mappingIndices.append(0)
        
        # ===== 2. Make num values even =====
        # 4 bit => 2 chars in one byte. Therefore: need even num values
        if(len(mappingIndices) % 2 != 0):
            mappingIndices.append(0)

        # ===== 3. create string =====
        ret = ""
        i = 0
        while True:
            if(i >= len(mappingIndices)):
                break # finished
            val1 = mappingIndices[i]
            val2 = mappingIndices[i+1]
            val1 = val1 << 4           
            mixed = val1 | val2
            char = chr(mixed)
            ret += str(char)
            i += 2

        return ret

    def decode(self, string):
        ret = ""
        for char in string:
            index1 = (ord(char) & 0xF0) >> 4
            index2 = (ord(char) & 0x0F)            
            ret += self._mappingTable[index1]
            ret += self._mappingTable[index2]
        
        return ret

def test():
    numberCompressor = Encode4Bits()
    encoded = numberCompressor.encode("067-845-512")
    decoded = numberCompressor.decode(encoded)
    print(len(decoded))
    print(len(encoded))


if __name__ == "__main__":
    test()

History

  • 8th February, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Engineer Telefonica Germany
Germany Germany
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
BugError Decoding Pin
Gammed11-Feb-19 1:04
Gammed11-Feb-19 1:04 
i just checked and noticed that the decoded string is not equal to the original input string.

the cause is within the decode function
index1 = (ord(char) & 120) >> 4
the value 120 needs to be 240.

when using binary operators like & or | it's better to use hexadecimal literal for example
index1 = (ord(char) & 0xF0) >> 4

PraiseThank you! Pin
D4rkTrick12-Feb-19 15:22
professionalD4rkTrick12-Feb-19 15:22 
QuestionNot Integers Pin
SDSpivey7-Feb-19 19:02
SDSpivey7-Feb-19 19:02 
AnswerRe: Not Integers Pin
Nelek7-Feb-19 19:09
protectorNelek7-Feb-19 19:09 
AnswerNumber of leading zeros Pin
D4rkTrick7-Feb-19 21:35
professionalD4rkTrick7-Feb-19 21:35 
AnswerRe: Not Integers Pin
YvesDaoust11-Feb-19 2:25
YvesDaoust11-Feb-19 2:25 
QuestionBCD Pin
YvesDaoust7-Feb-19 4:06
YvesDaoust7-Feb-19 4:06 
AnswerRe: BCD Pin
Nelek7-Feb-19 19:04
protectorNelek7-Feb-19 19:04 
GeneralRe: BCD Pin
YvesDaoust7-Feb-19 23:05
YvesDaoust7-Feb-19 23:05 
GeneralRe: BCD Pin
Nelek8-Feb-19 0:15
protectorNelek8-Feb-19 0:15 
AnswerThanks for sharing Pin
D4rkTrick8-Feb-19 0:17
professionalD4rkTrick8-Feb-19 0:17 
SuggestionPhone symbols Pin
Nick Gisburne7-Feb-19 2:36
Nick Gisburne7-Feb-19 2:36 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.