Click here to Skip to main content
15,886,639 members
Articles / DevOps
Tip/Trick

4-bit Encoder/Decoder

Rate me:
Please Sign up or sign in to vote.
4.38/5 (4 votes)
12 Feb 2019CPOL 17.8K   13
Code for a 4-bit encoder to store 15 different symbols with higher efficiency

Introduction

Converts an 8 bit string to a 4-bit string (max. 15 different characters allowed).

Respectively: Converts two 8 bit strings to one 8 bit string.

Through this conversion, strings can be stored using only 1/2 of the size of a usual string. This might be useful for a huge amount of data, that uses 15 different characters at max (like phone numbers).

Background

I was thinking, that storing telephone numbers in a database as strings is a waste of memory. But storing as an integer is also not possible. My solution was to use an encoded string.

Using the Code

Below, you see the implementation of the class. At the bottom, there is a test() function, that shows how to use the code.

For customizing the symbols, that can be represented/encoded, change Encode4Bits._mappingTable. Never use more than 15 customized values.

Python
class Encode4Bits:
    def __init__(self):
        # first element is always "END"
        self._mappingTable = ['\0', \
                              '0','1','2','3','4','5','6','7','8','9', \
                              '-','','','','']

    def _encodeCharacter(self,char):
        """@return index of element or None, if not exists"""
        for p in range(len(self._mappingTable)):
            if(char == self._mappingTable[p]):
                return p
        return None

    def encode(self, string):
        strLen = len(string)

        # ===== 1. map all chars to an index in our table =====
        mappingIndices = []
        for i in range(strLen):
            char = string[i]
            index = self._encodeCharacter(char)
            if(index is None):
                raise("ERROR: Could not encode '" + char + "'.")
            mappingIndices.append(index)
        mappingIndices.append(0)
        
        # ===== 2. Make num values even =====
        # 4 bit => 2 chars in one byte. Therefore: need even num values
        if(len(mappingIndices) % 2 != 0):
            mappingIndices.append(0)

        # ===== 3. create string =====
        ret = ""
        i = 0
        while True:
            if(i >= len(mappingIndices)):
                break # finished
            val1 = mappingIndices[i]
            val2 = mappingIndices[i+1]
            val1 = val1 << 4           
            mixed = val1 | val2
            char = chr(mixed)
            ret += str(char)
            i += 2

        return ret

    def decode(self, string):
        ret = ""
        for char in string:
            index1 = (ord(char) & 0xF0) >> 4
            index2 = (ord(char) & 0x0F)            
            ret += self._mappingTable[index1]
            ret += self._mappingTable[index2]
        
        return ret

def test():
    numberCompressor = Encode4Bits()
    encoded = numberCompressor.encode("067-845-512")
    decoded = numberCompressor.decode(encoded)
    print(len(decoded))
    print(len(encoded))


if __name__ == "__main__":
    test()

History

  • 8th February, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Engineer Telefonica Germany
Germany Germany
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
BugError Decoding Pin
Gammed11-Feb-19 1:04
Gammed11-Feb-19 1:04 
PraiseThank you! Pin
D4rkTrick12-Feb-19 15:22
professionalD4rkTrick12-Feb-19 15:22 
QuestionNot Integers Pin
SDSpivey7-Feb-19 19:02
SDSpivey7-Feb-19 19:02 
AnswerRe: Not Integers Pin
Nelek7-Feb-19 19:09
protectorNelek7-Feb-19 19:09 
AnswerNumber of leading zeros Pin
D4rkTrick7-Feb-19 21:35
professionalD4rkTrick7-Feb-19 21:35 
AnswerRe: Not Integers Pin
YvesDaoust11-Feb-19 2:25
YvesDaoust11-Feb-19 2:25 
Using 64 bits integers goes in the opposite direction of compression ! 10 digits can be packed in 40 bits using DCB, and 34 bits using plain binary.

QuestionBCD Pin
YvesDaoust7-Feb-19 4:06
YvesDaoust7-Feb-19 4:06 
AnswerRe: BCD Pin
Nelek7-Feb-19 19:04
protectorNelek7-Feb-19 19:04 
GeneralRe: BCD Pin
YvesDaoust7-Feb-19 23:05
YvesDaoust7-Feb-19 23:05 
GeneralRe: BCD Pin
Nelek8-Feb-19 0:15
protectorNelek8-Feb-19 0:15 
AnswerThanks for sharing Pin
D4rkTrick8-Feb-19 0:17
professionalD4rkTrick8-Feb-19 0:17 
SuggestionPhone symbols Pin
Nick Gisburne7-Feb-19 2:36
Nick Gisburne7-Feb-19 2:36 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.