Post

What is Base64 and how to base anything?

When you see a string like RG8gdSBrbm93IEJBU0U2ND8=, random letters and numbers with = at the end, you should guess it might be a Base64 string. How does it work? How to write you own BaseShittyThings?

Before we start, I need to make it clear. Base64 (or Base32, Base16….) is not an encryption method. It’s a way to encode binary bit to printable characters.

There are many ways to do those BaseLike encode. Here I’ll just explain a classic one of Base64:

  • Covert your text to ASCII format and then convert it to 8-bit binary (or you get other binary format, just make sure it’s binary).
  • From left to right, convert every 6 bit to decimal as the table index.
  • Get encoded letter from the table.

image-20260630223023739

Okay, here are some questions you might ask:

  • Why 8-bit?
    • That’s how ASCII works. Actually it doesn’t matter, we only need to make sure it’s binary.
  • Why every 6 bit?
    • Since we are using Base64, the index range is 0-63. $63 = (2)111111$, the bit-width is 6, which is the max width we can represent by using our Base64 table
  • What if Original.length % 6 != 0?
    • Simply add 0 to the end of the binary string. And most of the time we also add = to the end of the encoded string as a mark.
  • What if I wanna write Base63? 63 is not a power of 2.
    • You can’t do that. If your binary contains string like 111111 which is 63 in decimal, it will go out of your table’s range.
    • You can do that. But that method is just like number system conversion but not a bit operation (I hope you understand what I mean).

Here is an example of my own Base8 code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
std::vector<std::string> getYourBaseCode(std::string& text,
                                         std::vector<std::string>& baseTable) {
  std::vector<std::string> res;
  std::vector<bool> textBits;
  int baseBitLength =
      baseTable.size() ? 64 - __builtin_clzll(baseTable.size() - 1) : 0;

  for (char c : text)
    for (int i = 7; i >= 0; i--) textBits.push_back((c >> i) & 1);
  if (textBits.size() % baseBitLength)
    textBits.resize(
        textBits.size() + (baseBitLength - textBits.size() % baseBitLength),
        false);

  int decodeLength = textBits.size() / baseBitLength;
  for (int i = 0; i < decodeLength; i++) {
    int index = 0;
    for (int j = 0; j < baseBitLength; j++)
      index = (index << 1) | textBits[i * baseBitLength + j];
    res.push_back(baseTable[index]);
  }
  return res;
}

And the main function is like:

1
2
3
4
5
6
7
8
9
  std::vector<std::string> baseTable = {"🧐", "🤣", "🫠", "💩",
                                        "😎", "🦅", "😍", "Bird"};
  std::string text = "Hello, World!";
  std::cout << "Original text: " << text << std::endl;
  auto encoded = getYourBaseCode(text, baseTable);
  std::cout << "Encoded text: ";
  for (const auto& s : encoded) {
    std::cout << s;
  }

Code execution result:

1
2
Original text: Hello, World!
Encoded text: 🫠🫠🧐😍🫠🦅🦅😎💩💩🧐😍Bird😎🦅😎🤣🧐🧐🦅💩🦅🦅Bird💩😎😎😍😍🤣😎😎🤣🧐🫠
This post is licensed under CC BY 4.0 by the author.