January | 2022 | Adam Scherlis

Monthly Archives: January 2022

Fun with bitstrings and bijections

Posted on January 1, 2022 | Leave a comment

Here are two bijections between (infinite) bitstrings and (finite or infinite) sequences of positive integers:

Run-length encoding

You can think of a bitstring as being made up of consecutive runs of one or more 0s or 1s, like “0000” and “111” and “0”.

From bits to integers: Prepend a 0 to the bitstring, then slice it up into runs of 0s and 1s and write down the length of each run. If the string ends in an infinite run, stop the sequence of integers after the last finite run.

From integers to bits: Write down alternating runs of 0s and 1s, with lengths given by the numbers in the sequence. If the sequence stops, write an infinite run of whichever bit value is next. Then delete the first 0.

For example, “001110011100111…” maps to [3, 3, 2, 3, 2, 3…] and “110001100011000…” maps to [1, 2, 3, 2, 3, 2, 3…].

The annoying extra-0 trick is to distinguish pairs of bitstrings like the above, which are flipped versions of each other.

Note that “00000…” maps to [] and “11111…” maps to [1].

Gaps between 1s

On the other hand, you can think of a bitstring as being made up of runs of zero or more 0s ending in a single 1, like “001” and “000001” and “1”.

From bits to integers: For each 1 in the bitstring, write down the number of 0s immediately preceding it plus one. If the bitstring ends in an infinite run of zeros, stop the sequence of integers after the last 1.

From integers to bits: For each number N in the sequence, write down N-1 0s and then one 1. If the sequence stops, write an infinite run of 0s.

Note that “00000…” maps to [] and “11111…” maps to [1, 1, 1, 1, 1…]

Both at once?

The all-zeros string maps to an empty list under both bijections, but the all-ones string is different. It’s also easy to see that these bijections are pretty different on a generic bitstring.

Suppose we start with the list [1] and repeatedly do the following:

Encode the list into a bitstring using the run-length-encoding map
Decode that bitstring into a list using the gaps-between-1s map

What does the sequence of bitstrings look like?

The first few steps look like this:

[1] –> 11111…

11111… –> [1, 1, 1, 1, 1…]

[1, 1, 1, 1, 1…] –> 101010…

101010… –> [1, 2, 2, 2…]

[1, 2, 2, 2…] –> 110011…

So our first few bitstrings are:

11111111…
10101010…
11001100…

I’ll let you discover the pattern for yourself (pen and paper is sufficient, but Python is faster), but if you get impatient CLICK HERE to see the first 64 bits of each of the first 64 strings.

Monthly Archives: January 2022

[Linkpost] Understanding the two-head strategy for teaching ML to answer questions honestly

Fun with bitstrings and bijections

Run-length encoding

Gaps between 1s

Both at once?

Recent Posts

Top Posts Today

Archives

Tweets

Email Subscription

Meta