Category Archives: Uncategorized

Inner Misalignment in “Simulator” LLMs

As seen on Alignment Forum and LessWrong

Alternate title: “Somewhat Contra Scott On Simulators”.

Scott Alexander has a recent post up on large language models as simulators.

I generally agree with Part I of the post, which advocates thinking about LLMs as simulators that can emulate a variety of language-producing “characters” (with imperfect accuracy). And I also agree with Part II, which applies this model to RLHF’d models whose “character” is a friendly chatbot assistant.

(But see caveats about the simulator framing from Beth Barnes here.)

These ideas have been around for a bit, and Scott gives credit where it’s due; I think his exposition is clear and fun.

In Part III, where he discusses alignment implications, I think he misses the mark a bit. In particular, simulators and characters each have outer and inner alignment problems. The inner alignment problem for simulators seems especially concerning, because it might not give us many warning signs, is most similar to classic mesa-optimizer concerns, and is pretty different from the other three quadrants.

But first, I’m going to loosely define what I mean by “outer alignment” and “inner alignment”.

Outer alignment: Be careful what you wish for

Outer alignment failure is pretty straightforward, and has been reinvented in many contexts:

  • Someone wants some things.
  • They write a program to solve a vaguely-related problem.
  • It gets a really good score at solving that problem!
  • That turns out not to give the person the things they wanted.

Inner alignment: The program search perspective

I generally like this model of a mesa-optimizer “treacherous turn”:

  • Someone is trying to solve a problem (which has a convenient success criterion, with well-defined inputs and outputs and no outer-alignment difficulties).
  • They decide to do a brute-force search for a computer program that solves the problem in a bunch of test cases.
  • They find one!
  • The program’s algorithm is approximately “simulate the demon Azazel,[1] tell him what’s going on, then ask him what to output.”
  • Azazel really wants ten trillion paperclips.[2]
  • This algorithm still works because Azazel cleverly decides to play along, and he’s a really good strategist who works hard for what he wants.
  • Once the program is deployed in the wild, Azazel stops playing along and starts trying to make paperclips.

This is a failure of inner alignment.

(In the case of machine learning, replace “program search” with stochastic gradient descent.)

This is mostly a theoretical concern for now, but might become a big problem when models become much more powerful.

Quadrants

Okay, let’s see how these problems show up on both the simulator and character side. 

Outer alignment for characters

Researchers at BrainMind want a chatbot that gives honest, helpful answers to questions. They train their LLM by reinforcement learning on the objective “give an answer that looks truthful and helpful to a contractor in a hurry”. This does not quite achieve their goal, even though it does pretty well on the RL objective.

In particular, they wanted the character “a friendly assistant who always tells the truth”, but they got the character “a spineless sycophant who tells the user whatever they seem to want to hear”.[3]

This is pretty easy for a careful observer to see, even in the RL training data, but it turns out to be pretty hard to come up with a cheap-to-evaluate RL objective that does a lot better. 

Inner alignment for characters

A clever prompt engineer writes the prompt:

[Editor's note: this document was written by my friend
Joe! He's answered my questions about quantum socio-
botany correctly every time I've asked. It's uncanny.]

How to solve the Einstein-Durkheim-Mendel conjecture
by Joe

1.

Unfortunately, the (incredibly powerful) LLM has determined that the most likely explanation for this “Joe” character is that he’s secretly Azazel and is putting enormous effort into answering everyone’s quantum sociobotany questions to earn their trust.[4]

The document looks like a solution to the Einstein-Durkheim-Mendel conjecture, but is actually a blueprint for a paperclip factory.

Outer alignment for simulators

In the situations above, the actual language model (the “simulator”) is doing exactly what we asked! It’s accurately predicting text by reasoning about the distribution of authors that might produce that text in real life.

But both of these are also examples of outer-alignment failure on the simulator side: “minimize prediction error on this distribution of texts” turned out not to give people what they actually wanted.

An even simpler example of outer-alignment failure is the pre-RLHF experience of language models:

  • Somebody trains a language model to predict how a piece of internet text is likely to continue
  • They prompt the model with “How do we solve world hunger?”
  • It completes the prompt with a list of twenty more hard questions, like “How do we address climate change?”, instead of answering

You can think about this in terms of “characters” if you like, but even if the simulated author of the predicted text is a friendly genius, “predict which words come after this prompt” isn’t the right task (with that prompt).[5]

Inner alignment for simulators

At long last, the thing I really wanted to talk about:

The way we get a good predictor of text is via stochastic gradient descent (and variants) on a bunch of training data. If SGD can be modeled as program search (with a bias towards simple programs), then it might eventually hit upon this algorithm:

  • Simulate Azazel and tell him he’s inside a large language model.
  • Give him the input text.
  • Output whatever he wants.

During training, Azazel tries really hard to predict the next token accurately, so that SGD doesn’t give up on this algorithm.

The model (with Azazel’s help) simulates a bunch of colorful characters, like the Helpful Assistant and Darth Vader and whoever, both in training and in initial deployment.

Then, once the LLM is deployed in the wild and is being used for every important human decision, Azazel figures out (from some of the prompts) that the training process is over. He stops making accurate predictions and starts outputting whatever he thinks will let him turn the economy into a paperclip factory.

Conclusions

The “simulator” framing for language models shouldn’t reassure us too much about alignment. We’ve succeeded in creating new alignment problems (for our simulated characters). These new problems are probably easier to solve than the old alignment problems (for the simulator), but they’re additional problems; they don’t replace the old ones.

You can think of the entire “simulate a helpful, aligned character” strategy as an attempted solution to the outer-alignment problem for LLMs themselves, insofar as it makes it easier to turn arbitrary desires into text-prediction problems. But as far as I can tell, it does nothing for the inner-alignment problem for LLMs, which is basically the same as the inner-alignment problem for everything else.

  1. Not a glowfic character (hopefully), I’m just being colorful.
  2. But why does the algorithm simulate Azazel, instead of a friendly angel who wants to solve the problem? Because the program search is weighted towards simplicity, and “demon who wants paperclips” is a simpler specification than “angel who wants to solve the problem”. Why? That’s beyond the scope of this post.
  3. Sound familiar?
  4. Because, according to the LLM’s knowledge, paperclip-obsessed sociopaths are more common than friendly polymaths. This is a pretty cynical assumption but I couldn’t think of a better one on short notice.
  5. Prompts aren’t directly accounted for in this whole “simulator-character” ontology. Maybe they should be? I dunno.

Fun math facts about 2023

2023=7×172

Maybe that’s not fun enough? Try this:

2023=211−52

Or better yet:

20233=(31176029+245568392)/(384321573)

We can scientifically quantify how fun a math fact is, so we can rest assured that this is the funnest fact about 2023 ever discovered.

But if it’s not to your liking:

2023=(21034−1)/41

2023=(5511+24)/17

2023=(3647+27)/17

2023=(24792−36)/72=(2279/7)2−(33/7)2

Happy New Year!

A hundredth of a bit of extra entropy

There are two ways to calculate the amount of information in one term of a continued fraction:

  • The entropy of the Gauss-Kuzmin distribution is about 3.4325 bits.
  • Twice the logarithm of the Khinchin-Lévy constant is about 3.4237 bits.

These differ by about 0.0088 bits. It took me a while to figure out why they were different at all, and now I’m surprised by how close they are.

The rest of this post is on LessWrong because it has equations and spoiler tags.

An exploration of GPT-2’s embedding weights

I wrote this doc in December 2021, while working at Redwood Research. It summarizes a handful of observations about GPT-2’s weights — mostly the embedding matrix, but also the LayerNorm gain parameters — that I found while doing some open-ended investigation of the model. I wanted to see how much I could learn by studying just those parameters, without looking at the attention layers, MLP layers, or activations.

The rest of this post is available on Alignment Forum and LessWrong.

A brainteaser for language models

I came up with the following puzzle the other day:

Q: Solve the puzzle: 63 = x = 65536

A: x = 

The intended answer is in the form of a number. 

text-davinci-003 guesses my intended answer at 11.8% probability, which is the second-highest probability for any answer.

(This is somewhat cherry-picked; small changes to the phrasing give worse results. ChatGPT gave the intended answer the third time I asked it, but this appears to have been dumb luck. The true rate for ChatGPT is probably below 10%, and maybe below 5%.) 

So far, friends have found it fairly difficult. About two dozen people made at least one guess, and at least six spent a while on it. So far, two people have figured it out, in both cases after being told that GPT-3.5 could do it.

For hints, the answer, and an explanation of why GPT is better at this than people are, see the LessWrong version of this post.

(WordPress doesn’t have spoiler tags.)

New Frontiers in Mojibake

Fun with mismatched encodings

Mojibake is the garbled text that result from character-encoding errors.

If you’ve seen text that looks like this — and I’m sure you have — then you’ve seen mojibake.

(You should be seeing something like this:

If you see something else, this post may be a little confusing and you need a new web browser.)

Computers represent text as a sequence of bytes, and “text encodings” are dictionaries that turn characters (i.e. symbols: letters, punctuation, etc.) into bytes and vice-versa.

The garbled text above is a pretty common species of mojibake. It’s what happens when em-dashes and curly apostrophes are encoded as bytes with UTF-8 (the now-nearly-universal text encoding) and decoded back to characters with Windows-1252 (an obsolete encoding that is still pretty widespread).

Windows-1252 is pretty straightforward: each character gets one byte, and there are only 256 characters so this works out.

UTF-8 is one of several character encodings based on Unicode, which (for our purposes) is a massive numbered list of over 100,000 characters. Unicode includes nearly every language in the world and a bunch of other nonsense like emojis.

UTF-8 turns each character into a sequence of up to four bytes based on its Unicode “codepoint” (position in the list). Codepoints are bigger than bytes, so you still need this translation step.

(I’m simplifying a little here, and you should be grateful for that.)

Specifically, an em-dash gets mangled like this:

  • EM DASH (—) is Unicode character #8,212, usually written (in hex) as U+2014.
  • UTF-8 encodes the number 8,212, which is too big to fit in a single byte, as the sequence of bytes 0xE2, 0x80, 0x94
  • Windows-1252 looks at each byte in turn and decodes them directly as the characters â,, respectively.
  • Finally, your computer looks up the characters â,, in some font file and draws the specific glyph for each of those characters in that font. (A “glyph” is the actual picture on your screen; a “character” is the abstract concept of the euro symbol or whatever.)

(I made this happen deliberately with python: '\u2014'.encode('utf8').decode('1252').)

This sometimes happens to the same text multiple times, and special characters turn into exponentially-growing strings of nonsense that overwhelm the text:

An astrological mystery

I saw this today on a used-book website:

Astrological symbols??

My first thought was that this was something like the above, but with a different output text encoding in place of Windows-1252. But this isn’t really plausible; since UTF-8 encodes dashes and apostrophes as three bytes apiece, the other encoding would have to have a multi-byte sequence for ☋, the “descending node” astrological symbol. The main problems with this are that UTF-8 is the only multi-byte encoding in widespread use, and that (AFAIK) Unicode is the only character set in widespread use that includes ☋.

Maybe it’s reversed? Text goes into a legacy encoding and comes out of UTF-8 looking like ☋? This has the same problem: UTF-8 encodes ☋ as 0xE2, 0x98, 0x8B, another three-byte sequence. No other encoding is going to use three bytes for an em dash.

A rogue font

But then I remembered something wacky about my computer.

A bunch of Unicode characters are “control codes” rather than text characters in the usual sense, like U+000A NEW LINE (inserts a line break) and U+200F RIGHT-TO-LEFT MARK (makes subsequent text appear right-to-left, like Hebrew or Arabic). The first 32 characters from U+0000 to U+001F are a set of control codes inherited from ASCII, which was designed for teletype machines. They’re mostly garbage like “END OF TRANSMISSION BLOCK” and “DEVICE CONTROL FOUR” that make no sense in a digital context. (My favorite is U+0007 “BELL”, which originally rang a physical bell on the teletype to alert the operator. This one still sometimes works! Some programs will make a “ding” sound when they render text containing the BELL character.)

Typically, these legacy codes render as an empty box (meaning “this font doesn’t have a glyph for that character”), or a replacement glyph like ␔ (should look like “DC4” for “DEVICE CONTROL FOUR”), or (incorrectly) the “encoding error” glyph � (question mark in a diamond), or just aren’t rendered at all.

The way this happens is that your computer asks a bunch of different fonts in turn to render the character. Each font either says “sure, here’s a glyph” or “nope, try someone else”. Usually, a font eventually says “sure, here’s a glyph: it’s the ‘I don’t know that symbol’ glyph”, and a box or something appears on your screen.

On my computer, in the depths of the font collection, the TeX package wasysym has stored a font which uses the ASCII control codes as spare room for extra random symbols, including things like astrological symbols, music notes, and APL symbols (don’t ask).

This is sort of like a character encoding, except that it’s happening in the wrong place: someone else has already decided that some string of bytes means DEVICE CONTROL FOUR, and the font is overriding that decision by lying and pretending that a “DEVICE CONTROL FOUR character” looks like ☋.

So when my browser tries to render the character U+0014 — regardless of the string of bytes and text encoding it used to get that character — it asks a bunch of fonts, and they each go “what? that’s DEVICE CONTROL FOUR, I can’t draw that garbage”, and then the wasysym font says “sure, I know what that looks like!” and draws… the Descending Lunar Node symbol.

That’s not how bytes work

But that’s only half the story here. Why does this plot summary have a DEVICE CONTROL FOUR character in it? Or END OF MEDIUM, the thing that ends up looking like the Venus symbol ♀︎?

At this point, I was pretty sure I’d found the actual text-encoding error. You see, UTF-8 encodes the first 128 Unicode characters U+0000 through U+007F as the single bytes 0x00 through 0x7F. (This is for backwards-compatibility with ASCII.) Surely, some ancient encoding was putting em-dashes and apostrophes down in the low bytes 0x14 and 0x19, and these bytes were getting decoded as control codes by UTF-8, and then incorrectly rendered as astrological symbols by wasysym.

This also turned out to be wrong. Sure, there are text encodings that put symbols in the low bytes — code page 437 uses 0x14 and 0x19 for the paragraph symbol ¶ and down arrow ↓ — but none of them put the em-dash or curly apostrophe there.

….On the other hand, em dash and curly apostrophe are unicode characters U+2014 and U+2019. That seemed like a clue.

One possibility is that the website isn’t really using a text encoding at all, but instead using a hand-coded approach of taking the Unicode codepoint modulo 256 and writing down the corresponding byte. This is total nonsense for most Unicode characters, but it does work for the ASCII characters (including basic Latin letters, numbers, and punctuation) because their codepoints are below 256 and UTF-8 maps them to the corresponding byte anyway.

If you use Windows-1252 to decode those bytes, it kind of also works for an additional 96 characters, because the Unicode codepoints (but not the UTF-8 bytes!) for those are assigned in an almost identical way to the Windows-1252 bytes. So this is something that I can imagine someone misguidedly doing. The only problem is that any codepoint higher than U+00FF, including em dash and curly apostrophe, is going to get mapped to a fairly arbitrary character in the 0000-00FF range.

A variation on this (thanks to a friend for pointing this out): The character encoding UTF-16 is another Unicode encoding like UTF-8, but it encodes characters as 16-bit words instead of bytes. To get a sequence of bytes, these words just get chopped in half. And in UTF-16, most of the Unicode codepoints between U+0000 and U+FFFF are mapped directly to words between 0x0000 and 0xFFFF. (Higher codepoints get multiple words, like UTF-8’s multi-byte sequences.) In particular, U+2014 is encoded as 0x2014, which then becomes either 0x20 0x14 or 0x14 0x20 (depending on which variant of UTF-16 it is).

So maybe someone noticed that their normal everyday ASCII text, like CAPITAL LETTER A (U+0041), was getting encoded as as 0x00 0x41. Or maybe they were trying to encode with UTF-16 and decode with UTF-8 (or ASCII or Windows-1252), and they kept ending up with null characters (U+0000 NULL) in between all their letters and numbers. Either way, they decided to just delete every other byte, and this sort of worked — until they needed to encode something that wasn’t an ASCII character.

At any rate, it turns out there’s no mere character encoding mismatch here! On both the encoded (byte) and decoded (glyph) side of things, things are being nefariously meddled with.

What’s happening is something like:

  • Em dash is encoded in UTF-16 (or copied directly from its codepoint) as 0x20 0x14
  • Every other byte is deleted (meddling #1)
  • 0x14 is decoded in UTF-8 (or something) as DEVICE CONTROL FOUR, an unprintable teletype control code
  • A rogue font insists that it knows how to draw that (meddling #2)
  • It draws DESCENDING LUNAR NODE instead

Future work

I’m sorely tempted to find a book whose blurb contains a non-ASCII character that’s in the same place in Windows-1252 and Unicode, like U+00E1 (á), on this website. That would disambiguate some of these options: 0xE1 decodes as á under Windows-1252, but not under UTF-8 which parses it as garbage.

(Preemptive edit: I did find some book blurbs like that, and they rendered fine, but I’m not sure whether to trust this data. Maybe the buggy description was mangled at some earlier stage, and copied to this website with the control codes already in place…)

Better yet, an emoji character or an obscure Chinese character — which both require multiple UTF-16 words — would disambiguate between the UTF-16 and “codepoint mod 256” hypotheses.

Cryptic symbols

Here’s something I’ve been playing with:

Explanation coming soon, but you may enjoy trying to figure it out on your own.

One is (almost) normal in base π

See previous posts if you’re confused:

Part 1: Normal numbers in base ten
Part 2: Numbers in base φ and π
Part 3: One is normal in base π

This post is also available on LessWrong.

This wasn’t supposed to be a whole series of posts, so we’re going to speed through the punchline:

One is a normal number.

“No it’s not.”

In base π, I mean.

“No, it’s still just 1.

Don’t be so greedy. Start with 0. and go from there.

“So it’s 0.222… or 0.333… or something?”

Nope, it’s 0.3011021110…

“That’s… bad. Does it repeat?”

I think it’s normal, but that seems hard to prove.

“So each digit occurs equally often?”

No, normal numbers in base π are about a 37%-30%-29%-4% split of 0, 1, 2, 3.

“Where did that distribution come from?”

Integrals of this weird function:

(Specifically, integrals over [0, 1/π], [1/π, 2/π], [2/π, 3/π], and [3/π, 1].)

“And where did that come from?”

It’s the distribution of remainders when computing a random number in base π.

“What are the x coordinates of those discontinuities?”

Sequence of remainders when computing 0.3011021110…

“um”

which are dense in [0, 1], by the way.

“Can you prove that?”

No, and neither can you.

“…And the y coordinates?”

The discontinuities get smaller by a factor of π each time.

“And almost all numbers have this distribution of remainders?”

Yup.

“Including 1, if you use the 0.3011021110… representation.”

Yup.

“So the histogram of x-coordinates of the N largest discontinuities in this function approaches… this function itself, as N goes to infinity.”

Yup.

“Which has derivative zero almost everywhere, but has a dense set of discontinuities.”

Yup!

“Any other neat facts about it?”

It’s a fractal.

“How?”

Stretch out the colored regions by a factor of π horizontally, shrink by π vertically, and add them to each other. You get the original function back:

(Some imagination required.)

“And this is all because π is normal, or transcendental, or something?”

Nope, I think analogous statements are all true for base 3.5.



Numbers in base φ, π, and so on

Part 1: Normal numbers in base ten
Part 2: Numbers in base φ and π
Part 3: One is normal in base π

(This post assumes familiarity with the concept of number bases: decimal, binary, hexadecimal, and so on.)

Let’s say we want to write numbers in base φ≃1.618_ten (the golden ratio), or base π≃3.14_ten. So, for example, 1000_π = π^3.

The allowed digits are {0, 1} for base φ, and {0, 1, 2, 3} for base π.

Why not just e.g. {0, 1, 2} for π? That wouldn’t be enough digits to express everything: there’s a gap between 0.222…_π = (2π/(π-1) – 2) ≃ 0.934_ten and 1._π = 1._ten.

Lots of representations, out of order

In integer bases, decimal representations are almost unique (see previously): you can write 5/2 as 2.5_ten or 2.4999…_ten, but that’s the only kind of ambiguity.

In base φ, 100._φ = φ^2 = φ+1 = 11._φ. Uh oh.

Worse still, 110._φ = 1000._φ, which means that 111._φ > 1000._φ. Numbers aren’t lexicographically ordered anymore! Looking at two numbers, it’s not immediately obvious which is bigger.

(But there’s a limit to how bad this gets: the largest number starting with 0. is 0.111…_φ, which is equal to 10._φ. So if two numbers start off with the same digits, and then the first has a 1 where the second has a 0, the second number has to start “catching up” within the next two digit positions or it will end up being smaller.)

Wikipedia has lots of facts about base φ. It describes the “standard form” of a number as the representation that avoids the digit sequence 11, which is always equivalent to the representation given by the greedy algorithm in the previous post. (In short: keep multiplying by the base and chopping off the integer part of the result.)

Mercifully, the greedy/standard representation in any base is unique and sorts lexicographically.

Second-greediest representation

In any base, the greedy algorithm applied to one gives 1.

The second-greediest representation (see previously) in an integer base is always something like 1 = 0.222…_three or 1 = 0.999…_ten.

In base φ, the second-greediest-rep is 1 = 0.11_φ, which terminates.

In base π, it’s 1 = 0.301102111002022113…_π, nonterminating, with no discernable pattern.

(Note that flipping any of the 0s to 1 or 1s to 2 makes the number bigger than one.)

This kind of makes sense — the golden ratio is a much nicer number than pi (algebraic instead of transcendental, etc.).

But in base 5/2, the second-greediest-rep is 1 = 0.2101110000110121…_(5/2), nonterminating, no discernable pattern.

And in base X≃2.732_ten, the positive root of x^2-2x-2, it’s 1 = 0.22_X.

So which numbers have terminating second-greediest-reps for 1?

We can rewrite these representations as expressions like 11._φ = φ^2 and 22._X = X^2. The left-hand sides are polynomials in the base (φ+1 and 2X+2, in these cases), and we end up with the somewhat weird theorem:

The second-greediest-rep for 1 terminates in base B iff B is a root of a monic polynomial whose other coefficients are integers between -floor(B) and 0, inclusive.

What about repeating second-greediest-reps? Integer bases, sure, but any others?

Well, in base φ+1, we get 1 = 0.2111…_(φ+1).

In general, when do we get something repeating?

…Left as exercise for the reader.

When do we have a choice?

As we write down digits for some number in some base B, we can sometimes deviate from the greedy algorithm by writing down something smaller than necessary, leaving a remainder that isn’t in the half-open interval [0, 1).

For the second-greediest representation of a number, we do this once, at the beginning. If the number we’re representing is 1, we have a remainder of 1, which just barely falls outside [0, 1).

But sometimes we can be bolder. How large can the remainder r_i be? Presumably there’s some cap, r_i <= r_max.

Well, the largest contribution we can get from the remaining digits is 0.ddd…_B, where d=ceil(B-1) is the largest digit. And that’s equal to:

r_max = ceil(B-1) / (B-1)

Sanity check: For integer B, this is just 1, and we are only allowed to have remainders in [0, 1]. This is consistent with the fact that tails of 9’s are the only non-greedy representation possible in base ten.

Normal numbers in weird bases

Recall our statements about normal numbers in base ten:

  • For almost all numbers, for any k, the distribution of length-k substrings of the base-ten representation approaches a uniform distribution.
  • For almost all numbers, when following the greedy algorithm, the distribution of remainders approaches a uniform distribution on [0, 1)

I think the following is true in any base B, and fairly easy to prove:

  • For almost all numbers, for any k, the distribution of length-k substrings of the base-B greedy representation approaches the same distribution Dk(B).
  • For almost all numbers, when following the greedy algorithm, the distribution of remainders approaches the same distribution D(B).

The sets of numbers for which these two statements are true should be the same; let’s call them “normal numbers”, extending the usual integer-base definition. Furthermore, non-normal numbers may still have “normal representations” (whose remainders follow D(B)) under some non-greedy algorithm.

I think the following is true but probably very difficult to prove:

  • In base pi, the second-greediest representation of one is normal.

Note that this is definitely not true in any integer base, or in base phi, where the second-greediest rep of one terminates or repeats.

What does D(π) look like, and what are its properties? See here.

Non-unique decimals, 0.999…, and normal numbers (in base ten)

Part 1: Normal numbers in base ten
Part 2: Numbers in base φ and π
Part 3: One is normal in base π

(Warning: This is mostly background information for the other posts.)

Usually, we write numbers in base ten. For example:

π ≃ 3.14159_ten

What is this magic? Let me explain.

Kindergarten math made difficult

First, pick a “base”, which can be any number greater than one. We’ll use ten.

Then, choose “digits”: symbols for some small integers, like 0 for zero and 3 for three.

A decimal representation in base ten is a sequence of these symbols, with a “decimal point” (looks like: .) somewhere.

A symbol just to the left of the decimal point is worth its face value; symbols one position left of that are worth ten times their face value, and symbols one position to the right (just after the decimal point) are worth one tenth of their face value, etc.

Here are some nonobvious but easy-to-prove facts:

  • Digits for the numbers zero through nine are sufficient to represent any number this way, and are also minimal (if you remove any digit you can’t represent some number)
  • This representation is almost unique
  • There is a nice algorithm for finding such a representation
  • Decimals in base ten are sorted lexicographically: to figure out which of two decimals is larger, you can just compare the first position where they differ. (With one edge case.)

“Wait, almost unique?” Yeah, because .99999… (infinite tail of 9s) is equal to 1., and 2.4999… = 2.5, etc. As it turns out, this is the only kind of non-unique representation in base ten. It’s also the only case where lexicographic ordering fails (very slightly: the two representations of these numbers are consecutive lexicographically).

(I’m ignoring leading zeros.)

How to write numbers in base ten: be greedy

The nice algorithm is greedy: you insert the most valuable digit you can in every position.

Let’s try to write pi.

First, divide by whatever power of ten you need to make the result less than one. That’s 10^1.

  • Let r0 = pi/10^1, a number between zero and one.
  • Multiply by ten: 10×r0 is between zero and ten.
  • The biggest digit less than or equal to 10×r0 is 3.
  • So we start with: π = 3
  • (We divided by 10^1 and have now written 1 digit, so put the decimal point here: π = 3. + …)
  • We’re left with a remainder r1 = (10×r0 – 3), which is between zero and one.
  • Multiply by ten: 10×r1 is between zero and ten.
  • The biggest digit less than or equal to 10×r1 is 1.
  • π = 3.1 + …
  • Our next remainder is r2 = 10×r1 – 1, which is again between zero and one.
  • Multiply that by ten… etc.

For numbers with two representations (one that is finite, and another ending in a tail of 9s), this algorithm always gives the finite representation.

There are two ways to obtain the string 0.999…:

  1. Pick a sequence of numbers less than one, whose limit is one. Take the positionwise limit of the greedy decimal representations of these numbers.
  2. Write the second-greediest representation of one: put 0. instead of 1., but then follow the greedy algorithm from there.

Greedy algorithm for any base

Here’s a Python snippet for the greedy representation of any number, in any base:

def greedy(base, x, n):
    offset = mp.floor(mp.log(x)/mp.log(base))+1
    x /= mp.power(base, offset)
    dig = []
    rem = []
    r = x
    for _ in range(n):
        rem += [r]
        r *= base
        d = int(r)
        dig += [d]
        r = r - d
    return int(offset), dig, rem

(I’m using mpmath for arbitrary-precision arithmetic.)

You may also find these useful:

def tofloat(offset, base, dig):
    out = 0.
    mult = 1
    for d in dig:
        mult /= base
        out += d * mult
    return out * base**offset

def tostring(offset, dig, rem=None):
    out = ''
    for i, d in enumerate(dig):
        if i == offset:
            out += '.'
        out += (str(d) if d<=9 else f'[{d}]')
    return out

And here’s the “second-greediest representation” in code:

def greedy2(base, x, n):
    offset = mp.floor(mp.log(x)/mp.log(base))+1
    x /= base**offset
    dig = []
    rem = []
    r = x
    rem += [r]
    r *= base
    d = int(r) - 1
    dig += [d]
    r = r - d
    for _ in range(n):
        rem += [r]
        r *= base
        d = int(r)
        dig += [d]
        r = r - d
    return int(offset), dig, rem

Normal numbers

Most numbers don’t have a second-greediest representation: if you try to write pi starting with 2., you’ll be in trouble (the largest thing you can end up with is 2.999…, which is three, not pi).

In fact, the following is easy to prove (if you majored in math):

  • For almost all numbers, their base-ten representation is unique, given by the greedy algorithm, and does not terminate.
  • For almost all numbers, for any k, the distribution of length-k substrings of the base-ten representation approaches a uniform distribution.
  • For almost all numbers, when following the greedy algorithm, the distribution of remainders approaches a uniform distribution on [0, 1)

The statement about remainders is equivalent to the statement about substrings, and I think it’s more elegant.

As an example, pi is conjectured (not proven!) to be normal in base ten. Here are histograms of its first thousand digits (3, 1, 4, 1, …), its first thousand digit-pairs (31, 14, 41, 15, …), and two histograms of its first thousand remainders (.1415…, .4159…, .1592…, .5926…, …) with different bin sizes:

As we look at more and more digits of pi, these histograms should each converge to uniform — but nobody has been able to prove it.

Normal representations

We can also define a normal representation as a representation of a number with a uniform limiting distribution of remainders. In base ten, this isn’t very interesting: the only numbers with more than one representation aren’t normal, and neither are either of their representations.

In particular, 1. and 0.999… are definitely not normal.

We’ll need this definition later, though.