Nokia Composer on ATtiny85: ringtones on a chip the size of a fingernail
If you grew up in the late 90s or early 2000s, you probably remember the Nokia Composer. Not as an app, not as a feature on some settings page. You remember it as the thing you opened on a Nokia 3310 during a boring class, punching in notes one at a time, trying to recreate a song from memory. No internet to look up the notes. No copy-paste. Just you, the keypad, and your ear.
The ringtones were monophonic. One note at a time. Square wave through a tiny piezo speaker. They sounded awful by any reasonable standard. But somehow, when the Imperial March came through that speaker – tinny and flat and about as musical as a smoke detector – you recognized it instantly. Everyone on the bus did too.
People shared ringtones by texting the note sequences to each other. Not audio files. Text. Something like:
1
16e2 16d2 8#f 8#g 16#c2 16b 8d 8e 16b 16a 8#c 8e 2a 2-
That’s the Nokia Tune. The one that shipped on every Nokia phone and became one of the most recognized melodies in the world. All encoded in a string shorter than a tweet.
I wanted to bring that back.
Why
I’ve been maintaining ssd1306xled, a lightweight OLED driver for the ATtiny85 and similar constrained microcontrollers. I wrote about its history a few months ago. The library recently hit its first proper v1.0.0 release. Contributors have submitted pull requests, the CI pipeline is solid, the documentation is actually useful. It feels like a real project now.
With the v1.0.0 milestone behind me, I wanted to build something fun. Something that shows off what the library can do on minimal hardware. Not another LED blinker. Something with personality.
I stumbled on zserge’s Nokia Composer, a web-based recreation of the Nokia 3310 ringtone editor. You type in note sequences and it plays them back with that familiar square-wave sound. The whole thing runs in about 500 bytes of JavaScript. Looking at it, I thought: if this fits in 500 bytes of JS, it should fit on an ATtiny85. An 8-pin chip with 8KB of flash, a tiny OLED, and a piezo buzzer. Nokia Composer in hardware.
The notation
Before getting into the build, it helps to understand the notation. Nokia Composer uses a space-separated format where each token describes one note:
The format is [duration][.][#][note][octave]:
- Duration: 1, 2, 4, 8, 16, or 32. These are fractions of a whole note. An
8is an eighth note. A2is a half note. Default is 4 (quarter note) if you leave it out. - Dot: A
.after the duration makes it a dotted note – 1.5 times the normal length. - Sharp: A
#raises the pitch by one semitone. - Note: A letter from
athroughg. Or-for a rest (silence). - Octave: 1, 2, or 3. Default is 1. These map to piano octaves 4, 5, and 6.
So 8.#f2 means: dotted eighth note, F-sharp, octave 2. In musical terms, that’s F#5 at 1.5 times the duration of a regular eighth note.
The beauty of this format is density. The entire Nokia Tune fits in 56 characters. Harry Potter’s Hedwig’s Theme fits in about 200. You can store dozens of songs in a few kilobytes of flash.
Fitting it on an ATtiny85
The ATtiny85 has eight pins. One is VCC, one is GND, one is RESET (not usable as GPIO without fuse hacking). That leaves five. Here’s how they’re all spoken for:
| Pin | Function |
|---|---|
| PB0 | I2C data (SDA) to OLED |
| PB2 | I2C clock (SCL) to OLED |
| PB4 | Piezo buzzer |
| PB1 | “Next tune” button |
| PB3 | “Play” button |
Five pins, five functions, zero slack. The constraints shaped everything else.
The parser
The tunes live in PROGMEM (flash memory) as plain strings. The parser reads them one character at a time using pgm_read_byte() – no string copies, no intermediate buffers, no RAM allocation. It walks through each token, extracts the duration, dot, sharp, note letter, and octave, then hands those off to the frequency and duration calculators.
The whole parser is about 200 bytes of compiled code. On a chip where every byte counts, that matters.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Read next note token from PROGMEM string at *pos.
static bool parse_next_note(const char *pgm_str, uint16_t *pos, Note *out) {
uint8_t ch;
// Skip spaces
while ((ch = pgm_read_byte(pgm_str + *pos)) == ' ')
(*pos)++;
if (ch == '\0')
return false;
// Duration digits
uint8_t dur = 0;
while (ch >= '0' && ch <= '9') {
dur = dur * 10 + (ch - '0');
(*pos)++;
ch = pgm_read_byte(pgm_str + *pos);
}
out->duration = dur ? dur : 4;
// Dotted
out->dotted = 0;
if (ch == '.') {
out->dotted = 1;
(*pos)++;
ch = pgm_read_byte(pgm_str + *pos);
}
// Sharp, note letter, octave follow...
Each parsed note exists only as a handful of local variables on the stack. Nothing gets stored. Parse a note, play it, move on. The RAM footprint is essentially zero beyond the stack frame.
Frequency mapping
I needed to convert note names to frequencies. The standard approach is a lookup table with every frequency pre-calculated. For three octaves of twelve semitones each, that’s 36 entries at 2 bytes each – 72 bytes. Not terrible, but unnecessary.
Instead, I store one octave of twelve base frequencies (C4 through B4) in PROGMEM – 24 bytes. Then I use bit-shifting for octaves:
1
2
3
4
5
6
7
8
static const uint16_t base_freq[12] PROGMEM = {
262, 277, 294, 311, 330, 349, // C4 C#4 D4 D#4 E4 F4
370, 392, 415, 440, 466, 494 // F#4 G4 G#4 A4 A#4 B4
};
uint16_t freq = pgm_read_word(&base_freq[semitone]);
if (octave == 2) freq <<= 1; // multiply by 2
else if (octave == 3) freq <<= 2; // multiply by 4
Bit-shifting is a single clock cycle on AVR. No floating point, no pow() calls. The resulting frequencies are within 1Hz of the mathematically correct values – close enough that no human ear could tell the difference through a piezo buzzer.
A separate 7-byte table maps note letters to their semitone index within an octave. C maps to 0, D to 2, E to 4, and so on. Add 1 for sharps. Total flash cost for the entire frequency system: 31 bytes.
Duration math
The original Nokia Composer calculates note duration as:
1
total_ms = 240000 / BPM / duration_value
For dotted notes, multiply by 1.5. On the ATtiny85, I avoid the floating point by using integer math:
1
2
uint32_t ms = 240000UL / bpm / n->duration;
if (n->dotted) ms += ms >> 1; // +50% via bit shift
At 120 BPM, a quarter note lasts 500ms. An eighth note, 250ms. A sixteenth, 125ms. These durations control the buzzer timing.
The square wave
Sound on the ATtiny85 is direct and physical. There’s no DAC, no audio codec, no PWM-to-analog conversion. The chip toggles a pin high and low at the target frequency, and a piezo buzzer converts those voltage swings into sound.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
static void play_tone(uint16_t freq, uint16_t duration_ms) {
if (freq == 0) {
// Rest: just wait
for (uint16_t i = 0; i < duration_ms; i++)
_delay_ms(1);
return;
}
uint16_t half_period = 500000UL / freq;
uint16_t cycles = ((uint32_t)freq * duration_ms) / 1000;
for (uint16_t i = 0; i < cycles; i++) {
PORTB |= (1 << BUZZER_PIN);
delayMicroseconds(half_period);
PORTB &= ~(1 << BUZZER_PIN);
delayMicroseconds(half_period);
}
}
For A4 (440Hz), the half-period is 1136 microseconds. The pin goes high for 1136us, low for 1136us, 440 times per second. That’s a perfect square wave – the same waveform the original Nokia phones produced.
The 70/30 rule
Here’s a detail that took me a while to get right. If you play each note for its full duration and then immediately start the next note, it sounds wrong. Too smooth. Too connected. Nokia ringtones have a distinctive staccato quality – each note is clearly separated.
I dug into zserge’s JavaScript implementation to understand why. The source is about 20 lines of code. The relevant part:
1
2
3
t = t + d*7; // note ON for 70% of duration
v(g.gain, 0); // mute
t = t + d*3; // silence for 30% of duration
Every note plays for 70% of its duration. The remaining 30% is silence. That gap between notes is what makes Nokia ringtones sound like Nokia ringtones. Without it, the Nokia Tune sounds like an organ. With it, it sounds like a phone.
I initially used 90/10. Too legato. The tunes sounded vaguely right but not recognizably Nokia. Switching to the 70/30 split made an immediate difference. Mission Impossible’s opening rhythm went from muddy to crisp.
The timing problem
The first version had a bug that manifested as a subtle drag in faster tunes. The loop for each note went: update display, play tone, wait for gap. The display update writes to the OLED over I2C – clearing pages, drawing the note name, rendering a pitch bar, updating the scrolling history. All of that takes time. About 20-30 milliseconds.
For a whole note at 120 BPM, 25ms of overhead in a 2-second note is barely noticeable. For a sixteenth note at 100 BPM, 25ms in a 150ms note adds 17% to the duration. That’s audible. Mission Impossible sounded sluggish. The rhythm dragged, especially during the fast repeated phrases.
The fix was to rearrange the sequence. Instead of display-then-tone, the code plays the tone first, then uses the silent gap to update the display. A millis() measurement tracks how long the display update takes, and the remaining gap time is adjusted accordingly:
1
2
3
4
5
6
7
8
9
10
play_tone(freq, tone_dur);
// Update display during the silent gap
unsigned long t0 = millis();
draw_playback(&n, tune_idx);
unsigned long elapsed = millis() - t0;
if (gap_dur > elapsed) {
play_tone(0, gap_dur - elapsed);
}
The display update gets absorbed into time that was going to be silence anyway. No extra latency between notes. The rhythm stays tight.
The display
The OLED shows two screens. While idle, it displays the current tune name, its index in the list, and button hints (“A:Next B:Play”). During playback, it switches to the current note name, a horizontal pitch bar that stretches wider for higher frequencies, and a scrolling history of the last six notes along the bottom.
I went with the 6x8 pixel font exclusively. The larger 8x16 font looks nicer, but it pulls in 1,722 bytes of font data. On an 8KB chip, that’s over 20% of total flash consumed by letter shapes. The 6x8 font is 576 bytes and perfectly readable on a 128x64 screen.
One thing I learned: centering text on the OLED requires knowing the string length in advance. The font is 6 pixels wide per character, the screen is 128 pixels wide. A quick calculation at draw time positions the text properly. Not hard, but the kind of thing you don’t think about until “Mission Impossible” overflows the first line and wraps into garbage on the second.
Seven tunes in 59% flash
The current build carries seven tunes:
| Tune | BPM | Notes |
|---|---|---|
| Nokia Tune | 120 | The original. Everyone knows it. |
| Harry Potter | 125 | Hedwig’s Theme. The dotted notes give it that waltz feel. |
| Imperial March | 112 | Darth Vader’s entrance music. Heavy on the low A’s. |
| Mission Impossible | 100 | The one with the iconic 5/4 rhythm. Fast sixteenth notes. |
| Nacht | 125 | Lots of thirty-second note trills that test the parser. |
| Star Wars | 100 | The main theme. Opens with those three repeated C-sharps. |
| Airtel | 140 | The Indian telecom jingle. Higher BPM with dotted eighth patterns. |
All seven tunes, plus the parser, frequency tables, display code, button handling, and the ssd1306xled library, fit in 4,822 bytes. That’s 59% of the ATtiny85’s flash. There’s room for about 15 more tunes before things get tight.
Adding a new tune is one PROGMEM string and a BPM value:
1
2
static const char tune_name[] PROGMEM = "My Tune";
static const char tune_data[] PROGMEM = "4c 4d 4e 4f 4g 4a 4b 4c2";
The Nokia Composer notation is human-readable enough that you can transcribe tunes by ear, or find thousands of transcriptions online from the era when people actually did this for fun.
Try it
The whole thing runs in your browser. No hardware needed. I put it on Wokwi:
Click “Play” to start the simulation. The blue “Next” button cycles through tunes. The green “Play” button starts playback. You’ll hear the buzzer through your speakers.
The library is at github.com/tejashwikalptaru/ssd1306xled, and the documentation site has the full API reference.
The thing about square waves
There’s something I keep coming back to about this project. The Nokia 3310 was released in 2000. It had a monochrome screen, a piezo buzzer, and enough processing power to play Snake. The ringtones were square waves – literally the simplest possible waveform. On and off. High and low. Binary, in the most literal sense.
Twenty-six years later, I’m playing those same square waves through a chip that costs less than a dollar, displayed on an OLED that would have been science fiction in 2000. The notation strings that people used to text each other still work. The format is so simple it’s almost impossible to break.
The Nokia Tune, compressed into 56 characters of text, played through an 8-pin microcontroller the size of a fingernail. The tunes survived because they were small enough to share, simple enough to remember, and annoying enough to never forget.
I think that’s worth building.