1 Billion Row Challenge

Calculate the min, max, and average of 1 billion measurements

Accept the challenge

Original blog post

1BRC in C/C++

Try your hand at processing 12 GB of text using low-level C code! ⚡

Submit your solution!

1BRC in Python

Use the power of snakes to read 1 billion lines of text! 🐍

Submit your solution!

1BRC in Go

Go get started to see if you can average 1B measurements in Go! 🐹

Submit your solution!

1BRC in JavaScript

Wrangle with the world's most popular programming language to process 1B rows! 💻

Submit your solution!

1BRC in Rust

Embrace your inner iron crab and read a ginormous file in Rust! 🦀

Submit your solution!

1BRC in Zig

Use this new language to process 1B rows of text! 🧩

Submit your solution!

1BRC in PHP

ElePHPants are not as slow as one might think! 🐘

Submit your solution!

1BRC in C#

Sharpen your wide span<T>(ing) skills and refresh your memory<T>

Submit your solution!

1BRC in Java closed

~~The original 1BRC language! 🎉~~

View historical submissions

1BRC in Julia closed

~~The original 1BRC language! 🎉~~

View historical submissions

Don't see your favorite language listed above? Open an Issue to add it!

Choose one of the languages listed above to see the language-specific leaderboard and instructions for submitting your solution to that language's repository.

Global leaderboard

TODO: Make sure this is up-to-date

	Time	Solution	Language	Author
1.	6.159s	link	Java	royvanrijn
2.	6.532s	link	Java	Thomas Wuerthinger
3.	7.620s	link	Java	Quan Anh Mai
4.	9.062s	link	Java	obourgain
5.	9.338s	link	Java	Elliot Barlas
6.	10.589s	link	Java	Artsiom Korzun
7.	10.613s	link	Java	Sam Pullara
8.	11.038s	link	Java	Andrew Sun
9.	11.222s	link	Java	Jamie Stansfield
10.	13.277s	link	Java	Yavuz Tas
	4m 13.449s	link	Java	Reference implementation

You can view language-specific leaderboards on each language's competition page.

💪 The challenge

Your mission, should you choose to accept it, is to write a program that retrieves temperature measurement values from a text file and calculates the min, mean, and max temperature per weather station. There's just one caveat: the file has 1,000,000,000 rows! That's more than 10 GB of data! 😱

The text file has a simple structure with one measurement value per row:

Hamburg;12.0
Bulawayo;8.9
Palembang;38.8
Hamburg;34.2
St. John's;15.2
Cracow;12.6
... etc. ...

The program should print out the min, mean, and max values per station, alphabetically ordered. The format that is expected varies slightly from language to language, but the following example shows the expected output for the first three stations:

Hamburg;12.0;23.1;34.2
Bulawayo;8.9;22.1;35.2
Palembang;38.8;39.9;41.0

Oh, and this input.txt is different for each submission since it's generated on-demand. So no hard-coding the results! 😉

Choose a language from the cards at the top of this page to get started! 🚀

Rules and limits

No external library dependencies may be used. That means no lodash, no numpy, no Boost, no nothing. You're limited to the standard library of your language.
Implementations must be provided as a single source file. Try to keep it relatively short; don't copy-paste a library into your solution as a cheat.
The computation must happen at application runtime; you cannot process the measurements file at build time
Input value ranges are as follows:
- Station name: non null UTF-8 string of min length 1 character and max length 100 bytes (i.e. this could be 100 one-byte characters, or 50 two-byte characters, etc.)
- Temperature value: non null double between -99.9 (inclusive) and 99.9 (inclusive), always with one fractional digit
There is a maximum of 10,000 unique station names.
Implementations must not rely on specifics of a given data set. Any valid station name as per the constraints above and any data distribution (number of measurements per station) must be supported.

Entering the challenge

Some languages have special instructions but in general here's what you can expect:

Create a fork of the 1BRC repository for your language on your own GitHub profile. This will let you submit your solution via a pull request.
Somehow create a new implementation file in the repository. This will vary by language. For example in JavaScript you might create a new src/<username>.js file while in C++ you might make a new src/<username>.cpp file. It's recommended to copy the default reference solution to get started and then modify it from there.
Make that implementation fast. Really fast.
Test & benchmark your solution! There's usually language-specific instructions on how to do this but in general you run <some-command> bench <username> to run your solution against the reference implementation. If you see any differences, fix them before submitting your implementation.
Create a pull request against the upstream repository! 🎉 There's usually some additional instructions in the Pull Request template on information you should include like how long it took on your computer and your computer's specs.
Someone or some robot will run your solution "officially" on the same hardware as everyone else's solution (so no hardware differences) and report the results. If you're the fastest, you win! 🏆 If not, you'll still probably go on the leaderboard. 🥉

If you'd like to discuss any potential ideas for implementing 1BRC with the community, you can use the GitHub Discussions of this @1brc GitHub organization or the language-specific repository discussions. Please keep it friendly and civil.

Prize 🎁

If you enter this challenge, you may learn something new, get to inspire others, and take pride in seeing your name listed in the scoreboard above. Rumor has it that the winner of the Java competition (the original challenge language) may receive a unique 1️⃣🐝🏎️ t-shirt, too!

FAQ

Make sure you check your language-specific FAQ as well. 😉

What is the encoding of the measurements.txt file?

The file is encoded as UTF-8.

Can I make assumptions on the names of the weather stations showing up in the data set?

No. While only a fixed set of station names is used by the data set generator, any solution should work with arbitrary UTF-8 station names. For the sake of simplicity, names are guaranteed to contain no ; character.

Can I copy code from other submissions?

Yes, you can. The primary focus of the challenge is about learning something new, rather than "winning". When you do so, please give credit to the relevant source submissions. Please don't re-submit other entries with no or only trivial improvements.

My solution runs in 2 sec on my machine. Am I the fastest 1BRC-er in the world?

Probably not. 😊 1BRC results are reported in wallclock time, thus results of different implementations are only comparable when obtained on the same machine. If for instance an implementation is faster on a 32 core workstation than on the 8 core evaluation instance, this doesn't allow for any conclusions. When sharing 1BRC results, you should also always share the result of running the baseline implementation on the same hardware.

Why 1️⃣🐝🏎️?

It's the abbreviation of the project name: the One Billion Row Challenge.

1 Billion Row Challenge

1BRC in C/C++

1BRC in Python

1BRC in Go

1BRC in JavaScript

1BRC in Rust

1BRC in Zig

1BRC in PHP

1BRC in C#

1BRC in Java closed

1BRC in Julia closed

Global leaderboard ​

💪 The challenge ​

Rules and limits ​

Entering the challenge ​

Prize 🎁 ​

FAQ ​

What is the encoding of the measurements.txt file? ​

Can I make assumptions on the names of the weather stations showing up in the data set? ​

Can I copy code from other submissions? ​

My solution runs in 2 sec on my machine. Am I the fastest 1BRC-er in the world? ​

Why 1️⃣🐝🏎️? ​

Global leaderboard

💪 The challenge

Rules and limits

Entering the challenge

Prize 🎁

FAQ

What is the encoding of the measurements.txt file?

Can I make assumptions on the names of the weather stations showing up in the data set?

Can I copy code from other submissions?

My solution runs in 2 sec on my machine. Am I the fastest 1BRC-er in the world?

Why 1️⃣🐝🏎️?