Programming Blockchain with Jimmy Song
Last week, I had the pleasure of attending Jimmy Song's Programming Blockchain Interactive 2-day Seminar. For those unfamiliar with the course, it's an intensive deep dive into the Bitcoin protocol covering details like Finite Fields, Elliptic Curve Cryptography, Creating/Signing/Parsing Transactions, Merkle Tree Construction, and much more...in Python so it's hands on live coding, not a theoretical lecture.
Taking this seminar may cause your brain to explode - but in a very good way. There is so much content to cover in 2 days, you might feel like you're drinking out of a firehose. Part of this too is not just about the knowledge, it's about furthering the developer ecosystem and joining a community of like-minded devs to collaborate and build things. I was impressed with the diversity of my classmates, who were of varied backgrounds, ages, and genders. A high percentage of folks have been involved in blockchain/Bitcoin since before 2015, which can be somewhat unusual at other blockchain events these days.
Ok so what do you need to know to take this course? At least a basic understanding of blockchain/Bitcoin and some Python. If you are a coder, you can probably figure out Python syntax on the fly, but then it will be for sure firehose time (it's ok - if I can do it, you can do it too!)
Before your seminar starts, you are asked to set up your environment such that you're running the correct version of Python and are able to do some prelim exercises on Jupyter notebook. This ensures that the class is working with the same dev environment and that you have a little familiarity with Jupyter notebook.
Upon arrival, you get a copy of Mastering Bitcoin: Programming the Open Blockchain and a handy dandy laminated cheatsheet.
Eat Your Vegetables
To understand how this all works, first we must spend some time on foundational math. So if it's been a while, dust off those quanty thinking caps.
A finite field is a set of numbers that:
- Are finite
- Closed set under +, -, *, /, except for division by 0
- In our case, used with Elliptic Curves for Cryptography
What is modulo? It can be thought of like a remainder.
Example: If we are working in the prime field of 19...
11 + 6 = 17 (mod 19)
8 + 14 = 3 (mod 19)
Normally, 8 + 14 = 22. Now if we are in mod 19, that means 22-19 = 3 (mod 19)
Note that negatives also work...
4 -12 = 11 (mod 19)
You go through a lot more examples in class, including all the different operations of course.
Now why do we care about Finite Fields? Because they are used in Elliptic Curve Cryptography (ECC), which is what Bitcoin uses.
Elliptic Curves are important because of something called point addition. If we go back to algebra class, you may remember the concept of linear, quadratic, and cubic. Elliptic Curves follow the format of y^2 = x^3 + ax + b. Specifically, Bitcoin uses secp256k1, which is y^2 = x^3 +7.
An Elliptic Curve is a plane algebraic curve that is non-singular; that is, it has no cusps or self-intersections. It looks symmetric over the x-axis so for any particular x-coordinate, you have a positive y-coordinate and a negative y-coordinate.
As mentioned earlier, Elliptic Curves are important because of Point Addition. It is possible to "add" 2 different points on an Elliptic Curve.
Given any straight line going through an Elliptic Curve, if it intersects twice, by definition it will intersect a 3rd time. To add points P and Q in the below example:
- You draw the line connecting P and Q
- See where it intersects the 3rd time, at point R
- P + Q = the mirror point of point R
There is a concept of Point at Infinity, which can be thought of as a zero.
What does this have to do with Bitcoin? This whole rigamarole is to create a public key from a private key! A public key is actually a point!
In the above representation of the Elliptic Curve, it is a Curve over Reals. You can have Curves over Finite Fields and point addition still works.
Below we can confirm that a point is on a curve with a finite field.
Public Key Cryptography
Breezing over much of the detail you would learn in class, know that in Bitcoin, there is a set Generator Point (G). Your private key is some scalar (s), and you use these to derive your public key (P). In Bitcoin, secp256k1 private keys are 256 bits of data.
P = sG
In a really large group on the order of about 2^256 (generated by finite field math), where s is really large, finding P when we know s is easy. Finding s when we know P is not. This is referred to as the Discrete Log Problem, and part of what helps make Bitcoin secure. If you have heard of public and private keys being mathematically related, this is what they mean.
For what it's worth, 2^256 is a really big number.
2^256 is about the same as 10^77
Number of atoms in and on earth ~ 10^50
Number of atoms in the solar system ~ 10^57
Trillion computers doing a trillion operations every trillionth of a second for a trillion years < 10^56 operations
Nitty Gritty of Addresses
Ok so if a public key is just a POINT (!), what's this with addresses? Here's where the handy dandy cheatsheet gets handy. You will find that there is a standard format for things like transaction formats, block formats, signature formats, etc. You can serialize them in code because you know, for instance, that if the first 2 digits in an SEC string is '04', then that is the Uncompressed Format.
With some time and practice, you can start decoding these with ease (says Jimmy).
In our above example, know that there are 2 formats for SEC. The first 2 digits tell you whether it's compressed or uncompressed format. In the uncompressed format, the point is explicitly named by x and y. In the compressed format, only the x-coordinate is given, and the first 2 digits tell you whether the y-coordinate is the top mirror image y or the bottom y.
For all of you in ICO-land, SEC does NOT stand for Securities and Exchange Commission. It stands for Standard for Efficient Cryptography, a la secp256k1. The 256 refers to the number of bits in the prime field.
To actually calculate an address, you:
- Start with either the compressed or uncompressed SEC format
- SHA-256 the result and then RIPEMD160 the result (aka HASH160)
- Prepend the network prefix (00 for mainnet, 6F for testnet)
- Add a 32-bit double-SHA256 checksum at the end
- Encode in Base58
Yes all that! Before attending Jimmy's seminar, I knew Bitcoin used the SHA-256 hash algorithm but had no idea it also used RIPEMD160 and then DOUBLE hashed the checksum and THEN encoded in Base58. Reminder that this is all on top of the Discrete Log Problem mentioned earlier. Bless the paranoid crypto people. Respect.
But Wait There's More
Does your brain hurt yet? I just finished broadly summarizing Sessions 1 and 2 out of 8! As you might imagine, we take the building blocks from ground zero and keep going until you have the whole Bitcoin protocol. For some context, we covered Sessions 1 and 2 before lunch on Day 1. For the full mind-expanding experience, apply at www.programmingblockchain.com.
You get discounts for signing up early, and there are currently scheduled sessions for NY Blockchain Week, Toronto, Denver, Sao Paulo, and Athens. In a decentralized world, other locations will be considered, just email! Scholarships are available and you can currently apply for scholarships to the upcoming New York session.
Speaking of scholarships, I was one of the lucky recipients of a female developer scholarship for this seminar.
I am very thankful for the opportunity and wanted to give a shout out to sponsors John Pfeffer and Chaincode Labs. Some people like to *talk* about supporting female developers, and it's really refreshing when people and companies actually go ahead and do that with ACTION. Thank you!
Your generous contribution enabled 10 women to participate in this awesome experience- one is a new intern at Lightning Labs and another is only 19 years old! As for how they did - everyone was able to create and broadcast a Bitcoin transaction on the testnet from scratch. Whoever says women and female developers are not interested in blockchain and Bitcoin is just plain wrong. There were many more applications for the scholarships than there were spots. Here's to building an inclusive decentralized future.
The course gives you enough to be dangerous and to start building your own wallets and block explorers. While you may not know all the answers - and with a technology developing so quickly, it's uncertain anyone knows all the answers - it provides the knowledge and context to ask the right questions and dig in.
We as a group discussed what we'd like to do with our new knowledge. Several people, including myself, hope to contribute to open source. Many believe in the ethos of transparency, owning our own data, and disintermediating centralized sources of power and control. Some hope to use the technology for social impact and inclusion. TO THE MOON!
Big thanks to Jimmy Song for being our fearless leader educator, and for allowing me to share some of his content with you.