Learnings from a stint in the cryptocurrency space
It was the early 2017s, and cryptocurrency had just come off a huge boost to it’s market value. I had just finished off my project and was deciding the next step in my future, and I figured why not catch a ride off this wave of curiosity in technology and markets? So when I got an offer from a friend wanting a high performance stock exchange backend I just dived into it. My first stint in the space was to work on porting a high performance backend for a cryptocurrency exchange. It was also my entry into the world of network security. The other stint, an intersection between using the Ethereum blockchain and games, was a project involving putting a number guessing game as a smart contract on the Ethereum blockchain.
Cryptography
One can’t really understand how crypto works without the base fundamentals of cryptography. Cryptography addresses the problem of how to pass secret data without a man in the middle peeking at it and deciphering the secret. With some supplements from the field of computer science, the eventual effect is that it can make the data trustworthy despite the incentives of multiple parties.
Math fundamentals
Cryptography is based on math, and one of the most important mathematical constructs is the modulo (remainder) function and it’s relation to primes. If you mod a number by a prime, there is mathematical proof that the range of values produced by doing the modulo on the input number produces a remainder that covers each number from 0 to module minus 1. This is extremely helpful because it gives us a property whereby many inputs map to one output, similar to a Hash function in computer science. With some mathematical trickery, we can make it extremely hard, if not impossible, to reverse the output to the input.
Other math theorems used as a basis are Co-Primes, Euclid’s Greatest Common Divisor(GCD) Algorithm, Extended Euclid’s Algorithm, Galios Finite Fields, Fermat’s Little Theorem, Euler’s Totient Function, Euler’s Theorem, and the Chinese Remainder Theorem.
Early Ciphers
The Caesar Cipher was one of the earliest ways to encrypt a message, which shifts each alphabet by 3 places and wraps around. Similar ciphers that involve substitution can be easily broken by statistical methods. Classic substitution ciphers were used until the early 20th century, where computing machines started to emerge.
Symmetric Key Ciphers
Using a secret key known to both parties, one encrypts a message using it and the other uses the same key to decrypt the ciphertext. This key is shared between 2 parties and can be a liability due to it’s shared nature.
A good starting point is the Feistel Cipher, which takes a plaintext block of length 2w bits and key K, and splits it into 2 halves L and R. The split is then taken through a number of rounds. For each round, either L or R is fed into an encryption function and XOR’ed with the other, and then swapped for the next round. For decryption, switch out the encryption function for it’s decryption function and apply the rounds in reverse order.
The DES (Data Encryption Standard) uses 16 rounds of the above Feistel Cipher, with each round using a 48-bit subkey from a 56-bit overall key (which is another algorithm I won’t go through here). Over time, it has become prone to various cryptanalysis attacks. Differential cryptanalysis and Linear cryptanalysis involves analyzing plaintext and abusing XOR on the plain/ciphertext in different ways.
The current standard for symmetric key algorithms is AES (Advanced Encryption Standard).
Subjects/Theories to look at: Claude Shannon’s Theory of Confusion and Diffusion, S-Box
Public Key Cryptography
For this we generate 2 secret keys such that one (private) key is used to encrypt the plaintext and the other (public) key decrypts the ciphertext. Common uses other than encryption include using the public key and ciphered text as part of verification for the sender of the message and non-repudiation (ie. the sender cannot deny that he didn’t send the message).
The oldest algorithm is called RSA (Rivest–Shamir–Adleman). Here’s some excerpts I wrote while going through learning on cryptography via textbook.
Other well-known public key algorithms:
- The Diffie-Hellman key exchange algorithm addresses the key exchange problem between two parties and is used in many web systems in the world today.
- Elliptic Curve Cryptography (ECC) is used in Bitcoin and many other cryptocurrency systems. You start with an equation y2 = x3 + ax + b, which shows up as a rotated ‘ohm’ on a graph. To get a corresponding output for a pair of points, you transform it using reflection about the axis and line segment intersections.
Computer Science
Merkle Trees
I won’t go through a full 101 on this, but what’s relevant are Merkle Trees. Every node that isn’t a leaf has a cryptographic hash of the data and it’s child nodes. This forms the basis of many blockchain applications, and gives the advantage that you can’t change a piece of data in the tree without changing a great number of those hashes.
Backend network infrastructure for crypto exchanges
Having been assigned to port an existing codebase to our infrastructure for use, I had an opportunity to take a look at how their server backends are made. It’s a microservice architecture, there’s one program that manages all the trades in the exchange called the matching engine. Then there’s one that exclusively reads data from the mysql databases, and one more that reads price data from kafka, likely made to prepare for read bottlenecks. The others are either reporting or alert programs, or frontend http or websocket APIs, and can be scaled horizontally.
Of note is how the infrastructure is geared for reliability, failover and low downtime. Each of the horizontal-scaled programs, depending on it’s needs, spawns up to 4 local processes via fork() and they communicate via linux process signals. There is no concept of multithreading, which is what as a gamedev I’ve come to expect. The main process that spawned 4 child processes then waits around and listens to signals from it’s children. For example. if a child process crashes, the main process receives a signal and respawns these child processes.
Deposits, Withdrawals and Network Security
Fortunately, many blockchains have an API for listening to funds being deposited into wallets. With an understanding of how public addresses and transactions work, it’s just a matter of reading up their various APIs and pinging them for new transactions once in a while.
Withdrawing funds out is a whole other risky issue entirely. For security reasons, the exchange generally has hot and cold wallets, and their total funds are usually split amongst many of them. Our security specialist creates a key-pairs using a master key and a program. These location of these keys and the program are on an isolated laptop, and the laptop is only connected to the internet in the event the exchange needs to approve withdrawals. To approve withdrawals, the keys on the laptop are used to sign transactions sent to it from a local computer. This is called an air-gap in network security; the computer signing the transactions is never connected to the internet.
Zero Knowledge Proofs and a Number Guessing game
Inspired by games of putting battleships on the blockchain, I thought it would be an enlightening experience to make a small game of my own and put it on the Ethereum blockchain.
Language: Solidity
Tool for translating programs into smart contracts: Zokrates
Smart Contract Testing framework: Truffle
Sample Ethereum Blockchain generator: Ganache
Concepts studied: Making NFTs
The results are at https://github.com/ddengster/zk_numberguesser. My smart contract basically takes in a number, and reports if the number is greater or smaller or if it matches, all the while keeping the hashed number on the blockchain. This evaluation is done using Zokrates; it will spit out smart contract code for verification, and that smart contract code is uploaded to the Ethereum blockchain. A guesser sends his number to the backend server, then the backend server runs Zokrates and sends back a proof of the computation. This proof is uploaded to the verification smart contract along with the number, which gives us the result.
Conclusions
Coming from the world of game development, these experiences include a pioneering field, with lots of free reign with some projects and a codebase made by experienced people to learn from. I got a number of leads to many different computer science fields of interest, and it expanded my horizons quite a bit by exposing me to real world markets and economies.