Before Ethereum and smart contracts, high transaction fees were mostly a temporary phenomenon. Demand for Bitcoin transactions was low enough most of the time that transaction fees were costing under a dollar. Smart contract blockchains enable many new use cases beyond simple payments like ETH staking, liquidity farming or NFTs that blockspace demand and transaction fees are increased high more often than not.
In the last few years there was a lot of research in scaling blockchains and and we now have a better picture of the options blockchains have to increase their processing capacity to combat high transaction fees and the trade-offs that are involved.
This article will go through most of these options, one by one.
Scaling describes the capacity of a blockchain to facilitate more transactions and is usually measured in transactions per second (TPS). The higher the TPS the less competition there is for block space and the lower the transaction fees.
Blockchain TPS is highly correlated with how decentralized the blockchain is. Blockchains that require higher node specs can do more transactions but the goal of the blockchains is to be verifiable and censorship resistance and requiring expensive and/or specialized hardware means that most people won't be able to verify what's going on. In essence, blockchains can only be decentralized when they limited by what can run on consumer hardware.
To validate the chain nodes need time to both process and propagate transactions in between blocks. If they can just process the transactions but can't propagate them fast enough they will still fall behind nodes with higher specs.
So for a 12 seconds block time only 2 seconds can be allocated to processing transactions which means ~400 transactions or ~16.6 TPS (the same with Ethereum).
Note: These calculation apply to blockchains with very high number of nodes (for example, Ethereum that has ~600k nodes). Blockchains with fewer nodes can have block times less than 10s and blocks would still be able to propagate effectively.
A good place to start understanding scalability is probably the blockchain trilemma that Vitalik introduced to describe the trade-offs of different scalability ideas. The trilemma states that out of 1) decentralization (ie. nodes count), 2) scalability (ie. TPS) and security (ie. resistance against malicious parties) a single blockchains can only have two.
As as we've seen above a single blockchain can only do about 15 - 20 TPS. That leaves us two options 1) somehow improve the efficiency of a single blockchain or 2) use multiple blockchains and combine their throughput to archive higher TPS.
This leaves us with a couple of options to scale blockchains:
Let's investigate them one by one in detail.
One common approach to scaling is splitting the work so more work can be done in total. This approach is called sharding. Sharding had a central place in the Ethereum scaling roadmap as one of the two ways the chain is going to scale beyond the ~15 TPS limit of a single home computer but later was ditch for layer 2 scaling. Today sharding have a partial implementations by the Near Protocol and is in the roadmap of other L1s such as PolkaDot and Zilliqa.
For this technique to work nodes are divided into smaller groups called committees. Each committee is responsible to process only a portion of the transactions which means that each committee is expected to process that ~15 TPS by itself. Rough speaking, if a chain has 10 committees they will be able to process 150 TPS. Note that, there are limits to what can be done as these groups need to communicate with each other and that requires extra work and extra network traffic.
A trade-off sharding has to make is security. If transactions are processed by fewer validators there is more chance that these nodes act dishonesty. To medicate some of the problems, committees provide two types of evidence to the rest of the network with every block. The first evidence is that the shard data (ie. the blockchain data the shard is responsible for) is stored by all nodes in the shard and the second is a vote that the committee came to an agreement about the transactions in that block.
Moreover, since committee members are picked at random every few blocks it's improbable they manage to collude an with data availability proofs other committees know that the shard data can be restored even if most committee members go offline. That means that a sharded chain can "go around" the blockchain trilemma and increased its TPS without renouncing its security or decentralization.
Sharding splits transaction processing in committees picked at random. Each committee shares their vote and a proof that its nodes have the blockchain data they are responsible for.
This section discusses different methods that blockchain can use to improve transaction propagation. These optimizations don't have major impact on TPS as the delay that comes from propagating the transactions is small in comparison to processing the transactions and most benefits of different consensus mechanism come in the form of quicker finality and faster block confirmation times.
One alternative consensus mechanism is Avalanche. Their premise is that nodes can know in advance if transactions are valid by randomly asking other nodes. That way nodes can skip waiting for transactions to be propagated around the network and can start process more transactions before the next block.
Another example of an alternative consensus mechanism is Algorand. The Algorand consensus that a leader selection mechanism to reduce transaction propagation time. Every block the network elects a leader based on a random calculation that every node performs and knowing in advance which node will be creating the next block makes the process a lot faster. Similarly to Avalanche, Algorand reduces the transaction propagation delays and again the protocol have more time to process transactions.
Consensus optimizations can be speed transaction propagations. Examples are Avalanche transaction sampling and Algorand random leader selection.
Most blockchains have every node process every transaction and more transactions there are the more processing the blockchain has to perform. An alternative approach is to only require a subset of the validators to process a transaction to be deemed confirmed. With this approach if TPS increase the system can add more nodes to handle the inflow.
For example, if each transaction is processed by 10 nodes and there are 1000 transactions per block then each node would need to process 100 transactions. In contrast, a conventional blockchain would need all 1000 transactions to be validated by all the nodes. If the number of transactions increases to 2000, a system with reduced replication can add 10 more nodes so that each node again processes again 100 transactions per block.
In contrast to sharding, committees only store the data they are responsible for and don't have to merge with other shards at some point in the future. This makes storing data very cheap but comes with another issue. Since computation is cheap, some blockchain with reduced replication charge for data storage in a per block basis. Moreover, as only a constant number of validators are checking the validity of transactions there's a higher chance that some validators will be colluding at some point. A way to remedy this is to have committees shuffled every few blocks to make it hard for committee members to collude.
A notable candidate of this approach is probably the Internet Computer.
Reduced replication is processing transactions by committees with fixed randomly selected members. That's a trade-off of security for scalability.
Supernodes is a straightforward approach to increase TPS by increasing the nodes hardware requirements. At first the idea of increasing hardware requirements was not prominent such as in thje Bitcoin ecosystem where censorship resistance is a profound feature of the chain. In the last couple of years we are seening more layer 1s willing to make this sacrifice making the case that blockchain can be expensive to run nodes but could be cheap to verify these nodes with light clients.
Another side effect of blockchains with hardware heavy nodes is that they start to hit different limits. At some point their processing power is enough to process all incoming transaction but they require more network bandwidth to share these transactions around the network and get to consensus.
The most popular implementation of supernodes is probably Solana who became widely known back in 2021 as a place for launching and trading NFTs with low transaction fees. Solana has nodes with 12+ CPU cores, 128GB of ram and a few TBs of storage for saving the transactions and requires at least 300 Mbps symmetric internet connection with 1GBit/s preferred.
A blockchain can process more transactions with the use of more powerful computers. It will also require better networking infrastructure for its nodes and more disk space to store the cheaper transactions.
Layer 2 chains refer to two categories: layer 2 rollups and layer 2 validiums. Both systems process transaction off-chain and checkpoint on on-chain with a proof that the state of the layer 2 is valid. The difference between the two is that only layer 2 rollups store all of their transaction data on the layer 1.
There are also two categories of these proofs, optimistic and zero-knowledge. Optimistic are proofs that have a grace period of 1 or 2 weeks to evaluate. If the proof is wrong anyone can post a challenge on the layer 1 to validate the proof and the layer 1 would be that final arbiter of truth. Zero knowledge are proofs that contain all information about the state of the L2 and don't need need a grace period. At the time of writing their zero knowledge proofs are still experimental and although cheap to verify, they are expensive to produce.
As a side note, Layer 2 rollup chains are only storing a compressed version of the transaction data in the parent chain to save space. These transaction bundles, or rollups as they called, only stored the minimum necessary information to check if their respective transactions are valid.
Rollups are able to provide many security properties from their respective L1 but they have to make one sacrifice, the ability of the L1 to stay online. Since most L2 chains don't have as many validators they have a higher chance that their validator censor transactions or goes offline. In that case L2s have a longer delay until new nodes pick the transactions from the L1 to reconstruct the L2. To circumvent this, L2 nodes provide a storage proof with each new block. These proofs include random samples of the blockchain state that are much smaller than showing all the L2 state at once and make the L2 state more accessible since if a node deletes or hides its state it would be caught right away.
A blockchain can scale by processing transactions off-chain and requiring proofs that the off-chain state is valid and data are stored correctly.
In this article we went through various options blockchains can employ to scale.
Sharding splits transaction processing into randomly selected committees. Consensus optimizations reduce node communication times. Reduced replication splits both transaction processing and storage in committees, super nodes empower nodes with beefier hardware and layer 2 allow processing transactions off-chain.
All off these methods can help a blockchain to scale but most of them come with both benefits in transaction capacity and trade-offs, usually in security or decentralization.