A Tale of Gas and Approximation (part 2)

An in-depth analysis of Uniswap V2 gas consumption on Starknet

ResearchSmart Contract

Dec 22nd 2023

Share on

A Tale of Gas and Approximation (part 2)

Chapter 2 - May the Gas be with you

Introduction

Last time, we studied how “transfers” and “approves” are priced, and we’ve seen that their cost was almost entirely dependent of the cost of updating storage slots which is quite expensive. We’ve also seen that these two operations on the ETH ERC20 had small advantages compare to the other ERC20 due to the cost amortization performed by Starknet. Moreover, we derived a formal model of an AVNU swap that will help

In this chapter, we will use the mathematical models and the intuition we built so far to analyse Uniswap V2 implementation in the context of MySwap, JediSwap and SithSwap, deriving for each one of them an approximation function that will allow us to predict the gas used when we swap through them.

Approximation of AMM

Now that we understand how to price ERC20 calls, we need to figure out how to price the cost of the pools. AVNU supports AMM with the following implementations:

Uniswap V2
Stable Curve
Uniswap V3
Uniswap V4

Without any surprises, we expect that the Uniswap V2 implementation should be quite easy to price because we expect something of the order $\Theta(C)$ , with $C$ being a constant. On the other hand, the gas cost of Uniswap V3-4 will probably be a function of the number of ticks crossed, while the Stable Curve implementation should also be of the order $O(C)$ , but it will depend on the algorithm used to find the roots of the pricing curve, which is a third-degree polynomial.

To figure out the characteristic formula of each AMM, we will use the following methodology. First, let's rewrite our pricing formula as follows:

\begin{equation} \begin{split} Gas^{avnu}_{swap}((Pool_i)_{i \in \N}) &\leq (2_{ERC20}+1_{Nonce}+2^{ERC20}_{User})C \\ &\quad + \sum_{i \in \N}C(1_{ERC20}+2^{ERC20}_{Pool_i}+f_{storage}(pool_i)+f_{computation}(pool_i)) \end{split} \end{equation}

Which we can rewrite as follows to make it less verbose.

\begin{equation} \begin{split} Gas_{AVNU}((Pool_i)_{i \in \N}) &\leq Gas_{User} + Gas_{Transfers}((Pool_i)_{i \in \N}) \\ &\quad +\sum_{i \in \N}Cf_{storage}(pool_i)+f_{computation}(pool_i)) \end{split} \end{equation}

What we can do is simulate direct swaps for each given AMM, subtract from the result the gas cost induced by the first two terms to get the gas cost induced by the AMM. Then, we can use a combination of intuition to guess the form of the characteristic function along with some curve fitting based on the empirical data to derive our model.

Therefore, here is the precise methodology we will use:

Methodology

Snapshot the state of all the pools at block N for a given AMM.
Generate 500 direct swaps from X USDC to T where $X \in [0.1,20000]$ and $T \in Tokens$ are sampled randomly for each swap. We denote each such swap with a tuple $(X_i,T_i)$ .
For each swap, generate a direct route. We denote each route as a tuple $(X_i, T_i, R_i)$ where $R_i$ is a valid route swapping amount $X_i$ from USDC to $T_i$ .
Simulate each route through AVNU, i.e a transaction composed of an initial approve on USDC for $X_i$ tokens followed by a call to AVNU contract with the route generated in 3. Repeat step 4. for every tuple $(X_i,T_i,R_i),i\in \{1..500\}$ and record the gas used minus the gas cost induced by the update of the nonce, the transaction fee and the different ERC20 transfers.

The reader may wonder why USDC and not another base token. The reason is actually quite simple: we need to be able to simulate with a large amount of money to induce potentially important changes in the pool (typically for concentrated liquidity). Therefore, we need a wallet with enough money to do that, and we found that USDC was the best token for that purpose. Additionally, we always run a few tests with other tokens to ensure that the implementation is token independent and symmetric in the swap direction, which is the case for the implementation we are considering.

Once we have enough data, we will try to derive functions that characterize them and then measure the accuracy of the prediction made by these models

An easy start

As a warm-up and to get accustomed to the methodology, we start with the MySwap - Uniswap V2 implementation. This is an interesting case already because we don't have access to the code, so we have to treat it as a black box. However, we know that the implementation has the following property.

Property

All the pools are managed by a single contract, which means that all the pool balances are updated for the same address.

Let's start by plotting the gas used for different pairs and different amounts. On the left-hand side graph, each point represents the gas used returned by the simulation as a function of the amount. As expected, even though there are 3 distinct values, they are close enough to assume that the gas cost is constant and independent of the amount in. Additionally, we observe that we could fit 2 distinct storage slot updates with that amount of gas since we would get 1 more contract updated + 2 storage slots changed, resulting in an overall cost of 3737 gas used. Intuitively, this also makes sense because, as we said, all the pools are managed by a single contract, and for each pool, we must update the reserves when we perform a swap, so it is quite reasonable to assume that such a swap induces 2 reserves being updated.

Assuming that this is the case, we plot on the right the gas used if we subtract these storage updates, thus concluding that the computational part is roughly equal to 400 gas.

In conclusion, we would have

\begin{equation} Gas_{MySwap}(pool) = 1249 \cdot (1_{MySwap}+2^{MySwap}_{Reserves})+400 \end{equation}

To check how our model performs, we can generate a dataset where we allow any number of hops consisting only of MySwap (potentially including multiple times the same pool) and plot the simulated gas against our prediction. Note that the blue line marks the points where the prediction is equal to the simulation, and points of different colors indicate a different number of hops.

We see that as we increase the number of hops, our approximation becomes less and less accurate because we are overestimating the actual cost. However, this is not really an issue because in the worst case, we are only off by approximately a storage slot update. Furthermore, we could reduce the error with multi-hops by decreasing the computational coefficient. However, this initial approximation is quite satisfactory, and we will leave it as it is.

May the Gas be with you

The analysis of 10kSwap is exactly the same as that of JediSwap, so we won’t include it.

We now consider a slightly different implementation, the one from JediSwap. This implementation differs from that of MySwap as follows.

Properties

Each pool has its own contracts.
The number of storage slots updated varies depending on the position of the transaction in the block (more on that below).

First, note that since each pool has its own contract, they each have their own balance in the ERC20 involved. In addition, in the gas formula, each distinct pool used will increase the number of contracts modified, so we expect it to cost more than the MySwap implementation.

Let’s plot two sets of data! On the left-hand side, we see that again the gas used is not dependent on the amount in, but we see that there are two quite distinct values. The reason for this comes from the second property we mentioned. More specifically, JediSwap keeps track of the cumulative price for each reserve along with the block timestamp, which are values that are modified at most once per block. In our model, this translates to a swap where 3 additional storage slots are modified, explaining the variance observed on the graph.

Let’s ignore that detail for now and figure out the static cost. We see that the overall cost is roughly 4.2k gas, which would mean that we could fit 2 storage slot updates (remember we need to add 1 because we modify a new contract), the remaining being the computational cost. Intuitively, we know that again we need to update both reserves, so let’s assume that we do, and the remaining cost is computational only. On the right-hand side, we plot the gas used if we were to subtract the gas resulting from our hypothesis.

We can see from the bottom line that this is quite accurate, yielding a computational cost of approximately 630, which is quite close to what we had for MySwap. Now the question remains as to how to deal with the variance. First, note that we cannot anticipate the position of the pool in the block, so we essentially have three approaches. The first one is to always assume the worst case, so that we always end up with an upper bound. However, the more hops we have, the worse our approximation would get. The other two approaches are based on the observation that the probability of a pool being used for the first time should decrease rapidly the further we get from when the last block was mined. Therefore, we could either always use the lower bound or use a value in-between.

We decided to perform a least-square fit of a constant function to our data to find the value 1533, which is a bit more than a single storage slot update. Intuitively, this value makes sense because, as we mentioned, we expect to see more and more pools without the cumulative price updated the further time passes between two blocks.

Consequently, we get the following formula.

\begin{equation} Gas_{JediSwap}(pool) = 1249 \cdot (1_{MySwap}+2^{MySwap}_{Reserves})+1533 \end{equation}

Let's evaluate our model to see how it performs when we increase the number of hops.

Are Jedi Knights Cheaper than Siths?

Now that we have characterized our proud Jedi, we can apply the same approach to SithSwap and see if we can approximate both the stable and volatile pool using a similar model. We know that the volatile pool implementation is roughly equivalent to that of JediSwap (not the same code but same properties, including the block-dependent storage slot update). On the other hand, the stable pool relies on a different curve, and the swap relies on Newton’s Method to find roots or a third-degree polynomial with a bounded number of iterations. Based on empirical data, we have observed that the convergence of the algorithm is generally reached for a small number of iterations, but it could potentially reach the upper bound of 80 iterations.

Additionally, SithSwap manages the LP fees differently than other AMMs in the sense that they are directly transferred to the owner of the pool at the end of each swap. Consequently, we expect each swap to cost more than for the other AMMs as we need to count the storage slot update of the balance of the pool owner.

Let’s plot our data for both pool types. Now, this is quite interesting! First, we see that we don’t have 2 distinct gas.

Let's start with the volatile pools. We see the same type of pattern that we've seen with JediSwap but with 3 distinct values instead of 2. The reason is that, on top of having the same cumulative price features, they also store observations every 30 minutes or so. Observations are structures composed of three fields that take the equivalent of 3 Felts, and the rest is the computational cost.

Regarding the stable pools, we now know why there are 3 distinct values around which the gas value is found, but there is something more interesting going on. Indeed, we can observe a lot of jitter around the bottom value, which seems to disappear for the other two gas regions. The explanation is found again by looking at the original gas formula and, more specifically, the term pricing the computation. Computation is priced as the maximum between different quantities, and this is exactly what happens here - for the bottom value - the operations done by Newton's Method are dominating and thus picked by the maximum function, while for the two other scenarios, the computation induced by the rest of the logic dominates the root finding algorithm, so the max becomes a constant value. This comforts us in the hypothesis we made that the cost of this algorithm could be considered as constant.

With all that in mind, we use the same approach as in the preceding two sections to derive an approximation on the AMM, and you can find in the two graphs below the comparison between simulation and prediction. Note also the gas scale compared to that of JediSwap or MySwap, which is almost twice as big.

Conclusion

Time to catch our breath ! It was not a walk in the park, but we see that the intuition we've built in the first chapter has been worth it. It is also interesting to observe that even a rough approximation gives already very good result. Obviously the journey does not end here, and we will see that everything get a bit more difficult with more complex implementation in the next chapter.

Romain Jufer

Wizard at AVNU

A Tale of Gas and Approximation (part 1)

Embark on a journey exploring Gas on Starknet to build the AVNU gas approximation model

Dec 22nd 2023

ResearchSmart Contract

A Tale of Gas and Approximation (part 3)

The final chapter of our gas exploration ending with the analysis of Uniswap V3 and a few words on generalized gas functions

Dec 22nd 2023

ResearchSmart Contract

Building the Exchange Infrastructure for Starknet

Delve into AVNU's milestones: major gas fee savings, superior trading innovations and more. Read the full story!

Oct 1st 2023

Release note

Our mission is to create the Liquidity Infrastructure that empowers traders and dApps with best execution, competitive pricing, and access to a vast range of assets on Layer 2s. Our commitment is to deliver an industry-leading user experience and reshape the swap market landscape.

FAQ Documentation Blog Status

About us Terms & Conditions Privacy policy Contact

2024 -

A Tale of Gas and Approximation (part 2)

Chapter 2 - May the Gas be with you

Introduction

Approximation of AMM

An easy start

May the Gas be with you

Are Jedi Knights Cheaper than Siths?

Conclusion

Read more