The most bandwidth-efficient way of obtaining historical gasPrice's per tx

aliatiia · July 6, 2020, 5:33pm

I would like to obtain all gasPrice's used by all transactions in all blocks. Is there a more bandwidth-efficient way than the naive approach of:

for each block B since genesis: # hitting infura to get next block
     for each tx in B: # hitting infura for next tx
         extract tx.gasPrice

Thank you.

Leiya_Kenney · July 6, 2020, 7:01pm

Hi @aliatiia - welcome to the Infura community!

We’re working on checking into this, but in the meantime, what are you trying to achieve with this data?

aliatiia · July 6, 2020, 7:35pm

Thanks, I am doing some EIP1559-related analysis and would like to backtest a simulation that determines whether a hypothetical tx with a certain gasPrice would have been included into a block with high probability given the gasPrice’s of other real tx’s in said block.

Leiya_Kenney · July 6, 2020, 10:18pm

It doesn’t sound like there’s any metadata that would allow you to compute this more efficiently. However, you may be able to use sample gas prices over block ranges - though individual blocks can be outliers, the gas market itself doesn’t change dramatically block by block, so you would be able to compute a rough average gas price by doing a full sample of all the transactions every 100th block or something similar, if that will work for your purposes.

aliatiia · July 7, 2020, 1:13am

Sampling wouldn’t work unfortunately because it will affect the analysis, it has to be all transactions.

It is possible to obtain such data from Google Big Query but I ran out of quota pretty quickly, this is the query in case it’s useful to anyone:

SELECT block_number, gas_price FROM bigquery-public-data.crypto_ethereum.transactions ORDER BY block_number

aliatiia · July 10, 2020, 7:04pm

Update: Transactions can be requested Using Ethereum-ETL using this command:

ethereumetl export_blocks_and_transactions --start-block STARTBLCOK --end-block ENDBLOCK --provider-uri https://mainnet.infura.io/v3/[PROJECT_ID] --blocks-output blocks.csv --transactions-output transactions.csv

You can remove --blocks-output blocks.csv if you only want transactions.

transactions.csv has these columns: hash,nonce,block_hash,block_number,transaction_index,from_address,to_address,value,gas,gas_price,input,block_timestamp.

Infura can save a lot of bandwidth if it supports granularity to getTransactionByBlockNumberAndIndex method by letting users specify the columns they needed thru an extra param (in my case for example I only want hash, gas_price).

aliatiia · August 6, 2020, 3:24pm

Update: this EIP solves the issue generally