Abstract
To simulate blockchain systems as close to reality as possible, we need accurate estimates of the probability distribution of various variables. In this paper we obtain distributions for Ethereum smart contract transactions, with respect to Gas Limit, Used Gas, Gas Price and CPU Time. To determine these distributions we use publicly available Ethereum smart contract information, augmented with experimental data for over 300,000 smart contracts obtained on a test bed. We conclude that Gaussian Mixture Models are appropriate for distributions of smart contracts with respect to Used Gas and Gas Price, and use a uniform distribution for the distribution with respect to the Gas Limit. A correlation analysis shows that the CPU Time is strongly correlated with Used Gas and we therefore apply regression techniques to estimate the CPU Time conditioned on Used Gas. We experiment with three ensemble regression methods, namely Random Forest, Gradient Boosting Machine and Adaptive Boosting and conclude that Random Forest is both fast and accurate.