Abstract
Heavy tails in work loads (file sizes, flow lengths, service times, etc.) have significant negative impact on the performance of queues and networks. In the context of the famous Internet file size data of Crovella and some very recent data sets from a wireless mobility network, we examine the new class of LogPH distributions introduced by Ramaswami for modeling heavy-tailed random variables. The fits obtained are validated using separate training and test data sets and also in terms of the ability of the model to predict performance measures accurately as compared with a trace-driven simulation using NS-2 of a bottleneck Internet link running a TCP protocol. The use of the LogPH class is motivated by the fact that these distributions have a power law tail and can approximate any distribution arbitrarily closely not just in the tail but in its entire range. In many practical contexts, although the tail exerts significant effect on performance measures, the bulk of the data is in the head of the distribution. Our results based on a comparison of the LogPH fit with other classical model fits such as Pareto, Weibull, LogNormal, and Log-t demonstrate the greater accuracy achievable by the use of LogPH distributions and also confirm the importance of modeling the distribution in its entire range and not just in the tail.