Abstract
This paper proposes a new cost-driven approach for detecting non-technical loss (NTL) of electricity in a resolution-constrained setting. NTLs are caused by fraudulent behavior by customers; they are reported to cost 96 billion annually to utility companies. With the global adoption of smart meters still in its early stage, with 14% market penetration, many utility companies must detect NTLs from low-resolution signals. Our proposed method optimizes for the expected economic return. It employs a synthetic control approach and ensemble boosting model that jointly outperform state-of-the-art support vector machine and random forest methods described in the literature. We also used a class-imbalance-agnostic precision-recall metric to validate our approach under various conditions. The whole analysis was conducted using a subset of a dataset of customer accounts from a large utility company that serves a population of over 30 million people. Our proposed method was tested by the utility company and initial results show -75% precision in detecting new NTL cases.