Abstract
The computation of big data and related services has been the topic of research and popular applications due to the rapid progress of big data technology and statistical data analysis solutions. There are several issues with data quality that contribute to error decisions in organizations and institutions. Current research just covers how to adequately validate data to assure its validity. Data integrity is synonymous with data validity. It is a difficult undertaking that is often performed by national statistics organizations and institutes. There is a significant need to provide a general system for validating the big data integrity. This approach has been dedicated to presenting a model for data integrity, particularly big data, and how to solve the validation process. The data also comprises the validity of the data fields, as well as the validity of measuring the data and determining compliance with the data cycle chain. For the integrity of large data, the processing speed and accuracy of the verification process are taken into account. The research was based on the Python programming language and real test data, and it was based on the use of the most recent technologies and programming languages.