Abstract
To effectively exploit large-scale data sets using a limited storage space, it is necessary to find a special treatment which reduces them. There are certain methods with this intention. We can quote clustering method. However, this method proves its limits in the case of large-scale data sets. In this paper, we propose to reduce the workspace using the Principal Component Analysis (PCA). We work with fuzzy clustering of a data set in which users don't know the optimal number of clusters to be generated. We proved the effectiveness of the preprocessing use of this technique before any clustering operation.