Abstract
Nowadays, the Internet offers data to anyone at any time. Websites on the Internet have been warehousing data for many years ago, i. e., for 10 years and more. In the meantime, many websites have became obsolete. This means they no longer have owner because of either they have no-one to maintain them or they have become unavailable for indexing by spiders that retrieves information about documents to be referenced. As a result, these websites are lost for accessing from Internet browsers and are therefore, referred to as abandoned websites. This paper focuses on the problem of how to identify the abandoned websites and how to preserve and reconstruct the data they hold. We have mainly concentrated on abandoned sport websites that, in general, contains very important data about the results achieved at various sporting competitions in the past. The proposed solution consist of four steps: an analysis of the abandoned servers that held these websites, identifying the structure of the abandoned web page sets, web scrapping, and preserving and visualizing these page sets. In order to test prototype solution, some steps were applied inorder to reconstruct and preserve the data on the abandoned web servers for tracking the results on running. Additionally, opportunities and challenges of applying data mining techniques on reconstructed website are listed.