Abstract
The data available on the Internet is in various models and formats. One form of data representation is a table. Tables extraction is used in process more than one table on the Internet from different sources. Currently the effort is done by using copy-paste that is not automatic process. This article presents an approach to prepare the area, so tables in HTML format can be extracted and converted into a database that make easier to combine the data from many resources. This article was tested on the algorithm 1 used to determine the actual number of columns and rows of the table, as well as algorithm 2 are used to determine the boundary line of the property. Tests conducted at 100 tabular HTML format, and the test results provide the accuracy of the algorithm 1 is 99.9% and the accuracy of the algorithm 2 is 84%.