Pages

Tuesday, December 13, 2011

Data Mining


Data Mining is one branch of computer science is relatively new. And until now people are still debating to put data mining in the area of ​​science which, because of data mining involving databases, artificial intelligence (artificial intelligence), statistics, etc.. There are those who argue that data mining is nothing more than machine learning or statistical analysis that runs on the database. Yet others argue that the database is an important role in data mining because data mining to access data whose size is large (up to terabytes) and it seemed particularly important role in database query optimization it.

So whether it is data mining? Does it relate closely to the world of mining .... gold mines, tin mines, etc.. A simple definition of data mining is the extraction of information or important or interesting patterns from existing data in large databases. In scientific journals, data mining is also known as Knowledge Discovery in Databases (KDD).Attendance data mining against the background with the data explosion problem experienced lately where many organizations have collected so many years of data (purchasing data, sales data, customer data, transaction data, etc..). Almost all of the data is entered using a computer application used to handle the daily transactions that are mostly OLTP (On Line Transaction Processing). Imagine how many transactions are entered by such a Carrefour hypermarket or credit card transactions from a bank in a day and imagine how big the size of their data if it has been running a few years later. The question now is whether the data is allowed to build up, useless and thrown away, or are we able to'nambang' her to seek 'gold', 'diamond' yaituinformasi useful for our organization. Many of us are swamped with data but information poor.If you have a credit card, be sure you will often receive a letter containing a brochure offering goods or services. If your credit card provider bank has 1,000,000 customers, and sends a (only one) deals with shipping costs Rp. 1.000 per fruit then the amount spent is Rp. 1 Billion! If the bank send offers once a month which means 12x in a year then the budget spent per year is Rp. 12 Billion! Of the funds of Rp. 12 Billion is spent, what percentage of consumers who actually buy? Maybe only 10% of his course. Literally, it means that 90% of the funds were wasted.The issue above is one issue that could be addressed by data mining of the many potential problems that exist. Data mining can mine the data of credit card shopping transactions to see what buyers are indeed potential to buy a particular product. Probably not until the precision of 10%, but imagine if we could filter out 20%, 80% of funds must be used for other things.So what different data mining with data warehousing and OLAP (On-line Analytical Processing)? Can be answered briefly that existing technology in data warehouse and OLAP fully utilized to perform data mining. Data warehouse technology is used to perform OLAP, while data mining is used to perform information discovery that the information is intended for a Data Analyst and Business Analyst (with added visualization of course). In practice, data mining also take data from a data warehouse. It's just an application of data mining is more specific and more specific than given OLAP database is not the only field of science that affect data mining, many more areas of science that enriched data mining such as: information science (information science), high performance computing, visualization, machine learning, statistics, neural networks (neural networks), mathematical modeling, information retrieval and information extraction and pattern recognition. Even the image processing (image processing) are also used in order to perform data mining of image data / spatial. By integrating OLAP with data mining technology is expected the user can do things that are usually done in OLAP such as drilling / rolling to see the data more deeply or more generally, pivoting, slicing and dicing. All are expected later this can be done interactively and equipped with visualization.Data mining not only do the mining of transaction data only. Research in the field of data mining are now venturing into advanced database systems such as object-oriented databases, image / spatial databases, time-series data / temporal databases, text (known as text mining), web (known as web mining) and multimedia the database. Although the repercussions may not be as busy as when Client / Server Database appears, but industries such as IBM, Microsoft, SAS, SGI, and SPSS continues to aggressively conduct research in the field of data mining. Some research now being conducted to advance data mining include performance improvements when dealing with terabyte-sized data, visualization is more attractive to users, the development of data mining query language for that as far as possible similar to SQL. The aim is none other than to end-users to perform data mining to easily and quickly and get accurate results.

No comments:

Post a Comment