@MASTERSTHESIS\{IMM2006-04962, author = "N. Arnth-Jensen", title = "Applied Data Mining for Business Intelligence", year = "2006", keywords = "Business Intelligence, Data Mining, Knowledge Discovery in Databases, partition clustering algorithms, kNN, {FCM,} {UFP-ONC,} classification, cluster validity criteria.", school = "Informatics and Mathematical Modelling, Technical University of Denmark, {DTU}", address = "Richard Petersens Plads, Building 321, {DK-}2800 Kgs. Lyngby", type = "", note = "Supervised by Associate Professor Jan Larsen, {IMM-DTU}. Thesis co-supervisors are Christian Leth, Project Chief, gatetrade.net A/S and Allan Eskling-Hansen, Chief Financial Officer, gatetrade.net A/S.", url = "http://www2.compute.dtu.dk/pubdb/pubs/4962-full.html", abstract = "Business Intelligence (BI) solutions have for many years been a hot topic among companies due to their optimization and decision making capabilities in business processes. The demand for yet more sophisticated and intelligent {BI} solutions is constantly growing due to the fact that storage capacity grows with twice the speed of processor power. This unbalanced growth relationship will over time make data processing tasks more time consuming when using traditional {BI} solutions. Data Mining (DM) offers a variety of advanced data processing techniques that may beneficially be applied for {BI} purposes. This process is far from simple and often requires customization of the {DM} algorithm with respect to a given {BI} purpose. The comprehensive process of applying {BI} for a business problem is referred to as the Knowledge Discovery in Databases (KDD) process and is vital for successful {DM} implementations with {BI} in mind. In this project the emphasis is on developing a number of advanced {DM} solutions with respect to desired data processing applications chosen in collaboration with the project partner, gatetrade.net. To gatetrade.net this project is meant as an eye opener to the world of advanced data processing and to all of its advantages. In the project, gatetrade.net is the primary data supplier. The data is mainly of a transactional character (order headers and lines) since gatetrade.net develops and maintains e-trade solutions. Three different segmentation approaches (k-Nearest Neighbours (kNN), Fuzzy {C-}Means (FCM) and Unsupervised Fuzzy Partitioning - Optimal Number of Clusters (UFP-ONC)) have been implemented and evaluated in the pursuit of finding a good clustering algorithm with a high, consistent performance. In order to determine optimal numbers of segments in data sets, ten different cluster validity criteria have also been implemented and evaluated. To handle gatetrade.net data types a Data Formatting Framework has been developed. Addressing the desired data processing applications is done using the capable {UFP-ONC} clustering algorithm (supported by the ten cluster validity criteria) along with a number of custom developed algorithms and methods. For future gatetrade.net interest a draft for a complete {BI} framework using some or all of the developed data processing algorithms is suggested." }