Scenarios when data preprocessing is imperative with examples in Dell Statistica 12 & RapidMiner Studio 6.5

There are usually several data preprocessing steps required before applying any machine learning algorithms to data. These are required by the nature of available data and algorithms. Below are listed few common instances where data preprocessing is required. Recall in this context, attributes are variables (columns in the data spreadsheet) and each row in this column is a…

Assessing Clustering Tendency in R

In clustering one of major problem a researcher/analyst face are two question. First, does the given dataset has any clustering tendency?And second, how to determine an optimal number of clusters in a dataset validate the clustered results. In this post, I have attempted to answer this using R

Data Processing with Weka (Part II)

Today, I will discuss and elaborate on data processing in Weka 3.6 (it’s the same in version 3.7 too). This post is the second part in the series of “Data pre-processing with Weka”. If you have not seen my earlier post, you are directed to see that first. Continuing further, assuming that you have cleaned…