methods1

Save water! Save life! Proposed methods

At this point on the project, I am still working on the data preperation for exploration and subsequent analysis. Therefore, I provide below brief guidelines on how I will be performing the data analysis especially the sub chapter on Analysis wherein its not final but only provides the approach that I should consider. I now…

blind_men_elephant

“Give me data and I promise you cluster’s”: The case of k-means algorithm

Introduction The title of this week’s essay is actually derived from the infamous speech (“Give me blood and I promise you freedom!”) by the Indian nationalist Subhash Chandra Bose’s speech delivered in Burma on July 4th 1944. An essay makes more sense if its title can relate to its contents. Thus, after considerable debate on how to aptly title it,…

lasso-2

To penalise or not to penalise: The curious case of automatic feature selection

What is Lasso Regression? The LASSO (Least Absolute Shrinkage and Selection Operator)  is a shrinkage and selection method for linear regression. This method involves penalizing the absolute size of the regression coefficients. A good description for layman understanding is given on this SO post; to quote, ” By penalizing (or equivalently constraining the sum of the absolute…

Forest-Canopy1

A random forest approach to predicting breast cancer in working class women

What is a Random Forest? A random forest is an ensemble (group or combination) of tree’s that collectively vote for the most popular class (or feature) amongst them by cancelling out the noise. Ensemble learning– ensemble means group or combination. Ensemble learning in the context of machine learning is referred to methods that generate many classifiers…