Survival Analysis using Logistic Regression

While doing a project involving survival analysis a while back, I learned a useful technique for reducing survival analysis to binary regression. The following discussion is adapted from Chapter 7 of Survival Analysis by David G. Kleinbaum. To illustrate how the technique works, consider a small data set with three subjects. Subjects 1 and 3 are … More Survival Analysis using Logistic Regression

Agile Data Science: Applying Kanban in the Analytics Life Cycle

Sometimes I can’t help but wonder whether data science as a discipline is still in the dark ages. From my vantage point as a practicing data scientist, I see that too many projects still rely on personal heroics for success. By and large, ours is still a discipline dominated by the need for stars and … More Agile Data Science: Applying Kanban in the Analytics Life Cycle

A Note on Shopping Efficiency and Product Placement

Most of us have heard of this little puzzle in retail analytics: Assuming we know products A and B are usually purchased together from market basket analysis, how do we place them on shelves to maximise sales? The often-stated “sophisticated” answer is that A and B should be placed far away from each other because … More A Note on Shopping Efficiency and Product Placement

A Text-Based and Self-Adapting Product Recommendation Engine

Content-based recommendation engines are typically done in two steps. In the first, a user-preference model is constructed using a set of predefined features. (For example, in the context of retail, we may have features like Price Sensitivity, Promotion Sensitivity, Coupon Redeemer, Calorie Counter, Working Mums, etc.) In the second step, products are mapped, either directly … More A Text-Based and Self-Adapting Product Recommendation Engine