Views Prediction - A Quora Challenge - Final (EDA, Feature Engineering, and More)

I have found out that it takes a lot of time porting big Jupyter notebooks to a markdown document; sure, Jupyter can do that for you, but tweaking the markdown document so that codes and tables match the style of your blog takes an unecessary amount of time. It takes even more time to explore the data and derive insights from visualizations. Now, this is a necessary task. That’s why for the remaining journey of this Quora Challenge, I will be posting cleaned IPython notebooks along with a short summary of what I did in there....

September 6, 2017 · 2 min · Yohannes

Views Prediction - A Quora Challenge - Part II (Regression)

Recap In the last post, we imported the data provided from Quora. It was split 9:1 into training and test data (unlabled with __ans__ missing). But we will also need to split our data even further. I am talking about using part of the labeled training data to check the accuracy of our model. In this post, I will do regression models and evaluate their accuracy. I am new to regression; as a result, this post won’t be focused on the Quora challenge, rather, it is an exposition to help me better understand regression models....

August 30, 2017 · 16 min · Yohannes

Views Prediction - A Quora Challenge

Hello! I am doing the Quora Views challenge. In upcoming posts, I will chronicle my progress. In this post, I discuss the data munging after introducing the challenge. Prompt The questions the challenge wants to address are the following: Can you tell which questions can organically attract the most viewers? What about questiosn that eventually become viral? Which questions are timeless and can sustain traffic? “Organically attract.” Hmm, what does that exactly mean?...

August 29, 2017 · 24 min · Yohannes