Search This Blog

Friday, March 20, 2015

Hotel Demand Forecasting


The Hotel Demand Forecasting competition was held on CrowdAnalytix in Feb, 2015.

Objective
The objective was to build a forecasting model to predict the demand in hotel using historic inquiries.

Data
The train data consisted of historic inquiries (reservation, denial, regret) for five different hotels in 2011, 2012, 2013.

The test data was predicting the demand for the hotels in 2014.

Approach
I found the historic demand having strong weekly trends (like you would intuitively expect) and a naive submission of using previous year's demand for the same weekday (eg: using previous year's Friday demand to predict this year's Friday demand in the corresponding week) gave a very good score. So, I decided to play with and optimize the historic averages. I ended up using two different versions of historic averages and made it to the top-5 without a sophisticated model. I wouldn't be surprised if the other toppers used similar ideas.

The first method of historic averages was using a weighted average of the previous three years' demand on the same weekday of the week.

The second method of historic averages was aggregating the weekly demand of the previous three years in the corresponding week and splitting it based on the historic demand proportion for each weekday, which was calculated separately for each quarter.

Model
The final model was an average of the two historic averages, along with some smoothing. The smoothing was redistributing the predictions across three weeks (the week before, the current week and the week after) using a weighted average. This smoothing gave me the biggest jump in my LB score, which pushed me into the top-5.

Code
View on Github

Results
I stood 5th on the public LB out of 45 teams. My model achieved MAPE score of ~ 0.26 and the best was ~ 0.25

The top-5 models were evaluated further and I still stood 5th :-|

Views
This is my second competition on CrowdAnalytix (after Exacerbation) and I'm glad I could finish in the top-5 in both of them. Though these are not as popular as the ones on Kaggle, I enjoyed exploring this forecasting model especially since it was one without any features.

Check out My Best CrowdAnalytix Performances

2 comments:

  1. How did you choose the weights for different years and the smoothing factors for weeks?

    ReplyDelete
    Replies
    1. Little intuition, little guesswork, but mainly, something stable that reduced errors.

      Delete