The Future of Flying, with Emirates Airlines and CMU Innovation Lab

AUTHOR:
Sampath Chanda

Emirates and the Integrated Innovation Institute of Carnegie Mellon University recently sponsored a “travel hackathon” in Santa Clara, CA with Intel Nervana as one of the partners. This coding marathon brought together 175 hackers in 37 teams who worked for 24 hours to solve one of three travel problem statements using artificial intelligence and machine learning technologies. The top 5 teams were awarded prizes, and each member of the grand prize team received a round trip ticket on Emirates.

Intel Nervana was the lead sponsor and instrumental to the hackathon as they gave participants access to their high performance computing cluster featuring Intel® Xeon Phi™ processors. These massively parallelized and vectorized processors were developed to support computation-intensive AI workloads and allowed participating teams to rapidly prototype their idea within the 24 hour deadline. Each team was given access to a node within the Intel Nervana DevCloud to stage code and data, compile, and submit tasks to a queue. Jobs were scheduled on the 64 core Intel® Xeon Phi™ processor 7210 with access to 96 GB of on-platform RAM (DDR4) and 16 GB of high-bandwidth memory (MCDRAM), configured as flat mode. On the software framework side, each node in the cluster included access to Intel® Nervana™’s neon™ and Intel-optimized versions of Python*, TensorFlow*, Caffe*, Keras* and Theano*.

24 Hours – and Many Problems to Solve

Emirates travel hackathon participants were challenged with three problem statements:

  1. How to identify the current city of residence of Emirates “Skywards” rewards program members using social media data
  2. How to beatter understand theinfluence of external factors on airline pricing
  3. How to construct new attribution models and determine the success of marketing tools

Though Emirates Skywards rewards program members provide a city of residence during registration, this profile data can become outdated if it is not manually updated by the member. Therefore, Emirates can’t serve their Skywards members with relevant offers and benefits. To solve this problem, Emirates suggested that hackathon teams use available social media information along with existing Skywards members profile data to infer the current city of residence of members. However, due to confidentiality clauses, personal information in the user database is encrypted. Therefore, the data provided is not labeled, which makes this an unsupervised problem. Also, this makes it challenging to directly map member profiles to their respective social media accounts. Hackathon teams were expected to use different features available in the given data to effectively identify and predict the current city of residence of Skywards members.

In the second problem statement, participants were asked to determine the influence of external factors like weather information, geo-political data and stock exchange activities on airline pricing. By better understanding the influence of external factors on demand, Emirates can optimize their pricing. Also, Emirates wanted to understand the longer-term influences of these factors, which would help them prepare for future demand. To target this problem, Emirates provided hackathon participants with a dataset of standardized airline fares from the International Air Transport Association (IATA) for select US destinations. The primary challenge was to develop better models for external factors, which vary widely.

The third problem statement of the travel hackathon was challenging teams to build better attribution models, or ways to determine how a given marketing channel or element affects sales. If companies learn that a certain marketing channel delivers better results, they can invest more in that model and increase sales even more.

And the Winners Are...

The Grand Prize went to Team Thunderbirds. The team suggested solutions for all 3 problems. For the first problem, they hand-picked features like boarding point, disembark point, passport nationality, EMCG score, preferred departure airport, etc., and used an ensemble of 400 support vector machine models from the scikit-learn package in Python. Their solution provided around 90% accuracy in predicting the residence city and they claimed that accuracy could go up with more training data. They approached the second problem by making two predictions – one about market share and the other about ticket price. The intuition behind this approach was that the market share depends on internal factors like origin airport, departure airport, sale timing, season information, etc. This data is then combined with data from external factors like natural disasters and the Dow Jones Industrial Average by applying a Random Forest algorithm to predict the new ticket price. For the third problem statement, their approach considered factors like real time tracking, type of ads, demand curve and historical ad campaign results. Using these factors, they trained a neural network that was able to predict the attribution for each of the features.

Team Go Flames, who won the 1st prize, tackled the price prediction problem statement. The team utilized a marketing theory known as PESTEL (Political Economic Social Technological Environmental Legal) analysis. It is a tool used to analyze and monitor external factors that can impact a project or an organization. They initially used an Attention LSTM fully convolutional net on the features obtained from this tool to get an average mean absolute error (MAE) of $79. To further optimize, they used an XGBoost based architecture whose price prediction reached an MAE of $23 with 10 attributes. Also, by plotting the feature importance, they came to know that most of the profit obtained was through business class tickets.

The team Purple is the new Red and Blue won the 2nd prize. The team attacked the first problem statement of predicting the residence city. The technology stacks they used were from DaVinci Labs, JS, Google API, Twitter API, etc. They proposed tracking the Twitter timeline of users to predict the probability of the location. The features they claimed to be useful were adjacent words, Twitter features and semantics. Further they proposed a simple algorithm which uses these features along with data like nationality, frequency of travel location, and time of stay at a location to label the data with the residence city. Using this data, they train a 3-hidden layer deep neural network that predicts the customer’s residence city.

Many other teams had interesting approaches to the given problems. Team Newbie worked on the problem of finding the residence city of the users. They approached the problem by identifying the importance of the features by using correlation metrics and removing features with little relevance. They also used a distance metric from source to destination airports as a feature. Team Cortex also tackled the same problem but with an approach of labeling the data first and then using a supervised algorithm. For labeling they made use of frequency of visiting a particular airport by a user. Then, they trained a deep neural network with input features like gender, age, skywards tier, flying frequency, average score to predict the residence city of a user. Interestingly, they reported an accuracy of 81.9% using their approach and also suggested other ways of labeling the data using methods like TF-IDF.

Conclusion

The Emirates Travel Hackathon started with some interesting problem statements and ended with a wide variety of exciting solutions proposed by the teams. The expectation of a hackathon is that participants arrive at different solutions that could, in turn, become the best performing solution in the industry. Clearly, very intuitive approaches were taken by the teams, making the hackathon a success. Intel Nervana helped the participants by providing access to their cutting-edge compute infrastructure so teams could develop their best possible solutions. As an added benefit, the hackathon was a great opportunity to network with industry developers and academic researchers. Overall, the hackathon was an amazing experience that allowed participants to increase their.

This blog was written by Sampath Chanda. Sampath is an Intel Nervana AI Student Ambassador at CMU. To follow Sampath, check out his ambassador profile and stay up to date on his current projects. To become an ambassador, apply here.

The Intel® Nervana™ AI Academy is the place to discover AI tools, training, optimized frameworks and a community of your fellow peers, professors and industry experts. Join the academy today.

Copyright © 2017, Intel Corporation. All rights reserved. *Other names and brands may be claimed as property of others.

Stay Connected


Keep tabs on all the latest news with our monthly newsletter.