Analyze, predict and optimize: data mining tips from our Analytics Director

With more than 570 million reviews and opinions covering 1.2 million hospitality businesses, TripAdvisor processes a lot of data. Good thing we have a data mining authority and author of several popular books on data mining. With 20 years of experience, Michael Berry helps us stay on top of what’s going on and get ahead of what’s coming. And Michael’s sharing his expertise with you in this exclusive interview!

1. What is data mining?

Data mining is the business process for exploring large amounts of data for meaningful insights regarding business goals. For someone in the hospitality industry, that goal might be driving more bookings or increasing repeat visits from frequent diners.

2. Why is it important for hospitality businesses to use data? 

Hospitality is changing rapidly, in ways the industry hasn’t figured out yet. You need to know what works now, what will work soon and what used to work that doesn’t work anymore. The actionable tips you uncover from looking at data is how you get there.

Pricing optimization, what special offers will work for you, where you should advertise – these are all things you can better decide by looking at the data. Even small improvements add up over many guests or many years. Analytics has mainly been the territory of bigger chains but smaller properties can take advantage, too.

3. What are the key lessons for someone new to data mining? 

Be skeptical. Explore data with an open mind – don’t assume that what you expect to see is actually what you’ll see. Often, something completely different is revealed in the data.

It also helps to look at data visually. Using a program like Microsoft Excel to represent data as graphs, scatterplots or bar charts can help you notice something you wouldn’t otherwise.

4. What are the top three tips you have for getting the most from data? 

First, always test your hypotheses. Verify if something that seems true actually is. For instance, if you think business travelers plan travel a week out versus leisure travelers planning months ahead, you could test that by comparing when bookings are made and when the reservations are actually for. If your hypothesis is proven true, that could influence when you advertise and what kinds of special offers you include.

Second, try to find the strange values and figure out what they mean. Recently, I was researching average daily rates and saw lots of $1 transactions. Turns out, they’re for credit card validation, but if I hadn’t removed them from the overall numbers they would have skewed my results.

Third, go beyond the raw data. The most valuable information is the data you get by combining with other data points. What I mean is that you can’t directly collect it, but you take what you do have and make it more meaningful. If you have data from your restaurant’s reservation system, you can look at the number of reservations a repeat customer has made. But a more useful indicator of frequent dining is visits per unit time—a relatively new customer may dine more frequently than a long-time customer with a higher visit count.

5. How can someone start gathering data? 

You might not realize it, but you have lots of opportunities to collect useful information, including your booking or reservation system, guest surveys and loyalty programs. Give guests an incentive to boost your response rate, like, “Tell us your birthday and we’ll send you a special offer as a birthday gift!” or “Fill out this short survey and we’ll thank you for your opinions with 500 loyalty points.”

Google Analytics is a free way to learn about how visitors come to and use your site so you can test your marketing efforts and improve website performance over time. And if you have a TripAdvisor Business Listing, you have access to a lot of useful analytics on your property and competitors too. And there are third-party sources of industry-wide data that can help you see if you’re overpriced in the offseason compared to competitors and things like that.

6. What would be a good goal for someone with a fair amount of data experience? 

While reporting on the past is very important, your goal should be to see patterns in data to help predict what will happen in the future. When you can predict occupancy rates for the next few quarters, you can make more informed decisions on important things like pricing, promotions and staffing.

To go from reactive to predictive, search out a mentor. While you learn by doing, knowing someone who’s already on that level is a great accelerator. Ask around if experienced people would be willing to mentor you, or if they know anyone else who would. Especially if you’re new, attend conferences and networking events. They’re a great way to learn and meet people who might be able to help.

One conference in particular I’d recommend is Predictive Analytics World (PAW). It happens every few months, usually somewhere in North America, and I’ve gotten real value from it when I’ve attended.

7. What should someone consider when evaluating analytics software or tools? 

There’s a wide range of tools out there. I’ve used a lot of different software and tools over the last 20 years, and the key factors to consider are:

Does this fit with my skill set or the skill set of my staff? Even if a tool is great otherwise, it just doesn’t make sense if it uses SQL or SAS and you’re not familiar with them.

Does this fit my IT (information technology) environment? On a smaller scale, you shouldn’t choose a tool that’s only for PCs if you have a Mac. On an enterprise scale, you want to make sure this great program spits out data in a format you can handle.

Is this too good to be true? Don’t believe in “secret sauce” – some programs say you don’t need to know anything about analytics and you just push a button to get lots of valuable stuff. That’s not how it works, but when done right the amount of effort you put into data mining reaps even more rewards.

And, lastly, I think it’s worth noting that even as an industry veteran, I mostly use the TripAdvisor database and Excel. So keep in mind that you don’t necessarily have to buy tools you don’t already have.

8. Do you have any anecdotal lessons or tips to share based on your experiences at TripAdvisor? 

While working on a recent project on average daily rates, I discovered an issue of sample bias.

I was researching international average rates and noticed that Batswana and Lesotho had higher average room rates than Switzerland and other places you might expect higher rates. As it turns out, the sample was skewed because in Batswana and Lesotho generally only higher end accommodations have online booking or relationships with an online travel agent. And because TripAdvisor calculates ADR based on prices from those sources, we weren’t factoring in the rates from more budget-friendly accommodations.

It was a good reminder that if the sample of data you have doesn’t accurately represent the overall segment at large, you’re not going to get accurate results.

9. What are common data/analytics mistakes I should avoid? 

There are four common don’ts to avoid that can wreak havoc on your results.

Don’t act as if the data you have is all that you need. One of the big ones is sample bias, which we just discussed.

Don’t mistake correlation for causation. Just because two things are related doesn’t mean one caused the other. It’s easy to convince yourself that’s true because it’s human nature to expect it. If you notice that customers who order a bottle of wine tend to spend more than customers who don’t, you might infer that customers who drink will spend more. So you give everyone a free glass of wine but then notice their checks aren’t going up. Why? Because maybe the cost of the bottle of wine is what increased the checks, not that drinking wine will cause people to buy extra entrees or splurge on dessert.

Don’t look at summarized data and infer what’s happening on an individual level. You always need to drill down deeper for an accurate look.

And don’t learn what isn’t there. You might be drawn to “discover” something that isn’t really there in the data, especially when you’re new. It comes back to not expecting to find a certain outcome. Never assume.

10. What hospitality-related businesses make good use of data? 

Accommodations with loyalty programs definitely have an advantage, because they can collect more data to learn more about guests and better serve them.

A particularly good example is the gaming industry. Casinos get an incredible amount of data from loyalty players. Whenever someone inserts their card into a slot machine or checks in at a table game, the casino can see in real-time what games the player likes, how long they play for, how much they bet, whether they’re winning or losing…the benefit for the player is getting to earn loyalty points and a chance at receiving perks, like complimentary meals. The benefit for the property is having deep insight into what your audience likes and what games are the most popular and profitable, which can inform game selection and placement within the casino, special offers and things like that to draw in more guests.

11. Is there anything else you think people in the hospitality industry should know when it comes to data mining and analytics? 

It’s okay to start small. You don’t need a huge loyalty program to do something good. If you’re a B&B, you can still ask guests if they’d like to take a customer satisfaction survey (either online or with pen and paper). And if you ask guests if they’d like to receive special offers and news about your property, you can collect email addresses to build a mailing list of interested travelers.

For medium size properties and brands, professional help is available. Especially if you’re new to this, find a data consultant. Do it in a smart way where along the way you learn to do this stuff yourself instead of just getting a report in the end. Once you have better understanding of what to do, you can try doing more on your own from that point onward.

Last Updated: January 10, 2018