The Helio Platform + Gartner’s Data Sophistication Journey

CircleUp’s primary asset is Helio, a collection of data and algorithms, on top of which there are business applications. Today those applications are equity and credit funds that invest directly into early-stage CPG brands. Helio finds, classifies, and evaluates CPG companies on a set of dimensions calibrated for success outcomes.

Through years of facilitating investments, we have built the world’s most robust repository of early-stage CPG intelligence. As we build out our capabilities, we put it in terms of Gartner’s Data Sophistication Journey illustrated below. The data paints a descriptive picture of the current landscape and tells a diagnostic story of why it came to be. Rooted in the data, our team has gone on to build a series of algorithms to help predict likely future outcomes and ultimately be prescriptive about ideal behavior to generate preferable outcomes.

Let’s dive into each step of the data journey.

1. Descriptive Analytics

There is an immense amount of data in the CPG sector. Every product has a unique identifier (UPC code), each transaction is documented at the unit-level, and product reviews add a qualitative and quantitative dimension to the data. Furthermore, entrepreneurs are highly motivated to get their brand out there (in contrast to the secrecy of tech) – be it through distribution or social media channels. To create an accurately descriptive map of the industry, someone has to capture and aggregate data on distribution, brand, and team. Helio is that aggregator.

While the bulk of the country’s distribution data for purchase is concentrated within the largest retailers (think Walmart, Costco, Safeway, Amazon, etc.), much of the future value and insights come from the rest of the picture, the long tail of retailers where most emerging brands find their beginnings. Brands don’t often start at Kroger. They typically build a following by either selling through their own targeted websites (d2c) – be it for apparel or color cosmetics – or through the long tail of retailers.  Once these brands reach retail, they will typically start at an indie beauty store in LA, a pet store in Dallas, or a bodega in NYC. Without insight into that long tail (all the tiny puzzle pieces), investors don’t have differentiated insights, entrepreneurs don’t know the competitive landscape, and retailers fall behind consumer trends.

Players with the full picture will have the advantage. Players without it will be left behind, trying to find the right companies in this zoo.

2. Diagnostic Analytics

It’s one thing to have a snapshot of today’s CPG and retail landscape but quite another to understand why it looks that way. The key to diagnostic analysis (the why) is access to historical time series data in combination with algorithmic analysis. Historical data first explains how things have (or haven’t) changed so that algorithms can better explain why things have changed.

Investors can be right for the wrong reasons. Typically, for a private investment with a five-year duration, what matters most is just being right, not the reasons. But at CircleUp we care about making decisions for the right reasons. Being right for the wrong reasons (luck) may yield strong returns one time, but it is not repeatable and therefore not sustainable or scalable. CircleUp aims to be scalable and repeatable. Our mission is to help entrepreneurs to thrive, which means we care about helping tens of thousands of entrepreneurs over time. The why matters a lot.

Let’s take an example from tech: Slack. Currently a multi-billion dollar messaging application, Slack was launched by Stewart Butterfield as a non-violent video game company and turned out to be a total flop. But before Butterfield and his team ran out of their VC funding, they pivoted to build the Slack we know today. By any outcome measure, this was the “right” investment for those early investors.

But were the reasons right? If the thesis was based on a capable team, then yes. If the thesis was based on the non-violent video game, then no. Algorithmically knowing the difference in tech is almost impossible, given the lack of a standard data set and the wildly different business models. But in the consumer space, we are excited by the challenge.

To build the ability to diagnose, CircleUp has been capturing data since 2012. And while some information anyone can go back and find (e.g. Halo Top’s original packaging), most of it is nearly unattainable if not captured in the moment (e.g. the number of stores that carried Supergoop products in March of 2017).

Over time, the market will reward the best diagnosticians. Understanding the why means repeatable and scalable success.

3. Predictive Analytics

The value of predictive analytics is easy to understand – it’s more compelling to know what will likely happen than what already happened. Everyone wants to predict the future but no data company in CPG or retail can do it in isolation. Enter Helio. Helio has several predictive elements, including future product distribution. With predictive insights retailers can stay ahead of trends, entrepreneurs can move into a growing category with confidence, and investors can invest based on quantifiable future potential rather than just historical metrics.

This is the stage of data analytics that relies heavily on machine learning algorithms.  

As humans, we develop simple predictive algorithms all the time – some are accurate, but most are pretty lousy and rely on limited information. First, you notice a pattern – you save 20 minutes on your commute when you leave five minutes earlier, or you nail a presentation at work when you wear your lucky socks. These observations lead to a change in behavior as you believe a specific variable (departure time or sock choice) is causal of a specific outcome (commute duration or presentation success). Correct or not, these are simple algorithms.

Now get data scientists involved and suddenly algorithms can process thousands of variables, trillions of data points, and can test numerous different outcomes. At CircleUp we implement both supervised and unsupervised modeling techniques to look at thousands of input variables ranging from distribution expansion to product packaging. The outcomes we care most about are revenue growth and exit-events, which are aligned with the goals of both entrepreneurs and investors, and happen to be very relevant for most industry participants.

I cannot overemphasize the importance of the training data that we have collected over the past six years to shape these algorithms. A statistician would tell you that the more robust the data sample, the higher the confidence level. Think of how many presentations you would have to make with those lucky socks before your theory is statistically significant. CircleUp wouldn’t be predictive if we didn’t first have enough data to be descriptive.

4. Prescriptive Analytics

Imagine being able to algorithmically effect change at a company by identifying levers it could change to grow more quickly. Nobody does this today. And for good reason. It’s hard. Sure, board members give portfolio companies advice and serial entrepreneurs claim to have a “winning formula,” but there has never been a way to set strategy or tactics based on algorithmic prescriptive insights. The human brain can’t identify (let alone replicate) the factors that drive success – we are dealing with thousands of variables that potentially drive outcomes, not to mention the interaction effects between those variables. The number of possible combinations moves beyond the limits of human cognition and requires machine learning to make sense of it all.  

What if we had data to help an entrepreneur differentiate their ingredient list, determine the best package size, and redirect their distribution strategy?

That is where Helio is going.


%d bloggers like this: