How to Build Great Data Products
Products fueled by data and machine learning can be a powerful way to solve users’ needs and stave off competition. Classic examples include Google search and Amazon product recommendations, but the opportunity extends far beyond the tech giants. To tackle the challenge, companies should emphasize cross-functional collaboration, evaluate and prioritize data product opportunities with an eye to the long-term, and start simple.
Products fueled by data and machine learning can be a powerful way to solve users’ needs. They can also create a “data moat” that can help stave off the competition. Classic examples include Google search and Amazon product recommendations, both of which improve as more users engage. But the opportunity extends far beyond the tech giants: companies of a range of sizes and across sectors are investing in their own data-powered products. At Coursera, we use machine learning to help learners find the best content to reach their learning goals, and to ensure they have the support — automated and human — that they need to succeed.
The lifecycle of a so-called “data product” mirrors standard product development: identifying the opportunity to solve a core user need, building an initial version, and then evaluating its impact and iterating. But the data component adds an extra layer of complexity. To tackle the challenge, companies should emphasize cross-functional collaboration, evaluate and prioritize data product opportunities with an eye to the long-term, and start simple.
Stage 1: Identify the opportunity
Data products are a team sport
Identifying the best data-product opportunities demands marrying the product-and-business perspective with the tech-and-data perspective. Product managers, user researchers, and business leaders traditionally have the strong intuition and domain expertise to identify key unsolved user and business needs. Meanwhile, data scientists and engineers have a keen eye for identifying feasible data-powered solutions and a strong intuition on what can be scaled and how.
To get the right data product opportunities identified and prioritized, bring these two sides of the table together. A few norms can help:
Prioritize with an eye to the future
The best data products get better with age, like a fine wine. This is true for two reasons:
First, data product applications generally accelerate data collection which in turn improves the application. Consider a recommendations product powered by users’ self-reported profile data. With limited profile data today, the initial (or “cold start”) recommendations may be uninspiring. But if users are more willing to fill in a profile when it’s used to personalize their experience, launching recommendations will accelerate profile collection, improving the recommendations over time.
Second, many data products can be built out to power multiple applications. This isn’t just about spreading costly R&D across different use-cases; it’s about building network effects through shared data. If the data produced by each application feeds back to the underlying data foundations, this improves the applications, which in turn drives more utilization and thus data collection, and the virtuous cycle continues. Coursera’s Skills Graph is one example. A series of algorithms that map a robust library of skills to content, careers, and learners, the graph powers a range of discovery-related applications on the site, many of which generate training data that strengthen the graph and in turn improve its applications.
Too much focus on near-term performance can yield underinvestment in promising medium- or long-term opportunities. More generally, the criticality of high-quality data cannot be overstated; investments in collecting and storing data should be prioritized at every stage.
Stage 2: Build the product
De-risk by staging execution
Data products generally require validation both of whether the algorithm works, and of whether users like it. As a result, builders of data products face an inherent tension between how much to invest in the R&D upfront and how quickly to get the application out to validate that it solves a core need.
Teams that over-invest in technical validation before validating product-market fit risk wasted R&D efforts pointed at the wrong problem or solution. Conversely, teams that over-invest in validating user demand without sufficient R&D can end up presenting users with an underpowered prototype, and so risk a false negative. Teams on this end of the spectrum may release an MVP powered by a weak model; if users don’t respond well, it may be that with stronger R&D powering the application the result would have been different.
While there’s no silver bullet for simultaneously validating the tech and the product-market fit, staged execution can help. Starting simple will accelerate both testing and the collection of valuable data. In building out our Skills Graph, for example, we initially launched skills-based search — an application that required only a small subset of the graph, and that generated a wealth of additional training data. A series of MVP approaches can also reduce time to testing:
Stage 3: Evaluate and iterate
Consider future potential when evaluating data product performance.
Evaluating results after a launch to make a go or no-go decision for a data product is not as straightforward as for a simple UI tweak. That’s because the data product may improve substantially as you collect more data, and because foundational data products may enable much more functionality over time. Before canning a data product that does not look like an obvious win, ask your data scientists to quantify answers to a few important questions. For example, at what rate is the product improving organically from data collection? How much low-hanging fruit is there for algorithmic improvements? What kinds of applications will this unlock in the future? Depending on the answers to these questions, a product with uninspiring metrics today might deserve to be preserved.
Speed of iteration matters.
Data products often need iteration on both the algorithms and the UI. The challenges is to determine where the highest-value iterations will come from, based on data and user feedback, so teams know which functions are on the hook for driving improvements. Where algorithmic iterations will be central — as they generally are in complex recommendation or communication systems like Coursera’s personalized learning interventions — consider designing the system so that data scientists can independently deploy and test new models in production.
By fostering collaboration between product and business leaders and data scientists, prioritizing investments with an eye to the future, and starting simple, companies of all shapes and sizes can accelerate their development of powerful data products that solve core user needs, fuel the business, and create lasting competitive advantage.
How to Build Great Data Products
Research & References of How to Build Great Data Products|A&C Accounting And Tax Services
Source