In this post, we'll be going behind the scenes to see how we created the initial version of Corpus — our business metrics platform — and how that allowed us to power Attentive’s Reports Library. Corpus serves as a single source of truth for reporting data with APIs that can be easily used in the frontend.
We’ll discuss how we went from scattered definitions and chaotic reporting to a centralized platform that allowed us to create a reporting product that customers love to use.
"The reporting and super easy user-friendly dashboards are amazing!"
We’ll start by taking a look at the problems that kicked it all off, our scrappy team’s approach to tackling it head-on, the tools we used, the puzzle pieces that made it all happen, and of course, the final picture: our shiny new platform. At the end, we’ll talk about some lessons we've learned along the way. So grab your favorite beverage, kick back, and let's dive into the nitty-gritty of Corpus.
Attentive is a fast-growing company where change can happen quickly. This can pose challenges when it comes to reporting and the state of data.
Before Corpus, we had a legacy reporting system that served us well for many years. It was a platform where teams could define their reports using raw SQL (and there was some very gnarly SQL!), reports could be activated for a user, and the user could use the reporting UI to request a report. Minutes, or even many minutes later, they would receive an email with an Excel workbook export of the given report SQL. The report consisted of many rows, sometimes thousands, that the customer would have to parse through and visualize themselves.
The legacy reports aggregated the data every time they ran, and in some cases, they would fail due to large amounts of data or other issues. After they failed, the customer would have to request them again or contact Customer Support for help.
Customers wanted to get their data quickly in an easy-to-digest format, which wasn't possible with the current platform.
The SQL logic powering those reports? It was confined to each individual report and could not be easily reused in other reports or in the UI. If we wanted to reuse any data in other services or the UI, we would have to copy-paste the logic. As the business grew and evolved, the duplicated logic needed to change as well. How do you ensure the logic stays consistent across the board? And what do you do next month when the UI and reports no longer match? Which one is correct? It was becoming very difficult to maintain and have accurate data.
We needed a solution that would provide a single source of truth and could expose that data anywhere it's needed.
To satisfy customer needs, decrease the load on Customer Support and other internal teams, we wanted to improve the reporting product. We wanted to build The Reports Library:
A configurable report builder that lets clients create their own in-app and downloadable reports by selecting metrics, timescales, filters, and other options. This feature empowers clients to create and customize the reports they need, without requiring Attentive CS to manually create those reports.
Our awesome designers then worked on coming up with initial designs for our new platform. This was a cross-functional team effort where all of the engineers, our product managers, and designers closely collaborated to come up with the final designs for our new reporting platform.
Unfortunately, the existing reporting platform could not achieve this very easily. This is when we decided that, before we could satisfy the product needs, we needed to build a better metrics platform - Corpus. And we could build it while dogfooding ourselves to construct the Reports Library.
Given the above problems and requests, we summarized the requirements for the Corpus platform as follows:
There was one problem: We were a team of five engineers, and we had never built something like this before. Where and how do we start? Research!
We spent a lot of time understanding existing data engineering tools and reading blog posts about solutions and experiences. Out of all the research we did, one particular blog series resonated with us - AirBNB Minerva. Given our own requirements, an excerpt from Part 2 of the Minerva blog series was exactly what we needed:
Unfortunately, Minerva is not open source, but at least we now had our North Star! We just had to figure out the simplest puzzle pieces to start with.
Given our requirements and the North Star, we worked on defining the smallest puzzle pieces we would need to satisfy the platform requirements.
We wanted to make the metric queries performant and minimize the sizes of the datasets. To achieve this, we decided to pre-aggregate all the data. The vast majority of data from other systems, be it databases or events, was already available in Snowflake. This meant that Snowflake was the best place to do all the raw data aggregations.
Based on reporting requirements, we needed to support multiple levels of aggregation, from 15 minutes to yearly.
Now that we knew where we would be doing our data aggregations, we had to figure out how.
Our BI teams already utilized dbt to model their datasets and used Airflow to run those models. Perfect!
Dbt is a modern data transformation engine which allows analysts to easily materialize data views. By simply writing select statements, tables can be defined, materialized, and referenced in other models. It gives us many benefits that we don't have to build ourselves. We could define models using SQL, and dbt would be able to manage the relationships between all the models, execute them, and automatically handle merging of the data. And by using dbt, we could also take advantage of dbt docs to catalog and have data lineage for any metrics we defined. A lot of freebies that allow us to move much quicker!
By creating custom Airflow DAGs, we could run the dbt models in two different ways:
The dbt tool gave us a lot, but we still had to ensure that each model was able to handle some specific logic needs:
While the data warehouse is great for storing and aggregating a large amount of data, it's not best suited for powering the data in the UI, where we could have hundreds or thousands of customers looking at metrics.
As part of the Airflow DAG that handled aggregation, we also synced the data from Snowflake to a PostgreSQL database. Using PostgreSQL also allowed us to properly index the metric data and improve the overall query performance.
We used another off-the-shelf tool to handle this, called Meltano. Meltano allows us to define ETL pipelines and automatically sync the data from Snowflake to PostgreSQL based on the last modified date. And since we only update the last modified date when and if data changes, we were able to minimize the amount of data being copied.
Full metric refreshes were a different beast, and we discovered early on it's not something Meltano could perform efficiently. For some of the larger metrics, if using Meltano, full refreshes could take hours. Instead of Meltano, we ended up handling full refreshes by exporting the data from Snowflake to S3 and then loading it into PostgreSQL from S3. We can now do full refreshes in minutes! Load the CSV into a temporary table, run a table swap, and we have an entire metrics table refreshed without any downtime. Magic!
We can now execute the dbt models via Airflow to aggregate data in our data warehouse. Awesome! What about our platform users?
Dbt is a great tool, but we didn't expect every engineer to learn its syntax just to define their metrics, especially given the additional logic required for each dbt model. Plus, if we ever needed to introduce new logic across all models, hard-coded definitions would complicate maintenance. We also needed a way to catalog metrics and their metadata.
These needs were addressed by introducing the Metric Definition Language (MDL). MDL is a YAML file that defines metric metadata and how metrics are aggregated. Below is an example MDL for the Conversions metric:
dataSource: snowflake
table: attentive_dev.dw.attributable_conversions
metrics:
- id: conversion_count
field: id
type: count
name: 'Conversions'
description: 'Number of conversions that resulted from a message based on your attribution window settings. A conversion typically means a purchase.'
format: number
connotation: positive
filters: [ ]
dimensions:
- id: date
field: converted
type: timestamp
name: 'Conversion Date'
grains: [daily]
inCompanyTimezone: true
- id: company_id
field: company_id
type: list
name: 'company id'
preAggregations:
- id: conversion_count_aggregation
lookback_days: 2
type: groupBy
metric: conversion_count
dimensions: [ date, company_id ]
The Airflow DAGs would parse the MDLs, generate the dbt models, and aggregate the data. This allowed us to introduce new logic or bug fixes to all the metrics without impacting any of the definitions.
For the initial implementation, we decided to use denormalized tables instead of dimensional modeling. We knew we would eventually want to transition to dimension and fact tables, but for now, denormalized tables were sufficient. Once aggregated, the above metric would resemble the following table.
As part of metric definitions, we also focused on creating self-serve tooling that would allow platform users to easily create and test new reports. We developed a Corpus CLI tool that enabled users to compile any new or existing metrics, aggregate data from any environment, and output the data to their own personal schema in Snowflake, which they could query and further validate.
Corpus' CLI tool is a Python tool that reuses much of the existing aggregation code to work with metrics without needing Airflow. This allowed us to have consistency with processes run locally and processes in a production environment.
Additionally, it allowed for rerunning the data and automatically comparing the new output against existing data. This made it much easier to validate issues and see the impacts of metric definition changes.
While the aggregated data in our data warehouse was usable directly in internal reporting via Looker, legacy reports, and other internal processes, we needed to make the PostgreSQL data available to the UI and other services. Enter our query layer, metrics-api!
We exposed two mechanisms to get the data:
The APIs support obtaining metric catalog, such as human-readable metric and dimension names, as well as retrieving the actual data in multiple formats.
The API can also be used to further filter, group, and aggregate the metric data as needed. For example, the following query would return a month's worth of daily data for the given company ID and include the monthly total value as aggregateValue.
{
node(id: $companyId) {
metrics(
metricIds: ["conversion_count"],
groupings: [
{ dimensionId: "date", granularity: TIME_DIMENSION_GRANULARITY_DAILY }
],
filters: [
{
dimensionId: "date",
startDate: "2024-03-01",
endDate: "2024-03-31"
},
{
dimensionId: "company_id",
operator: FILTER_OPERATOR_EQUALS,
value: $companyId
}
]) {
definition {
metricId
}
aggregateValue
groupedValues {
groupingDimensionValues
value
}
}
}
}
Now that all the data was easily queryable, we needed to display it.
We worked closely with our Product and Design team members to create a suite of React components that worked natively with the metrics-api GraphQL queries. Using Storybook allowed us to build these components in isolation without needing the APIs to exist. From line and bar charts to date pickers and reporting tables, these components were meant to be easily reused within the Reports Library and by any other team in other parts of the Attentive UI. A few component examples are below:
While the legacy reports could be updated to use the new metrics data in our data warehouse, to implement the Reports Library, we also needed a new way to define the reports that used Corpus and that could be accessed directly in the UI.
In the new world, instead of each report being one large SQL query, reports were now collections of metrics, their filters, and dimensions. The new reporting platform would be able to take a report definition, query the metrics API to get the data, and use the new presentation components to display a nice, user-friendly report that allows in UI filtering, grouping, and other advanced features. Below is a sample report definition for one of the more important reports, Attributable Revenue:
{
"id": "attributed_revenue",
"name": "Attributed Revenue",
"domain": "billing",
"class": "attributed_revenue_class",
"metrics": [
"conversion_count",
"conversion_total_revenue",
"average_conversion_revenue"
],
"filters": [
{
"dimensionId": "date",
"type": "timestamp",
"datePeriod": "LAST_30_DAYS"
}
],
"groupings": [
{
"dimensionId": "date",
"type": "timestamp",
"granularity": "daily"
}
],
"summaryMetricIds": [
"conversion_count",
"conversion_total_revenue",
"average_conversion_revenue"
],
"summaryMetrics": {
"summaryMetricIds": [
"conversion_count",
"conversion_total_revenue",
"average_conversion_revenue"
]
}
}
You might have noticed that we have five unique puzzle pieces, and we have five Engineering members on the team. This wasn't an accident!
Once the puzzle piece blueprints were ready and we were sure they would all fit together, each of us took on our respective piece and implemented it in parallel:
With a lot of ongoing teamwork, communication, and collaboration with our Product and Design teams, we smoothed out any rough edges and put all the pieces together. They fit perfectly! We got our final picture.
All the puzzle pieces are now in place! We can define metrics, aggregate them, serve them via APIs, and have components to display them. What does that actually look like?
This is the architectural diagram of the initial version of Corpus. Since then, we’ve added many more features and improvements, but this initial version solved the problem we set out to resolve. We had a standardized, declarative, performant way to define and consume our metrics.
Powered by our new metrics platform - Corpus, we could now build the Reports Library! Customers no longer have to parse through thousands of Excel Workbook rows and create their own visualizations. They can self-serve their reporting needs right in the UI, getting a quicker view into their marketing performance. This reflects our core value of Championing the Customer. Our success is defined by our customers' success, and by understanding their needs, we craft sustainable and extraordinary solutions to help them thrive.
And since we built the platform to be easily usable by other teams, they could use Corpus for their own metrics and fully remove many legacy systems. Our mission was a success; we have built the single source of truth. The data customers see in their reports is the same data they'll see throughout the Attentive UI.
As a summary in numbers, the Corpus platform currently:
This was our initial version of the platform, and we built something we never had before. Since then, we've added many new features to support both our internal platform users and our external customers. However, with that initial version, there were a few important things we learned:
We had to work together, not be afraid to challenge and question each other to make sure what we were building made sense. Even though we spent a lot of upfront time planning, not every detail could be captured.
It was crucial to cultivate a comfortable environment within the team, where issues could be raised early if things veered off course. Despite being a fully remote team, constant synchronization and communication were maintained to ensure alignment as all the pieces came together. This cohesive effort was vital to the success of our platform.
Effective teamwork extended beyond the Engineering realm; involvement of Product and Design team members was equally essential. Through collaborative discussions and necessary compromises, we ensured that every element fit together seamlessly, even if it meant adjusting our initial plans.
Since we initially weren't sure what to build, early research was critical to our overall success. Delving into existing tooling and solutions, both internally and externally, was instrumental in charting our course. This approach not only helped us find our North Star but also identified tools within Attentive that could help us build our platform.
Utilizing existing tooling helped us quickly build our initial version and be able to optimize when needed. Despite the extended research phase delaying the start of the platform's development, it proved instrumental in defining component contracts, enabling parallel development, and ensuring a cohesive end product. This early research is what helped all the puzzle pieces fit.
We initially used the Corpus platform ourselves to build the Reports Library. This internal usage served as a crucial validation step, allowing us to identify any shortcomings and refine the platform before introducing it to other teams. By experiencing the platform firsthand, we gained invaluable insights into its functionality, usability, and areas needing improvement.Ultimately, using the platform ourselves proved instrumental in enhancing its quality, reliability, and user experience. This approach shows the significance of iterative development and continuous improvement, to ensure that the final product meets the needs and expectations of its intended users.
Our experience highlights the crucial role of user support in driving platform adoption and success. As we worked with other teams to define the initial metrics for the Reports Library, we had to deal with diverse datasets. This emphasized the need for meticulous data validation and cleanup processes. By providing user support and implementing self-service tooling early on, we not only facilitated smoother data validation but also empowered teams to independently create their own metrics.
Looking back, these lessons reshaped our project management approach, emphasizing the importance of teamwork, research, validation, and user-centric design principles. They served as guiding principles throughout our journey, enabling us to navigate challenges and ultimately achieve our goals.
Ready to help us tackle our next challenge, together? Find your perfect role.