Deriving Valuable Insights and Making Informed Data-Driven Decisions by Leveraging Data Engineering

About the Client

The client is a US company that develops, markets, and sells natural health products to healthcare professionals and select health food retailers. With regards to its product line, the company develops various health products such as women’s well-being, therapeutics, and liposomal, to name a few. The company’s product development process involves a critical understanding of the market data, extensive R&D, and application of the latest scientific research. Its sales, operations, and administration departments work in tandem to provide customers with flawless services.


To increase its product sales, the Sales and Marketing teams require an integrated view of their existing sales and derive real-time insights for demand-focused promotions.
However, data silos pose a significant challenge , thereby making visualization difficult; the company needs to extract terabytes of data relating to marketing campaigns, sales promotions, past region-wise demand forecasts, product-wise performance, sales by channel, and more. The data from multiple sources need to be processed and loaded into a data warehouse system that can then be analyzed. The data warehouse solution needs to be:
Robust – Enable data to be searched and extracted quickly
Scalable – Handle high volume real-time and batch data updates
Reliable – Ensure validity and accuracy of data being analyzed
Additionally, the company aspires to develop a single source of truth forecast that can be leveraged across the product, sales, and marketing teams, making the process more efficient.
To find the right balance between its product offering and user needs, the client built a minimum viable product (MVP). By introducing MVP, the client aspired to generate user feedback and understand the customer usage patterns of the application frequently. This required the core product team to maximize the time spent on application feature enhancements constantly. Additionally, to provide a superior customer experience for the MVP, an omnichannel user interface and user experience (UI/UX) on the web application were essential. To enable the core engineering team to focus on strategic and development initiatives, the client sought a team of expert QA and testing engineers with process management skills to:


To realize the company’s goal of data-driven decision-making, they worked with Trigent to design an integrated solution and provide a unified view of the sales and marketing data. Trigent proposed a strategy that balances the need to create a unified resilient data engineering and data analytics infrastructure and approach while protecting existing investments where possible.
Scope determination & solution design
The first step was to understand the current software implementation, flow of data between systems, the corresponding business processes, and issues faced by the technical and product teams. Trigent understood the company’s overall product portfolio, product categories, and distribution channels and existing tools to design an architecture that addresses future needs.
Data Pipelines & ETL Process
The next step was to collate data from the existing systems by using RestfulAPIs, and process the data to address format or structure issues, data availability, and accuracy.
Trigent suggested leveraging Amazon Managed Workflows for Apache Airflow (MWAA) for workflow management, orchestration set-up, and end-to-end operations of data pipelines. The team designed a scalable and reliable data pipeline/data infrastructure using AWS Glue (ETL) workflow. The proposed data pipeline was customized to the company’s business needs, wherein the data was ingested from several sources such as customer order databases, payment gateways, and emails, to name a few.
The team proposed utilizing Amazon MSK for streaming data from all the available sources. The extracted raw data from disparate sources would then be transformed and given a unified format and structure for model building and data recovery, which would ultimately facilitate discovering insights. This transformed data would then be loaded into the warehouse on the cloud data platform architecture. Amazon Redshift was selected for designing the data warehouse.
Cloud Data Warehouse Solution
For building the cloud data warehouse solution, Trigent suggested Amazon Redshift for hosting and processing terabytes of data and running thousands of highly performant queries in parallel, thereby enabling the company to capture, store and analyze large volumes of data and deliver real-time insights to the product, sales, and marketing teams.
Additionally, the research and development team designed a demand forecasting solution for the next quarter by leveraging ML techniques. Algorithms were designed with the help of Amazon SageMaker to improve Weighted absolute percentage error (WAPE). Time series and Artificial Neural Networks (ANNs) models were used to design a demand forecasting solution. The demand forecasting solution, if used, would lead to various intuitive observations such as sales values and sales percentage changes, to name a few.
Data Visualization & Reporting Options
Useful data insights from demand forecast and anomaly detection were derived. By utilizing Trigent’s data visualization solution, the team designed dashboards, reports, and visuals from ML-generated insights. These customized reports would enable the teams to make informed decisions. The team designed the reports by using several tools such as PowerBI, Tableau, and Amazon QuickSight to generate customized visualizations.

Business Impact/ Outcome

The cloud data warehouse would have ensured that large volumes of data are processed simultaneously without delays
Data visualization would enable the company to quickly recognize the emerging trends for product-specific demand and make informed decisions.

Technology Stack: