CASE STUDIES
-
Send Time Optimization
CHALLENGE
The client wanted to maximise email engagement by ensuring campaigns were sent at the optimal day and time for each recipient. Traditional approaches lacked personalisation and exploration of potentially better options
SOLUTION
We developed a Send Time Optimiser (STO) using a Bayesian approach, balancing exploitation of known high-performing send times with exploration of potential new optimal times. This allowed the model to continuously learn and refine the best send windows.
IMPACT
The STO achieved a +50% higher clickthrough rate compared to sending emails on the most popular day/time from the last two years and 2x the clickthrough rate compared to random scheduling.
TECH STACK: R
STATISTICAL METHODOLOGIES: Thompson’s sampling, Multi-armed bandit problem
-
Fragrance Recommendation Engine
CHALLENGE
Fragrance selection is highly personal, influenced by individual preferences, personality, and past choices. The goal was to develop a fragrance recommendation engine: a “Spotify for perfumes” that could provide personalised fragrance suggestions for new consumers based on their personality traits and past fragrance ownership.
SOLUTION
We developed five different recommendation engines, each using a distinct logic to suggest fragrances, along with a random recommendation model as a control group. The models were tested on 75 UK and 75 US consumers, evaluating their effectiveness based on purchase intention, likeability, and wearability.
IMPACT
Three out of the five recommendation engines significantly outperformed the control group, demonstrating a strong ability to match consumers with fragrances they were more likely to enjoy. The uplift was identified based on consumer ratings on likeability, wearability and purchase intention.
TECH STACK: Python
STATISTICAL METHODOLOGIES: Random Forest, Association Rules
-
Predicting Churn
CHALLENGE
An airline client wanted to reduce customer churn, as passengers were switching to competitors. They already had a churn prediction model with 80% precision and 80% recall but needed improvements to better identify at-risk customers and take proactive retention measures.
SOLUTION
We implemented an ensemble modeling approach, developing four separate logistic regression models, each focusing on different aspects of customer behavior. The outputs of these models were then fed into a neural network, which provided a final churn prediction with improved accuracy.
IMPACT
Our enhanced model achieved 93% precision and 97% recall on an unseen future dataset, significantly improving the client's ability to identify and retain high-risk customers before they churned.
TECH STACK: R, Scala Spark
STATISTICAL METHODOLOGIES: Artificial Neural Networks, Regression
-
Customer Lifetime Value Ranking
CHALLENGE
Our client needed a new framework for customer loyalty tiering due to the introduction of new KPIs, shifting the focus from customer profit to customer profitability. Additionally, there was stakeholder disagreement on how to define customer segments, creating the need for a more flexible and data-driven approach.
SOLUTION
We developed a ranking system based on expected customer lifetime value (CLV). Using a Bayesian approach, we estimated CLV for new customers based on overall data patterns, with predictions becoming more accurate as additional customer information was collected.
IMPACT
Stakeholders gained the ability to slice and dice customer data dynamically, allowing them to create their own segmentations and analyses. This enhanced customer understanding, personalization, and decision-making, ensuring a more strategic and adaptable loyalty framework
TECH STACK: R, Python
STATISTICAL METHODOLOGIES: Bayesian Statistics
-
Propensity modelling
CHALLENGE
In the automotive industry, dealers struggle to prioritise leads effectively due to limited customer insights. With only a few opportunities to connect each day, they risk spending time on low-quality leads while missing high-potential buyers. A lack of informed decision-making leads to inefficiencies, reduced conversions, and a weaker understanding of customer needs.
SOLUTION
We developed a propensity scoring system that assigns each contact a score from 1 to 3, ranking their likelihood to purchase a specific vehicle model. These scores are tailored per region and contact type (Customers vs. Prospects), ensuring dealers focus on the highest-value leads. To generate accurate predictions, we trained machine learning models on historical customer data, using past behaviors to estimate future purchasing probabilities. This “time machine” approach allowed us to validate the model’s effectiveness by simulating past scenarios and testing against real purchase outcomes.
IMPACT
With high-propensity contacts 5x more likely to convert, dealers can now prioritise the right leads, improving efficiency and increasing sales. The latest iteration of our model delivered a 16-25% improvement over its first-generation version, providing a data-driven advantage to automotive sales teams.
TECH STACK: R, Python
STATISTICAL METHODOLOGIES: Random Forest, Experimental design
-
Customer Chat Experience Evaluation
CHALLENGE
Our client introduced live web chat for its sales team and needed to evaluate its effectiveness. The main objectives were to assess the impact of high-volume events like Black Friday and Cyber Monday, identify consumer pain points for different products, measure the performance of contact center agents, and find a way to semi-automate the evaluation of large chat volumes. Additionally, the business needed to demonstrate the value of live chat as a key sales and support channel.
SOLUTION
We developed a Natural Language Processing (NLP) model to analyse over 40,000 chat logs and 500,000 lines of text. The model evaluated chat transcripts, customer and agent survey data, and transactional data to extract meaningful insights. Real-time performance feedback was provided to senior management, allowing immediate visibility into contact center operations.
IMPACT
The model provided instant insights into the performance of high-volume sales events, enabling the business to measure the impact of major campaigns. Analysis revealed that positive sentiment in chats contributed to a 30% increase in consumer conversion rates. One-page evaluation reports were created for contact center staff to improve individual performance, and a comprehensive report outlined key findings and strategic recommendations.
TECH STACK: R, Python
STATISTICAL METHODOLOGIES: Various text mining techniques
-
Deduplication algorithm
CHALLENGE
A client’s database contained duplicate customer records with different IDs, making it difficult to perform accurate analytics and draw customer insights. The goal was to develop an algorithm that could identify and merge duplicate records, creating a single, unified customer profile while preserving all relevant information.
SOLUTION
We built a deduplication algorithm that analysed multiple data points such as email addresses, physical addresses, and preferences to identify records likely belonging to the same customer. A second step ensured that the master record retained the most accurate and complete customer information, including correct addresses, transaction counts, and interaction points.
IMPACT
The algorithm successfully deduplicated 11% of the customer database without any known incorrect merges. This resulted in cleaner, more reliable data, enabling better analytics, improved customer insights, and more effective marketing strategies.
TECH STACK: Python
STATISTICAL METHODOLOGIES: Association Rules, tf-idf
-
Segmentation
CHALLENGE
Our client needed audience insights across four major markets (UK, US, France, Germany) to inform advertising strategy. The goal was to transition from an anonymous social audience to a known, engaged community.
SOLUTION
We Conducted an on-site session to define key questions, uncover knowledge gaps, and align on audience research objectives. Then, we applied machine learning techniques to segment the audience and extract key behavioural insights. Finally, we developed an interactive dashboard, enabling the team to quickly access and analyse real-time audience insights for strategic decision-making.
IMPACT
The analysis led to the identification of six distinct audience segments, enabling more targeted and effective marketing strategies. The findings were later presented at Advertising Week Europe, where they helped shape discussions around Gen Z engagement.
TECH STACK: R, Sisense
STATISTICAL METHODOLOGIES: Principal Component Analysis, Self-Organising Maps, Hierarchical Clustering
-
Customer Chat Experience Evaluation
CHALLENGE
Our client introduced live web chat for its sales team and needed to evaluate its effectiveness. The main objectives were to assess the impact of high-volume events like Black Friday and Cyber Monday, identify consumer pain points for different products, measure the performance of contact center agents, and find a way to semi-automate the evaluation of large chat volumes. Additionally, the business needed to demonstrate the value of live chat as a key sales and support channel.
SOLUTION
We developed a Natural Language Processing (NLP) model to analyse over 40,000 chat logs and 500,000 lines of text. The model evaluated chat transcripts, customer and agent survey data, and transactional data to extract meaningful insights. Real-time performance feedback was provided to senior management, allowing immediate visibility into contact center operations.
IMPACT
The model provided instant insights into the performance of high-volume sales events, enabling the business to measure the impact of major campaigns. Analysis revealed that positive sentiment in chats contributed to a 30% increase in consumer conversion rates. One-page evaluation reports were created for contact center staff to improve individual performance, and a comprehensive report outlined key findings and strategic recommendations.
TECH STACK: R, Python
STATISTICAL METHODOLOGIES: Various text mining techniques
-
Interactive Dashboard
CHALLENGE
Our client needed a solution to drastically reduce the time spent generating reports and insights about their customers. They relied on complex spreadsheets to manually merge various data sources, making the process slow, error-prone, and inefficient.
SOLUTION
To solve this, we split the project into two phases. First, we built an automated data pipeline to clean, preprocess, and consolidate customer data into a single, unified view. Then, we developed a custom interactive dashboard using Power BI, designed to meet the client’s visual and analytical needs, delivering real-time insights.
IMPACT
The result was a dramatic reduction in report generation time from hours to minutes. The client gained access to up-to-date insights, enabling faster, more informed decisions. The automated pipeline also boosted scalability and reduced manual workload, allowing teams to focus on strategic tasks.
TECH STACK: Python, SQL, Power BI, AWS Lambda
STATISTICAL METHODOLOGY: Simple stats
-
Call Center Volume Forecast
CHALLENGE
A banking client’s call center experienced highly unpredictable call volumes, leading to long wait times on some days and idle staff on others. They needed a solution to forecast daily call volumes based on historical trends and the impact of outbound campaigns, allowing them to optimise staffing levels and improve customer service efficiency.
SOLUTION
We developed a SARIMAX forecasting model to account for seasonal patterns and external factors such as specific product campaigns and their volumes. To make predictions accessible for non-technical users, we built an interactive Excel-based calendar, allowing the client to input past call volumes and upcoming campaigns to generate forecasts dynamically.
IMPACT
The client improved staff allocation, reducing both customer wait times and underutilised agent hours. By integrating predictive modeling into their workflow, the call center gained a data-driven approach to managing call volumes more efficiently.
TECH STACK: R, Excel
STATISTICAL METHODOLOGIES: Time Series Analysis (Sarimax)