A/B Testing Results | Lexi Barry

⚡

Executive Summary — For Busy Leaders

The most important findings from our Q4 2024 experimentation program

📝 Note: VitalMetrics is fictional. This demonstrates analytics methodology using synthetic data.

🎯 Bottom Line Impact

+41%

Premium subscription conversion

$890K

Additional ARR from winning variants

6 of 8

Tests achieved statistical significance

Our systematic A/B testing program increased app-to-premium conversion from 3.2% to 4.5%, reduced device setup abandonment by 28%, and improved 30-day retention by 18%. All winning variants are now in production.

🏆 Biggest Win

Social proof on premium paywall (+36.8% lift): Adding "Join 100K+ users tracking their wellness" above subscription options dramatically increased conversions. Lesson: Users want validation they're making a smart choice.

🎯 Key Strategic Insight

Onboarding matters more than features: Reducing setup friction (1-tap Bluetooth pairing) outperformed adding new device features. Users need to experience value before they care about capabilities.

🎯 Business Question

Our smart wellness scale has multiple conversion touchpoints: device onboarding, app setup, feature discovery, and premium subscription. Which variations drive the highest conversion and retention rates, and how can we systematically optimize the entire user journey from device unboxing to paying subscriber?

🌿 Product Context: VitalMetrics Smart Scale

📝 Note: VitalMetrics Smart Scale is a fictional product created for this portfolio demonstration. All product features, test results, and business metrics are synthetic and do not represent real company data. This case study showcases analytical methodology and storytelling for connected wellness devices.

What We Offer

VitalMetrics is a connected wellness scale that tracks weight, body composition (fat %, muscle mass, BMI), and syncs to our mobile app. The app offers free basic tracking plus a Premium tier ($9.99/month) with advanced analytics, trends, goal-setting, and integration with Apple Health/Google Fit.

Device: $79 Premium: $9.99/mo iOS & Android

Why Connected Wellness Matters

Connected health devices democratize wellness by making body metrics accessible, understandable, and actionable. But hardware is just the entry point—the real value is in the software, data insights, and behavior change. Every optimization in onboarding or conversion means more people successfully adopting healthier habits.

🔍 Analytical Approach

I designed and analyzed 8 A/B tests across four key user journey stages over Q4 2024:

📱 Onboarding

Device pairing, initial setup flow

💰 Monetization

Premium paywall, subscription display

🎯 Feature Discovery

Tooltips, guided tours, prompts

🔔 Re-engagement

Push notifications, email campaigns

🛠️ Tech Stack & Tools Used

Experimentation Platform

• Amplitude Experiment (test orchestration)
• LaunchDarkly (feature flags)
• Custom Python scripts (power analysis)

Analytics & Tracking

• Mixpanel (event tracking)
• Google Analytics 4 (web behavior)
• Segment (data pipeline)

Statistical Analysis

• Python (scipy.stats, statsmodels)
• SQL (BigQuery for data extraction)
• Looker (dashboarding & monitoring)

🎛️ Interactive Filters

Filter by test category or view all results:

📈 A/B Testing Metrics Explained

Conversion Rate (CR)

Percentage of users who completed the desired action (e.g., completed setup, upgraded to Premium). For freemium apps, 2-5% free-to-paid conversion is typical; IoT device onboarding should exceed 80%.

Lift %

Percentage improvement of Variant B over Control A. Formula: ((B-A)/A) × 100. A 10%+ lift is considered meaningful for app experiences; 20%+ is a major win.

Statistical Significance

Confidence that results aren't due to chance. We use 95% confidence (p < 0.05) as the threshold for declaring a winner. Marked with ✓ when significant. Calculated via two-proportion z-test.

Sample Size

Number of users in each variant. Larger samples = more reliable results. Minimum 1,000 users per variant for app tests; 2,000+ for detecting lifts under 15%.

🏆 Test Results Overview

Tests Run

Across all stages

Winners

Statistically significant

Avg Lift

On winning tests

Total Users

Tested across variants

Conversion Rate Comparison: Control vs Variant

Green bars indicate statistically significant wins

💡 Quick Insight

Monetization tests (social proof on paywall, annual discount emphasis) delivered the strongest lifts. Onboarding improvements showed moderate but meaningful gains. Feature discovery tests had mixed results, suggesting our baseline guided tour was already solid. Six of eight tests reached statistical significance.

🛠️ Tools Used:

Chart.js for visualization, Amplitude Experiment for test orchestration, Python scipy.stats for significance testing (two-proportion z-test, α=0.05)

Performance Lift by Test Category

Relative improvement of winning variants over control

💡 Quick Insight

Monetization changes (paywall optimization, pricing display) show the highest lift potential at 25-37%, making them the most impactful lever for revenue growth. Onboarding improvements deliver 15-18% lifts— smaller but critical for long-term retention. Re-engagement tests showed strong results for push notifications but neutral results for email, suggesting our users are mobile-first.

🛠️ Tools Used:

SQL (BigQuery) for data extraction, pandas for aggregation, Looker for real-time test monitoring dashboards

📊 Detailed Test Results: Hypothesis → Outcome

1. Bluetooth Pairing: One-Tap vs Manual Entry

✓ Winner

Category

Onboarding

Control CR

72.5%

Variant CR

84.8%

Lift

+17.0%

Hypothesis: Simplifying Bluetooth pairing to automatic detection (one-tap) vs manual device ID entry will reduce setup abandonment.

Test Design: Control showed 6-digit manual code entry; Variant used iOS/Android native Bluetooth auto-discovery with one-tap pairing.

Measurements: Primary = Setup completion rate; Secondary = Time-to-first-weigh-in, Support ticket volume

Result: Variant increased setup completion from 72.5% → 84.8%. Time-to-first-weigh-in reduced by 35% (4.2min → 2.7min). Support tickets about pairing issues dropped 41%.

Business Impact: 12.3% absolute improvement in setup completion translates to 2,460 additional activated devices per quarter (20K new users × 12.3%). Higher activation = higher retention and premium conversion downstream.

2. Onboarding Tutorial: Video vs Interactive Walkthrough

✗ No Winner

Category

Onboarding

Control CR

68.2%

Variant CR

66.7%

Lift

-2.2%

Hypothesis: Replacing interactive tutorial with 60-second video would increase tutorial completion (users prefer passive content).

Test Design: Control = 4-step interactive tutorial with tooltips; Variant = embedded video showing device features.

Measurements: Primary = Tutorial completion rate; Secondary = Feature discovery (% users accessing trends, goals, integrations within 7 days)

Result: No significant difference in tutorial completion (68.2% vs 66.7%, p=0.24). However, interactive group showed 23% higher feature discovery rate, suggesting hands-on learning drives engagement.

Learning: Video didn't improve completion and actually reduced downstream engagement. Keep interactive onboarding. Sometimes "faster" isn't better—users need to learn by doing, not watching.

3. Premium Paywall: Social Proof Messaging

✓ Winner

Category

Monetization

Control CR

3.2%

Variant CR

4.38%

Lift

+36.8%

Hypothesis: Adding social proof ("Join 100,000+ users tracking their wellness") above subscription options will reduce purchase hesitation.

Test Design: Control = clean paywall with feature list; Variant = added social proof header + "4.8★ on App Store" badge.

Measurements: Primary = Free-to-Premium conversion rate; Secondary = Time on paywall screen, paywall abandonment rate

Result: Massive 36.8% lift in conversion (3.2% → 4.38%). Users spent 8% more time on paywall (reading social proof), but abandonment dropped 22%. Winner across all demographics, strongest with 35-54 age group (+42% lift).

Business Impact: This single change adds ~$520K ARR (based on 15K monthly active users × 1.18% conversion improvement × $9.99/mo × 12 months). Now implemented globally. Validates that trust signals dramatically reduce friction for wellness products where efficacy matters.

4. Pricing Display: Annual Discount Emphasis

✓ Winner

Category

Monetization

Control CR

4.1%

Variant CR

5.11%

Lift

+24.6%

Hypothesis: Highlighting annual plan savings ("Save $24/year") more prominently than monthly price will increase annual subscriptions.

Test Design: Control = monthly price emphasized ($9.99/mo or $84/yr); Variant = annual plan as default with "2 MONTHS FREE" badge.

Measurements: Primary = Total subscription conversion; Secondary = % choosing annual vs monthly, LTV impact

Result: Overall conversion increased 24.6% (4.1% → 5.11%). Even better: 68% chose annual plan (vs 42% in control). This improves customer LTV from $87 → $102 due to lower churn on annual commitments.

Business Impact: Higher conversion + higher % annual = double win. Estimated $370K additional ARR plus improved retention (annual subscribers churn at 15% vs 35% for monthly). The framing of "2 months free" outperformed percentage discount.

5. Feature Tooltip Prompts

✗

Category: Feature Discovery

Test: Contextual tooltips on trends page vs no tooltips

Result: 45.2% vs 44.8% feature usage (p=0.67, not significant)

Learning: Tooltips didn't move the needle. Features need better in-context value demonstration, not just explanation.

6. Guided Feature Tour

✓

Category: Feature Discovery

Test: Full guided tour vs self-exploration

Result: 52.1% vs 44.9% feature adoption (+16% lift, p=0.003)

Impact: Guided tour increased advanced feature usage (goals, integrations) by 28%. Now part of default onboarding.

7. Push Notification: Daily Weigh-In Reminder

✓

Category: Re-engagement

Test: Morning reminder (8am) vs no reminder

Result: 34.2% vs 28.7% weekly active usage (+19% lift, p<0.001)

Impact: Daily reminders increased weigh-in frequency from 3.2 → 4.1 times/week. Higher engagement correlates with 22% better retention.

8. Email: Win-Back Campaign

✓

Category: Re-engagement

Test: "We miss you" vs "Your wellness progress" subject

Result: 18.5% vs 15.2% reactivation rate (+21.7% lift, p=0.02)

Impact: Progress-focused messaging reactivated 3.3% more lapsed users. Emotional appeals performed worse than data-driven subject lines.

📋 All Test Results Summary

Test Name	Category	Control CR	Variant CR	Lift %	Sig?	Sample Size

🛠️ Tools Used:

SQL (BigQuery) for data extraction and aggregation, Python pandas for analysis, Looker for stakeholder reporting, Amplitude Experiment for real-time test monitoring

🎯 Key Takeaways

✅

What Worked

• Social proof on paywall: +36.8% lift (biggest win)
• Annual discount framing: +24.6% lift + higher LTV
• One-tap Bluetooth pairing: +17% setup completion
• Six out of eight tests achieved statistical significance

📚

Key Learnings

• Trust signals > Features: Users need validation before value
• Onboarding friction compounds: Small UX wins drive big retention gains
• Interactive > Passive: Users learn by doing, not watching videos
• Data beats emotion: "Progress" emails outperform "We miss you"

Bottom Line: Systematic A/B testing increased premium conversion from 3.2% to 4.5%, reduced setup abandonment by 28%, and added $890K in annual recurring revenue. The wins came from understanding user psychology (social proof, urgency) and removing friction (one-tap pairing), not feature bloat. Our testing framework is now expanding to personalization and retention optimization.

📋 Methodology & Data Notes

🎭 IMPORTANT: This is a portfolio demonstration using entirely synthetic data.

VitalMetrics Smart Scale does not exist. This is a fictional product created by Lexi Barry to demonstrate A/B testing analysis skills. All test results, conversion rates, sample sizes, and business impacts are synthetically generated and do not reflect any real company's data or performance. The methodology, frameworks, and analytical approach are real and based on industry best practices.

This analysis uses synthetic data modeling realistic A/B testing patterns for connected wellness device apps. The dataset represents 8 hypothetical tests run in Q4 2024 across Onboarding (2 tests), Monetization (2 tests), Feature Discovery (2 tests), and Re-engagement (2 tests) categories. Each test included 1,000-2,500 users per variant with conversion rates ranging from 3.2% to 84.8% depending on the conversion event.

Statistical methodology: Significance calculated using two-proportion z-test (p < 0.05) via Python's scipy.stats library. Lift percentage: ((Variant CR - Control CR) / Control CR) × 100. Sample sizes determined using power analysis (80% power, 95% confidence, statsmodels.stats.proportion module) to detect minimum 15% relative lift. All tests ran for minimum 14 days to account for weekly behavior patterns and device usage cycles.

Tech stack: Amplitude Experiment for test orchestration, Mixpanel for event tracking, BigQuery for data warehousing, Python (pandas, scipy, statsmodels) for statistical analysis, Looker for dashboards, LaunchDarkly for feature flagging. All synthetic data and analysis created by Lexi Barry for portfolio purposes only.

A/B Testing Results: Connected Wellness Device Optimization

Executive Summary — For Busy Leaders

🎯 Bottom Line Impact

🏆 Biggest Win

🎯 Key Strategic Insight

🎯 Business Question

🌿 Product Context: VitalMetrics Smart Scale

What We Offer

Why Connected Wellness Matters

🔍 Analytical Approach

🛠️ Tech Stack & Tools Used

🎛️ Interactive Filters

📈 A/B Testing Metrics Explained

Conversion Rate (CR)

Lift %

Statistical Significance

Sample Size

🏆 Test Results Overview

Conversion Rate Comparison: Control vs Variant

Performance Lift by Test Category

📊 Detailed Test Results: Hypothesis → Outcome

1. Bluetooth Pairing: One-Tap vs Manual Entry

2. Onboarding Tutorial: Video vs Interactive Walkthrough

3. Premium Paywall: Social Proof Messaging

4. Pricing Display: Annual Discount Emphasis

5. Feature Tooltip Prompts

6. Guided Feature Tour

7. Push Notification: Daily Weigh-In Reminder

8. Email: Win-Back Campaign

📋 All Test Results Summary

🎯 Key Takeaways

What Worked

Key Learnings

📋 Methodology & Data Notes