Most people run cold email campaigns, get mediocre results, and move on.
At imisofts, we run 100+ A/B tests monthly. Each test teaches us something. Compound those lessons, and you get 65-75% open rates and 3-5% reply rates.
This is our A/B testing framework.
The Cold Email A/B Testing Hierarchy
Not all tests have equal impact. We prioritize:
- Subject line (highest impact)
- Opening line (second highest)
- Value statement (medium impact)
- CTA (medium-low impact)
- Send time (lowest impact)
Never test low-impact variables first. You'll waste samples before finding the real wins.
Testing Subject Lines (Highest Impact)
Subject lines determine open rate. Open rate determines everything else.
Test structure:
Variant A (baseline): [Current subject line that gets 45% opens]
Variant B (test): [New subject line with different formula]
Sample size: 50-100 prospects each
Duration: 3-5 days
Example test:
Variant A: "hi john, noticed you launched [product]" (45% open rate)
Variant B: "quick thought on [industry]" (52% open rate)
Winner: Variant B (+7 percentage points)
This 7-point improvement compounds. Across 10,000 email prospects, that's 700 additional opens. 700 additional opens means 7-35 additional replies (at 1-5% reply rate).
That's 7-35 additional customers from changing one word.
Subject line tests we run:
- Personalization (first name + achievement vs. generic)
- Question format vs. statement format
- Lowercase vs. Title Case
- Short vs. specific
- Different achievement angles
Testing Opening Lines (Second Highest Impact)
Opening line determines whether they read past the first sentence.
Test structure:
Email 1, Variant A: [Opening line A] + [rest of email unchanged]
Email 1, Variant B: [Opening line B] + [rest of email unchanged]
Sample size: 100 prospects each
Duration: 3-5 days
Metric: Open rate from click to read percentage (hard to track without advanced tools)
Alternative: Track reply rate as proxy for "engagement with content"
Example test:
Variant A: "I noticed you launched [product] last month." (1.2% reply rate)
Variant B: "Most SaaS founders spend 15 hours/week on prospecting. You probably do too." (1.8% reply rate)
Winner: Variant B (+0.6 percentage points reply rate)
0.6 points seems small. Across 10,000 prospects, it's 60 additional replies.
Opening line tests we run:
- Specific personalization vs. general observation
- Problem-first vs. achievement-first
- Curiosity gap vs. direct statement
- Industry pattern vs. company-specific observation
Testing Value Statements (Medium Impact)
Value statement is your chance to prove relevance before the pitch.
Test structure:
Email 1, Variant A: [Value statement A]
Email 1, Variant B: [Value statement B]
Metric: Email 1 reply rate or Email 2 open rate (if they engage with Email 1, they'll engage with Email 2)
Example test:
Variant A: "We help SaaS teams automate their prospecting and save 12 hours/week." (2% Email 1 reply)
Variant B: "One of your competitors just booked 8 qualified deals this month using [tactic]." (2.8% Email 1 reply)
Winner: Variant B (social proof outperforms direct benefit)
This changes your entire Email 1 strategy. Across campaigns, social proof hooks outperform benefit hooks by 20-40%.
Value statement tests we run:
- Direct benefit vs. social proof
- Specific metric vs. general statement
- Industry pattern vs. company-specific observation
- Problem-agitation vs. opportunity-excitement
Testing CTAs (Medium-Low Impact)
CTA wording has lower impact than subject/opening, but still matters.
Test structure:
Email 2, Variant A: "[CTA A] or reply with your timeline."
Email 2, Variant B: "[CTA B] or reply with your timeline."
Metric: Email 2 reply rate
Example test (SaaS):
Variant A: "Book a 15-min strategy call" (3.2% reply)
Variant B: "Are you open to a quick conversation?" (2.8% reply)
Winner: Variant A (direct booking link outperforms vague ask)
But this varies by industry. Medicare-focused campaigns might see opposite results (phone-first CTAs outperform calendar links).
CTA tests we run:
- Direct booking link vs. "reply to schedule"
- Soft ask vs. hard ask
- Specific time ("15 min") vs. vague ("quick call")
- Phone number vs. calendar link (varies by industry)
Testing Send Times (Lowest Impact)
When you send affects open rate, but much less than what you send.
Test structure:
Group A: Send on Tuesday, 10 AM
Group B: Send on Thursday, 10 AM
Metric: Open rate
Our data across 50M+ emails:
Tuesday: 45% open rate
Wednesday: 47% open rate
Thursday: 48% open rate
Friday: 40% open rate
Monday: 38% open rate
Best day: Thursday, 10 AM
Worst day: Monday, 10 AM
Difference: ~10 percentage points
That matters, but nowhere near as much as subject line testing (which can change open rate by 30+ points).
Send time tests we run:
- Weekday vs. weekend
- Morning vs. afternoon vs. evening
- Time zone-specific sends
- Industry-specific patterns (e.g., healthcare gets higher open on Friday due to weekly planning)
How to Measure Statistical Significance
You don't need a PhD in statistics. Here's the simple rule:
Sample size of 50+ per variant. If you see a 5%+ difference, it's probably real.
More rigorous approach:
Use a binomial test calculator.
Example:
- Variant A: 45 opens out of 100 (45%)
- Variant B: 52 opens out of 100 (52%)
- Difference: 7 percentage points
Question: Is this real or random?
Plug into calculator. If p-value < 0.05, it's statistically significant (95% confidence). You can trust the result.
For cold email, we use this rule of thumb:
Sample < 50: Don't trust the result. Run more samples.
Sample 50-100: If difference > 5%, probably real.
Sample 100-200: If difference > 3%, probably real.
Sample 200+: If difference > 2%, probably real.
The Weekly Testing Cycle
Monday: Review last week's tests. Declare winners.
Tuesday-Wednesday: Roll out winning variant to 50% of new prospects.
Wednesday-Thursday: Run new tests on remaining 50%.
Friday: Measure results.
Monday: Repeat.
This weekly cycle compounds. Each week you find one new winning variant. Month 1, you're at baseline. Month 3, you're 30-40% above baseline.
What Not to Test
Don't test too many things at once
Wrong: Test subject line, opening line, CTA, send time simultaneously.
You won't know which variable won. Also called "multivariate testing" and it requires huge sample sizes.
Right: Test one variable per week.
Subject line week 1. Opening line week 2. CTA week 3. You learn faster and with smaller samples.
Don't test on tiny samples
Wrong: Test on 10 people per variant.
Too much variance. Random chance plays huge role.
Right: Test on 50+ people per variant minimum.
This gives signal above noise.
Don't declare winners too early
Wrong: Run test for 24 hours. Declare winner.
Time of day matters. Day of week matters. One day isn't enough.
Right: Run test for 5-7 days minimum.
This accounts for daily/weekly patterns.
Testing Template: What We Track
| Element | Variant A | Variant B | Winner | Notes |
|---------|----------|----------|--------|-------|
| Subject Line | hi john, noticed [product] | quick thought on [industry] | B | +7 points open rate |
| Opening Line | I noticed... | Most [industry]... | B | Engagement higher |
| Value Statement | Direct benefit | Social proof | B | Social proof +0.8% reply |
| CTA | Book call | Reply to schedule | A | Direct link better |
| Send Time | Tuesday 10 AM | Thursday 10 AM | B | Thursday +3 point open |
Tools for A/B Testing
At imisofts, we use:
- Instantly (built-in A/B testing)
- SmartLead (rotation + analytics)
- Clay + Apollo (data merge + manual testing)
- Custom scripts (for complex multivariate tests)
Most platforms now offer native A/B testing. Use it.
Results: What Testing Gets You
Baseline campaign (no testing):
- Subject: 35% open
- Reply: 1.5%
After 3 months of weekly testing:
- Subject: 55% open (+20 points)
- Opening: better engagement
- CTA: better conversion
- Reply: 3.5% (+2%)
That 2% improvement on reply rate is massive. It doubles your results.
What We Recommend at imisofts
We run A/B testing for all managed clients:
- Weekly testing cycles
- Subject line, opening, CTA, send time
- Multivariate testing for scaled campaigns
- Statistical significance validation
- Monthly optimization reports
Packages start at $497/month (Management with testing) to $2,450/year (Enterprise with full testing suite).
Explore imisofts Cold Email Packages