"In planning an advertising campaign, the first step should be to clear the decks of all opinions … The next step should be to find a scientific method of testing."
- John Caples, Tested Advertising Methods
The science of testing has evolved far beyond A/B splits. Yet, as a small group of statisticians toiled over the last six decades to create powerful new test designs, they largely ignored—and were ignored by—the marketing world.
Today, as more test experts bridge the gap between theory and practice, this is beginning to change. Terms like interaction effects, fractional-factorial designs, and multivariable testing may seem like a foreign language, but cutting-edge techniques like these have given a few market leaders a formidable competitive advantage.
Multivariable Matrix Test Designs
Multivariable matrix test designs let you change many variables at once and still isolate the impact of each. With these advanced techniques, a number of test cells are combined within the same test, following scientifically-defined "recipes." Each recipe provides one unique piece of information about every variable in the test.
Analyzing all recipes together, results pinpoint the real-world effect of each variable and interactions among variables (where the impact may change depending on how other variables are set). In addition, you can use the same sample size whether testing one or thirty-one variables. With one test design made up of a number of recipes showing many different pieces of the same puzzle, multivariable matrix tests are fast and powerful and results are clear, robust, and actionable.
Furthermore, in a well-run test, all of the statistical complexity can remain transparent to your marketing team. With greater freedom to explore new ideas and opportunities, the statistics and strategy are like a framework that adds structure and scientific discipline to the execution of your ideas.
The Science Behind the Success Later on we'll look at an e-mail test with 19 variables tested simultaneously, but first let's look at a simple test to see how these matrix designs work.
Say you want to test two elements of your e-mail:
You can test these two elements as two separate split-run tests, or you can combine them into one. As one test, you send out four recipes of your e-mail, each with one combination of the two elements:
The test can be written with a "-" (or -1) to represent the control and a "+" (or +1) for the new level, as in the following matrix. The +/- combinations in the A and B columns represent the four e-mails you need to drop. The AB column is used only for the calculation of the interaction effect (discussed below). Each plus and minus in this column is calculated by multiplying the signs in columns A and B. For example, (-1)x(-1)=(+1) in recipe #1 and (+1)x(-1)=(-1) for recipe #2.

The perfect balance among test recipes (i.e. rows in the matrix) makes the analysis fairly simple. Looking at column A, you see two + recipes (2, 4) and two - recipes (1, 3). For the two A+ recipes, you have one B+ and one B-. Statistically, when you average recipes 2 and 4, the effect of B "averages out," so you end up with A+, the high price, independent of the impact of changing the offer. Therefore, the main effect of A is the average of recipes 2 and 4 minus the average of recipes 1 and 3. The effect of B is also calculated as the average of all + recipes (3, 4) minus the average of all - recipes (1, 2). The interaction effect is calculated similarly. Therefore, the effects are:
The interaction effect is more difficult to interpret, but a simple plot can help. In the chart below, the top line shows both recipes with the free gift (B+) and the bottom line is recipes with the discount (B-). The left side shows both recipes with the lower price (A-) and the right points are recipes at the higher price (A+). An interaction is present when the lines are not parallel.

Looking at this plot, the lower price and free gift pulls best (upper left), but the higher price leads to a negligible drop in response when the free gift is offered (upper right). Therefore, the team should ask, "Is the $29.99 price more profitable if we offer a free gift?"
Comparing this simple "full-factorial" matrix design to common split-run tests:
Now let's look at a large test that really leverages the creative freedom and statistical power that matrix designs offer.
One E-mail Test of 19 Elements
With growing competition, dropping sales, and a small budget, one marketing VP needed to find new ways to increase profit. With speed, flexibility, and low production cost, e-mail was the ideal channel for testing. In addition, what works in e-mail may translate directly to their website and may also work in print.
After running numerous A/B splits, the marketing executive decided the benefits of multivariable matrix testing were worth a try. Working with an expert, the team brainstormed 101 different ideas and zeroed in on 19 specific elements they wanted to test in their next drop of 500,000 names. Elements included products, prices, offers and call-to-action, copy, graphics, links, and layout.
After defining the 19 test elements, the team created the control (-) level and the test level (+) for each element. Then the test expert gave them the list of 20 recipes they needed to create. The additional effort to create all 20 recipes added just two days to their usual schedule. All e-mails were dropped at the same time and data were collected and analyzed over the next few days. All 19 main effects are summarized in the following bar chart.

In this chart, the test elements are listed on the left (with the test level shown in parentheses for each significant effect). The "line of significance" below the 9 th element, N, is a measure of experimental error—everything above the line is statistically significant and everything below can be explained by random market variation. The length of each bar shows the magnitude of the effect and the label at the end shows whether the test level increased response ("helps") or decreased response ("hurts") and by how much.
The next drop, they implementing the five changes proven to increase sales and avoided the four "good" ideas that hurt. Response jumped 24% over the control. Interestingly, if they had tested all 19 elements as separate one-variable tests with the same 500,000 names, nothing would have been statistically significant. If they had selected two or three elements for split-run tests, they may have gained some improvement, but nowhere near the 24% lift from implementing the optimal combination of all 19 elements all at once. They also found two interactions (not shown here) that one-variable tests would have never uncovered.
Diverse Test Designs Offer Flexibility and Power
The full-factorial type of design used in the first example does not quite work for this 19-element test, since it would require 524,288 test recipes (2 19 ). However, more advanced designs balance the simplicity of A/B splits with the efficiency of full-factorial designs. In the 19-element test, statistical techniques were used to select as few as 20 recipes out of the half-million combinations that provided the greatest wealth of information with the least amount of data, offering:
Cutting-edge designs offer a wide range of test options. Between the two extremes—split-run tests and large full-factorial designs—are designs to test many elements quickly, few elements in greater depth, and elements with multiple levels. Whatever your goals, an expert can help you design and execute the right test to streamline learning and increase ROI. For example, you can test:
The Opportunities are Wide Open
Multivariable matrix tests let you cast a wide net for new ideas, increase the power and efficiency of your tests, and gain deeper insights into main effects and interactions among numerous elements of your marketing programs.
Marketers who only test new subject lines, offers, and lists may be missing out on the profit and insights that can come from searching where competitors never look. Certainly, you need to leverage your experience, test bold changes, and focus most of your resources on what you believe are the most important elements. But step back every now and then and question. Question the experts, question your theories, and question whether more of the "art" of marketing can be quantified… especially when testing a multitude of elements costs little more than a test of one.
About the Author
GORDON H. BELL is the principal of consulting firm Montlake Marketing. Skilled in marketing science and strategy, Gordon has advanced the field of market testing and continues to teach and apply cutting-edge techniques. He can be reached at gbell@montlakemarketing.com or (865) 693-1222.