Rules for a successful multivariate test (Billy’s Optimization Guide Part 3)

No Comments Methodology, Testing Concerns, Testing Techniques

Rules of Six Detail

If you missed it, see Part 1 (A/B Split Testing) and Part 2 (Multivariate Test Basics).

With the basics of part 2 down, it’s time to start designing a multivariate test.  Every optimization project has different challenges and goals, luckily though, there are a few rules that apply to every multivariate test design.  These rules fit into two categories: technical rules and content rules.

Technical rules:

  1. Choose the appropriate multivariate test type (full or fractional factorial)
  2. Determine the number of factors and levels that can be tested based on estimated conversion traffic (choose a test array)
  3. Stop the test when it has stabilized, not based on your earlier estimations

These rules ensure statistical significance by constraining the test to the appropriate size at the beginning and then letting the test gather the proper amount of data at the end.

Running a test full factorial, if your traffic supports it, may be a good choice if you’re testing content that you believe to have many interactions or if you only want to test 2 factors with 2 levels each.  (Note: the smallest fractional factorial test size is 3 factors with 2 levels each.)  Typically though, you’ll want to run a fractional factorial test to save time and expand the number of factors and levels you can test.

In order to find out how many factors and levels you can test, you need to have some idea of your predicted page views, conversions, as well as an estimate of lift.  The reason that lift matters, is that a large lift will get you more conversions and so your test will stabilize quicker.  Because of this, I would be conservative with lift estimates to ensure that the test is not designed too large.  At Widemile, we have a large list of arrays available to our tool and have calculated the approximate conversions needed to stabilize, allowing me to look at the three criteria I listed and find the arrays that are statistically viable for testing.  You should look for something similar with your tool of choice.

To figure out when a test is stabilized, I prefer to primarily look at level influence stabilization with experiment conversion rate stabilization for support.  Widemile Optimize shows this using graphs, so I simply look for horizontal trending of lines, meaning winning levels and experiments stay winners and their level of influence or conversion rates stay fairly constant (look horizontal) over 3-5 days.  If you don’t have graphs available,  the historical cumulative conversion rate for your experiments and see if there is a lot of variance between the latest few days of your test.

Content rules:

  1. Every item you test should answer an important question
  2. Test variety not quantity
  3. Test opposites first then refine
  4. Remember you can run more than one test

The content rules are closely tied together.  In effect, they ensure that the items selected for testing have purpose and that they don’t needlessly expand the size of your test, reducing its efficiency.  I begin designing tests by creating hypothesis regarding issues with the page and then choose factors and design levels to address those issues.

An example hypothesis is “Having a hero shot on the right side of the page causes users to ignore the important value proposition on the left side.”  To test this, I would choose hero shot position as a factor and then have “left side hero shot” as the baseline level and “right side hero shot” as the second level.  This example also illustrates that, other than headlines and images, testing layout is possible with creative use of CSS and sometimes JavaScript.  As long as you can revert from one to another and it matches the other factors and levels, you are at liberty to test anything.

Coming back to the rules, make sure that you are testing as few items as possible to find out what you need.  Before testing a collection of lifestyle hero shots, choose one and test it against an iconic hero shot.  This will save you the time of going down a path of testing something that may not work.

Lastly, you aren’t going to be able to get the best page on the first run or even second, third, etc.  If you knew what your audience liked 100% of the time then you wouldn’t need testing.  Remember to think of your overall test plan beyond just the first run, so that you can answer all the questions you need without having to force everything into one test.

In summary, determine what you’re trying to achieve, select the proper testing method to meet those goals and then make sure to be purposeful and efficient with the content you end up testing in front of your visitors.  Testing and optimization is not difficult, although it can be tough to start.  Follow these rules and you’ll be on your way to conquering conversion rates, bounce rates, funnel drop-offs and many other metrics.

Photo credit: Aranda\Lasch (CC)

Breaking down multivariate testing (Billy’s Optimization Guide Part 2)

No Comments Methodology, Terminology

If you missed it, see Part 1 (A/B Split Testing).  Update: Part 3 on Rules for a Successful Multivariate Test is here.

The technical and statistical aspects of multivariate testing can be complicated but in order to design successful tests you don’t need to know everything, just the basics of how it works and some guidelines.  I’m assuming you already have some understanding of multivariate testing, however I want to cover the basics and make sure we’re on the same level before going into how to design good multivariate tests.

Check out the wireframe below.  Pretty standard for a landing page, right?  To properly design a multivariate test, we have to look at the page in a certain way.  Using three key terms, factors, levels and experiments, we can break down a test and describe its framework.

page

Factor: An element of the Web page (headline, image, text) being tested.  The element can also be groups of content, e.g. left column, button and hero shot together, or all banner ads on the page.

Level: Content that is assigned to a specific factor to be tested.  For example, one variation of a hero shot.

Below are 4 factors from our example page (headline, hero shot, offer and button) and then each of those factors with 4 levels represented by the different colors.  Note that the levels of one factor do not have to relate in anyway to the levels of other factors.

factors and levels 450x156

The last term, experiments, makes use of both factors and levels.

Experiment: A unique combination of levels used during a test.

Here you can see 4 different experiments.  Each experiment is different and holds different combinations of levels.  Note that there actually are many more variations (4×4x4×4=256 combinations).

experiments example 400x300

Essentially a multivariate test involves showing these experiments randomly to live traffic, while tracking how each experiment performs.  The one that performs the best wins.  Each experiment is shown to many people, but each person only sees one experiment.  (There is some complexity in this, if you are still confused or want to know more, go to my primer on full and fractional factorial testing.)

In my next post, I will use these terms to outline the rules to creating a great multivariate test.

How to do efficient optimization

2 Comments Landing Page Optimization, Methodology, Testing Techniques

arrows
A beginner’s mistake is to test every idea with every test. This is the most obvious way of being efficient. If I can test 50 things in a week, why not?

In my experience, efficiency has more to do with careful test design and doing things right the first time, than trying to test everything and rushing the process. By testing a few big ideas quickly and then designing the next test based on those results, you can do a set of small tests and get answers fast without having to risk your page to many bad ideas.

Every test should have specific questions its trying to answer. Not just “What’s the best performing page?” but questions that lead to that. A car salesman doesn’t blindly try every tactic in the book get you to buy a car, a real salesman probes you with a few questions and changes their technique accordingly.

That’s how you should design your tests.

Here’s an example test plan that works for most clients:

  • Step 1 (Split Test) – Find an optimal template/design: What template and/or design effectively gets visitors to stick, click and convert? At this stage, you aren’t testing messaging yet, you’re merely re-skinning and moving elements around to find a good design. Some techniques to use are simplifying the page by de-emphasizing unimportant content (shrink company logo, move ads to the bottom of the page) and emphasizing core content (moving 3rd party validation near the call to action) and adding more whitespace to the page to enhance readability. These are in addition to a well done creative design. This test usually has the greatest impact, however it all depends on your current page and the audience. (Read more on template testing)
  • Step 2 (Multivariate Test) – Find the biggest converting segment: This test focuses on finding the correct messaging by appealing to different segments that you know and hypothesize visit your page. If your product was Google Apps, you might test appealing to business users and freelancers. Or if you are selling a cell phone, you might test features against benefits.
  • Step 3 (Multivariate Test) – Find the perfect way to communicate to the segment: Step 2 points you in the right direction, but this step helps you find the exact place you should be with your page. Use what you learned (freelance messaging won) and try variations on that winning theme to really grab your audience and give them what they want. Also, step 2 may have revealed 2 or more segments that are worth targeting. If you can segment them out, run multiple tests that are customized for each segment, and you’ll raise conversions even higher.

The alternative is to test 50 ideas of which many of the ideas overlap. Why test any ideas that are remotely similar until you know that they work in general? If I go to a dealership wanting a sports car and the dealer offers me 5 colors of minivans, I’m still not going to buy a minivan. Show me 4 types of cars, let me pick the one I like and then we might talk about color.

Let your visitors lead you!

This really is a simple process, but it drives results. Be methodical to be efficient. By course correcting in each test, you get closer and closer to what you need and don’t spend a lot of time testing losing elements. Follow a test plan like this and you’ll get results and learn a lot about your core converting visitors.

Google Web Optimizer officially launched, no AdWords required

No Comments Industry News

I just got news that Google Web Optimizer is out of beta. In addition, it doesn’t even require an AdWords account to use it anymore. This is great news for the testing industry and for all online marketers. Check it out here. In addition, there now is a dedicated Official Google Web Optimizer blog.

I’ll see if I can get some tests running just to see what the isolated tool looks like versus the integrated one. They also upgraded the setup of multivariate tests for all versions.

On another note, it’s good to see that Google saying things like “it’s hard to find a serious advertiser who doesn’t at least plan to do content testing this year.” They even mention some best practices that I’ve talked about at this blog:

  • “don’t be shy: big changes generally yield big differences in performance”
  • “We recommend letting your experiments run for at least two weeks, no matter how much traffic you get and how strong the results initially appear, just so the data has enough time to normalize.” – I recommended the same things in my Multivariate Testing Primer.

Also there’s a forum for Google Web Optimizer users, which isn’t new, but expect it to grow quickly with this latest announcement.

If you’re waiting for the last post in my 3 part series about difficult test results, I apologize. I’ve been sick all week and wanted to go over my last post with Vladimir Brayman, Widemile’s chief scientist, before I posted it for the world to see. It’s a very important topic and a challenging one too. I’ll try to get it out next week for sure.

1 quick but powerful test design tip

No Comments Methodology

Find out if it works

I was going over my testing plans with my boss, Frans Keylard, today and he reminded me of a very powerful rule.

Test if something works before you try variations of it.

In this case, I was testing out two testimonials. They were quite different in the messaging, however, do I even know if testimonials are read or impact visitors at all? If I test a testimonial and no testimonial, I will immediately know if I should continue trying testimonials. If testimonials win or compare favorably against having no testimonial, then I know to test additional testimonials.

Not that I have never tested factors on/off or tried totally different factors, e.g. a testimonial against a product shot. I had a strong feeling testimonials were going to work, so I assumed they would, although I know I shouldn’t assume anything. An honest mistake, but a good reminder.

Ideally I would be testing variations, along with showing nothing, or “off”, as a variation, however in this case the page didn’t get much traffic so I was limiting my testing to the most important variations and factors.

There might be some fringe cases where this isn’t necessarily true, but in most cases you should just save extra variations for future runs and first find out if your factor has any impact on the page. Maybe I need to read some of my old posts more often.