• About This Site

    I write about multivariate testing, landing page optimization and other campaign optimization topics. Check out the Lesson Guide if you're just starting. Have a question or suggestion? and let me know your thoughts.
  • Subscribe

  • Big List - Search Marketing Blogs

    Marketing Experiments Professional Certification Program

  • Recent Comments

  • « How to do efficient optimization | Home | SES San Jose: Landing Page Optimization Roundtable »

    An Essential Primer on Full and Fractional Factorial Test Design

    By Billy | July 24, 2008

    keys

    What are full and fractional factorial test designs? How do they relate to optimization and what about interactions?

    Once you get down and dirty with testing, these questions matter. Whether selecting an optimization platform or trying to thoroughly understand the tests you are building, grasping these concepts will put you in greater control and allow you to design and analyze your tests more effectively.

    As simply as possible, I hope to educate you and other marketers about full and fractional factorial test designs and why fractional factorial is the best choice for multivariate testing of online campaigns.

    Note: “Partial factorial” and “fractional factorial” are the same. Also, if you don’t have a thorough understanding of experiments and interactions, please read those first.

    The tests used in optimization are from the design of experiments field. (From Wikipedia: “Design of experiments is the design of all information-gathering exercises where variation is present, whether under the full control of the experimenter or not.”) The two types of tests I will focus on are fractional factorial and full factorial.

    Here is an example I will use to explain these concepts. Below is a test matrix outlining a test for a landing page with 5 factors with 2 levels each. Don’t let the vocabulary scare you away, this means that there are 5 parts of the page being tested and 2 variations of each.

    matrix

    Recipe Matrix: 5 factors = 5 parts (hero shot, headline, etc.) and 2 levels = 2 variations

    These factors and their respective levels make up the possible combinations for a landing page. The combinations displayed are called experiments.

    Let’s calculate the total number of experiments possible (even if you know how to do this already, this is important to understanding the distinction between fractional and full factorial.) There are 2 levels for each factor, so you can have 2×2x2×2x2 (2 to the 5th power) = 32 possible experiments. This means there are exactly 32 combinations of hero shots, headlines, sub headlines, button text and main copy from our matrix outlined above. Note that if we add another factor, it becomes 2 to the 6th power or 64 possible experiments. Additionally, if you add 2 more levels to any of the existing 5 factors, it will increase from 32 to 4×2x2×2x2 = 64 experiments also.

    In testing, each experiment must get a minimum amount of measurable conversions, known as the sample size per experiment. This ensures that there is enough data for a solid statistical analysis. Therefore the more experiments you have, the more conversions you need. You can think of conversion data as time also, since the longer you leave your web page up, the more data you get.

    Now we’re ready to go back to the difference between the two test designs. Full factorial testing requires that every possible experiment combination is shown, so our 5-factor test would need to display all 32 experiments. This means that if there is a sample size of 100 conversions, 3,200 conversions will be required. Fractional factorial works differently, it displays a much smaller number of experiments, about 8 in this case, so it would need about 800 conversions.

    Since full factorial gathers additional data, it reveals all possible interactions, but as seen by the numbers above, there is a trade-off. More data equals more information but more data also equals a longer test duration. The minimum data requirements for full factorial are very high since you are showing every experiment.

    Even if you are using full factorial to get the same amount of information as a fractional factorial test, it will take more time since you need more data to see statistically relevant differences between the many experiments.

    You might be wondering how fractional factorial can be accurate if interactions are possible?

    Random interactions of high relevance are very rare, especially when looking for interactions of more than 2 factors. You really need to design tests where you look for meaningful interactions that are based on true business requirements rather than hoping for a random and low influence interaction between a red button, a hero shot and a headline.

    Whatever the interaction is, you need to be able to understand your audience and infer why there was an interaction in the first place, only then are you ready to start designing for interactions.

    Tests should not be filled with random levels, they should be carefully designed for success by focusing on testable hypotheses around the audience. Could a 1 pixel drop shade on a button interacting with the copyright statement ever be truly significant, and not a victim of random error? Is it worth sacrificing thousands of conversions to learn a lesson that won’t result in any relevant increase of real world conversions?

    There are interactions that might make sense and those that should be avoided from being measured because of the amount of testing time it adds.

    This brings me to fractional factorial. It is possible for fractional factorial tests to detect interactions. How so? Using our example of a 5-factor test, fractional factorial can include everything from only main-effects all the way to 4-factor interaction effects. Full factorial’s only difference is that it is the full extension and includes the 5-factor interaction effects.

    Fractional factorial is not a one-trick pony, it is a continuum ranging from testing for no interactions (only main effects) to one factor less than full factorial. It is exactly what the name fractional implies; even one less is a “fraction” of full factorial. It gives you the power to make trade-offs between testing only main effects to testing for interactions based on intelligent test design.

    Once you decide to test for all possible interactions, you are committing to a full-factorial test and incur the associated traffic requirements. I’d love to see a test design that is designed for full interactions and still makes sense! Not having the ability to reduce the number of interactions is a huge detriment rather than a benefit of solutions limited to full-factorial testing.

    Radically shorter test times allow for many more smart marketing ideas to be tested and adapted based on what you learn from each test run. You, the marketer have the ability to analyze your results and tweak follow-on tests to capitalize on what you learn. This common-sense approach is what hypothesis-based testing is all about and is very powerful. Focus on testing smart ideas to increase your conversion rate – that’s what matters most.

    The graph below illustrates how much information is gained and the amount of testing needed, based on the number of interactions tested.

    effects-graph

    In my experience, the red area shows how valuable the data is based on which effects are being tested, while the blue area shows the amount of data (or time) needed to gather the data to confirm those effects. The x-axis goes from left to right, from main effects to full factorial (5-factor effects).

    At Widemile, we believe it is more effective to perform quick, successive tests detecting only main-effects rather than randomly hoping for interactions. While interactions might give you small or even large gains, it likely will never not trump the gains from additional testing, nor the time and money lost looking for random interactions. The additional time required for full factorial tests is large and not many marketers want to wait more than a month for a test to complete.

    Fractional factorial is preferred by a few camps, including Widemile, Omniture’s Test&Target (formerly Offermatica) and Interwoven’s Optimost. Full factorial is used in Google’s free Website Optimizer and some tools offered by smaller providers.

    Testing for all interactions sacrifices a lot of time. With the speed that audiences, marketing campaigns and seasons can change, it is important to get the most testing done in the least amount of time without sacrificing the quality of the data. Fractional factorial allows you to do just that, making it the wisest choice for multivariate testing.

    Share and Enjoy:
    • Sphinn
    • del.icio.us
    • Facebook
    • Digg
    • Google
    • Technorati

    Related posts

    Topics: Methodology, Terminology, Testing Techniques |

    4 Responses to “An Essential Primer on Full and Fractional Factorial Test Design”

    1. Shinchi Kotomaru Says:
      July 25th, 2008 at 1:27 pm

      Greetings,

      I appreciated this article very much. I have been studying DOE techniques (実験計画法) for social sciences at Kyoto Univ in Japan since 2007. One thing that is important to point out is the ability to measure main effects is equal for both full and fractional tests.

      Along with the “camps” that you mention, we are using a product called SiteSpect for the study of controlled experiments using both fraction and full factorials. This has helped us with optimization in online media in Japan. They have large repository of documentation that can be accessed from their website http://www.sitespect.com.

      thank you!
      ありがとうございました
      Shinchi

    2. Curious Cat Management Improvement Blog » Full and Fractional Factorial Test Design Says:
      July 29th, 2008 at 11:47 am

      [...] An Essential Primer on Full and Fractional Factorial Test Design Since full factorial gathers additional data, it reveals all possible interactions, but as seen by the numbers above, there is a trade-off. More data equals more information but more data also equals a longer test duration. The minimum data requirements for full factorial are very high since you are showing every experiment. [...]

    3. seo pixy Says:
      August 7th, 2008 at 3:03 am

      Congratulations for the great article! I found it very interesting and helpful, thanks:)

    4. Web Analytics Research Blog » Multivariate Testing News: Google Website Optimizer Now Features MVT Experiment Pruning Says:
      August 21st, 2008 at 11:05 am

      [...] announced the introduction of ‘Pruning’ as part of the GWO multivariate testing tool.  Billy Shih, an expert in Multivariate Testing as well as a close watcher of the Google Website Optimizer tool took note of this and immediately [...]

    Comments

    Readers who viewed this page, also viewed: