• About This Site

    I write about multivariate testing, landing page optimization and other campaign optimization topics. Check out the Lesson Guide if you're just starting. Have a question or suggestion? and let me know your thoughts.
  • Subscribe

  • Big List - Search Marketing Blogs

    Marketing Experiments Professional Certification Program

  • Recent Comments

  • « 3 ways to maximize PPC and Landing Page Optimization | Home | 3 steps to quickly make a good multivariate test »

    What is Taguchi? How does it relate to testing?

    By Billy | February 14, 2008

    the Taguchi method

    Multivariate testing is a buzz word these days, but the buzzword of multivariate testing seems to be Taguchi. However, that term is being abused. Do you know what Taguchi really means? I wasn’t even positive, so to get some background, I did some research and talked with Vladimir (Widemile’s Chief Scientist).

    The name and method comes from Genichi Taguchi. His method, also known as Robust Design, attempted to improve product manufacturing quality. Therefore it falls into an area of engineering called Quality Engineering.

    Does this sound aligned with website testing? Not really, and this is the problem of using the term Taguchi with web site testing. The goals of manufacturing and the goals of a website are not the same.

    What most people are attempting to grasp when using the term Taguchi is fractional factorial test design. (I discussed this at length in my post about the difference between Widemile’s technology and Google Optimizer.) The Taguchi method uses a fractional factorial test design and is under the umbrella of fractional factorial testing but is not the only or best fractional factorial method. In fact, even within manufacturing, the Taguchi method was the inspiration for many new techniques but many statisticians find it flawed.*

    It is important to differentiate the Taguchi method from fractional factorial test design since one is a basis for manufacturing while the other is purely related to design of experiments. You need to ensure that the math and science behind your testing is based on methods that have the end goal of optimizing your website only. So if your testing tool uses the Taguchi method for testing, you better ask what that really means.

    So does Widemile use Taguchi? We don’t use the Taguchi method, however do use fractional factorial test design. I like to say that our platform goes beyond Taguchi because it was specifically made for optimizing web content.

    Don’t get sucked into the Taguchi method, it is just a buzzword used by your fellow marketers. Just because the technology doesn’t use Taguchi, doesn’t mean you should count it out.

    *Read more after the jump for Vladimir’s explanation of the Taguchi method and its criticisms
    The following is written by Vladimir Brayman, Chief Scientist at Widemile. If you have any questions for him or I, leave a comment and I will try to get back to you ASAP.

    Sometimes the term Taguchi method is used mistakenly to mean fractional factorial design. In fact, the Taguchi method is much narrower in both its scope and objectives. The Taguchi method (also known as robust design) belongs to an engineering discipline called Quality Engineering. The main objective of the Quality Engineering design is to minimize variability in the performance of a product under different environmental conditions. The main characteristics of the Taguchi method stem from that objective. Among the steps involved in the Taguchi method are:

    1. Defining two types of factors, control and noise. Control factors can be manipulated by a production team during the manufacturing process whereas noise factors model environmental impacts on the product and thus cannot be controlled precisely.
    2. Defining two orthogonal arrays – usually with mixed levels and of strength 2 (this implies that only main effects can be detected) – one array for the control factors and the other for the noise factors.
    3. Maximizing the signal-to-noise ratios, a logarithmic function of the ratio between the square of the average responses due to the control factors and the estimate of the variance due to the noise factors.

    Statisticians criticized unjustified claims of almost limitless applicability of the Taguchi method by some of the researchers. Among the critiques are:

    1. There is no possibility of detecting interactions among the control factors.
    2. There are N1*N2 observations needed, where N1 is the number of level combinations of the control array and N2 is that of the noise array. However the confounding structure for the control factors is the same as that of an array of size N1. This implies that the same resolution can be obtained with much smaller number of runs.
    3. The influence of the noise factors on the response variables cannot be detected.

    To conclude, some people mistakenly call fractional factorial design Taguchi method. Use of the genuine Taguchi method for Landing Page Optimization is not justified.

    Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
    • Sphinn
    • del.icio.us
    • Facebook
    • bodytext
    • Google
    • Technorati

    Related posts

    Topics: Terminology |

    8 Responses to “What is Taguchi? How does it relate to testing?”

    1. Terry Polyak Says:
      February 20th, 2008 at 9:36 am

      Nice article. I always wondered how auto manufacturing QC and website Optimization through MVT were related……only that it was a process to arrive to an end with mathematics helping you get to a result quicker. Zoom zoom.

    2. Dale Jelinek Says:
      February 21st, 2008 at 12:37 pm

      I am reading Ranjit Roy’s “Primer on the Taguchi Method” and I sense that his understanding differs from that of Vladimir in a few areas. Being a non-statistician and neophyte in this area I don’t claim to know the answers, but here is where I see the differences:

      - “There is no possibility of detecting interactions among the control factors”
      + In Roy’s book he covers this topic in many sections, starting in Section 5-5-2 “Interaction Affects”. He also publishes Taguchi’s triangular tables and linear graphs which provide information on how to isolate interaction effects of the control factors in the orthogonal arrays.

      - “However the confounding structure for the control factors is the same as that of an array of size N1. This implies that the same resolution can be obtained with much smaller number of runs. ”
      + The full factorial array is N1 * N2 in size, but isn’t that the point of the research done by Taguchi - that by using orthogonal fractional factorial arrays that you can use a fraction of the number of experimental runs to capture the essential information to determine main effects and interaction effects?

      I have heard the comment in the past that the Taguchi approach does work in marketing, but there are “adjustments” that need to be made to make it useful in that context. Have you any idea what those “adjustments” might be?

      Thank you in advance for your expanded comments in reply, I am looking forward to advancing my understanding of optimization techniques applied to marketing.

    3. Billy Says:
      February 25th, 2008 at 2:43 pm

      Hi Dale,
      It’s difficult to understand what you’re saying without having read Roy’s book. We are not saying that the Taguchi method has no standing or merit, just that by itself it is not the best method to decipher marketing data. The goals of the Taguchi method is far from what we, as marketers, are trying to achieve.

      What you described as “adjustments” is close to what I am saying. The Taguchi method is a part of fractional factorial design, but is not all encompassing. Marketers that say they are using “the Taguchi method” probably are just doing fractional factorial design and incorrectly attributing it to the Taguchi method.

      I just want to clarify that the true Taguchi method is not made for marketing and should not be sought as the answer to answer multivariate tests.

    4. Web Analytics Roundup - Recent Links of Interest | Field Guide to Programmers Says:
      March 19th, 2008 at 4:39 pm

      [...] in the Taguchi method? Billy gives a nice overview of the Taguchi Method in this post: The Taguchi method (also known as robust design) belongs to an engineering [...]

    5. Tim Ash Says:
      April 19th, 2008 at 9:08 pm

      Hi Billy,

      I agree with you that the Taguchi Method is the wrong way to go for landing page optimization tests. I am currently working on a whitepaper called “The Truth About The Taguchi Method for Landing Page Optimization”. It will be available as a download here: http://SiteTuners.com/downloads.html

      However, I believe that you are splitting hairs a bit. As you mentioned, the Taguchi Method is a subset of fractional factorial design of experiments (DoE). All fractional factorial designs basically suffer from the same constraints as Taguchi. They assume that there are no interactions (or at best can detect very few low level ones). The other big drawback (and this applies to all parametric “model building” approaches), is that the number of parameters in the model grows very quickly as the size of the test grows. So it becomes impossible to run very large tests.

      Landing page optimization tuning methods consist of two main activities: how you collect data, and how you analyze it. Your choice on the data collection approach often dictates and restricts the possible kinds of data analysis that you can conduct. Fractional factorial data collection is absolutely the wrong way to go in the presence of strong variable interactions. Variable interactions are very common in online marketing, so to assume that they do not exist is silly and dangerous. It often leads you to the wrong answers).

      The proper way to collect data is full factorial. This has two advantages: better estimation of main effects (even if you do not model interactions), and the ability to detect and model important interactions. Google Website Optimizer does full factorial data collection. Even though their reports are currently only main effect reports, at least you can export the raw data and model interactions.

      However, both fractional and full factorial data collection is still commonly analyzed using parametric methods. This leaves the problem of scaling up the size of your test. Basically you will hit a wall with any of these methods if you try to run a very large test.

      That is why we took a long time to develop our own proprietary TuningEngine technology. It is a non-parametric method that takes variable interactions into account, and also allows you to run much larger tests (1000 to 10,000 times larger on the same data rate). It does have the disadvantage of needing at least 100 transactions per day to operate. But if you have higher data rates, it allows you to get results in one test instead of several back to back parametric multivariate tests.

      I wrote a whole chapter in my Landing Page Optimization book ( http://LandingPageOptimizationBook.com ) about various tuning methods. It discusses the trade-offs above in a lot more detail.

      The TuningEngine is available as part of our full-service engagements or as a technology license for interactive agencies and companies that run many landing page tests in-house.

    6. Billy Says:
      April 28th, 2008 at 1:34 pm

      Hi Tim,

      Thanks for your post. We obviously disagree about the best method of testing, but I am glad that there is discussion around this.

      I am “splitting hairs a bit” when I describe Taguchi as inadequate since it is a form of fractional factorial test design, but I just wanted to make clear that it is not the exact same thing as what we do at Widemile and that the differentiation is both important and relevant to getting the best test results.

      However, I have to take point against a few of your statements. Fractional factorial designs can detect interactions, up to the total number of factors in the test. Fractional factorial design does not inherently mean that no interactions are taken into account. As Vladimir, Widemile’s Chief Scientist describes, “practically any level of interactions (up to the total number of factors in the test) can be specified. The fractional factorial designs must be viewed, in fact, as a continuum of the designs, starting from the main effects (no interactions), then 2-factor interactions, and so on, up to the full factorial (interactions of all the factors in the test). The designer then chooses the appropriate level of interactions depending on his/her insights and the time constraints of the test.”

      Also, I am not sure what exactly you mean by “very large tests” but the strength of fractional factorial design is being able to run large tests in shorter periods of time than full factorial.

      I never assume that interactions don’t exist, however I also think the extreme of assuming all interactions exist in all situations is a mistake also. Part of being an expert is being able to draw out what interactions may come up. Although you may find all the interactions by doing full factorial test designs, it is highly likely that many of those interactions are low influence and are not worth the time or effort to discover. The better alternative is for an expert to choose which interactions they believe may occur and to test only for those.

      In regards to parametric vs nonparametric, 100 conversions per day is a very high amount for a single page. With that amount of traffic, fractional factorial design can create very large tests also. In addition, the information gained about individual factor influences through parametric MV tests is VERY important. The influence measurements I get from parametric testing helps me design future tests and ensures that I don’t throw away ideas that only barely lost to the optimal or think that marginally important factors are very important. Since nonparametric design only tells you what wins and not the actual influence levels, you lose the knowledge of knowing why the optimal is the best.

      -Billy

    7. Tim Ash Says:
      June 19th, 2008 at 9:47 am

      Hi Billy,

      I am afraid that you have some inaccurate comments and assertions in your reply to my comment above. I will try to address them each of them below:

      1)

      [Also, I am not sure what exactly you mean by “very large tests” but the strength of fractional factorial design is being able to run large tests in shorter periods of time than full factorial.]

      This is a bunch of B.S. - all parametric approaches build a model of the expected conversion rate and then “score” each possible recipe in the test to try to predict which is the best. This predicted-best should then be run in a head-to-head against the original to see the actual amount of improvement. If you are doing a main-effects analysis only, there is absolutely NO SPEED ADVANTAGE at all of fractional versus full factorial data collection. In fact, you get better main effects estimates in the (almost inevitable) presence of variable interactions, and you have the ability to create models with interactions to see if your main effects model is appropriate. I cover all of this in detail in my Taguchi whitepaper at http://SiteTuners.com/downloads.html

      2)

      [Part of being an expert is being able to draw out what interactions may come up.]

      This comment is off-base as well - you can’t “draw out” interactions which are not captured by your fractional factorial model in the first place. If you do not collect data on the interactions (by using full-factorial or at least very dense (as opposed to the commonly used very sparse) fractional factorial sampling, there is simply no data to build the required models.

      3)

      [Although you may find all the interactions by doing full factorial test designs, it is highly likely that many of those interactions are low influence and are not worth the time or effort to discover.]

      This is very misguided. In marketing we want to create synergies among the page elements. In other words we want the variables in the test to interact. So to say that the interactions “are not worth the time or effort to discover” means that you are willing to leave a lot of money on the table. We recently completed a test for RedEnvelope using the Google Website Optimizer. If we had used only a main-effects analysis without interactions the test showed only two out of the eight variables as being significant. When we re-ran the analysis with simple two way interactions, we found that there were four significant main effects, and eight very strong interactions of similar magnitude. The bottom line is that by ignoring the presence of many wickedly strong interactions, you would have not found the best answer.

      http://sitetuners.com/site_tuners_re_case_study_detailed.pdf

      4)

      [The better alternative is for an expert to choose which interactions they believe may occur and to test only for those.]

      This is not possible. There are no “experts” capable of understanding the psychology of an anonymous audience visiting your site. Even if you could, there would be inevitable conflicts and contradictions because the needs of each individual are different. Domain subject matter insights in creating the model are only helpful when you are dealing with physical processes like manufacturing (which is where the Taguchi Method originated many decades ago). If you have access to such “experts” for the human brain, I will hire them in a heartbeat.

      5)

      [In regards to parametric vs nonparametric, 100 conversions per day is a very high amount for a single page. With that amount of traffic, fractional factorial design can create very large tests also.]

      100 conversion per day is not a lot - we work with many clients that have many orders of magnitude higher than this as a data rate. Also, as I have already said repeatedly, fractional factorial design for large experiments would have to be very sparse and ignore most variable interactions. If they include even low (second and third) order interactions, the number of parameters in the model would grow geometrically and prevent you from making any kind of reasonable predictions.

      6)

      [In addition, the information gained about individual factor influences through parametric MV tests is VERY important. The influence measurements I get from parametric testing helps me design future tests and ensures that I don’t throw away ideas that only barely lost to the optimal or think that marginally important factors are very important. Since nonparametric design only tells you what wins and not the actual influence levels, you lose the knowledge of knowing why the optimal is the best.]

      This is a tired and incorrect mindset that I encounter often. So-called “learnings” are of very limited value. You basically have a choice: run large scale tests and find the best answer among millions of landing page versions, or run small “toy” tests with inaccurate simple models and pretend that you understand something about why your predicted answer is better. There is no free lunch - you can either devote your data sampling to finding the best answer or to building a model - you can’t have it both ways. As I’ve also said many times, this kind of main effects after-the-fact meaning making will lead you to the wrong answer (see the RedEnvelope case study in #3 above).

      I hope this helps,

      Tim Ash

    8. Billy Says:
      July 16th, 2008 at 3:14 pm

      I strongly disagree with your responses, but I think this comment thread has reached it’s end. If you’d like to continue our discussion, feel free to contact me as you have in the past.

      Through this blog, I will continue to speak about the virtues of fractional factorial testing and limitations of full factorial testing.

      Multivariate testing of web pages is still growing, so there will be many conflicts on the best testing methodologies, best practices and statistical methods.

      Best of luck,
      Billy

    Comments