Nearly a decade in the making, we have spent countless hours refining our club testing protocols and data processing methodologies. MyGolfSpy has worked with R&D experts from some of the largest and most well-known companies in the golf equipment industry with the singular goal of creating the most comprehensive and informative golf club tests.

Everything in our tests – the number of testers, the number of shots we collect, the way we analyze data, and even the way we sort and randomize clubs during the tests is directly influenced by our conversations with industry experts.

The result is what we believe to be the largest, independent, 100% Datacratic golf club tests conducted on an annual basis.

Our Club Tests

While MyGolfSpy is a huge proponent of custom fitting, we also realize that a significant percentage of golfers choose not to be fully fitted for their golf equipment. It’s within that reality that Most Wanted was conceived as a test of off-the-rack golf equipment. As manufacturers have added adjustability to their clubheads and increased the number of stock shaft offerings, our tests have evolved to what we call fit from stock. We fit our testers to the degree possible with each manufacturer’s off-the-rack options. This includes changing shafts, making loft/face angle adjustments, and leveraging adjustable weights. Within the constraints of the options at our disposal, we make every reasonable effort to optimize each club for each tester.


For each test, our pool of golfers spans a wide range of ages, swing speeds, handicaps, and abilities. Among our regular testers are collegiate golfers, retirees, and average golfers just like you.



To collect a valid sample size across all testers and all clubs while minimizing the risk of fatigue, each tester takes part in multiple test sessions. Depending on the number of clubs, 3 to 8 sessions per test may be required of each tester.

During each session, testers are asked to take 3-4 swings with each club before moving to the next club in the rotation. Club groupings and the order in which clubs are hit are also randomized for each golfer. Obvious outliers, including worm burners, pop-ups, and balls severely offline are removed during the test session. The process repeats until we achieved a sample size of 10-12 good shots for each tester with each club.

Ball flight and clubhead data are collected with Foresight GC Quad launch monitors. To reduce the number of variables, all testers hit Titleist Pro V1 golf balls. Balls are inspected after each session. Any ball showing signs of wear is replaced.

When the session is complete, the launch monitor data is then exported and checked for errors. Outliers across several metrics are identified programmatically using a statistical approach called Median Absolute Deviation. This process is automated to eliminate any potential bias. Shots flagged as outliers are excluded when the final averages and grades are calculated.



While we collect and publish averages for standard launch monitor metrics, we use Strokes Gained as our key performance metric.

Here is an overview of the process:

For each shot, we calculate the strokes gained value. Next, we calculate the strokes gained average for every tester with each club.

For each tester, we identify the top performing club (the one with the highest average strokes gained). Then, for each golfer, using an 85% confidence interval (90% for putter tests), we identify any other clubs for which the strokes gained average is not reliably different from the top performer. Any club that is not shown to be reliably different from the top performer is considered to be as good. The number of clubs in this statistical top group varies between testers. For some testers, there is a single statistically significant best club, while for others, more than half the field is shown not to be reliably different from an individual’s top performing club.

Our Most Wanted winner for overall performance is the club that finishes in the statistically significant top group for the highest percentage of our testers. In the event of a tie, the club that was the top performer for the highest percentage of testers is our Most Wanted winner.

For iron tests, we test multiple irons from each set. To arrive at our Most Wanted winner, we level the performance percentages for the three irons tested, and then average the three values to determine our final grades.

We realize that data is inherently interpretive, which is why we share our data with you. For those who prefer things clean and simple, we provide performance grades. For those of you who want to dig deeper, make your own interpretations, and draw your own conclusions, we make a portion of our data available to you.

This is your test.

NoteThis page is frequently updated to reflect the latest information about how we test. As such, it may not be wholly applicable to previous tests.