Interview with data analyst and lecturer, Bill Gerrard

Bill Gerrard has built a career in Business and Sports Data analysis, advising in Rugby and Football at clubs like Saracens, San Jose Earthquakes and AZ Alkmaar. With the increased use of Moneyball techniques and increased use of metrics in football, Gerrard has built a keen insight into the adaptation methods of football clubs, in terms of performance analysis and team building. In this interview, we speak about the effect of coronavirus on the transfer market, the importance of Microsoft Excel, data visualisation, sequence-based analytics and scouting.

How did you first get interested in football and rugby after your experiences in baseball? What were your particular interests with these sports?

I’m a proud Scot and was brought up playing football so the “beautiful game” is my first love but I have always tried to follow all sports. My interest in baseball really dates back to the late 1980s when I spent three summers teaching at the University of Manitoba in Winnipeg, Canada. There was a lot of coverage of the Toronto Blue Jays and I went on an organised trip to see the Blue Jays play a weekend series in Minneapolis. I was hooked. Although I have never worked in baseball, I have been massively influenced by Moneyball. I was very fortunate to be contacted by Billy Beane who had been told about my work in applying data analytics to football. Billy has a passion for football and the Oakland A’s ownership group also own the San Jose Earthquakes in Major League Soccer. I ended up working with Billy as a consultant for three years, analysing MLS data. Billy has been a great mentor and it gave me a chance to see how the A’s operated from inside.

 

In a normal sporting calendar, what does your week look like in terms of your responsibilities at the University of Leeds and various clients in football and rugby?

My major club involvements over the last 10 years have been with Saracens and London Irish in English rugby and AZ Alkmaar in Dutch football. Typically I undertake game reviews and opposition analysis. So the normal working week is usually working Sundays/Mondays on game reviews, then updating my database with the game data from the rest of the games in the league, and preparing reports of the next opponents. It’s a kind of Indiana-Jones existence since my day job is as university professor in the Business School where I teach courses on business analytics and sports analytics. Things have been slightly less hectic this season since I’m not involved with any team currently. I’ve been doing some mentoring, talking with teams and potential investors about possible challenges, and acting as an expert witness in some sports law cases requiring data analysis.

 

How will Coronavirus influence the summer transfer market in your view? More loans, a wider search for bargains or bigger use of players from the academy?

The transfer market always reflects the fundamentals of the football business as regards expectations of future revenue growth. Coronavirus has been an extreme event affecting every aspect of our lives. Its economic and social effects will last for years. In the case of football, the huge loss in revenues will inevitably restrict the ability of clubs to spend big money on transfer fees. Clubs will have to be smarter and this will favour clubs who have been adept in using Moneyball-type strategies to find undervalued talent, developing and bringing through academy players into the first team, and acquiring talented players on loans.

 

With Amortisation of transfer fees is it more pertinent for a team to invest a larger sum in their youth academy rather than a continuous buy low to sell high strategy? For instance, if Team A sells a player to Team B for €30 million amortised over the length of a five-year contract to then buy his replacement for a slightly lower figure, don’t Team A lose money on their balance sheet in the short term? Or are there other variables to consider?

Amortisation is just an accounting method for how capital purchases (such as the costs of player acquisitions) are treated in the company accounts. It doesn’t really change anything or affect decisions. Ultimately player transfers involve cash flows in and out of the club. These cash flows will be volatile year-on-year but ultimately a club has to align its net transfer spending over time with its net operating cash flows – its revenues from matchdays, TV, sponsorship and merchandising minus its operating costs, mainly wages. Big clubs able to generate large net operating cash flows can afford to be net buyers in the transfer market. Smaller clubs are often struggling to afford the player wages required to be competitive, Necessarily they must be net sellers in the transfer market. It’s crucial for these clubs to be great at developing players, either developing young players in their academy or acquiring undervalued players with potential and successfully developing them in the first team.

 

You have blogged about the importance of excel in Data Science. With programming language sites like Python, do you feel it would be pertinent for a data scientist to have a good knowledge of Excel first and why?

There’s a lot of “technical snobbery” in analytics around which programming languages and statistical software that analysts use. Don’t get me wrong – there is a need for Python, R etcetera to do the really specialist statistical stuff but much of the more routine work of cleaning up data, doing basic analysis and generating visuals can be done in Excel which is a lot more powerful than may people recognise.

 

In your old blogs, you use something called the Pythagorean Expected Wins model. Could you explain this and its effect in layman’s terms?

Pythagorean expected wins was developed in baseball by the great sabermetrics guru, Bill James. Basically all he did was to create a formula to link runs scored, runs allowed and a team’s win percentage. It is a really important tool for helping teams to help plan changes in their rosters. So, for example, suppose you are considering changing your pitchers. Once you have identified possible replacements and estimated the expected reduction in runs allowed, you can use the Pythagorean expected win formula to predict the impact of the replacement pitchers on the team’s win percentage. You could look at different possible combinations of players trades, assess the impact on team performance and compare to the costs. A similar approach is possible in other team sports. It’s just more complicated to calculate the expected effect of an individual player on team performance.

 

You have also written about the role of the analyst in bridging a gap between themselves and the coach? Former Philadelphia 76ers GM, Sam Hinkie explained that he would drop two pages of data on the coach’s lap on the plane ride after games to make the data easier to read and transfer into the coach’s thinking. How important is it for an analyst to manage data presentation and language when transferring data to a player or a team? Do you have any examples of how you have done this at stages of you career that you can give?

It’s an essential skill of the analyst to be able to build up a great relationship with the coaches and other decision makers in order to be able to produce relevant analysis. Ultimately the analyst’s role is to provide the best possible evidential base for coaches and others to make the best possible decisions to allow the team to be as successful as possible. The analyst must be highly game-knowledgeable but must also be humble. The numbers only tell part of the story. Knowing the limitations to the analysis is every bit as important as finding patterns in the data. Data visualisation is crucial to getting the analyst’s message across. My reports always include lots of graphs and whenever possible I colour-code tables of numbers using a traffic-lights system so that you can instantly see differences in performance levels.

 

In Baseball, there is a position where a former player bridges the gap between the data analysts and the playing staff to better explain the metrics to the playing staff, so they can receive and apply the information. Do you feel that a role like this has a future in football?

Anyone who can bridge the gap between analysis and coaching/playing is invaluable especially those with experience of playing at the elite level. When I was at Saracens the team captain, Steve Borthwick, was a great supporter of the value of data analysis and that was important for getting buy-in from the other players. He is now a very successful coach. Ultimately it is all about building the right culture in which being as thorough and as well prepared as possible in everything you do, with data analytics as a key part of how the team does things.

 

Even though some Directors of Football have a playing history as well as a higher ratio of coaches have a top-level playing history than in American sports, would they still need someone in this role to be in charge of bridging the gap between the analysts and playing staff to create more synergy?

I think so. Combining an understanding of the potential value of data analysis with the experience of playing at the elite level is still a pretty unique skill set and a very valuable one.

 

In Baseball, sabermetrics discovered underlying value in staying on base, in Basketball, it was the three-point shot, in American Football, it was going for it in Fourth Down situations. Where is the underlying value that defines the difference between wins and losses in Football?

That’s the Holy Grail that many are chasing after in football analytics. Some think that it is expected goals. I’m still searching.

 

Tony Pulis proved the value of throw-ins with Rory Delap at Stoke a few years ago which triggered a school of thought about attacking throw-ins a few years ago. What is the underlying value in throw-ins and their creation of opportunities?

Tony Pulis “discovered” and put into practice something that the original football analyst, Charles Reep, had quantified several years previously – a long throw near the opposing goal is around six times more likely to result in a goal than short throw.

 

Brighton and Hove Albion have also looked to have increased success with set-pieces. They have built the apparatus with tall centre-backs and skilled set-piece takers like Pascal Gross while promoting Nick Stanley to work on their set-pieces. Do you think the marginal gains of set-pieces will affect a bigger part of a team’s coaching and recruitment strategies?

Again there is nothing particular new in this. It was a key element in the approach of the first “Moneyball” team in Premiership football – Bolton Wanderers under Sam Allardyce.

 

More coaches are hiring throw-in coaches or specialists as the importance of set-pieces is growing. Do you feel football is trending towards a place where there would be situation specific specialists permanently hired on coaching staffs like in American football? For instance, a throw-in specialist, a corner-kick specialist or a goal-kick specialist?

It’s definitely a trend in rugby union and one that is beginning to emerge in football. I’m a big believer in specialist coaches.

 

In pre-season, Former Stuttgart manager, Tim Walter took most of his set-pieces in short situations. Do you feel there is underlying value in short corner-kicks and goal-kicks?

It’s exactly the sort of question that is analysable with data as Charles Reep showed with his so-called “yield analysis”. But. Of course, the more a team adopts a specific tactic, the more the opposition will be prepared. Data analysis can only take you so far. You cannot reduce winning to a formula. Having a way of doing things is really important but it needs to encompass flair and creativity. Otherwise it becomes rigid, predictable and beatable.

 

Do you believe there will be more growth in sequence-based analytics? in Baseball you have limited outcomes in situations. In football there are more variables. For instance, it is out-dated to grade a player’s creativity on the number of assists he has. There are players that can have multiple crucial touches in the build-up to a goal but are not credited with the assist. Data has tried to remedy this by valuing passes in the attacking phase to better understand the crucial moments in a sequence.

Yes, definitely. The leading-edge developments in football analytics are in analysing playing sequences and player positioning decisions in and out of possession.

 

In terms of scouting, how important is it for a team to have a defined style of play for a recruitment staff to look for players to fill certain roles with certain attributes?

There is a very simple adage – “recruit the best team for the players, not the best players for the team”. What I mean by this is that you should look for players that will be effective in your team. Be clear on exactly what it is you are looking for first, then try to identify the players who might be able to provide what you are looking for. It’s how New England Patriots have done things for years and one of the reasons why they have been so successfully. The recruit players who would be the best Patriots players, not necessarily the best NFL players.

 

Data has become more important as clubs have access to large video libraries like InStat or Wyscout, giving them more equal opportunity to the wider market. Data analysts believe that statistics is about giving traditional scouts a wider lens to properly analyse and grade talent. Is that your belief?

Effective player recruitment is about combining the expertise and experience of the scouts with the data analysis produced by the analysts. Players are more than just numbers. The numbers are important but only part of the story. I’ve always seen data analytics as a very cost-effective way of doing an initial screening of all potential recruits to identify a small subset of players who are look on their numbers to be the most promising possibilities to be scouted in much more detail. Scouting is a scarce resource so anything that helps allocate scouts’ time and effort more efficiently should be welcomed. And if a team is really embracing an evidence-based approach, it should be using scouting reports as an important source of data to be analysed especially qualitative, expert data that can’t be easily quantified.

 

Can metrics with playing ideals lead to more effective scouting? For instance, if Team A, plays a possession-based style and is looking for a deeper-lying midfielder in a certain age range, how you go about beginning to put data together. Does PACKING data by grading opposition players removed by a certain pass as well as successful dribbling take preference on other metrics associated with defensive rigidity?

Yes it fits with everything I’ve said about how scouting should be team-specific and analytics should be integrated into the scouting function. Scouting and analytics working together are a powerful combination.

 

When people speak of data, they normally think of numbers on a spreadsheet and graphs. However, information like touch maps and video sequences can also be used to better inform a data analyst, right?

Data includes both quantitative measured data and qualitative categorical data. Data is usually thought of as numbers but it encompasses all forms of evidence. Expert judgments especially of those things that aren’t easily measurable but are critical for success such as resilience and attitude can be very important sources of data.

 

With increased data in football, there is also the need for data analysts to use predictive analysis. For instance, for a club buying a player from a less competitive league to be sure the player will be able to adjust in a certain time-span upon signing? Are there any methods you use when trying to predict this?

There’s a lot of work being done on trying to compare performances across different leagues to determine how Player X in League A will perform in League B. One of the key ways of doing this is to analyse the performance data of those players who have previously played in both leagues.