IS A CZ STARTER AS GOOD AS A B NAKED?

AND OTHER QUESTIONS ABOUT PITCHING YOU WERE AFRAID TO ASK ABOUT MAIL LEAGUE PITCHING

 

By Ethan Conner-Ross and Marc Howard Ross

Published in the APBA Journal July, 2001


One of the regular challenges to draft league managers is to decide how much pitching they need to be competitive and to figure out when upgrading starting pitching is worthwhile and when it is too costly.

 

Itís not that a manager wouldnít want the best pitching he or she could get, itís just that good pitching isnít given away these days and managers have to decide whatís a fair price to pay in a trade or in the draft for pitching.

 

We explore this issue here asking what a pitcher of different grades is worth in comparison with league average pitchers and when compared with how a DZ ďreplacementĒ pitcher drafted to fill needed starts might perform.

 

Using the tools that Bill James, Pete Palmer and other Sabermatricians have developed, we attempt to provide a reasonable estimate of what pitching is worth and how much pitching grades matter to an ABPA draft league teamís fate. A hitterís offensive value, James has taught is best measured through runs created which can roughly be indicated by a playerís OPS (on-base plus slugging) and compared across players with an equivalent number of plate appearances in a season.

 

Runs add up to wins, we all know, and Palmer has estimated that for every ten runs more a team scores than it gives up over a season, it is likely to win one more game. Our study looks at the flip side of this insight by focusing on runs allowed (and not allowed) rather than runs produced.

 

It does this by examining the relationship between pitching grades and runs yielded to estimate the number of wins starting pitchers with different APBA grades are worth while holding hitting constant. For example, we ask if a manager adds a BZ starter to his staff, how much better is the team likely to do?

 

††††††††† APBA specifies that pitching matters and captures this through the assignment of pitching grades. While replayers have been telling us for years that replay outcomes reflect actual player and team performance, we are especially interested in how differences in pitching grades translate into wins and losses in draft leagues.

 

A draft league manager will want to know how different mixes of pitching grades among the starters are likely to perform and how choices concerning efforts to obtain hitting versus pitching might affect chances of making the playoffs or winning a pennant.[1]

 

After all, draft league managers are often faced with real choices such as whether to stick with a CZ starter or to try and obtain a B but without control. Is giving up a good deal of hitting to upgrade a starter worthwhile?[2]

 

There are at least two obvious ways to examine the question of not just whether pitching grades matter, but how much they affect a teamís wins and losses. One is to run a million or so versions of the computer game with the same position players in the lineup day after day only varying the pitching and then analyzing what difference the pitchers make. This would be easy and appealing from a scientific point of view.

 

Instead we decided to take a second approach: analyze data from a real draft league. For one thing we have more confidence in the dice than in the computer (except for doing our data analysis). Second, we have what is almost a real-life situation: managers have drafted players and are competing for real playoff situations. They have made the choices about rosters and about player usage. Third, as our results show, managerial decisions regarding player usage turns out to be relevant in our analysis in ways that a computer simulation might not have fully anticipated.

 

To study this question, Ethan examined the actual performance of starting pitchers who had at least 15 starts in Mail 3, a 20-team draft league which plays a 160 game schedule, over three full seasons: 1996, 1997, and 1998.[3] Mail 3 limits starters to the actual number of starts they have in the MLB season provided at least 2/3 of the pitcherís appearances were as a starter. The data in Table 1 show clearly that pitching grades matter.

 

There is a clear drop off in performance as one moves from A to D grades that are measured in a variety of ways. Pitchers with better grades give up fewer runs (and earned runs), fewer hits and walks per 9 innings, earn more wins and fewer losses, hurl more shutouts, and have more complete games.

 

In addition, it is clear from Table 2 that for pitchers at the same grade, those with a Z control rating do better than those without it, although control doesnít matter as much as the grade itself. Finally, the data in Table 2 show that Aís throw about 50 more innings a season than Bís who throw about 50 more than Cís. So you donít just get a better pitcher when you upgrade, you get him for more innings. But you knew this right?

 

Since we promised to tell you how much more a higher grade pitcher is worth in wins and losses to a team over a season, we have to move beyond the ERA and the statistics presented already to do this. Because not all starters throw the same number of innings we first need a common standard of comparison.

 

Ethan decided that since the average Mail 3 starter pitched about 210 innings in a season that he would calculate runs allowed per 210 innings which is shown in the second column in Table 2 to given us a level playing field on which to compare starters over a season. Clearly there is a net increase in runs given up as one moves from AZ to D non-Z pitchers.

 

The runs allowed statistic then allows us to compare pitching value using Palmerís finding that over baseball history when a team scores an extra 10 runs than it yields this produces one more win over the course of a season.

 

The 10 runs Palmer came up with might see a bit illogical, of course, because one run can win any ball game but his data analysis has shown this is a strong and consistent relationship over time. For example, if a team scores 800 runs and allows 800 runs, the best guess is that it will win just half of its games, and if they score 100 more than they allow, they should win go 91-71 over a 162 game season.

 

Comparing pitching grades (and taking the Z into account) gives us an estimate of the number of wins a pitcher at each grade is worth and is shown in Table 3 column 2. Here we compare pitchers at each grade against what is a League Typical Starterósimply the mean of all starters over the three year period which is shown in the sixth column in Table 3. This shows that an AZ starter is worth 4.5 more wins and year and a D non-Z starter produces 5.6 more losses for his team óand the other grades, not surprisingly, fall in between these two.

 

It is clear that while the absence of a Z rating hurts pitching performance as we said earlier, in all cases the non-Z is closer to the Z pitcher with the same grade than to the Z pitcher at the next lowest grade. There is no support for the idea that itís better to have a starter with control than one with no control and a higher grade.

 

Interestingly, the results clearly show that the Z rating improves a pitcherís runs yielded less for higher graded pitchers than for those with poor grades. As we move from A to D the magnitude of the difference between Z and non-Z pitchers increases from .2 to .4 to .8 to 2.4 wins per year. The lower the grade the more the Z is worth.

 

The explanation for this is that better pitchers are giving up fewer hits which advance runs and produce runs while each time D pitchers put runners on base through walks, there is a greater chance of these free passes resulting in runs scored.

 

Pitchers with higher grades and good control not only yield fewer runs but also throw more innings than lower graded, non-control hurlers. The extra innings, shown in Table 3 column 3, are benefits as they take pressure off a teamís bullpen.

 

Table 3 provides our estimates of the value of the extra (or fewer) innings a pitcher throws over the 210 average for a starter (column 4) and the following column (5) is our estimate of the benefit derived from the ability of a team to use relievers more selectively. The extra (or fewer) inning value is simply an estimate based on the runs yielded by a starter compared with the league average for the additional (or fewer) innings pitched by a starter at each grade.[4]

 

To estimate how a starter affects bullpen usage was a bit more complicated and our figures are presented in Table 4. To do this Ethan first calculated that the average team used 360 bullpen innings during the three year period studied.[5] Then he developed a profile of the average bullpen for Mail 3 teams. It shows that the average team had 365 innings of A, B and C relievers each of the three seasons.

 

More specifically, the average pen had 65 innings of A relief, 194 of B relief and 101 of C. Good starters meant that a team could drop C relief innings and use a higher proportion of A and B relievers. Table 4 shows how many runs this is worth over a season for a starter at each grade by comparing the runs an average pen would yield with each mix of innings (above or below 360) with the higher or lower inning total the pitcher at each grade would produce.

 

The run differential is not large but it is consistent and shows that the pen innings difference ranges from +.7 wins to 1.0 losses Table 3 column5).

 

The last two columns of Table 3 present two different estimates of a pitcherís overall win/loss value. The more conservative estimate shown in column five combines the wins (or losses) from a hurlerís grade, the extra (or fewer) innings thrown and the bullpen value in comparison with a league average pitcher. It shows that there is a gain of about six wins a year for an AZ starter over a league average pitcher and a loss of five games when a D non-Z is used. Whatís more there is a clear stair step progression just as APBA boards would like it.

 

But some readers might think there is something curious about using the league average starter for a comparison since few managers just have a league average starter sitting on the bench ready to be used if needed.

 

More realistically, draft league managers face a choice between drafting what might be a league average starter in some early to middle round of their draft (or trading for one) and using a DZ (or even a D non-Z) that they might pick up at the end of the draft.

 

While this costs them a roster spot, it is not expensive in terms of draft picks. The final column shows how much a pitcher at each grade is worth over a DZ starter and the results are quite clear. The AZ is worth and extra nine wins and the BZ another five and a half. According to these estimates if the replacement starter is a D, a teamís win total will take a pretty big hit.

 

How do our calculations help managers think about staff composition? The final table offers our estimate of how many more (or fewer) wins a team is likely to chalk up over a season depending upon the composition of its pitching staff pitching against league average starters.[6]

 

It shows that good pitching with an average hitting team can do very well. In most draft leagues with 16 to 24 teams, 93 wins will garner a playoff berth. On the other hand, a team with all DZ starters needs to make up 16 games (160 runs over opponents) just to play .500 ball, a pretty tough feat and then needs another 120 runs to match the team with a very good staff. This requires an incredibly good offense.

 

In conclusion, we have shown that good starters produce four benefits for a team. They give up fewer runs, they throw more innings, and they allow a manager to use a bullpen more selectively. In addition, good starters allow a team to go with a smaller bullpen so a manager can save roster spots for a good pinch hitter or prospect. But acquiring good pitchers can be costly.

 

To decide if it is worth upgrading a starter we offer the figures in Table 3 to estimate the wins he might garner and to ask if this is greater or less than the runs a position player who would be traded (or drafted) in the pitcherís stead might produce.

 


 

 

 

 

 

Table 1: MAIL 3 PITCHING PERFORMANCE BY GRADE OF STARTER, 1996-98

 

Grade

#

R/210

ERA

G

GS

CG

SH

W

L

IP

H/9

R/9

HR/9

BB/9

K/9

A

14

73.5

2.85

32.9

32.9

26.6

3.43

21

10.3

288

5.4

3.15

1.1

2.65

8.81

B

85

99.2

3.82

31.3

31.3

13.0

2.06

15.4

11.0

238

7.6

4.25

1.2

2.99

6.98

C

141

119.7

4.64

29.3

29.2

7.4

1.45

11.4

10.8

194

9.5

5.13

1.2

3.19

6.57

D

67

161.0

6.27

29.8

28.6

7.85

0.6

7.0

15.2

187

11.9

6.90

1.4

3.42

6.91

AVG

 

118.2

4.59

30.1

29.8

9.9

1.52

12.0

11.8

209

9.09

5.07

1.2

3.14

6.91

 

Table 2: Mail 3 Pitching Performances

By Grade and Control of Starter, 1996-98

 

Grade

#

R/210

ERA

G

GS

CG

SH

W

L

IP

H/9

R/9

HR/9

BB/9

K/9

AZ

9

72.8

2.80

33.7

33.7

27.3

3.8

21.3

10.4

293.4

5.44

3.12

1.13

2.45

8.90

A Non-Z

5

75.6

2.97

31.6

31.6

25.2

2.8

20.4

10.0

277.4

5.47

3.24

1.06

3.05

8.70

BZ

44

97.3

3.72

32.6

32.6

14.6

2.2

16.8

11.5

251.7

7.67

4.17

1.26

2.53

6.86

B Non-Z

41

101.7

3.93

29.8

29.8

11.2

1.9

13.8

10.4

224.1

7.48

4.36

1.21

3.55

7.14

CZ

59

114.8

4.48

29.5

29.5

8.0

1.7

11.7

10.9

198.4

8.91

4.92

1.18

2.64

6.37

C Non-Z

82

123.4

4.76

28.9

28.7

6.9

1.2

11.0

10.6

189.5

8.82

5.29

1.22

3.60

6.74

DZ

31

150.0

5.86

32.5

31.1

8.7

0.7

8.3

15.7

206.2

11.65

6.43

1.39

2.76

6.68

D Non-Z

36

173.8

6.63

28.2

27.1

7.3

0.5

6.2

15.2

175.0

11.96

7.45

1.42

4.09

7.11

All

307

118.2

4.59

30.1

29.8

9.9

1.5

11.99

11.8

209.2

9.09

5.07

1.24

3.14

6.91

 

 

 

 

 

Table 3: Estimates of Pitcher Difference

in Wins and Losses Over One Season

 

GRADE

NORM WIN VALUE

(2)

EXTRA INN THROWN

(3)

EXTRA INN VALUE

(4)

SAVED PEN VALUE

(5)

TOTAL WINS

(6)

TOTAL WINS OVER DZ

(7)

AZ

4.5

83

.7

.7

5.9

9.0

A NON Z

4.3

67

.5

.6

5.4

8.5

BZ

2.1

32

-.1

.3

2.3

5.4

B NON Z

1.7

14

-.1

.1

1.7

4.8

CZ

.3

-12

.1

-.3

.1

3.2

C NON-Z

-.5

-20

.3

-.5

-.7

2.4

DZ

-3.2

-3

.1

0

-3.1

0

D NON-Z

-5.6

-35

1.4

-1

-5.2

-2.1

 

 

 

Table 4: IMPACT OF ADJUSTED BULLPEN USAGE FOR STARTERS OF DIFFERENT GRADES AND CONTROL RATINGS

 

 

Average Bullpen Usage Needed by Mail 3 Teams

 

 

 

A

B

C

D

Tot

 

 

Innings

65.1

193.6

101.3

-

360

 

 

Runs Allowed/9

3.15

3.82

4.64

6.90

3.93

 

 

 

 

 

 

 

 

 

 

 

 

 

Bullpen Innings Needed

Bullpen Runs Yielded

 

 

Starter

A

B

C

D

Total

Average

Improved

Difference

 

AZ

65

194

18

 

277

121.0

114

7.0

 

A

65

194

34

 

293

128.0

122.5

5.5

 

BZ

65

194

69

 

328

143.2

140.6

2.6

 

B

65

194

87

 

346

151.0

149.9

1.1

 

CZ

65

194

106

7

372

162.4

165.0

- 2.6

 

C

65

194

106

15

380

166.0

171.1

- 5.1

 

DZ

65

194

104

 

363

158.5

158.6

- 0.1

 

D

65

194

106

30

395

172.5

182.6

-10.1

 

 

 

Table 5: Impact of Grades on Win Totals for Various Staff Mixes

 

Staff makeup

Versus League Average

Versus All DZ

Record

AZ, 2BZ, B non-Z, CZ

+12.3

+27.8

(93-69)

2BZ, B non-Z, 2 CZ

+ 6.4

+22.0

(88-74)

BZ, B non-Z, 2CZ, C non-Z

+ 3.5

+19.0

(84-78)

B non-Z CZ, 2 C non-Z, DZ

-2.7

+12.8

(78-84)

2CZ, C non-Z, 2 DZ

-6.7

+ 8.8

(74-88)

2 C non-Z, 3 DZ

- 10.7

+ 4.8

(70-92)

5 DZ

-15.5

†† 0

(65-97)

 



[1] One issue we donít consider here but which is a question that can easily be addressed using our framework for analysis is the size of a league. Few draft leagues we know about have the 30 teams found in the major leagues.

Many have 30 (or larger) player rosters, and as a result have better hitting and pitching than the average major league team. We also realize that leagues vary in their restrictions on player usage.

[2] We are only presuming calculations with one year in mind and assume that the longer term potential of the players in question is equal.

[3] The starting pitchers used in the study accounted for 94% of the games started over the three year period and the average number of starters per team is 5.1.

[4]Note the anomaly that the D non-Z is worth +1.4 runs because he throws an average of 35 innings less than a league average starter.

[5] There is a slight discrepancy between the innings pitched by the starters included in the study and all starters since the study only included those starters who had at least 15 starts.

[6] Because we estimated the impact of saved bullpen innings for individual pitchers, the estimates might be a little different when considering the composition of an entire staff. However, because the impact relatively small except for pitchers at the extremes we thing the numbers here are good general guidelines.