Response to Reviewer #1’s Comments on: “Subseasonal prediction of springtime Pacific-North American transport using upper-level wind forecasts”

The original motivation for our study was to examine STT to the PBL over North America, because this is the type of STT that can directly influence surface air quality. For this specific type of STT, spring is by far the most important time period, which our study is not the first to point out (see for e.g. the references we included: Fiore et al. 2003; EPA US 2006; Langford et al. 2009; Lefohn et al. 2011; note that our Fig. 1-first column also supports this).

The original motivation for our study was to examine STT to the PBL over North America, because this is the type of STT that can directly influence surface air quality. For this specific type of STT, spring is by far the most important time period, which our study is not the first to point out (see for e.g. the references we included: Fiore et al. 2003;EPA US 2006;Langford et al. 2009;Lefohn et al. 2011; note that our Fig. 1-first column also supports this).
In terms of STT to 500 hPa and TME (and water vapor transport more generally) on the other hand, the reviewer is quite right that transport skill during DJF is an interesting topic. Indeed, water vapor transport is large in magnitude (e.g., atmospheric rivers from the Pacific basin to the west coast of the US) and forecast skill perhaps at a peak during DJF. This is likely one reason why there are so many studies (too many to count) that discuss the predictability of water vapor transport during DJF. However, there are very few modern studies (if any?) that have examined STT and TME forecast skill during spring, despite the fact that this is a dynamically interesting period (see for e.g., the Breeden et al. paper we cite) and is an important season for moisture transport over the Pacific-North American region (e.g., Mundhenk et al. 2016).
Thus, because (1) STT to the PBL is so important during spring and (2) very little research has documented STT and TME forecast skill during spring, we feel that focusing on spring rather than other months is a worthwhile endeavor. That said, we agree with the reviewer that we should be more explicit when explaining our motivation for examining spring, so we have rewritten a portion of the Introduction and added some additional that expands on our rationale for focusing on MAM (see lines 31-46).
Reviewer wrote: Lines 185-187: It is not clear how the claim of "vertically deep" shifts in the jet stream is made from the 2-D comparison. I am assuming that is because the jet climatology is derived from vertically averaged winds between 100-500 hPa. However, given the strong vertical gradients of wind around the jet stream, the wind changes don't have to be vertically deep to result in shifts to the jet. In addition, the phrase could also be interpreted to imply the jet stream location shifts vertically which is not necessarily the case.
Our response: You are correct in assuming that the jet climatology was calculated for vertically averaged winds between 100-500 hPa. Although we did not include the results in the manuscript, we also tested our results using a "shallow jet" climatology (based on upper tropospheric winds only, which was also taken from the Koch et al. 2006 dataset). We found that our results were not dependent on the jet climatology depth. Thus, it would seem that the depth is probably not super important, and given the potential for confusion that you have pointed out, we have simply removed the reference to jet depth.
Reviewer wrote: Line 223 and Our response: Thank you for pointing out this omission. We included p-values in Table 1 and confidence intervals in Fig. 4, but for some reason neglected to do so for Table 2. However, instead of listing p-values in Table 2, we have listed the 95 th percentile confidence intervals (we include the p-values in the Table 2 caption). We made this choice for two reasons. First, we have some concerns about whether the p-values are really meaningful for the results in Table  2. For example, the p-value for PC3 at week 6 is <0.05 despite the fact that the correlation is 0.09; that is, we feel that the 0.09 correlation is essentially useless for prediction purposes despite the fact that it is technically "statistically significant". A second reason to include confidence intervals instead is that they at least provide some amount of information about the "usefulness" of the correlation. Indeed, this is the reason that we included confidence intervals for the similar forecast-verification time series correlation calculations shown in Fig. 4.
Nevertheless, for completeness, we did note what the p-values are for the full time series in the Table 2 caption, but we also added what amounts to a note of caution to readers who might over-interpret the "significance" of the small p-values. Specifically, we included additional p-value information where we tried to take into account autocorrelation in the data that might artificially boost the p-values. This additional calculation shows that indeed the PC3 correlations at weeks 5 and 6 are likely of negligible usefulness/statistical significance. Our response: We did check many other combinations, but for the most part, the other combinations are either for regions nearby to what we are already showing (STT500 and TME over the central Pacific) or are not statistically significant. We have added a sentence on lines to make a note of this on lines 355-358.
In addition to this new text, it is worth noting that on lines 412-415 we discuss the fact that TME over Alaska for positive phase EOF1 and negative phase EOF2 appears not to be reliably predictable, while on lines 372-375, we discuss the lack of significant forecast skill over the western US.
Reviewer wrote: Many of the figures lack labels on the color bars and/or the axes. These need to be added.
Our response: We find this comment a little confusing because (1) all of the figures have color bars and the units are noted in the figure captions, and (2) all of the figures have axes labels, we just chose to only include the axes labels at the bottom of every column and on the far right-hand side (or sometimes left-hand side) of each row. We prefer to keep this figure label convention as we have it because we believe that it helps to reduce the clutter on multi-panel plots (e.g., Figure 3 would have 18 sets of labels instead of 6, which we find to be preferable).
In addition to the inclusion of the new composites and text just mentioned, we have also tried to address your specific comments from this portion of your review (see next three responses immediately below).
Reviewer wrote: L187-190: "While the EOF patterns likely combine jet variability due to both the subtropical and polar front jets (Koch et al. 2006), a strong jet stream of either type will act as a waveguide for Rossby waves (e.g., Schwierz et al. 2004; Rivière 2010 and references therein) with an increased frequency of STT and TME (e.g., Shapiro andKeyser 1990, Koch et al. 2006)."

[and later]
Are TMEs are really mentioned in Shapiro and Keyer (1990) and Koch et al. 2006) to be linked to Rossby waveguides?
Our response: You are right that we should be a little more careful with the reference structure here. That is, the Shapiro/Keyser paper is really only related to STT (not TME), while the Koch et al. paper is only related to STT and TME in the sense that STT and TME are related to Rossby waves that tend to be "steered" along the waveguide of the jet. We have switched out the Koch et al. reference for Sprenger et al. 2017 where they discuss the connection between TME, STE, and jet variability, Eady growth rates, etc. We have also added the Higgins et al. 2000 reference to help readers explore the connection between Pacific jet (and teleconnection) variability and TME.

Reviewer wrote: L200-202:
"STTPBL on the other hand (Fig. 3 middle row), have maxima slightly downstream of the 500 hPa maxima, which reflects the fact that deep STT tends to occur as maturing Rossby waves amplify and tropopause folds and potential vorticity streamers extend downwards towards the surface (Wernli and Bourqui 2002;Sprenger et al. 2003, Appenzeller et a. 1996, Wernli and Sprenger 2007, Škerlak et al. 2015."

What does it mean that a PV streamer extends down to the surface?
Our response: Thank you for pointing out that this section needs to be more carefully written. The point that we were trying to make (perhaps unclearly), was that as a Rossby wave breaks, filamented structures are stretched out along isentropes. And because isentropes slope downwards towards the equator, any filaments that are stretched out towards the equator will consequently "extend downwards" closer to the surface of the Earth. This view is consistent with Škerlak et al. 2014 (see first paragraph of their Sect. 3.1.3 and their Fig. 4), where they point out that local maxima in STT tend to move equatorward as pressure increases (i.e., maxima in STT are farther poleward at 500 hPa than the maxima at 800 hPa).
We have tried to rewrite the text in this section (lines 275-280) to be more careful with our language (note that this section now also includes the mention of the streamer and PV cutoff figures discussed above). One issue here is that it is a little unclear exactly how the mass is exchanged to the PBL, because it is unclear from the time averaged STT climatologies how the isentropic surfaces are warped as the Rossby wave fully breaks (obviously in a climatological sense, the isentropes that intersect the stratosphere in midlatitudes do not intersect the surface over western N. America for example.) In that sense, a better word choice than "extends to the surface" is to say that streamers "extend towards the surface", because clearly at some point turbulent mixing and diabatic circulations must occur to cause irreversible mixing across isentropes. Note that our view is supported by discussions in Škerlak et al. 2014 (see the left column on their page 918), which we now reference on lines 277-278.
Reviewer wrote: Whereas the signals for STT-500hPa and TME are rather clear, it is much more difficult to see the signal of STT-PBL. For instance, in Figure 7 it is really difficult to see the signal in the west-American box for the hindcast simulations. The same applies, to a lesser degree, to the retrospective analysis in Figure 3. I wonder whether this would become somewhat clearer if (especially in Figure 7) the colorbar is adjusted. Overall, I have have the impression the link between the jet structures, as expressed in PC1-3, is clearly discernible in for STT-500hPA and TME, but rather weak for STT-PBL. Actually, Figure 10 shows that there is some skill, but it is difficult to get it from the Figure 7.In summary, I see that the authors are aware of this fact and argue by means of Rossby waves not extending to the surface (see point 2 above) and PBL effects (altitude) to explain this weak signal of STT-PBL compared to STT-500hPA and TME, but the explanation is not fully convincing. Please add, at least, some references supporting the argument..

Our response:
We totally agree that the STT-PBL is somewhat hard to see relative to STT-500 and TME! Rather than a flaw in the choice of color scale though, we see this as fundamentally indicative of the fact that the STT-PBL anomalies are rather weak and difficult to predict. Nevertheless, to help the reviewer confirm that the Fig. 7 patterns of STT-PBL are consistent (spatially) with those shown in Figs. 3 and S3, we include a figure below that shows STT-PBL with a modified color bar and a slightly different color scheme that is easier to see (the figure below is analogous to Fig. 7 in the main manuscript). Still, in the published version of Fig. 7, we have decided to keep the scale of the color bar for STT-PBL equivalent to that used for STT-500 because we feel that changing the color scale to make the STT-PBL anomalies easier to see risks making them seem larger or more robust than they actually are (that is, when a reader thinks that the anomalies are hard to see, thus concluding that the result/signal is weak, we feel that they are correctly interpreting the reality of the problem).
To address the reviewer's concern that we did not fully explain the reason that the forecasted STT-PBL signal is weak compared to STT-500, we have expanded the text (see lines 385-409) to include a discussion highlighting potential reasons why the STT-PBL forecasts might be worse than those for STT-500.
Reviewer wrote: L48: As a more recent study linking STT and PV streamers, the authors might want to add the following reference: Sprenger, M., Wernli, H., & Bourqui, M. (2007).
Our response: Done.