# «June 2015 Abstract Since 2011, Saudi Arabia has dramatically extended its labor market policies to address youth unemployment and low Saudi ...»

5.1 Regression Kink Design The RKD estimates treatment e↵ects using kinks in a continuous policy variable that is based on a potentially endogenous assignment variable. This method is analogous to regression discontinuity design, but can be used in cases where the policy variable is continuous but contains discontinuities in its derivative (i.e. kinks). The treatment e↵ect is identiﬁed by the discontinuities in the derivatives (i.e. changes in the slopes) of the outcome variables around the kink point in the policy variable. This is critical to the analysis of the Nitaqat program because the size of the necessary hiring increase approaches zero as ﬁrms near the cuto↵; a Yellow ﬁrm with 7.9 percent Saudization facing an 8 percent quota will have almost no need to adjust its sta ng. Yellow ﬁrms below the quota, however, will need to increase Saudization by an amount that is directly increasing with their distance below the quota, while Green ﬁrms’ incentives to change Saudization rates will be uniformly zero regardless of their distance above the cuto↵. The RKD method will allow us to exploit this kink in the quota compliance requirements to estimate the e↵ect of the program near the quota cuto↵s. The program’s treatment e↵ect will be identiﬁed by changes in the slopes of the outcome variables around this kink point in the assignment function.

The RKD is formalized by Card, Lee, Pei & Weber [2012], which establishes the conditions under which the RKD identiﬁes the local average response, or treatment on the treated, parameter that would be identiﬁed if the treatment had been randomly assigned. The necessary identiﬁcation tests and robustness checks are similar to those for RDD outlined in detail in Lee & Lemieux [2010].

This method has previously been used for the evaluation of programs with kinked beneﬁt structures such as the EITC [Jones 2013], UI beneﬁts [Card et al. 2012], college ﬁnancial aid [Nielsen, Sorensen ˚ & Taber 2010], intergovernmental grants [Dahlberg, M¨rk, Rattsø& Agren 2008], education ﬁnance o [Guryan 2001], and prescription drug reimbursement [Simonsen, Skipper & Skipper 2010].

**5.1.1 Compliance Requirement**

The RKD analysis in this paper relies on the kinked compliance requirement that was generated by the imposition of Saudization quotas on ﬁrms in each industry by size group. As discussed above, the most important quota is the one at the Green/Yellow cuto↵. The incentive for these ﬁrms to increase their Saudization percentage was increasing in their baseline distance below this cuto↵.

For example, the cuto↵ for medium-sized construction ﬁrms was six percent; a ﬁrm in the Yellow band with four percent Saudi workers needed to increase its Saudization rate by two percent to comply with the program. For Green and Platinum ﬁrms above the cuto↵, I assume that that ﬁrms already in compliance (with baseline Saudization rates just above the quota) experienced no incentive to change their Saudization rates as a result of the program. A medium sized construction ﬁrm with eight percent Saudi workers, for example, would have a compliance requirement of zero.

This generates a kinked function mapping initial Saudi percentage to the increase mandated by the program. Figure IIIa shows this compliance requirement for medium construction ﬁrms. This rule generates a similar compliance requirement with a kink at the quota level for each of the 109 industry by size cells; these kinked compliance requirements are plotted for each cell in Figure IIIb.

We can combine these by normalizing the cuto↵ to zero and measuring the compliance requirement as the distance below the cuto↵, i.e.

** b(Vijs ) = max(Qjs Sijs, 0)**

where Sijs is the initial Saudization percentage for ﬁrm i and Qjs is the quota for the corresponding industry j and size group s. For Yellow and Red ﬁrms, this will be positive: a ﬁrm with a baseline Saudization rate of 5 percent facing a quota of 8 percent would have b(3) = max(3, 0) = 3. A Green ﬁrm with 9 percent Saudization facing the same quota would have b( 1) = max( 1, 0) = 0. This normalization collapses the compliance rules in Figure IIIb into a rule with a single kink at zero shown in Figure IIIc.32 When examining the e↵ect of the program on variables measured in terms of employees, i.e.

number of Saudi employees and number of expatriate employees, it will also be useful to deﬁne the distance from the cuto↵ in terms of the number of Saudis that would have to be hired or expatriates

**that would have to be downsized to meet the quota. For Saudis, we can express this as:**

Note that this normalization pools ﬁrms facing di↵erent quota cuto↵s into a single sample. This approach is standard in the RD literature when cuto↵s vary by treatment site or year (see for example Black, Galdo & Smith [2007]), and yields an estimate of the weighted average e↵ect over cells.

For example, a ﬁrm with 12 expatriate employees and 1 Saudi employee facing a quota of 10 percent would need to downsize 3 expatriate workers to meet the quota. These normalizations are useful in the interpretation of the e↵ects of the program in terms of the number of di↵erent types of workers employed. The normalized compliance requirements are plotted in Figures IVa and IVb.

The assumption that Green and Platinum ﬁrms have a compliance requirement of zero is consistent with the idea that baseline Saudization rates reﬂect unobserved di↵erences in propensity to hire Saudis, whether because of ﬁxed investments made in Saudi HR development, physical capital, or employee-driven recruitment networks. If this propensity to hire Saudis generates an optimal number of Saudi workers that is not a↵ected by the presence of non-binding quotas, then we would not expect these ﬁrms to change their sta ng in response to Nitaqat regulations.33 However, this assumption will be violated if ﬁrms above the quota experienced pressure to change their Saudi percentages. This may be the case if quotas a↵ected equilibrium wages or resulted in other spillovers from treated (Yellow and Red) to non-treated (Green and Platinum) ﬁrms. In this case, ﬁrms above the quota would have incentives to move down to the quota, implying a compliance requirement with a smaller kink that the one described above.34 The results, however, indicate that ﬁrms just above the quota tended not to adjust their Saudi employment in response to Nitaqat requirements.

**5.1.2 RKD Identiﬁcation and Estimation**

The identiﬁcation assumptions and estimation procedure for RKD are very similar to those required for RDD, but applied to the discontinuity in the derivative rather than the level of the treatment function. In particular, for outcome Y, starting Saudization quota distance V = Q S and Nitaqat compliance requirement B, we can express the e↵ect of the Saudization requirement on the outcome of interest using the generalized nonseparable model Y = y(B, V, U ), i.e. deﬁne the outcome of interest as a general function of the compliance requirement B, baseline Saudization quota distance (and potentially other observable covariates) V, and an unobserved error term U.

The key relationship of interest is the e↵ect of B on Y.35 The policy parameter of interest is therefore E(@Y (B, V, U )/@V |V = 0) If variation baseline Saudization rates are driven by random ﬂuctuations around the median (where the quotas were set), then we would expect Green and Platinum ﬁrms to tend to revert to the mean, decreasing their Saudization rates independently of the program.

If the compliance requirement was in fact entirely smooth, they the RKD would ﬁnd no program e↵ect even if the program in fact had a large e↵ect on ﬁrms. This e↵ect may be mitigated by the incentive of these ﬁrms to maintain their Nitaqat compliance by replacing these workers.

In this formulation, the error term U may enter the model non-additively, which allows for unrestricted heterogeneity in the response of Y to V. This setup also allows heterogeneity in the response of Y to B.

A1: (Regularity) y(·, ·, ·) is continuous, and y1 (b, v, u) is continuous in b for all b, v, and u.

The marginal e↵ect of B on Y must therefore be a continuous function of both observables and of the unobserved error term.

A2: (Smooth E↵ect of V ) y2 (b, v, u) is continuous in v for all b, v, and u.

V may a↵ect Y, but the e↵ect is assumed to be continuously di↵erentiable, so any observed kinks in Y cannot be the direct result of small changes in Y. In our case, this would rule out a kinked underlying relationship between baseline quota distance and increases in Saudi percentage in the absence of the Nitaqat program.

A3: (First Stage) b(·) is a known function that is everywhere continuous and continuously di↵erentiable on ( 1, 0) and (0, 1), but limv!0+ b0 (v) 6= limv!0 b0 (v). The compliance function must therefore be known and have a kink at v = 0. There also must be a positive density around the kink point. In our case, the compliance requirement is b(V ) = max(V, 0), so

Because the quotas were placed near the median Saudization rates for each industry by size cell, there is also a large density of ﬁrms around this kink point.

A4: (Smooth Density) FV |U =u (v) is twice continuously di↵erentiable in V for all u and v. This condition rules out the manipulation of the assignment variable and is the key identifying assumption.

In summary, if everything else is continuous near the kink, any changes in the slope of the outcome can be attributed to the kink in the compliance requirement B. In this case, the RKD will identify the “treatment on the treated” parameter at this point, i.e. the average e↵ect of a marginal increase in the compliance requirement near the cuto↵ holding the distribution of unobservables constant. The degree to which V and U are correlated will determine the extent to which this treatment e↵ect applies to ﬁrms that are further away from the quota.

There are two testable implications of the identiﬁcation assumptions above. First, in a valid sharp RKD, fv (v) must be continuously di↵erentiable in v. This rules out precise manipulation of baseline Saudization percentage by ﬁrms near the quota cuto↵s. This is reasonable given that the quotas were not announced prior to the start of the program: although ﬁrms had been informed that the government would start enforcing Saudization quotas, ﬁrms were not told where the cuto↵s would be for their industry and size groups until the start of the program in June 2011. We can test for this by examining the baseline distribution of V. In particular, I use a modiﬁed McCrary test to test for a break in the density of V around the kink in the compliance function [McCrary 2008].

Figure V plots the density of baseline Saudization percentages relative to the cuto↵. A McCrary test shows no evidence of bunching to the right of the quota at the start of the program, and the ﬁgure conﬁrms that quotas were set near the median starting Saudization percentages.

The second testable implication is that there should be no kink in baseline covariates around dP r(Xx|V =v) is continuous in v at v = 0 for all x.36 Baseline values of several sample the quota, i.e. dv covariates (ﬁrm size, Saudi employees, and expatriate employees) are plotted in Figure VI; none of these correspond to a statistically signiﬁcant kink or discontinuity in averages around the cuto↵.

The fact that quotas were assigned near cell medians also means that there should be roughly the same number of ﬁrms above and below the cuto↵ within industry by size groups.

**If we use a simple, additive model with constant e↵ects:**

This RKD estimand is the change in the slope of the conditional expectation function E[Y |V = v] at the kink point v = 0 divided by the change in the slope of the assignment function b(·) at that same point. In our case, the assignment function is b(V ) = max(V, 0), so the change in the slope of the assignment function is 1 at the cuto↵. We therefore have

where |v| h for bandwidth h and p is the polynomial order of the ﬁt. The analysis estimates these local polynomial regressions using a symmetric uniform kernel and several estimation and bandwidth selection methods.

This is analogous to a test for true random assignment in an RCT.

In addition to the conventional nonparametric RKD estimator, I also use the bias-corrected estimator proposed by Calonico, Cattaneo & Titiunik [2014] and implemented using Calonico [2014].

This procedure adjusts the local RKD estimate using a bias correction method using a local regression of order p + 1. I use their routine for calculating robust conﬁdence intervals for these estimates using the ﬁxed-matches estimated residuals. Results are also reported for several choices of bandwidth. These include the “rule-of-thumb” (ROT) bandwidth selector described in Fan & Gijbels [1996] and the bandwidth selector proposed by Calonico et al. [2014] (CCT). For consistency across outcome variables I also report the results for bandwidths of 5 and 50. The bandwidth for the bias-correction term is the same as the bandwidth of the local polynomial in the ROT and manual bandwidth selections and selected optimally for the CCT bandwidth-selection routine.

Following the analysis in Card, Lee, Pei & Weber [2015], I use the local linear estimates with the ROT bandwidth selections as the preferred speciﬁcation. However, results are reported for all four bandwidth selections and both the conventional kink estimates and bias-corrected estimates with robust conﬁdence intervals as described in Calonico et al. [2014].

5.2 Di↵erences in Di↵erences While the RKD analysis focuses on changes in incentives to hire around the kink in the policy rule, it is also useful to estimate the overall e↵ects of the Nitaqat program on Saudi employment, expatriate employment, ﬁrm size and exit.37 This can be done by estimating the average e↵ect of assignment to the Red or Yellow color bands as compared to ﬁrms in the Green band within the

**same industry by size cell:**