Javascript required
Skip to content Skip to sidebar Skip to footer

Interpret Interaction Effect Continuous and Dummy Variable

Marinela, the source is Joro Kolev, who says that the following two are algebraic regression facts:

1.) Fact one, if every industry is a singleton, nobody can estimate anything on top of the constant. Here, I tag only one observation per group defined by the variable rep:

Code:

. sysuse auto, clear (1978 Automobile Data)  . keep if !missing(rep) (5 observations deleted)  . egen tag = tag(rep)  . reg price mpg i.rep if tag, absorb(rep) note: mpg omitted because of collinearity note: 2.rep78 omitted because of collinearity note: 3.rep78 omitted because of collinearity note: 4.rep78 omitted because of collinearity note: 5.rep78 omitted because of collinearity  Linear regression, absorbing indicators         Number of obs     =          5                                                 F(0, 0)           =       0.00                                                 Prob > F          =          .                                                 R-squared         =     1.0000                                                 Adj R-squared     =          .                                                 Root MSE          =          0  ------------------------------------------------------------------------------        price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval] -------------+----------------------------------------------------------------          mpg |          0  (omitted)              |        rep78 |           2  |          0  (omitted)           3  |          0  (omitted)           4  |          0  (omitted)           5  |          0  (omitted)              |        _cons |       6921          .        .       .            .           . ------------------------------------------------------------------------------  .

as you see I cannot estimate anything but a constant.

Now I am going to keep the rep==1 and rep==2 as singletons, but the rest of the groups defined by rep>2 I let them be whatever they are (not singletons):

Code:

. replace tag = tag + 1 if rep>2 (59 real changes made)  . reg price mpg i.rep if tag, absorb(rep) note: 2.rep78 omitted because of collinearity note: 3.rep78 omitted because of collinearity note: 4.rep78 omitted because of collinearity note: 5.rep78 omitted because of collinearity  Linear regression, absorbing indicators         Number of obs     =         61                                                 F(1, 55)          =      17.25                                                 Prob > F          =     0.0001                                                 R-squared         =     0.3416                                                 Adj R-squared     =     0.2817                                                 Root MSE          =     2573.4  ------------------------------------------------------------------------------        price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval] -------------+----------------------------------------------------------------          mpg |  -261.2437   62.89928    -4.15   0.000    -387.2967   -135.1908              |        rep78 |           2  |          0  (omitted)           3  |          0  (omitted)           4  |          0  (omitted)           5  |          0  (omitted)              |        _cons |   11945.14   1392.397     8.58   0.000     9154.718    14735.57 ------------------------------------------------------------------------------

Now I managed to estimate the slope on mpg. However,

2.) Fact two, the slope I estimated on mpg is not determined by the two singleton groups, in fact the regression above simply disregarded/threw out the singleton groups.

I estimate below the regression only for rep>2, that is I throw out manually the singleton groups, and the slope on mpg is still the same.

Code:

. reg price mpg i.rep if rep>2, absorb(rep) note: 4.rep78 omitted because of collinearity note: 5.rep78 omitted because of collinearity  Linear regression, absorbing indicators         Number of obs     =         59                                                 F(1, 55)          =      17.25                                                 Prob > F          =     0.0001                                                 R-squared         =     0.2431                                                 Adj R-squared     =     0.2018                                                 Root MSE          =     2573.4  ------------------------------------------------------------------------------        price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval] -------------+----------------------------------------------------------------          mpg |  -261.2437   62.89928    -4.15   0.000    -387.2967   -135.1908              |        rep78 |           4  |          0  (omitted)           5  |          0  (omitted)              |        _cons |   11864.94    1398.91     8.48   0.000     9061.463    14668.42 ------------------------------------------------------------------------------

Note that the observations in the previous two regressions are different, 61 vs 59. And yet the slope on mpg is the same.

The same will happen in your regression if you include dummies at the 2 digit industry level. The singleton industries will not contribute to the estimation of your slopes.

Finally, it is up to you what you do. If you think that it is crucial to include dummies at the 2 digit industry level, you do that, and you live with the fact that the singleton industries were "silenced", and not allowed to say anything about what your slope parameters are.

On the other hand if you include dummies at the 1 digit industry level, you control for industry at more coarse level, but you are allowing every industry to speak regarding what your slope estimates are.

Originally posted by Marinela Veleva View Post

Joro Kolev , thank you so much for the invaluable advice! Could you clarify one thing - what do you mean by "if you include dummies at the 2 digits level, you would be throwing out the information coming from those singleton industries."? Would there be any source, which I can use for backing up my choice of 1-digit dummies? This is just because my supervisor is extremely questioning every time.

Best!

brutonbeirsinglat.blogspot.com

Source: https://www.statalist.org/forums/forum/general-stata-discussion/general/1599790-how-to-interpret-interaction-terms-with-continuous-variables