Econ 2123. Introduction to Econometrics
Sections 80/83. Prof. Williams
due Tuesday March 24
You can work together with other students in the class to answer each of the following
problems but you must write your assignments up separately.
Suppose the number of new home construction projects in the US started in month i,
Yi , is regressed on the unemployment rate in the US in period i, X1i , and 11 dummy
variables where X2i is equal to 1 if period i is January of some year, X3i is equal to
1 if period i is February and so on. The data consists of 120 monthly observations
over ten years. Use the FWL theorem to describe a simple regression that would
provide the same estimate for β1 .
(i) Suppose that using data on sales of a company’s product in n diﬀerent markets
the following regression was estimated
Si = 3080 − 75, 000Pi + 4.23Ai − 1.04Bi
(25, 000) (1.06) (0.51)
where, as usual, the number in parentheses are the robust standard errors. The
dependent variable, Si , is the company’s sales and the regressors are price (Pi ),
advertising expenditures (Ai ), and advertising expenditures of the company’s
competitor (Bi ). The company’s marketing department worried that this regression suﬀered from imperfect multicollinearity because Ai and Bi have a
correlation coeﬃcient of .97. To “solve” this problem Bi was dropped and the
following result was used:
Si = 2586 − 78, 000Pi + 0.52Ai
(24, 000) (4.32)
a. Using this second regression, what would the company conclude about the
eﬀect of advertising expenditures on sales?b. If the company had used the ﬁrst regression, what would they conclude
about the eﬀect of advertising expenditures on sales?
c. Does the second regression suﬀer from an omitted variables bias due to
the omission of Bi ? If so, is the bias positive or negative?
d. What can you conclude about how the company handled the “problem”
(ii) Consider a dataset consisting of several variables for each of sample of 1000
bills that were voted on in Congress over the past ten years. Let Yi denote the
percentage of the members of Congress that voted in favor of the ith bill in the
ﬁnal vote. Then let X1i , X2i , X3i denote the percentage of poor, middle class,
and wealthy Americans who supported the bill according to public opinion
polling. Using this data the following regression results were obtained:
Yi = 0.1 − 0.2X1i + 0.3X2i + 0.6X3i
(0.18) (0.22) (0.25)
a. According to these regression results, what would be the eﬀect of an increase of 10 percentage points in the support of wealthy Americans for a
bill, holding ﬁxed the level of support of the rest of the public? Is this
eﬀect statistically signiﬁcant?
b. Should one conclude from the regression results that only the support of
wealthy Americans inﬂuences the votes of members Congress? (Hint: The
answer is no. But you should explain why.)
c. What statistical test should be done to determine if there is enough statistical evidence to conclude that only the support of wealthy Americans
inﬂuences the votes of members Congress? Give the null hypothesis that
you would test, stated as an equation, or equations, using the parameters
of the model. Also give the name of the test you would do to test this
null hypothesis and the command you would use in Stata.
For each of the following regressions interpret the results. Speciﬁcally, answer the
question “what is the eﬀect of X on Y ?” Pay attention to the standard errors as
ln(wage)i = 0.6 + 0.07educationi
where wage is measured in dollars per hour and education is measure in years
(from 6 years to 22 years).
beef Consumptioni = 30 + 100 ln(incomei )
where the dependent variable is the number of pounds of beef consumed by
the individual in a year and income is measured in dollars.
ln(output)i = 30 + .8 ln(labori )
where outputi is the average units of a product produced by plant i in a month
and labori is the average units of labor used by the plant in a month.
ln(wage)i = 0.7 + 0.08experiencei − .001experience2
where wage is measured in dollars per hour and experience is measure in years
(from 0 years to 47 years).
v. For a sample of newly engaged couples
caratsi = 0.1 + 0.05incomei + .02income2
where caratsi represents the size of the diamond in the engagement ring (in
carats) and incomei represents the combined monthly income in thousands of
dollars of the couple.
Use the dataset in the ﬁle IPOD3.dta to complete this exercise. This data contains
information on 215 transactions on eBay where an iPod was sold. The variable
names should be self-explanatory. PRICE is measured in dollars. NEW and SCRATCH
are dummies equal to 1 if the iPod was new and scratched, respectively, and equal
to 0 otherwise. BIDRS represents the number of bidders for the iPod in the auction
and PERCENT is the quality of the seller, measured as the percent of past customers
giving the seller a positive rating. Answer the following questions. Also, please hand
in a .do ﬁle that lists every command you carried out.
i. New iPods in the sample sold for $62 more, on average, than used iPods. Is
this in part, or entirely, because new iPods are less likely to be scratched? Use
two separate regressions to answer this question and be careful to explain your
ii. According to most economic models of auctions, an increase in the number of
bidders should cause an increase in the ﬁnal price as it increases competition
for the good. Does this data support this implication of economic theory? To
answer this question be sure to consider the following:
a. omitted variables