Econ 2123. Introduction to Econometrics

Sections 80/83. Prof. Williams

Homework 7

due Tuesday March 24

You can work together with other students in the class to answer each of the following

problems but you must write your assignments up separately.

Q1.

Suppose the number of new home construction projects in the US started in month i,

Yi , is regressed on the unemployment rate in the US in period i, X1i , and 11 dummy

variables where X2i is equal to 1 if period i is January of some year, X3i is equal to

1 if period i is February and so on. The data consists of 120 monthly observations

over ten years. Use the FWL theorem to describe a simple regression that would

provide the same estimate for β1 .

Q2.

(i) Suppose that using data on sales of a company’s product in n diﬀerent markets

the following regression was estimated

ˆ

Si = 3080 − 75, 000Pi + 4.23Ai − 1.04Bi

(25, 000) (1.06) (0.51)

where, as usual, the number in parentheses are the robust standard errors. The

dependent variable, Si , is the company’s sales and the regressors are price (Pi ),

advertising expenditures (Ai ), and advertising expenditures of the company’s

competitor (Bi ). The company’s marketing department worried that this regression suﬀered from imperfect multicollinearity because Ai and Bi have a

correlation coeﬃcient of .97. To “solve” this problem Bi was dropped and the

following result was used:

ˆ

Si = 2586 − 78, 000Pi + 0.52Ai

(24, 000) (4.32)

a. Using this second regression, what would the company conclude about the

eﬀect of advertising expenditures on sales?b. If the company had used the ﬁrst regression, what would they conclude

about the eﬀect of advertising expenditures on sales?

c. Does the second regression suﬀer from an omitted variables bias due to

the omission of Bi ? If so, is the bias positive or negative?

d. What can you conclude about how the company handled the “problem”

of multicollinearity?

(ii) Consider a dataset consisting of several variables for each of sample of 1000

bills that were voted on in Congress over the past ten years. Let Yi denote the

percentage of the members of Congress that voted in favor of the ith bill in the

ﬁnal vote. Then let X1i , X2i , X3i denote the percentage of poor, middle class,

and wealthy Americans who supported the bill according to public opinion

polling. Using this data the following regression results were obtained:

ˆ

Yi = 0.1 − 0.2X1i + 0.3X2i + 0.6X3i

(0.18) (0.22) (0.25)

a. According to these regression results, what would be the eﬀect of an increase of 10 percentage points in the support of wealthy Americans for a

bill, holding ﬁxed the level of support of the rest of the public? Is this

eﬀect statistically signiﬁcant?

b. Should one conclude from the regression results that only the support of

wealthy Americans inﬂuences the votes of members Congress? (Hint: The

answer is no. But you should explain why.)

c. What statistical test should be done to determine if there is enough statistical evidence to conclude that only the support of wealthy Americans

inﬂuences the votes of members Congress? Give the null hypothesis that

you would test, stated as an equation, or equations, using the parameters

of the model. Also give the name of the test you would do to test this

null hypothesis and the command you would use in Stata.

Q3.

For each of the following regressions interpret the results. Speciﬁcally, answer the

question “what is the eﬀect of X on Y ?” Pay attention to the standard errors as

well.

i.

ln(wage)i = 0.6 + 0.07educationi

(0.01)

where wage is measured in dollars per hour and education is measure in years

(from 6 years to 22 years).

ii.

beef Consumptioni = 30 + 100 ln(incomei )

(0.01)

where the dependent variable is the number of pounds of beef consumed by

the individual in a year and income is measured in dollars.

iii.

ln(output)i = 30 + .8 ln(labori )

(0.15)

where outputi is the average units of a product produced by plant i in a month

and labori is the average units of labor used by the plant in a month.

iv.

ln(wage)i = 0.7 + 0.08experiencei − .001experience2

i

(0.01)

(0.0003)

where wage is measured in dollars per hour and experience is measure in years

(from 0 years to 47 years).

v. For a sample of newly engaged couples

caratsi = 0.1 + 0.05incomei + .02income2

i

(0.1)

(0.001)

where caratsi represents the size of the diamond in the engagement ring (in

carats) and incomei represents the combined monthly income in thousands of

dollars of the couple.

Q4.

Use the dataset in the ﬁle IPOD3.dta to complete this exercise. This data contains

information on 215 transactions on eBay where an iPod was sold. The variable

names should be self-explanatory. PRICE is measured in dollars. NEW and SCRATCH

are dummies equal to 1 if the iPod was new and scratched, respectively, and equal

to 0 otherwise. BIDRS represents the number of bidders for the iPod in the auction

and PERCENT is the quality of the seller, measured as the percent of past customers

giving the seller a positive rating. Answer the following questions. Also, please hand

in a .do ﬁle that lists every command you carried out.

i. New iPods in the sample sold for $62 more, on average, than used iPods. Is

this in part, or entirely, because new iPods are less likely to be scratched? Use

two separate regressions to answer this question and be careful to explain your

results.

ii. According to most economic models of auctions, an increase in the number of

bidders should cause an increase in the ﬁnal price as it increases competition

for the good. Does this data support this implication of economic theory? To

answer this question be sure to consider the following:

a. omitted variables

b. nonlinearities

c. multicollinearity