Q1: In assessing the predictive power of categorical predictors of a binary outcome,
should logistic regression be used?
Q2: Objective: Using Logistic Regression to handle a binary outcome.
Given the prostate cancer dataset, in which biopsy results are given for 97 men:
• You are to predict tumor spread in this dataset of 97 men who had undergone a biopsy.
• The measures to be used for prediction are: age, lbph, lcp, gleason, and lpsa. This implies that binary dependent variable of lcavol will be the outcome variable.
We start by loading the appropriate libraries in R: ROCR, ggplot2, and aod packages as follows:
> install.packages(“ROCR”)
> install.packages(“ggplot2”)
> install.packages(“aod”)
> library(ROCR)
> library(ggplot2)
> library(aod)
Next, we load the csv file and check the statistical properties of the csv File as follow:
> setwd(“C:/RDataâ€) # your working directory
> tumor <- read.csv(“prostate.csv”) # loading the file
> str(tumor) # check the properties of the file
. . . continue from here!
Reference
R Documentation (2016). Prostate cancer data. Retrieved from
http://rafalab.github.io/pages/649/prostate.html