alcoholSurv {alcoholSurv} | R Documentation |
Light Alcohol Consumption and Survival
Description
Data from 6 NHANES surveys with follow-up for mortality, as a matched comparison of: (i) light daily alcohol, (ii) rare alcohol, and (iii) no alcohol.
Usage
data("alcoholSurv")
Format
A data frame with 5650 observations on the following 21 variables.
SEQN
NHANES ID Number
nh
Identifies the NHANES years. The data are from six NHANES, 2005-2006 to 2015-2016.
female
1=female, 0=male
age
Age in years at the time of the NHANES survey.
education
Level of education in five categories. 1 is less than 9th grade, 3 is high school, 5 is at least a BA degree.
hdl
HDL cholesterol in mg/dL
bmi
BMI or body-mass index
GH
Glycohemoglobin as a percent
smoke
Smoking status at the NHANES interview. An ordered factor with levels
Everyday
<Somedays
<NotNow
<Never
z
Treatment indicator, 1 if consumes light daily alcohol, 0 if control.
gDrinks
Daily means: consumes light daily alcohol, between 1 and 3 drinks on at least 260=5x52 days each year. Rare means rarely consumes alcohol. None means consumed no alcohol in the past year. An ordered factor with levels
Daily
<Rare
<None
aDays
Days consumed alcohol in the past year.
aDrinks
Typical number of alcoholic drinks on drinking days.
a12life
1=consumed at least 12 alcoholic drinks in life, 0=other. Based on NHANES question ALQ110.
aEverBinge
Was there ever a time in your life when you drank 5 or more drinks almost every day? 1=yes, 0=no. The wording of this question changed slightly from one NHANES to another, sometimes asking about 4 drinks for a woman rather than 5. See the NHANES documentation for details.
time
Time to death or censoring in months from the date of the NHANES examination. Public data file using the National Death Index.
mortstat
Death/censoring indicator, 1=dead, 0=censored.
cod
Cause of death on the death certificate. See codf.
codf
Cause of death as a factor. An ordered factor with levels
Alive
<Heart
<Cancer
<ChronicLung
<Accident
<Cerebrovascular
<Alzheimer
<Diabetes
<FluPneumonia
<Kidney
<Other
mset
Matched set indicator, 1 to 1130.
treated
The SEQN for the treated individual in a matched set. Same information as mset, but in a different format.
Details
This is a matched data set, one treated, 2 rare controls plus 2 none controls in each of 1130 blocks of size 5. See the description of the gDrinks variable above. For details, see Rosenbaum (2025). The examples below replicate analyses from Rosenbaum (2025).
The mortality data is from the public use linked morality files. NHANES also has a restricted use version of the mortality files; it is not used here. The public use file masks identity in various ways; see its web-page referenced below.
Source
Data are from the NHANES webpage www.cdc.gov/nchs/nhanes/index.htm.
Also, 2019 Public-Use Linked Mortality Files are from www.cdc.gov/nchs/data-linkage/mortality-public.htm
References
US National Health and Nutrition Examination Survey. www.cdc.gov/nchs/nhanes/index.htm
Public-Use Linked Mortality Files. www.cdc.gov/nchs/data-linkage/mortality-public.htm
Rosenbaum, P. R. (2025) <doi:10.1080/09332480.2025.2473291> Does a Daily Glass of Wine Lengthen Life? Insight from a Second Control Group. Chance, 38 (1), 25-30.
Examples
#
# The example replicates results from Rosenbaum (2025)
#
oldpar <- par(no.readonly = TRUE)
data(alcoholSurv)
# Three treatment groups
table(alcoholSurv$gDrinks)
# In 1130 matched blocks of size 5
table(table(alcoholSurv$mset))
attach(alcoholSurv)
# Alcohol groups
table(gDrinks,aDays>0)
table(gDrinks,z)
table(gDrinks,aDrinks)
table(gDrinks,a12life)
table(gDrinks,aDays>24)
table(gDrinks,aDays>0)
# Alcohol groups are matched for covaiates
tapply(age,gDrinks,mean)
tapply(female,gDrinks,mean)
tapply(aEverBinge,gDrinks,mean)
tapply(education,gDrinks,mean)
prop.table(table(smoke,gDrinks),2)
library(survival)
par(bg="moccasin")
# Make Figure 1
par(mfrow=c(1,3))
boxplot(age~gDrinks,las=1,cex.lab=1,cex.axis=1,xlab="Age",
ylab="Age in Years")
axis(3,at=1:3,labels=round(tapply(age,gDrinks,mean)),cex.axis=1)
boxplot(education~gDrinks,las=1,cex.lab=1,cex.axis=1,xlab="Education",
ylab="Education")
axis(3,at=1:3,labels=round(tapply(education,gDrinks,mean),2),cex.axis=1)
boxplot((aDays*aDrinks)~gDrinks,las=1,cex.lab=1,cex.axis=1,
xlab="Alcoholic Drinks", ylab="Drinks Per Year")
axis(3,at=1:3,labels=round(tapply((aDays*aDrinks),gDrinks,mean)),cex.axis=1)
# Make Table 1
Female<-tapply(female,gDrinks,mean)*100
Age<-tapply(age,gDrinks,mean)
Education<-tapply(education,gDrinks,mean)
EverBinged<-tapply(aEverBinge,gDrinks,mean)*100
NeverSmoked<-tapply(smoke=="Never",gDrinks,mean)*100
NoLongerSmoke<-tapply(smoke=="NotNow",gDrinks,mean)*100
SmokeSomeDays<-tapply(smoke=="Somedays",gDrinks,mean)*100
SmokeEveryDay<-tapply(smoke=="Everyday",gDrinks,mean)*100
tabBal<-rbind(Female,Age,Education,EverBinged,NeverSmoked,NoLongerSmoke,SmokeSomeDays,
SmokeEveryDay)
rm(Female,Age,Education,EverBinged,NeverSmoked,NoLongerSmoke,SmokeSomeDays,
SmokeEveryDay)
tabBal2<-rbind(tabBal,prop.table(table(nh,gDrinks),2)*100)
# Make Figure 2
par(mfrow=c(1,2))
xlim<-c(0,150) # Restrict plots to first 150 months, after which data are thin
coln<-c("blue","red","black")
st<-Surv(time,mortstat)
plot(survfit(st~(gDrinks=="Daily")),col=c("darkgreen","blue"),lty=c(4,1),lwd=2,ylim=c(.5,1),las=1,
ylab="Probability of Survival",xlab="Months",cex.axis=.9,cex.lab=.9,
main="(i) All, I=1130, J=5", cex.main=.8,xlim=xlim)
legend(0.5,.63,c("Daily","Control"),col=c("blue","darkgreen"),lty=c(1,4),lwd=rep(2,2),cex=.8)
plot(survfit(st~gDrinks),col=coln,lty=1:3,lwd=2,ylim=c(.5,1),las=1,
ylab="Probability of Survival",xlab="Months",cex.axis=.9,cex.lab=.9,
main="(ii) All, I=1130, J=5", cex.main=.8,xlim=xlim)
legend(0.5,.66,levels(gDrinks),col=coln,lty=1:3,lwd=rep(2,3),cex=.8)
# Make Figure 3
who<-smoke=="Never"
plot(survfit(st[who]~gDrinks[who]),col=coln,lty=1:3,lwd=2,ylim=c(.5,1),las=1,
ylab="Probability of Survival",xlab="Months",cex.axis=.9,cex.lab=.9,xlim=xlim,
main=paste("Never Smoker, I=",sum(z[who]),", J=5",sep=""),cex.main=.8)
legend(0.5,.66,levels(gDrinks),col=coln,lty=1:3,lwd=rep(2,3),cex=.8)
who<-(aEverBinge==0)
plot(survfit(st[who]~gDrinks[who]),col=coln,lty=1:3,lwd=2,ylim=c(.5,1),las=1,
ylab="Probability of Survival",xlab="Months",cex.axis=.9,xlim=xlim,cex.lab=.9,
main=paste("Never a Binge Drinker, I=",sum(z[who]),", J=5",sep=""),cex.main=.8)
legend(0.5,.66,levels(gDrinks),col=c(4,2,1),lty=1:3,lwd=rep(2,3),cex=.8)
rm(who)
# Do formal analyses in footnote 1 using the Cox's stratified proportional
# hazards model
coxph(st~z+strata(treated))
confint(coxph(st~z+strata(treated)))
exp(confint(coxph(st~z+strata(treated))))
noDrinks<-1*(gDrinks=="None")
coxph(st~z+noDrinks+strata(treated))
confint(coxph(st~z+noDrinks+strata(treated)))
exp(confint(coxph(st~z+noDrinks+strata(treated))))
rm(coln,xlim,noDrinks)
detach(alcoholSurv)
par(oldpar)