Take a moment to explore the data and the variables in the Honolulu dataset.  The Honolulu Heart Study was a prospective cohort study of coronary heart disease and stroke among men of Japanese descent in Hawaii, born from 1900-1919 and residing on Oahu island in 1965. You are analyzing a random subset of the men’s characteristics at baseline (i.e., at the initial interview in 1969).

In the below codebook table, identify the variable measurement (nominal, ordinal, or scale) for each of the original variables in the data set. (9 pts)

 Variable Label Variable Name Values or Units of Measurement Variable Type Highest Level of Education Completed EDUC 1 = none 2 = primary school 3 = intermediate school 4 = high school 5 = technical school beyond high school 6 = university Age AGE in years Age_cat Age_cat 1= 45-50 2= 51-55 3= 56-60      4= 61 or older Smoking Status SMOKE 0 = no 1 = yes Physical Activity Status PHYS_ACT 1 = mostly sitting (sedentary) 2 = moderate 3 = vigorous Blood Glucose BLD_GLC in mg/dL Cholesterol CHOLEST in mg/dL Cholesterol_Cat Cholest_Cat *NOTE:  These values don’t have to be entered into SPSS* 1= Normal (less than 200) 2= Borderline (200-239)      3= High (240 or above) Body Mass Index BMI in weight (kg) / height (m2)

1.    A researcher thinks that mean blood glucose will be higher in those who smoke vs. those who don’t smoke.

a.    Please write out the null and research hypothesis.

b.    Is this situation’s research hypothesis a one or two tailed hypothesis?  Explain.

c.     Please produce an appropriate visual aid illustrating this scenario.

d.    Please provide a summary table for this scenario that gives the following statistics, N, MEAN, MEDIAN, STD DEVIATION, VARIANCE, MIN, MAX.

2.    Please produce individual summary tables displaying the measures of central tendency, dispersion and distributional shape for the following variables:

a.    BMI

i.     What is the most appropriate measure of central tendency? Why?

b.    PHYS_ACT

i.     What is the most appropriate measure of central tendency? Why?

c.     AGE

i.     What is the most appropriate measure of central tendency? Why?

3.    A researcher thinks that mean cholesterol level will be different between those who smoke and those who don’t smoke.

a.    Please identify the independent and dependent variable.

i.     How are each measured?

b.    Please write out the null and research hypothesis.

4.    Please assess cholesterol for normality.

5.    Please provide a summary table showing the mean, standard deviation, n, min and max for the variable BMI stratified by education.

6.    Please provide a visual showing the proportion of people in the study who are sedentary, moderately active, and vigoursly active.

7.    Please show the mean differences in BMI across physical activity status groups paneled by smoking status.

8.    How many men are in this dataset?

