# Instructions for SAS Assignment 3 – CHIS data analysis SAS Assignment 3: Using r

Instructions for SAS Assignment 3 – CHIS data analysis
SAS Assignment 3: Using real data – 2018 California Health Interview Survey
Before you do anything, take a look at the data. Do a PROC CONTENTS. This will give you the names of the variables, and, in most cases, a descriptive label. (1 point)
Before you do anything else, READ THE CODEBOOK. You don’t need to read all 316 pages but you do need to search for any variables that interest you and read the explanation of each value. For example, the value of ‘9’ for education does not mean 9 years of schooling.
Now that you have some understanding of the data, select variables you would like to use to test a hypothesis. State your hypothesis and null hypothesis. (2 points) For example, my hypothesis is that Native Americans would see the doctor more often because they could access the Indian Health Service. Ho: There is no difference in the frequency with which Indigenous and non-Indigenous people visit the doctor.
Do descriptive statistics for your variables. (2 points) This will probably show you that some changes need to be made to the data.
Fix the data so it can be usable. (2 points) For example, only Native Americans or Alaskan Natives were asked the question on tribal enrollment. For everyone else, some negative value was entered. Also 1 = Yes, enrolled in a tribe and 2 = No. So, I recoded every answer except a 1 to be 0. If there are no problems, state that the data did not need to be fixed because all of the subjects had a valid response.
Run the appropriate analysis. (2 points) Since I had a continuous variable and two categories, I could do a t-test. Since doctor visits is really only on a scale of 0-10, and there were at least 5 subjects in each cell, I could also do a chi-square. I did both.
State whether you reject or accept your hypothesis (1 point)
The p-value for the t-test was .1024, greater than .05 so I would accept the null hypothesis.
When I did the chi-square, I looked at the Mantel-Haenszel chi-square, which is the one that considers the order of the categories. It also had a p-value of .1024 so no matter which way I looked I was accepting the null.
PROC CONTENTS DATA=IN.CHIS18_SUBSET ;
PROC MEANS DATA=IN.CHIS18_SUBSET ;
*** Fixing variable to show if Indigenous or not ;
DATA analysis_set ;
SET in.chis18_subset ;
IF aa5c = 2 THEN enrolled = 1 ;
ELSE enrolled = 0 ;
*** TTEST to compare on number of visits ;
PROC TTEST DATA =analysis_set ;
CLASS enrolled ;
var ACMDNUM ;
**** COULD DO A CHISQ – look at CMH since the categories are ordinal;
PROC FREQ DATA = analysis_set ;
tables acmdnum*enrolled /chisq;