**Assignment 1**

**DUE**: See the course syllabus.

**LATE PENALTIES:** A late penalty of 10% per day late (including weekends and holidays) will apply to written assignments.

**DESCRIPTION**

Assignment 1 requires the student to generate a small data set using SPSS and conduct a series of simple hypothesis tests in SPSS.

The instructions are as follows:

1) Open SPSS and generate 21 columns of data. Column 1 should be 20 1's and 20 2's, in order. The first 20 rows of columns 2-21 should be generated randomly from a Normal distribution with mean of 10 and SD of 5. Specify "number of samples" = 20 and "sample size" = 20. The second 20 rows should be generated in the same way but rather than using a mean of 10, use a mean of 11. You should now have 40 rows and 21 columns looking something like this (2 points):

2) State the appropriate null hypothesis for tests of group 1 against group 2. (1 point)

3) In SPSS, conduct t-tests of the difference between groups for each of the 20 columns of data. List all 20 p-values. (2 points)

4) For an alpha of .05 two-tailed, indicate the tests that lead to a rejection of the null. In how many cases was the p-value less than .05? (1 point)

5) For the same alpha, indicate the tests that did not lead to a rejection of the null. In how many cases was the p-value not less than .05? (1 point)

6) How many wrong decisions did you make? Explain (3 points).

**Grading**: The assignment is worth 10% of your course grade. There are 6 questions, the amount assigned to each question is given above.

**SUBMITTING YOUR ASSIGNMENT**

Submit ONE MS Word file converted to a pdf containing your analyses, results and written work. To demonstrate that you have done your analyses correctly, take screen shot images of your SPSS data file (your .sav data file) and the results of your fist t-test. Include all images and written work in the MS Word file. For each question above then, you should have either an image or written work.

Name the file "ass_1_first name_second name.pdf". My file name would be: "ass_1_jeremy_jackson.pdf".

Submit your assignment on Blackboard under the "Assignments" tab in the main menu.

**Assignment 2**

**DUE**: See the course syllabus.

**LATE PENALTIES:** A late penalty of 10% per day late (including weekends and holidays) will apply to written assignments.

**DESCRIPTION**

Assignment 2 is an analysis of the data contained in the file: assignment_2_data.xls

The data file contains 25 columns and 50 rows. The data is real data used in a scientific publication. Column names are in row 1 of the data file. Row names are contained in column 1 of the data file. Rows (we might say subjects) are US states. So there is data for 50 of the 51 US states in the file. Variables include (selected columns only):

Column D) Estimated average IQ of citizens in the state in 2009.

Column E) The number of murders per 100,000 citizens in 2009.

Column F) The percent of eligible adults with a high school diploma in 2009.

Column K) The percent of adults living below the poverty line in 2009.

Column M) The unemployment rate in 2009.

Column N) The percent of males that were smokers in 2009.

Column W) The gini coefficient in 2009.

Column X) The percent of total income earned by the top 1% of income earners in 2009.

Column Y) The percent of the adult population that were obese in 2009.

**Analysis**: Use SPSS to perform the following analyses:

1) Using the codes for "Coastal" and "Latitude" to create a new variable called "Middle" that denotes whether the state is in the middle of the country. Make 1="middle" and 0="non-middle". Codes of "1" for coastal (meaning not coastal) and "2" for Latitude (meaning neither northern or southern) should be used to denote a state in the middle. Use SPSS syntax in a batch syntax file to do this. (1 point)

2) Identify all missing values in the file (there are 6). Use multiple regression with "IQ", "physicians per capita", "percent of female smokers" and "prison rate" as the predictors and "divorce rate" as the criterion. Write down the model (use two decimal places for the weights) and use the model (use SPSS to do this, do not do it by hand) to impute the missing values. Enter the missing values in to the data file. (3 points).

3) Calculate the Pearson correlation matrix for all the variables in the file. (1 point)

4) Which of the variables are most strongly related to the state being in the middle of the US. You could ask...how are middle states different from non-middle states on the variables in the file? (1 point)

5) Calculate a PCA on all 25 variables. Are there any outlier states? What does it mean for a state to be an outlier in this context? (2 points).

6) How many components should you retain? Interpret each of the components in terms of the variables they correlate with. ( 2 points)

7) Bonus: The Gini coefficient describes income inequality in a given area (in this case in a state). Based on this data, discuss what you think income inequality is due to. (2 points)

8) In case you are interested, the second sheet in this fie includes gini coefficients for each state for 10 years between 2003 and 2012. Has income inequality been increasing? If so, where has this happened? Where has it not happened? (no points...just for your interest)

**Grading**: The assignment is worth 10% of your course grade. There are 7 questions, marks for each question are indicated above.

**SUBMITTING YOUR ASSIGNMENT**

Submit ONE pdf file containing your analyses and results. Include all output/results in the pdf file. Each page should contain the answer to each question. You should have a title page and 7 answer pages for a total of 8 pages. The question should be written at the top of each page.

Name the file "ass_2_first name_second name.pdf". My file name would be: "ass_2_jeremy_jackson_pdf".

Submit your assignment on Blackboard under the "Assignments" tab in the main menu.

**Assignment 3**

**DUE**: See the course syllabus.

**LATE PENALTIES:** A late penalty of 10% per day late (including weekends and holidays) will apply to written assignments.

**DESCRIPTION**

Assignment 3 is an analysis of the data here.

**Analysis**: Use SPSS to perform the following 5 analyses:

1) Calculate cell sample mean "agreement with the message" scores by policy type and messenger. (1 point)

2) Calculate cell sample mean "artistic" and "education" scores by policy type and messenger. What do the mean differences indicate (1 point)?

3) Calculate the Pearson correlation between "artistic" and "agreement with the message" by policy type (there will be two Pearson correlations) and then by policy type and messenger (there will be six Pearson correlations). (1 point). Interpret these correlations (1 point).

4) Conduct an ANOVA with "agreement with the message" as the DV and "policy type" and "messenger" as the IV's. Show the summary table and interpret the results given in the summary table. You must interpret the main effects and interaction p-values. This will require that you clearly state each of the steps of the logic of hypothesis testing and the data /conclusions for each step. (4 points)

5) Repeat "4" but add "artistic" and "education" as covariates. How does this change your interpretation in question 4 above. (2 points).

**Grading**: The assignment is worth 10% of your course grade. There are 5 questions, the percentage assigned to each question is given above.

**SUBMITTING YOUR ASSIGNMENT**

Submit ONE PDF file containing your analyses and results. Include all output/results in this file. Do not include the raw data in this file.

Name the file "ass_3_first name_second name.pdf". My file name would be: "ass_3_jeremy_jackson_pdf".

Submit your assignment on Blackboard under the "Assignments" tab in the main menu.