Step 1: Prepare Data
For the data analysis project, use the following data sources:
NHANES 2015-2016
NHANES 2013-2014
NHANES 2011-2012
From the NHANES website, you need to download the “Demographics Data” datasets from two consecutive years and two matching datasets of your research interest. You should restrict the sample to a subpopulation of interest (for example, males only, females only, only adults, …).
After identifying the datasets, you should:
Read the
Describe the data
Identify a research question you desire to answer with the analysis of the data of your This should include examining a relationship at least between two variables.
Draft an outline of what you want to accomplish with the variables of your selection and what SAS procedures you intend to
Read the data into SAS and generate a permanent SAS
Append, merge, and subset as appropriate to answer your research question(s).
Identify the weight, primary sampling unit, and stratum
Restructure your data as Remove variables that are not of interest.
When and where appropriate, perform variable recodes, transform, or compute new variable(s).
Apply variable formats and labels as appropriate
Step 2: Analysis
After completing step 1, your data is ready for analysis. Include the following in your analysis,
Provide frequency distribution of 4 categorical variables (for example, education level, race/ethnicity, …)
Use key summary statistics to summarize 2-3 quantitative variables in your data (i.e., mean, median, minimum, maximum, and deviation)
Provide information about missing values for each variable, and make sure missing values are not included in your
Rerun 1 and 2 with the weight variable you identify in step 1 and compare your results with 1
and 2. Why is weighting important?
Apply the appropriate statistical test to examine relationship or association (for example, chi - square for categorical variables or t-test/ANOVA for quantitative variables).
Use appropriate graphs to visualize the distribution of each variable involved in your primary
research.
Identify the primary sampling unit variable and the stratum variable for this
Use survey procedure command(s) to rerun (4). Discuss your results by comparing the estimates with results in (2) and (4).
Step 3: Report
Once you have completed your analysis, you should submit an organized report of your project. The report is meant to be a polished report that describes your research and findings supported by appropriate tables and graphs. It is not appropriate to cut and paste SAS output into the report (with exception of graphs). All graphs and tables should be described. Key features of tables, graphs and figures should be written in full sentences.
The rubric for the report is as follows:
Introduction ( 1-2 paragraphs)
Describe specific research question(s) of interest
Data Source (1-2 paragraphs)
Name of the data source, years included, coverage
How was the data collected (web-based, self-administered, telephone interview)?
Target population: what population was included in the study (sample population)
Sampling: how were respondents selected? Was is simple random sampling, stratified, or clustered sampling?
Inclusion/exclusion criteria (subsetting used to select the study sample for your research)
Analysis
Describe the variables used in your research
Key summary statistics for each quantitative variable (mean, median, max, min, std deviation) (see attached table template “Table 1”)
Show frequency distribution for each categorical variables (see attached table template “Table 2”)
Include information about missing values for each variable
Report test statistic for quantitative and categorical variables as appropriate
Investigate your research questions based on results from the analysis
Include graphs to visualize the distribution of each variable used in the analysis for your primary research question
Include an appropriate graph to visualize at least one relationship in your research question
Summarize your findings (2-3 paragraphs)
Refer to tables and figures from your Describe and highlight the major findings
Describe the variables of interest in you research. For continuous variable(s), check if the distribution appears normal (check the graph). For categorical variables, check if one or more categories have little or no observation and discuss how this will affect the
Describe the main findings from the analysis and discuss how they relate to your research question(s), e., interpretation of the results.
DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of
1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of