MIS 690 Topic 3 CLC Data Cleansing and Data Summary GCU
This is a Collaborative Learning Community (CLC) assignment.
Now that you have identified the business problem, translated it into an analytics problem, identified the data needs, and acquired the data, you will use data that you have found (or with the company’s permission you can use its data for analysis) to resolve the analytics problem. Using one or more of the following software applications (IBM SPSS Modeler, SPSS Statistics, Excel, PowerBI, Tableau, or R), analyze the data so that the findings can be used to address the established business problem in your company.
Conduct an exploratory data analysis and provide a draft outline describing the key features of the data and any significant relationships and information contained in the data set that you found. You are required to include specific screenshots of graphs, tables, etc., that are provided:
- How did you verify that the data was reliable before proceeding?
- What problems did you find and how did you address them?
- What relationships did you find in the data?
- Are there any missing data?
- How are you going to summarize data samples?
- Analyze trends with respect to any appropriate characteristics that you may have discovered. Include relevant line graphs, pie charts, bar charts, and scatter plots.
- What have you done to prevent the Simpson’s paradox?
- Next, you will work on a descriptive analytics. Supplement your description with appropriate charts/figures and finalize by creating an appropriate dashboard with PowerBI or Tableau. Include a summary that provides a detailed overview of the data behavior you have identified based upon the analysis. Indicate any causal relationships you found.
- Segment the data accordingly, if needed, to help describe the data behavior. Did you have to redo your sample? Can you identify any data anomalies? If there are anomalies, what do they represent and how do you avoid them?
- Indicate the steps you have taken to investigate the quality of the data and indicate any variables you have transformed or discarded as a result.
Provide the raw software files (Tableau or PowerBI) that you used for this assignment.
Synthesize the information from your draft outline to complete, in 1,500–2,000 words, the relevant components in the Data Diagnostics and Descriptive Summary section of the “Capstone Project Thesis Template.”
Submit your draft outline, raw data Excel files, screenshots, and the updated “Capstone Project Thesis Template.”
Prepare this assignment according to the guidelines found in the APA Style Guide, located in the Student Success Center. An abstract is not required.
This assignment uses a rubric. Please review the rubric prior to beginning the assignment to become familiar with the expectations for successful completion.
You are required to submit this assignment to LopesWrite. Refer to the for assistance.