Given unstructured text data, each student processes the text data and creates valuable business analytics, similar to those in the Elder Research and Jiffy Lube case studies
INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS
UMUC Data 620 Assignment
Given unstructured text data, each student processes the text data and creates valuable business analytics, similar to those in the Elder Research and Jiffy Lube case studies. Each student creates a managerial report identifying the patterns and recommendations uncovered in the data.
Complete instructions may be found below.
Turn in separate files with the following items. Please label each with the TURN IN # statement so we can easily follow it.
Deliverable
|
Description
|
Points
|
TURN IN #1
|
Managerial report outlining your findings. Name this file “XXXX-text-analysis” where XXXX is your name. See the Grading rubric in this assignment for more details.
|
200
|
TOTAL
|
|
200
|
- Read the Elder Research and Unilever case studies and note the types of information they were able to extract from the text data. (We don’t quite have that amount of information available, or quite that level of software, so your conclusions won’t be quite as detailed.)
- Elder Research Inc. (2013). Improving customer retention and profitability for a regional provider of wireless services. Retrieved from https://cdn2.hubspot.net/hubfs/2176909/Resources/Elder-Research-Case-Study-Customer-Retention-nTelos.pdf .
- Jiffy Lube Uses OdinText Software to Increase Revenue. https://greenbookblog.org/2018/01/12/shell-oil-identifying-key-revenue-drivers-in-customer-comment-data/ There’s also an interesting interview video here: https://www.youtube.com/embed/2Zxmjir8Zwo?autoplay=1
- These case studies are for inspiration only; there is nothing to turn in from them.
- Choose a readily available text item which recurred over at least three time periods, spaced some distance apart. You want to make sure you have at least several thousand words of text (more than 10 pages) for each time period, and you want something which will change noticeably over that time period. Some options could include:
- The CEO’s letter to shareholders (I do Amazon.com below as an example – you may choose any other company) in three different years; you may want to supplement with additional information from the company if this isn’t long enough.
- The State of the Union Address (or other political speeches) from three different Presidents, such as those from Woodrow Wilson, Lyndon B. Johnson, and Barack Obama
- The State of the State Addresses from any one of our 50 states
- A writeup of something technical (like descriptions of Motor Trend’s Car of the Year and Finalists) from 1950, 1980, and 2010.
- Some industry writeup (such as PC Magazine’s best new computers for 1979, 1989, and 1999).
- Some policy over time (such as Google’s Privacy Policies)
- Recommendations about what to eat (such as the government’s Nutrition Guidelines. It could be interesting to cross-index the low-fat recommendations from 1980 with today’s ketogenic or Paleo diets.)
- Match your timeline to the subject. For political speeches, you will get the best results if they are at least 50 years apart. For faster-changing items (such as the cellular phone user’s manual), you can probably get away with things 10 years apart. You will have a much easier time making good graphs if you give yourself good raw data to work with.
- Select your desired number of time periods. You must have at least three, and can use as many as you like. (i.e. 2000, 2005, and 2010 would be three time periods.)
- Take your text items for each year and convert them into a text input file.
- Run a Python program to determine the top X word count for each year. You will need to determine how many words you are going to use in your analysis; you should probably have somewhere between 5 and 30.
- You will need to make decisions about stop words.
- Make sure the bulk of your text processing is done in Python, not using the “search/replace” functions in Excel or Word. Part of this class’ skillset is exposure to Python, and this is how you should do it here.
- Merge your Python word count data into an input file for Tableau. You may find it helpful to use Excel or some other tool for this.
- Analyze your data. Emphasis here will be placed on visual analysis and text analysis.
- As part of your analysis, take the top 3 interesting relevant words from your latest time period. (You can use some judgment here; for Amazon, “kindle” would be more interesting than “book” even if “book” had more occurrences.) Trace the trajectory of each of these 3 words over time – for example, at Amazon, the word “kindle” gains tremendously in popularity over time.
- Create a managerial report in Word outlining your findings.
Use what you already know about visualization. Additionally, some Tableau graphs you may find helpful for this sort of analysis include:
- Bump charts – to show changes in rank of items over time http://www.tableau.com/learn/tutorials/on-demand/bump-charts
- Packed Bubble Charts - http://onlinehelp.tableau.com/current/pro/desktop/en-us/help.htm#buildexamples_bubbles.html
A successful report will
- Contain a title page and a list of references and pass a plagiarism check in Turnitin
- Begin with a results-filled Executive Summary (half a page to one page). The Executive Summary needs to orient the reader to your company and the analysis you are doing, and give actionable outcomes.
- Be otherwise 5-10 pages in length (I will stop reading after page 10). So if you have 1 title page, 1 list of references, and 1 page of Executive Summary, you could turn in up to 13 pages of stuff.
- Contain reasonable typeface and margins (12-point Times New Roman with 1-inch margins work just fine).
- Follow our Modified APA Formatting. This follows standard APA formatting, with one exception: please integrate your figures within the body of your document. Don’t put them in the Appendix. If you are on page 3 talking about Figure 1, make sure the reader can see Figure 1 on page 3.
- Show mastery of the readings in the class to date
- Showcase your data visualization skills in Tableau, with exactly 5 graphs (4 is not enough, and I will not look at a 6th or further graph). The online help at http://www.tableau.com/learn/tutorials/on-demand/formatting contains formatting tips which will make it look really professional.
- Contain action words in your captions. Captions need to tell the reader what is happening here – something like “Count of Top Three Words at Amazon.com” isn’t helpful. “The word ‘Kindle’ is growing fast, overtakes ‘Paperback’ in 2015” is a better choice.
- Contain a list of your top 3 words, with your reasons for choosing them, and trace their trajectory over time
- Analyze anything else you find interesting
- Integrate your charts with your conclusions
- Give a busy executive a clear path to follow in terms of action items and a “to do” list
Attachments:
Related Questions
. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C
DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma
. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:
1
Project 1
Introduction - the SeaPort Project series
For this set of projects for the course, we wish to simulate some of the aspects of a number of
. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:
1
Project 2
Introduction - the SeaPort Project series
For this set of projects for the course, we wish to simulate some of the aspects of a number of