In the Insight Lab we focus on data science and machine learning innovations with applications to business problems brought to us by strategic partners. These unique, collaborative engagements provide organizations with the opportunity to gain insight into a big data challenge.
Organizations with exploratory big data projects bring their staff and data together with institute faculty and students to engage in a 3- to 4-week focused effort to understand what is in their data and what can be done with it in the institute’s big data lab.
Students work in teams on data sprints to find solutions to these real business problems involving data management and applications. The students tackle each project with a company staff member and devise possible solutions.
The objective is to see if specific questions can be answered using the data or if the data may be helpful in other ways.
This project focused on studying social media comments to understand customer feedback on some bank policies. Students first collected big data sets from various social media sites, then pre-processed these texts by removing unrelated variables, filling missing values, extracting potential text features, etc. Subsequently, they implemented different machine learning and deep learning techniques on sentiment analysis, text classification, and topic modelling. The project came up with suggestions and recommendations regarding service improvement for Truist Bank.
MSDA students created a beta version of a decision support tool for Starr’s underwriters to help increase their production. Students used both analytical and machine learning approaches (random forest and XGBoost) and created a Power BI dashboard to assist underwriters in understanding and visualizing submission priority.
This project leveraged internal and external data sources to predict monthly maintenance costs for different types of automobiles. Data Science and Analytics Students used different machine learning methods such as decision tree, random forest and support vector machine to forecast car repairs and regular maintenance costs.
This project examined factors that can help to facilitate rehabilitation completion for substance abuse patients. MSDA students analyzed available patient features, and investigated most important and influential features with different machine learning methods.
Better Business Bureau
The project analyzed both structured and unstructured data to improve the customer service of Better Business Bureau. In particular, students used text mining techniques to analyze customer reviews for clients of BBB, and also explored the web travel journey of potential clients.
Fast Food Restaurant
MSDA students used machine learning forecasting techniques to forecast fast food restaurant sales for both dine in and online delivery. Social media review data were collected and features were engineered for models such as decision trees, XGBoost, and random forest.
MSDA students examined a number of components reflecting the diversity of the company’s employees at all levels of the organization. They then developed a dashboard with dash and plotly to interactively visualize different measures of diversity and inclusion metrics.
Starr Companies (Fall 2020)
MSDA students examined several business areas of the insurance industry, and collected data such as premium, incurred losses, loss ratio, experience modifications and exposures for multiple states. A Power BI dashboard was built to measure territory risk scores.
MSDA students investigated customer experience of the mortgage division of the bank in order to provided a more customized service to the clients. Machine learning techniques such as random forest, SVM and XGBoost were used in the project.
Florida Center for Capital Representation (FCCR)
Working with a set of court documents filed in Florida state courts, the students extracted key textual information to study prosecutors’ exercise of discretion in seeking the death penalty. The students used a variety of natural language processing and machine learning techniques to classify the documents into relevant categories. They also built visualization dashboards and reporting mechanisms for the FCCR to use in analyzing future sets of court documents.
Starr Companies (Spring 2020)
Students analyzed structured and unstructured data from different sources to predict the severity of Commercial Auto claims, enabling increased early detection of claim severity and more accurate severity predictions, which implies more investment capability. Text mining and machine learning methods such as term frequency, sentiment analysis, word2vec, logistic regression, XG-Boot and random forest were used in this project.
This project utilizes machine learning methods to understand customer attrition. In particular, students investigated historical banking transactions to identify households who are likely to attrite, distinguish those households who leave out of dissatisfaction versus normal churn (evitable vs. inevitable), and identify issues to address based on the traits that distinguish these households. Various machine learning methods were implemented.
The project develops a scoring methodology to rate and rank brokers who have done business with the company. Students used unsupervised machine learning methods such as clustering and principal component analysis to identify weights of various factors that reflect broker performance and created a broker ranking system in PowerBI.
This project explores blockchain technology. Students experimented with technologies such as Ethereum, Quorum and Hyperledger to develop a blockchain architecture to connect different parties of relevance to TSYS, perform associated analytics, set up relevant accounts in the blockchain and develop smart contracts to manage transactions. A prototype was built for deployment.
Students examine historical sales and pricing data and use machine learning to dynamically predict the optimal price for semi-commodity products. Machine learning techniques such as logistic regression, decision tree, random forest and deep learning methods such as LSTM were used to predict sales.
Better Business Bureau
This project provides insight to the Better Business Bureau on understanding of causative factors that might influence or impact an individual or customers decision to do/ maintain business with the organization, through effective data analysis, application of machine learning and predictive modeling. In particular, students use both sentiment analysis and topic modeling approaches on customer review and complaint texts to explore behavior of businesses in different industries.
Students examined historical bank transactions to identify potential money laundering examples. Different machine learning methods such as decision trees, random forest, support vector machines, logistics regression, and neural networks were implemented to support anti-money laundering (AML) efforts at SunTrust.
Students analyzed various data formats at Starr and tried to automate data input process especially for unstructured texts and images. In addition, topic models such as LDA were used to summarize and classify topics in documents, and sentiments of these documents were analyzed as well.
Metro Atlanta Chamber
Students analyzed unstructured data from different sources including Twitter, news media, Reddit, Facebook, and Google search trends. A system was built to systematically collect, clean, process, and analyze data that is relevant to the reputation of the city of Atlanta. Analyses included relevance filtering, topic modeling, and sentiment analyses. Machine learning algorithms were applied to improve the accuracy of classification and unsupervised learning. The results (refreshed periodically) are pushed to an online and interactive dashboard.
Barrett & Farahany
Students analyzed unstructured text data from legal documents and court records in order to classify lawsuit outcomes and develop a predictive model for forecasting the steps through which a lawsuit would progress and its conclusion. Methodologies used included topic modeling, Word2Vec, and various machine learning classification algorithms.
Georgia-Pacific challenged Robinson students to use images in operations to determine whether use of image recognition can detect fraud and monitor activity. Students matched same-day inbound/outbound truck images and explored the use of image data in logistics.
Robinson students were asked by SunTrust Banks to explore what website behavior, by a customer, leads to a sale and whether the bank can tailor individual interaction in real-time. During the project, students measured the impact of “visitor engagement” that increased the probability that a customer would acquire a new product.
Robinson students engaged with WestRock to improve its plant operations through image analytics. Students took pictures of corrugated boxes on an assembly line, read the labels captured in the images, and gauged the descriptions' accuracy by comparing them to the physical products. Students also took product inventory through intensity differentiation of the images.
Using Robinson’s big data lab, students used text-mining to predict client attrition for SunTrust Banks. Investigating “unstructured” texts such as underwriter’s notes, client acquisition or risk review, and sales manager’s notes from servicing clients, students provided approaches for supporting SunTrust’s goal.
American Red Cross
To address the ongoing demand and need for blood, students set out to determine whether the American Red Cross could identify those likely to be a repeat donor and those likely to be a high value donor. Students analyzed demographic, geographic and behavioral profiles for donors and offered insights on drivers of donor loss and retention.
Students were challenged to use Starr Companies’ data on customer attributes in an existing line of business to determine what external data is useful and for what purpose in the property-casualty business. Using Robinson’s big data lab, students conducted data analysis and mining on the existing book of business to find correlations and patterns.