In the Insight Lab we focus on data science and machine learning innovations with applications to business problems brought to us by strategic partners. These unique, collaborative engagements provide organizations with the opportunity to gain insight into a big data challenge.
Organizations with exploratory big data projects bring their staff and data together with institute faculty and students to engage in a 3- to 4-week focused effort to understand what is in their data and what can be done with it in the institute’s big data lab.
Students work in teams on data sprints to find solutions to these real business problems involving data management and applications. The students tackle each project with a company staff member and devise possible solutions.
The objective is to see if specific questions can be answered using the data or if the data may be helpful in other ways.
Two teams of MSDA students worked on inventory optimization and quote success prediction.
After engineering a set of related features, students implemented various machine learning models to predict when AVGroup is likely to win a client order. Recommendations for the right inventory, safety stock, and right product margin are made to enhance the company’s profits.
In this project, three groups of MSDA students engaged with datasets of consumer review and company responses. Text mining techniques are used to pinpoint consumer reviews that pertain to Diversity, Equity, and Inclusion (DEI) and to discern the elements of DEI most valued by consumers. Predictive machine learning algorithms are built to help BBB improve customer services and reputation.
In integrating addressable TV advertising with linear advertising, Cox Media needed to better understand how much inventory (i.e., advertising slots) to dedicate to addressable advertising. Several MSDA students analyzed weekly addressable advertisement pass and fail data to determine when to optimally deliver addressable ads. Recommendations regarding inventory types and quantity were made to help understand when addressable advertising was most likely to be successful.
MSDA students conducted a project analyzing new product launches for a manufacturer. Objectives included assessing launch performance through retailer orders and customer reviews. The project integrated sales data analysis, sentiment analysis, and interactive dashboards, enabling real-time monitoring and evaluation of new product launches.
Nine MSDA students worked on anomaly detection problems for database of MBUSA. A cloud-based pipeline is designed, which includes data processing algorithms and machine learning models, to effectively detect and analyze anomalies in the database. A recommendation system is provided as a checkpoint to be integrated into the data flow of the organization.
This project focuses on predicting the optimal personnel allocation based on weather, seasons, and various unforeseen events. The aim is to ensure that staffing is neither excessive nor insufficient, preventing any delay in business operations. Students in the MSDA program initially employed web scraping techniques to collect data. They then integrated various time series models with the latest methods in machine learning and deep learning.
Four teams of MSDA students worked on exploring how to use graph databases in modeling Truist banking system. In a staged process, these teams synthetically generated a dataset and subsequently engineered a distinct graph database. Leveraging these databases, they engaged in diverse financial assignments, including product recommendations and risk predictions, employing different graph machine learning and deep learning models for analysis and insights.
This project focuses on comparing Truist Bank's TCFDs (Task Force on Climate-Related Financial Disclosures) with those of other national and local banks. The aim is to study the work Truist has already done in Environment and Sustainability and identify areas for improvement. Students employed the latest technologies in the field of NLP (Natural Language Processing). They used embedding-based techniques to calculate similarities and differences between documents, BERT's Q&A method to extract key information from lengthy files, and technologies like ChatGPT to automate the generation of comparative results.
Truist Blockchain Pilot Study with Prototype Project:
MSDA students investigated potential blockchain applications at Truist, and built a prototype for a particular Truist business unit. Students explored different blockchain platforms such as Ethereum, Hyperledger Fabric, Sawtooth, and Solana among others, and implemented smart contract for various transactions.
Truist Data Automation with OCR Project:
MSDA students worked on both internal OCR documents and external public data, and automated the data collection and verification process. Techniques used include webscraping, information extraction from both Acrobat Adobe and character recognition files, as well as risk classification with machine learning.
Four teams of MSDA students worked on forecasting procurement costs and looked into factors/features that were necessary to win customer orders under competition at airline service company AvGroup. These students have implemented both visualization charts and machine learning/deep learning models to gain insights.
12 MSDA students worked with SERC (Southeast Regional Cooperative) to find opportunities to get more food and reduce logistics costs to help to feed the hungry. Students used both machine learning and network analysis to analyze structured data in Florida and Georgia, and built a dashboard to streamline the data management process. This project is meaningful and helpful to our community and society.
MSDA students used machine leaning and deep learning with structured and unstructured text and voice data to help resolve complaints of various business consumers. This Sprint project helped improve consumer wellness and the reputations of these businesses over time.
This project focused on studying social media comments to understand customer feedback on some bank policies. Students first collected big data sets from various social media sites, then pre-processed these texts by removing unrelated variables, filling missing values, extracting potential text features, etc. Subsequently, they implemented different machine learning and deep learning techniques on sentiment analysis, text classification, and topic modelling. The project came up with suggestions and recommendations regarding service improvement for Truist Bank.
MSDA students created a beta version of a decision support tool for Starr’s underwriters to help increase their production. Students used both analytical and machine learning approaches (random forest and XGBoost) and created a Power BI dashboard to assist underwriters in understanding and visualizing submission priority.
This project leveraged internal and external data sources to predict monthly maintenance costs for different types of automobiles. Data Science and Analytics Students used different machine learning methods such as decision tree, random forest and support vector machine to forecast car repairs and regular maintenance costs.
This project examined factors that can help to facilitate rehabilitation completion for substance abuse patients. MSDA students analyzed available patient features, and investigated most important and influential features with different machine learning methods.
Better Business Bureau
The project analyzed both structured and unstructured data to improve the customer service of Better Business Bureau. In particular, students used text mining techniques to analyze customer reviews for clients of BBB, and also explored the web travel journey of potential clients.
Fast Food Restaurant
MSDA students used machine learning forecasting techniques to forecast fast food restaurant sales for both dine in and online delivery. Social media review data were collected and features were engineered for models such as decision trees, XGBoost, and random forest.
MSDA students examined a number of components reflecting the diversity of the company’s employees at all levels of the organization. They then developed a dashboard with dash and plotly to interactively visualize different measures of diversity and inclusion metrics.
Starr Companies (Fall 2020)
MSDA students examined several business areas of the insurance industry, and collected data such as premium, incurred losses, loss ratio, experience modifications and exposures for multiple states. A Power BI dashboard was built to measure territory risk scores.
MSDA students investigated customer experience of the mortgage division of the bank in order to provided a more customized service to the clients. Machine learning techniques such as random forest, SVM and XGBoost were used in the project.
Florida Center for Capital Representation (FCCR)
Working with a set of court documents filed in Florida state courts, the students extracted key textual information to study prosecutors’ exercise of discretion in seeking the death penalty. The students used a variety of natural language processing and machine learning techniques to classify the documents into relevant categories. They also built visualization dashboards and reporting mechanisms for the FCCR to use in analyzing future sets of court documents.
Starr Companies (Spring 2020)
Students analyzed structured and unstructured data from different sources to predict the severity of Commercial Auto claims, enabling increased early detection of claim severity and more accurate severity predictions, which implies more investment capability. Text mining and machine learning methods such as term frequency, sentiment analysis, word2vec, logistic regression, XG-Boot and random forest were used in this project.
This project utilizes machine learning methods to understand customer attrition. In particular, students investigated historical banking transactions to identify households who are likely to attrite, distinguish those households who leave out of dissatisfaction versus normal churn (evitable vs. inevitable), and identify issues to address based on the traits that distinguish these households. Various machine learning methods were implemented.
The project develops a scoring methodology to rate and rank brokers who have done business with the company. Students used unsupervised machine learning methods such as clustering and principal component analysis to identify weights of various factors that reflect broker performance and created a broker ranking system in PowerBI.
This project explores blockchain technology. Students experimented with technologies such as Ethereum, Quorum and Hyperledger to develop a blockchain architecture to connect different parties of relevance to TSYS, perform associated analytics, set up relevant accounts in the blockchain and develop smart contracts to manage transactions. A prototype was built for deployment.
Students examine historical sales and pricing data and use machine learning to dynamically predict the optimal price for semi-commodity products. Machine learning techniques such as logistic regression, decision tree, random forest and deep learning methods such as LSTM were used to predict sales.
Better Business Bureau
This project provides insight to the Better Business Bureau on understanding of causative factors that might influence or impact an individual or customers decision to do/ maintain business with the organization, through effective data analysis, application of machine learning and predictive modeling. In particular, students use both sentiment analysis and topic modeling approaches on customer review and complaint texts to explore behavior of businesses in different industries.
Students examined historical bank transactions to identify potential money laundering examples. Different machine learning methods such as decision trees, random forest, support vector machines, logistics regression, and neural networks were implemented to support anti-money laundering (AML) efforts at SunTrust.
Students analyzed various data formats at Starr and tried to automate data input process especially for unstructured texts and images. In addition, topic models such as LDA were used to summarize and classify topics in documents, and sentiments of these documents were analyzed as well.
Metro Atlanta Chamber
Students analyzed unstructured data from different sources including Twitter, news media, Reddit, Facebook, and Google search trends. A system was built to systematically collect, clean, process, and analyze data that is relevant to the reputation of the city of Atlanta. Analyses included relevance filtering, topic modeling, and sentiment analyses. Machine learning algorithms were applied to improve the accuracy of classification and unsupervised learning. The results (refreshed periodically) are pushed to an online and interactive dashboard.
Barrett & Farahany
Students analyzed unstructured text data from legal documents and court records in order to classify lawsuit outcomes and develop a predictive model for forecasting the steps through which a lawsuit would progress and its conclusion. Methodologies used included topic modeling, Word2Vec, and various machine learning classification algorithms.
Georgia-Pacific challenged Robinson students to use images in operations to determine whether use of image recognition can detect fraud and monitor activity. Students matched same-day inbound/outbound truck images and explored the use of image data in logistics.
Robinson students were asked by SunTrust Banks to explore what website behavior, by a customer, leads to a sale and whether the bank can tailor individual interaction in real-time. During the project, students measured the impact of “visitor engagement” that increased the probability that a customer would acquire a new product.
Robinson students engaged with WestRock to improve its plant operations through image analytics. Students took pictures of corrugated boxes on an assembly line, read the labels captured in the images, and gauged the descriptions' accuracy by comparing them to the physical products. Students also took product inventory through intensity differentiation of the images.
Using Robinson’s big data lab, students used text-mining to predict client attrition for SunTrust Banks. Investigating “unstructured” texts such as underwriter’s notes, client acquisition or risk review, and sales manager’s notes from servicing clients, students provided approaches for supporting SunTrust’s goal.
American Red Cross
To address the ongoing demand and need for blood, students set out to determine whether the American Red Cross could identify those likely to be a repeat donor and those likely to be a high value donor. Students analyzed demographic, geographic and behavioral profiles for donors and offered insights on drivers of donor loss and retention.
Students were challenged to use Starr Companies’ data on customer attributes in an existing line of business to determine what external data is useful and for what purpose in the property-casualty business. Using Robinson’s big data lab, students conducted data analysis and mining on the existing book of business to find correlations and patterns.