Research Projects

RESEARCH PROJECTS

Influence Maximization in Complex Networks

Led by Mehmet Aktas

The identification of the crucial network structures within a network, known as influence maximization (IM) problem, is a critical issue in graph analytics, with a wide range of practical applications, including information diffusion, market advertising, and rumor management. This project focuses on identifying influential structures, such as nodes, edges, and communities, in diverse network models, including graphs and hypergraphs. Additionally, it explores the impact of various IM models on social media influencer marketing.

Human-AI Interaction: Human Behavior and AI Performance

Led by Kai Zhao

We see a booming era of AI technology development in the last ten years. According to AI Index Report 2022 by Stanford Human-Centered AI, the number of AI patents filed in 2021 is more than 30 times higher than in 2015, showing a compound annual growth rate of 76.9%. Many AI algorithms have been developed to model and predict human behaviors for business applications, ranging from predicting how people move in the cities for mobile marketing, to guiding stock investment decisions with sentiment analysis using unstructured big data and emerging AI technologies. AI can alter human behavior, and humans also create, inform and mold the behaviors of AI. We shape AI behaviors through the training of these systems on observations of human behaviors with the data that we generate daily. Humans are dynamic, and our behavior changes regularly. For example, many people have shifted their purchase behavior from offline to online during COVID-19. As more and more companies adopt AI in their operations, our research on modeling how AI interacts with the dynamics of human behaviors can be a crucial part of advancing new business research theories.

Mutual Fund Risk Disclosures

Led by Anne Tucker and Yusen Xia

This project works to extract mutual fund disclosures and analyze statements of investment strategy and the attendant risks. The research team has developed text extraction code and is leveraging machine learning methodologies combined with the legal subject matter expertise to confirm compliance with SEC regulations, identify and aggregate mutual fund risks, analyze tone and sentiment of strategy statements, and explore relationships between mutual fund disclosure features and fund performance.

Litigation

Led by Charlotte Alexander and Anne Tucker

This project continues earlier Legal Analytics Lab work using docket sheets to explore case pathways, especially focused on judicial dispositive motions like "to dismiss" or "for summary" judgment. Working with a team of M.S. in Analytics and JD students, Alexander and Tucker are leveraging text analytics and machine learning to gain further insights in the frequency and predictors of certain case pathways and outcomes.

Image Analytics to Improve Firm Operations

Led by Yusen Xia

This research explores the use of image analytics to improve firm operations to achieve higher efficiency, greater productivity, and better customer service. Both traditional image analytics tools (e.g., image enhancing, segmentation, and object identification) and deep learning methods (e.g., convolutional neural networks, U-Nets, auto-encoders) have been investigated in this study.

Understanding the Content of Online Reviews

Led by Yichen Cheng and Yichuan Zhao

This study explores the content of online reviews. The growth of online shopping has made online reviews a critical source of information for consumers. However, there can be thousands of reviews for a single product. For example, Amazon’s Echo Dot accumulated more than 100,000 reviews in its first two years. The volume of reviews makes it difficult to search for useful and relevant information from the post-purchase experiences of others. The researchers developed a methodology that leads to a simple representation of information being revealed in reviews. Specifically, for each product, they extracted the relevant aspects of the product that are discussed in the reviews, and developed a measure of each reviewer’s satisfaction with those aspects. They applied this methodology to a large review dataset from Amazon and showed that initial reviewers report a few salient aspects of the product and their experiences with those aspects. Subsequent reviewers continue to report their experiences with these aspects. They find that user satisfaction with these aspects are very different when comparing favorable reviews to less favorable ones. Somewhat surprisingly, aspects that generate a strong positive satisfaction for positive reviews have a neutral or muted mention in negative reviews. Their results suggest simple strategies for platforms hosting reviews to easily provide relevant and useful information to customers.

Truth discovery & Crowdsourcing

Led by Houping Xiao

With the advancement of technology in both data collection and data storage, Information Veracity (i.e., individual users might not be reliable) has become a severe challenge in the big data area. To resolve this challenge, Dr. Xiao has worked on the area of Truth Discovery or Crowdsourcing, which is the procedure of discovering the truth that is usually hidden in a large amount of noisy or even conflicting crowdsourced data. Dr. Xiao has developed multiple approaches to resolve conflicts in crowds and harness the wisdom of crowds in different applications, including Crowd Sensing, FinTech, Computerized Adaptive Testing (CAT). For instance, Dr. Xiao has developed a novel framework to aggregate the EPS estimation from multiple platforms and provided a better EPS forecast than the Wall Street or IBES. For more information, please refer to the website.

A Study of Ten Years of Employee Misclassification Decisions

Led by Charlotte Alexander and Javad Feizollahi

This project examines the text of judges' decisions in employee misclassification cases -- or lawsuits where a worker's status as an independent contractor or employee is in dispute -- to understand how courts distinguish between the two categories. The law does not provide clear rules in these cases, so judges are called upon to apply a loose set of standards, producing written opinions that are highly unstructured. Using text mining and machine learning classification models, this project seeks to find patterns in judges' decision-making and provide more clarity on the state of the law in this area.

Plaintiffs' Attorney Networks as Litigation Drivers

Led by Charlotte Alexander

This project maps four types of network relationships among plaintiffs’ lawyers who filed wage and hour lawsuits under the Fair Labor Standards Act (FLSA) in federal court over 17 years: overlapping college attendance, law school attendance, shared professional association memberships, and co-counseling linkages. The first three linkage types are hypothesized to layer underneath, and predict, the fourth: shared educational experience and affinity group membership may make co-counseling more likely. Further, the project explores whether co-counseling relationships, particularly those across borders, influence case-filing numbers. To adopt a public health frame, attorneys from high-volume FLSA “hot spot” jurisdictions who join forces with lawyers who practice in other courts or states may act as vectors for the spread of FLSA litigation. This project uses an original data set of all federal FLSA cases filed between 2000 and 2016 to explore the existence of layered network relationships within the FLSA plaintiffs’ bar, and to investigate the extent to which these network relationships acted as litigation drivers.

Dimension Reduction for Feature Engineering

Led by Yichen Zheng

One research area of Yichen Cheng is dimension reduction to better fit the machine learning model when there are many features. When the number of dimensions/features is high, classical analytics methods may fail. Thus, it is of importance to reduce the data dimensionality before any statistical methods or machine learning can be applied. In this project, we developed a supervised dimension reduction method that works especially well for high dimensional data. Applications include data visualization, feature engineering/selection for predictions, and inferences.