RESEARCH PROJECTS
Toward Precision Stroke Rehabilitation: An Integrated Machine Learning and Clinician Feedback Approach
The objective of this project was to identify relationships between key facets of rehabilitation and stroke survivors’ independence following inpatient rehabilitation. Through a collaborative partnership between data scientists and clinicians seeking to improve precision stroke rehabilitation, iterative machine learning (ML) was employed. Data were from stroke patients (n = 560) seeking to improve independence during in-patient rehabilitation, where treatments provided included sessions (e.g., physical therapy, chaplaincy services) and medication management. The method applied in this study was a combination of causal ML and iterative clinician feedback. We found treatment effects for sessions (e.g., therapy) and for medications based on patient characteristics. Causal ML analyses confirmed that heterogeneous treatment effects (HTEs) are indeed present in rehabilitation. Thus, tailoring treatments to specific patient characteristics is likely to improve independence achievement for stroke patients. We also found that the use of causal ML, coupled with clinician feedback throughout the research process, allowed for minimization of unnecessary complexity.
A Graph-Based Approach to Expert Identification in Crowdsourcing
This project introduces a novel expert detection method for crowdsourcing forums that identifies experts at a more granular, tag-specific level, addressing the limitations of current methods that only recognize expertise at the forum-wide level. By modeling users' interactions as graphs and hypergraphs and using the sheaf Laplacian to analyze information diffusion, the proposed approach offers a deeper understanding of how expertise spreads across different question tags. The newly defined sheaf Laplacian centrality further refines expert identification by measuring user expertise within each tag.
Enhancing Decision-Making with Context-Aware, Multimodal, Knowledge-Infused AI Systems
Online platforms are centered around individuals and communities exchanging multimodal content (e.g., text/image/video) influencing user behavior and social dynamics. This rich multimodal environment enhances user experience while bringing challenges in computational modeling, as the semantic contextual cues span across these modalities. Identifying and incorporating psycho-social cues is crucial for capturing the true holistic meaning and retrieving meaningful insights for decision-making. Our research is dedicated to developing innovative intelligent systems that effectively measure, predict, and generate actionable insights on driving factors of communications and human behavior on the social web. We employ an advanced hybrid neuro-symbolic approach that leverages knowledge graphs to enrich computational models, including large language models (LLMs) and large vision-language models (LVLMs). This integration facilitates a deeper understanding of the nuances in multimodal data, enabling us to address complex challenges in business marketing, healthcare, and social issues more effectively.
Using Generative AI to Improve Firm Operations and Supply Chain Performance
This research explores the use of generative AI methods, especially large language models, to improve firm operations and supply chain performance to achieve higher efficiency, greater productivity, and better customer service. Aspects such as product demand forecast, customer service improvement, supply chain risk identification and mitigation, and inventory optimization are considered in this project. Prompt engineering, augmented retrieval generation, and agentic solutions are utilized.
Modeling Social Media Influencer Impact for Enhanced Digital Marketing
This project aims to streamline influencer marketing by developing automated systems that enhance influencer identification and optimize campaign targeting. Representing social media as graphs and hypergraphs and using the sheaf Laplacian, we model information diffusion on social networks based on different influencer types like macro, micro, and AI influencers, while also considering product characteristics (familiarity, acceptance) and follower demographics (age, gender). The flexibility of the sheaf Laplacian allows for a comprehensive approach to modeling how different influencers impact information diffusion, ultimately helping brands maximize the return on investment from their campaigns in an evolving digital landscape.
An Interpretable and Scalable Multimodal Multiple Instance Learning Model for Understanding Review Ratings
In this project, we propose a multimodal multiple instance learning (MIL) model to infer review ratings based on both the review texts and the images posted by the user. Our goal is to identify the most informative parts from both the text and images about the rating. To achieve that, we adopt the MIL framework in a Bayesian setting. To accelerate the computation speed and make the algorithm scalable to big data, we propose to use Variational Bayes to approximate the posterior. We apply the model we developed to yelp restaurant reviews and found interesting results.
Automated Text Analysis of Franchise Disclosure Documents: A Systematic Framework
Drawing on a tradition of scholarship in applied computer science, we outline a method for automated text analysis of franchise disclosure documents. We provide a step-by-step guide by which researchers can acquire, clean and pre-process, extract features from, and apply learning-based models or neural networks in natural language processing to franchise disclosure documents. Our process offers franchising, legal, and B2B scholars a template to utilize advances in artificial intelligence-enabled natural language processing to enrich current research agendas.
Resolving Conflicts in Crowds: An Earnings Forecast Application
The aim of this project is to combine earnings forecasts from various information sources to produce a more accurate consensus. To resolve conflicts among the sources, we propose an optimization framework that iteratively refines both the consensus earnings forecast and the reliability of each source (i.e., the quality or accuracy of forecasts from each venue). In line with the wisdom-of-crowds principle, the new consensus is more accurate than the Wall Street consensus 67.5% of the time and surpasses the IBES consensus 67.4% of the time. Additionally, this enhanced consensus proves incrementally valuable in forecasting earnings and sheds light on post-analyst revision drifts. For more information about other Fintech projects, please refer to the website https://houpingx.github.io/fintech.html.