I'm interested in deep learning, machine learning, graph analytics, and their practical
applications.
Discovering Global False Negatives On the Fly for Self-supervised Contrastive Learning
Vicente Balmaseda, Bokun Wang, Ching-Long Lin, Tianbao Yang
Accepted to Forty-second International Conference on Machine Learning (ICML 2025), 2025
code /
arXiv /
We introduce GloFND, an optimization-based approach for false negative discovery in self-supervised unimodal and bimodal contrastive learning. GloFND globally detects false negatives across the entire dataset while maintaining a per-iteration computation cost independent of dataset size.
Discriminative Finetuning of Generative Large Language Models without Reward Models and Preference Data
Siqi Guo, Ilgee Hong, Vicente Balmaseda, Changlong Yu, Liang Qiu, Xin Liu, Haoming Jiang, Tuo Zhao, Tianbao Yang
Accepted to Forty-second International Conference on Machine Learning (ICML 2025), 2025
code /
arXiv /
We introduce Discriminative Fine-Tuning (DFT), an approach for fine-tuning LLMs without preference data or reward models. Unlike Supervised Fine-Tuning (SFT), DFT adopts a discriminative paradigm to efficiently optimize the likelihood of an answer among all possible outputs given an input.
Combinatorial Approximations for Cluster Deletion: Simpler, Faster, and Better
Vicente Balmaseda, Ying Xu, Yixin Cao, Nate Veldt
Forty-first International Conference on Machine Learning (ICML 2024), 2024
paper /
code /
poster /
arXiv /
venue /
We provide improved deterministic approximation algorithms and guarantees for Cluster Deletion, and the first combinatorial algorithm for the Strong Triadic Closure (STC) relaxation.
Predicting systemic risk in financial systems using Deep Graph Learning
Vicente Balmaseda, MarĂa Coronado, Gonzalo de Cadenas-Santiago
Intelligent Systems with Applications, 2023
paper /
code /
We propose Graph Neural Networks (GNNs) for systemic risk analysis in financial networks, leveraging network structure to overcome ML limitations. Additionally, we introduce C2R approach to reduce pre-labeling efforts for costly continuous systemic risk metrics, enabling modeling these metrics (e.g., quantiles) using data labeled into just a few classes.
Other Projects
These include coursework and side projects.
TransformerKart: A Full SuperTuxKart AI Controller