I'm interested in deep learning, machine learning, graph analytics, and their practical
applications.
Discovering Global False Negatives On the Fly for Self-supervised Contrastive Learning
Vicente Balmaseda, Bokun Wang, Ching-Long Lin, Tianbao Yang
Accepted to Forty-second International Conference on Machine Learning (ICML 2025), 2025
code /
poster /
arXiv /
venue /
In self-supervised contrastive learning, negative samples may inadvertently share semantics with the anchor, leading to false negatives that degrade representation quality. We introduce GloFND, a scalable optimization-based method that discovers false negatives during training. It identifies them globally across the dataset while keeping computation confined to each mini-batch and independent of dataset size. GloFND integrates seamlessly with unimodal and bimodal contrastive frameworks, consistently improving representation quality with minimal overhead.
Discriminative Finetuning of Generative Large Language Models without Reward Models and Preference Data
Siqi Guo, Ilgee Hong, Vicente Balmaseda, Changlong Yu, Liang Qiu, Xin Liu, Haoming Jiang, Tuo Zhao, Tianbao Yang
Accepted to Forty-second International Conference on Machine Learning (ICML 2025), 2025
code /
arXiv /
venue /
We introduce Discriminative Fine-Tuning (DFT), an approach for fine-tuning LLMs without preference data or reward models. Unlike Supervised Fine-Tuning (SFT), DFT adopts a discriminative paradigm to efficiently optimize the likelihood of an answer among all possible outputs given an input.
Combinatorial Approximations for Cluster Deletion: Simpler, Faster, and Better
Vicente Balmaseda, Ying Xu, Yixin Cao, Nate Veldt
Forty-first International Conference on Machine Learning (ICML 2024), 2024
paper /
code /
poster /
arXiv /
venue /
We provide improved deterministic approximation algorithms and guarantees for Cluster Deletion, and the first combinatorial algorithm for the Strong Triadic Closure (STC) relaxation, enabling us to handle graphs with millions of nodes and edges on a standard 16GB laptop.
Predicting systemic risk in financial systems using Deep Graph Learning
Vicente Balmaseda, María Coronado, Gonzalo de Cadenas-Santiago
Intelligent Systems with Applications, 2023
paper /
code /
We propose Graph Neural Networks (GNNs) for systemic risk analysis in financial networks, leveraging network structure to overcome ML limitations. Additionally, we introduce C2R approach to reduce pre-labeling efforts for costly continuous systemic risk metrics, enabling modeling these metrics (e.g., quantiles) using data labeled into just a few classes.
Other Projects
These include coursework and side projects.
TransformerKart: A Full SuperTuxKart AI Controller