About me
Hello. I am currently a Ph.D. candidate in Statistics at the University of Michigan, where I am fortunate to be advised by Professor Jonathan Terhorst and Professor Long Nguyen. Before coming to the University of Michigan, I graduated as the valedictorian of Hanoi University of Science, class of 2019.
Here is a current version of my CV.
Research Interest
A central theme of my research focuses on:
- Hierarchical Models: Identifiability, Statistical Efficiency, and Model Selection Methods
- Population genetics and Phylogenetics
- Statistical Optimal Transport
I study identifiability and parameter estimation for latent variable models such as mixture and admixture models with unknown number of components using optimal transport, empirical process theory, and Bayesian asymptotic theory. From that, we provide rigorous model selection methods for those models. I am also passionate about developing interpretable and computationally efficient hierarchical Bayesian methods with applications in genetics.
Preprints and Publications
* denotes equal contributions.
Dirichet moment tensors and the correspondence between admixture and mixture of product models. To be submitted to Annals of Statistics.
Dat Do, Sunrit Chakraborty, Jonathan Terhorst, XuanLong Nguyen.Dendrogram of mixing measures: Hierarchical clustering and model selection for finite mixture models. under review with Journal of the American Statistical Association.
Dat Do, Linh Do, Scott McKinley, Jonathan Terhorst, XuanLong Nguyen.Functional optimal transport: map estimation and domain adaptation for functional data. Journal of Machine Learning Research (JMLR) 2024.
Jiacheng Zhu*, Aritra Guha*, Dat Do*, Mengdi Xu, XuanLong Nguyen, Ding Zhao.Strong identifiably and parameter learning in regression with heterogeneous response. Under major revision with Electronic Journal of Statistics.
Dat Do, Linh Do, XuanLong Nguyen.Minimax Optimal Rate for Parameter Estimation in Multivariate Deviated Models. NeurIPS 2023.
Dat Do*, Huy Nguyen*, Khai Nguyen, Nhat Ho.Beyond Black Box Densities: Parameter Learning for the Deviated Components. NeurIPS 2022.
Dat Do*, Nhat Ho*, XuanLong Nguyen.Entropic Gromov-Wasserstein between Gaussian distributions. International Conference on Machine Learning (ICML) 2022.
Khang Le, Dung Le, Huy Nguyen, Dat Do, Tung Pham, Nhat Ho.Generalized Marcinkiewicz Laws for Weighted Dependent Random Vectors in Hilbert Spaces. Theory of probability and its applications, 2021.
Ta Cong Son, Le Van Dung, Dat Do, Ta Thi Trang.On Label Shift in Domain Adaptation via Wasserstein Distance. Under Review.
Trung Le, Dat Do, Tuan Nguyen, Huy Nguyen, Hung Bui, Nhat Ho, Dinh Phung.