Picture for Michael Sun

Michael Sun

Protein Structure Tokenization via Geometric Byte Pair Encoding

Add code
Nov 13, 2025
Viaarxiv icon

DischargeSim: A Simulation Benchmark for Educational Doctor-Patient Communication at Discharge

Add code
Sep 10, 2025
Viaarxiv icon

Foundation Molecular Grammar: Multi-Modal Foundation Models Induce Interpretable Molecular Graph Languages

Add code
May 29, 2025
Viaarxiv icon

Directed Graph Grammars for Sequence-based Learning

Add code
May 29, 2025
Viaarxiv icon

Two-Stage Pretraining for Molecular Property Prediction in the Wild

Add code
Nov 05, 2024
Figure 1 for Two-Stage Pretraining for Molecular Property Prediction in the Wild
Figure 2 for Two-Stage Pretraining for Molecular Property Prediction in the Wild
Figure 3 for Two-Stage Pretraining for Molecular Property Prediction in the Wild
Figure 4 for Two-Stage Pretraining for Molecular Property Prediction in the Wild
Viaarxiv icon

Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning

Add code
Oct 05, 2024
Figure 1 for Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning
Figure 2 for Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning
Figure 3 for Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning
Figure 4 for Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning
Viaarxiv icon

Representing Molecules as Random Walks Over Interpretable Grammars

Add code
Mar 13, 2024
Figure 1 for Representing Molecules as Random Walks Over Interpretable Grammars
Figure 2 for Representing Molecules as Random Walks Over Interpretable Grammars
Figure 3 for Representing Molecules as Random Walks Over Interpretable Grammars
Figure 4 for Representing Molecules as Random Walks Over Interpretable Grammars
Viaarxiv icon

X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents

Add code
Jun 30, 2023
Figure 1 for X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents
Figure 2 for X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents
Figure 3 for X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents
Figure 4 for X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents
Viaarxiv icon

Improving Representational Continuity via Continued Pretraining

Add code
Feb 26, 2023
Viaarxiv icon

Do Neural Networks Generalize from Self-Averaging Sub-classifiers in the Same Way As Adaptive Boosting?

Add code
Feb 14, 2023
Viaarxiv icon