Picture for Han Liu

Han Liu

Johns Hopkins University

Transformers Simulate MLE for Sequence Generation in Bayesian Networks

Add code
Jan 05, 2025
Figure 1 for Transformers Simulate MLE for Sequence Generation in Bayesian Networks
Figure 2 for Transformers Simulate MLE for Sequence Generation in Bayesian Networks
Figure 3 for Transformers Simulate MLE for Sequence Generation in Bayesian Networks
Figure 4 for Transformers Simulate MLE for Sequence Generation in Bayesian Networks
Viaarxiv icon

Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism

Add code
Dec 30, 2024
Figure 1 for Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Figure 2 for Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Figure 3 for Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Figure 4 for Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Viaarxiv icon

AlignAb: Pareto-Optimal Energy Alignment for Designing Nature-Like Antibodies

Add code
Dec 30, 2024
Viaarxiv icon

EGSRAL: An Enhanced 3D Gaussian Splatting based Renderer with Automated Labeling for Large-Scale Driving Scene

Add code
Dec 20, 2024
Figure 1 for EGSRAL: An Enhanced 3D Gaussian Splatting based Renderer with Automated Labeling for Large-Scale Driving Scene
Figure 2 for EGSRAL: An Enhanced 3D Gaussian Splatting based Renderer with Automated Labeling for Large-Scale Driving Scene
Figure 3 for EGSRAL: An Enhanced 3D Gaussian Splatting based Renderer with Automated Labeling for Large-Scale Driving Scene
Figure 4 for EGSRAL: An Enhanced 3D Gaussian Splatting based Renderer with Automated Labeling for Large-Scale Driving Scene
Viaarxiv icon

A Real-Time System for Scheduling and Managing UAV Delivery in Urban

Add code
Dec 16, 2024
Figure 1 for A Real-Time System for Scheduling and Managing UAV Delivery in Urban
Figure 2 for A Real-Time System for Scheduling and Managing UAV Delivery in Urban
Figure 3 for A Real-Time System for Scheduling and Managing UAV Delivery in Urban
Figure 4 for A Real-Time System for Scheduling and Managing UAV Delivery in Urban
Viaarxiv icon

FTP: A Fine-grained Token-wise Pruner for Large Language Models via Token Routing

Add code
Dec 16, 2024
Viaarxiv icon

On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality

Add code
Nov 26, 2024
Figure 1 for On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality
Figure 2 for On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality
Figure 3 for On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality
Figure 4 for On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality
Viaarxiv icon

Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training

Add code
Nov 25, 2024
Figure 1 for Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training
Figure 2 for Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training
Figure 3 for Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training
Viaarxiv icon

Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency

Add code
Nov 25, 2024
Viaarxiv icon

One-Layer Transformer Provably Learns One-Nearest Neighbor In Context

Add code
Nov 16, 2024
Figure 1 for One-Layer Transformer Provably Learns One-Nearest Neighbor In Context
Figure 2 for One-Layer Transformer Provably Learns One-Nearest Neighbor In Context
Figure 3 for One-Layer Transformer Provably Learns One-Nearest Neighbor In Context
Viaarxiv icon