Picture for Mengdi Wang

Mengdi Wang

Offline Multitask Representation Learning for Reinforcement Learning

Add code
Mar 18, 2024
Figure 1 for Offline Multitask Representation Learning for Reinforcement Learning
Viaarxiv icon

Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory

Add code
Mar 18, 2024
Viaarxiv icon

Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework

Add code
Mar 17, 2024
Figure 1 for Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework
Figure 2 for Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework
Figure 3 for Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework
Figure 4 for Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework
Viaarxiv icon

Regularized DeepIV with Model Selection

Add code
Mar 07, 2024
Viaarxiv icon

Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models

Add code
Mar 03, 2024
Viaarxiv icon

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning

Add code
Feb 16, 2024
Viaarxiv icon

MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences

Add code
Feb 14, 2024
Viaarxiv icon

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Add code
Feb 07, 2024
Figure 1 for Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Figure 2 for Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Figure 3 for Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Figure 4 for Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Viaarxiv icon

Embedding Large Language Models into Extended Reality: Opportunities and Challenges for Inclusion, Engagement, and Privacy

Add code
Feb 06, 2024
Viaarxiv icon

TurboSVM-FL: Boosting Federated Learning through SVM Aggregation for Lazy Clients

Add code
Jan 29, 2024
Viaarxiv icon