Picture for William Yang Wang

William Yang Wang

T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design

Add code
Oct 08, 2024
Figure 1 for T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design
Figure 2 for T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design
Figure 3 for T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design
Figure 4 for T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design
Viaarxiv icon

Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement

Add code
Oct 06, 2024
Figure 1 for Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement
Figure 2 for Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement
Figure 3 for Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement
Figure 4 for Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement
Viaarxiv icon

A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models

Add code
Aug 29, 2024
Figure 1 for A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models
Figure 2 for A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models
Figure 3 for A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models
Figure 4 for A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models
Viaarxiv icon

Can Editing LLMs Inject Harm?

Add code
Jul 29, 2024
Figure 1 for Can Editing LLMs Inject Harm?
Figure 2 for Can Editing LLMs Inject Harm?
Figure 3 for Can Editing LLMs Inject Harm?
Figure 4 for Can Editing LLMs Inject Harm?
Viaarxiv icon

Benchmarks as Microscopes: A Call for Model Metrology

Add code
Jul 22, 2024
Viaarxiv icon

Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data

Add code
Jul 20, 2024
Figure 1 for Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
Figure 2 for Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
Figure 3 for Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
Figure 4 for Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
Viaarxiv icon

RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering

Add code
Jul 19, 2024
Viaarxiv icon

DebUnc: Mitigating Hallucinations in Large Language Model Agent Communication with Uncertainty Estimations

Add code
Jul 08, 2024
Figure 1 for DebUnc: Mitigating Hallucinations in Large Language Model Agent Communication with Uncertainty Estimations
Figure 2 for DebUnc: Mitigating Hallucinations in Large Language Model Agent Communication with Uncertainty Estimations
Figure 3 for DebUnc: Mitigating Hallucinations in Large Language Model Agent Communication with Uncertainty Estimations
Figure 4 for DebUnc: Mitigating Hallucinations in Large Language Model Agent Communication with Uncertainty Estimations
Viaarxiv icon

MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension

Add code
Jul 06, 2024
Viaarxiv icon

VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs

Add code
Jul 02, 2024
Figure 1 for VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs
Figure 2 for VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs
Figure 3 for VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs
Figure 4 for VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs
Viaarxiv icon