Picture for Furu Wei

Furu Wei

Context-DPO: Aligning Language Models for Context-Faithfulness

Add code
Dec 18, 2024
Figure 1 for Context-DPO: Aligning Language Models for Context-Faithfulness
Figure 2 for Context-DPO: Aligning Language Models for Context-Faithfulness
Figure 3 for Context-DPO: Aligning Language Models for Context-Faithfulness
Figure 4 for Context-DPO: Aligning Language Models for Context-Faithfulness
Viaarxiv icon

Multimodal Latent Language Modeling with Next-Token Diffusion

Add code
Dec 11, 2024
Figure 1 for Multimodal Latent Language Modeling with Next-Token Diffusion
Figure 2 for Multimodal Latent Language Modeling with Next-Token Diffusion
Figure 3 for Multimodal Latent Language Modeling with Next-Token Diffusion
Figure 4 for Multimodal Latent Language Modeling with Next-Token Diffusion
Viaarxiv icon

RedStone: Curating General, Code, Math, and QA Data for Large Language Models

Add code
Dec 04, 2024
Figure 1 for RedStone: Curating General, Code, Math, and QA Data for Large Language Models
Figure 2 for RedStone: Curating General, Code, Math, and QA Data for Large Language Models
Figure 3 for RedStone: Curating General, Code, Math, and QA Data for Large Language Models
Figure 4 for RedStone: Curating General, Code, Math, and QA Data for Large Language Models
Viaarxiv icon

MH-MoE: Multi-Head Mixture-of-Experts

Add code
Nov 26, 2024
Figure 1 for MH-MoE: Multi-Head Mixture-of-Experts
Figure 2 for MH-MoE: Multi-Head Mixture-of-Experts
Figure 3 for MH-MoE: Multi-Head Mixture-of-Experts
Figure 4 for MH-MoE: Multi-Head Mixture-of-Experts
Viaarxiv icon

Preference Optimization for Reasoning with Pseudo Feedback

Add code
Nov 25, 2024
Figure 1 for Preference Optimization for Reasoning with Pseudo Feedback
Figure 2 for Preference Optimization for Reasoning with Pseudo Feedback
Figure 3 for Preference Optimization for Reasoning with Pseudo Feedback
Figure 4 for Preference Optimization for Reasoning with Pseudo Feedback
Viaarxiv icon

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Add code
Nov 07, 2024
Figure 1 for BitNet a4.8: 4-bit Activations for 1-bit LLMs
Figure 2 for BitNet a4.8: 4-bit Activations for 1-bit LLMs
Figure 3 for BitNet a4.8: 4-bit Activations for 1-bit LLMs
Figure 4 for BitNet a4.8: 4-bit Activations for 1-bit LLMs
Viaarxiv icon

Textual Aesthetics in Large Language Models

Add code
Nov 05, 2024
Figure 1 for Textual Aesthetics in Large Language Models
Figure 2 for Textual Aesthetics in Large Language Models
Figure 3 for Textual Aesthetics in Large Language Models
Figure 4 for Textual Aesthetics in Large Language Models
Viaarxiv icon

ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation

Add code
Oct 27, 2024
Figure 1 for ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Figure 2 for ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Figure 3 for ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Figure 4 for ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Viaarxiv icon

ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Contrastive Framework

Add code
Oct 25, 2024
Figure 1 for ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Contrastive Framework
Figure 2 for ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Contrastive Framework
Figure 3 for ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Contrastive Framework
Figure 4 for ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Contrastive Framework
Viaarxiv icon

Little Giants: Synthesizing High-Quality Embedding Data at Scale

Add code
Oct 24, 2024
Viaarxiv icon