Picture for Mikhail Yurochkin

Mikhail Yurochkin

K2-Think: A Parameter-Efficient Reasoning System

Add code
Sep 09, 2025
Viaarxiv icon

Limitations of refinement methods for weak to strong generalization

Add code
Aug 23, 2025
Figure 1 for Limitations of refinement methods for weak to strong generalization
Figure 2 for Limitations of refinement methods for weak to strong generalization
Figure 3 for Limitations of refinement methods for weak to strong generalization
Figure 4 for Limitations of refinement methods for weak to strong generalization
Viaarxiv icon

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Add code
Jun 17, 2025
Figure 1 for Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Figure 2 for Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Figure 3 for Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Figure 4 for Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Viaarxiv icon

Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding

Add code
Feb 11, 2025
Viaarxiv icon

Out-of-Distribution Detection using Synthetic Data Generation

Add code
Feb 05, 2025
Viaarxiv icon

CARROT: A Cost Aware Rate Optimal Router

Add code
Feb 05, 2025
Figure 1 for CARROT: A Cost Aware Rate Optimal Router
Figure 2 for CARROT: A Cost Aware Rate Optimal Router
Figure 3 for CARROT: A Cost Aware Rate Optimal Router
Figure 4 for CARROT: A Cost Aware Rate Optimal Router
Viaarxiv icon

SPRI: Aligning Large Language Models with Context-Situated Principles

Add code
Feb 05, 2025
Figure 1 for SPRI: Aligning Large Language Models with Context-Situated Principles
Figure 2 for SPRI: Aligning Large Language Models with Context-Situated Principles
Figure 3 for SPRI: Aligning Large Language Models with Context-Situated Principles
Figure 4 for SPRI: Aligning Large Language Models with Context-Situated Principles
Viaarxiv icon

Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families

Add code
Dec 09, 2024
Viaarxiv icon

LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

Add code
Oct 15, 2024
Figure 1 for LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content
Figure 2 for LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content
Figure 3 for LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content
Figure 4 for LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content
Viaarxiv icon

Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead

Add code
Jun 17, 2024
Figure 1 for Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Figure 2 for Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Figure 3 for Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Figure 4 for Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Viaarxiv icon