Picture for Kevin Swersky

Kevin Swersky

University of Toronto

Analysis of Optimality of Large Language Models on Planning Problems

Add code
Apr 03, 2026
Viaarxiv icon

Enhancing LLM Planning Capabilities through Intrinsic Self-Critique

Add code
Dec 30, 2025
Viaarxiv icon

SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?

Add code
Nov 11, 2025
Figure 1 for SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?
Figure 2 for SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?
Figure 3 for SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?
Figure 4 for SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?
Viaarxiv icon

Do generative video models learn physical principles from watching videos?

Add code
Jan 14, 2025
Figure 1 for Do generative video models learn physical principles from watching videos?
Figure 2 for Do generative video models learn physical principles from watching videos?
Figure 3 for Do generative video models learn physical principles from watching videos?
Figure 4 for Do generative video models learn physical principles from watching videos?
Viaarxiv icon

Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability

Add code
Aug 14, 2024
Figure 1 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Figure 2 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Figure 3 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Figure 4 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Viaarxiv icon

Exploring and Benchmarking the Planning Capabilities of Large Language Models

Add code
Jun 18, 2024
Figure 1 for Exploring and Benchmarking the Planning Capabilities of Large Language Models
Figure 2 for Exploring and Benchmarking the Planning Capabilities of Large Language Models
Figure 3 for Exploring and Benchmarking the Planning Capabilities of Large Language Models
Figure 4 for Exploring and Benchmarking the Planning Capabilities of Large Language Models
Viaarxiv icon

Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation

Add code
May 31, 2024
Figure 1 for Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
Figure 2 for Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
Figure 3 for Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
Figure 4 for Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
Viaarxiv icon

Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

Add code
May 27, 2024
Figure 1 for Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models
Figure 2 for Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models
Figure 3 for Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models
Figure 4 for Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models
Viaarxiv icon

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Add code
Dec 22, 2023
Figure 1 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 2 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 3 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 4 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Viaarxiv icon