Picture for Seongyun Lee

Seongyun Lee

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think

Add code
May 15, 2025
Viaarxiv icon

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Add code
Apr 24, 2025
Figure 1 for Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Figure 2 for Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Figure 3 for Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Figure 4 for Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Viaarxiv icon

Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators

Add code
Mar 25, 2025
Viaarxiv icon

Efficient Long Context Language Model Retrieval with Compression

Add code
Dec 24, 2024
Figure 1 for Efficient Long Context Language Model Retrieval with Compression
Figure 2 for Efficient Long Context Language Model Retrieval with Compression
Figure 3 for Efficient Long Context Language Model Retrieval with Compression
Figure 4 for Efficient Long Context Language Model Retrieval with Compression
Viaarxiv icon

Evaluating Language Models as Synthetic Data Generators

Add code
Dec 04, 2024
Figure 1 for Evaluating Language Models as Synthetic Data Generators
Figure 2 for Evaluating Language Models as Synthetic Data Generators
Figure 3 for Evaluating Language Models as Synthetic Data Generators
Figure 4 for Evaluating Language Models as Synthetic Data Generators
Viaarxiv icon

How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?

Add code
Oct 10, 2024
Figure 1 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 2 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 3 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 4 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Viaarxiv icon

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Add code
Jun 09, 2024
Figure 1 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 2 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 3 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 4 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Viaarxiv icon

Aligning to Thousands of Preferences via System Message Generalization

Add code
May 28, 2024
Viaarxiv icon

LG AI Research & KAIST at EHRSQL 2024: Self-Training Large Language Models with Pseudo-Labeled Unanswerable Questions for a Reliable Text-to-SQL System on EHRs

Add code
May 18, 2024
Viaarxiv icon

Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation

Add code
Jan 12, 2024
Viaarxiv icon