Picture for Vernon Y. H. Toh

Vernon Y. H. Toh

The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles

Add code
Feb 03, 2025
Viaarxiv icon

Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning

Add code
Oct 16, 2024
Figure 1 for Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning
Figure 2 for Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning
Figure 3 for Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning
Figure 4 for Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning
Viaarxiv icon

Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique

Add code
Aug 20, 2024
Viaarxiv icon