Picture for Soujanya Poria

Soujanya Poria

DialogXpert: Driving Intelligent and Emotion-Aware Conversations through Online Value-Based Reinforcement Learning with LLM Priors

Add code
May 23, 2025
Viaarxiv icon

From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems

Add code
May 21, 2025
Figure 1 for From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems
Figure 2 for From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems
Figure 3 for From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems
Figure 4 for From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems
Viaarxiv icon

PREMISE: Matching-based Prediction for Accurate Review Recommendation

Add code
May 02, 2025
Figure 1 for PREMISE: Matching-based Prediction for Accurate Review Recommendation
Figure 2 for PREMISE: Matching-based Prediction for Accurate Review Recommendation
Figure 3 for PREMISE: Matching-based Prediction for Accurate Review Recommendation
Figure 4 for PREMISE: Matching-based Prediction for Accurate Review Recommendation
Viaarxiv icon

NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks

Add code
Apr 28, 2025
Figure 1 for NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
Figure 2 for NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
Figure 3 for NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
Figure 4 for NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
Viaarxiv icon

PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference

Add code
Mar 30, 2025
Figure 1 for PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference
Figure 2 for PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference
Figure 3 for PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference
Figure 4 for PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference
Viaarxiv icon

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models

Add code
Mar 06, 2025
Viaarxiv icon

Pixel-Level Reasoning Segmentation via Multi-turn Conversations

Add code
Feb 13, 2025
Viaarxiv icon

The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles

Add code
Feb 03, 2025
Figure 1 for The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
Figure 2 for The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
Figure 3 for The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
Figure 4 for The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
Viaarxiv icon

PROEMO: Prompt-Driven Text-to-Speech Synthesis Based on Emotion and Intensity Control

Add code
Jan 10, 2025
Viaarxiv icon

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

Add code
Dec 30, 2024
Viaarxiv icon