Alert button
Picture for Sayak Ray Chowdhury

Sayak Ray Chowdhury

Alert button

Provably Robust DPO: Aligning Language Models with Noisy Feedback

Add code
Bookmark button
Alert button
Mar 01, 2024
Sayak Ray Chowdhury, Anush Kini, Nagarajan Natarajan

Figure 1 for Provably Robust DPO: Aligning Language Models with Noisy Feedback
Figure 2 for Provably Robust DPO: Aligning Language Models with Noisy Feedback
Figure 3 for Provably Robust DPO: Aligning Language Models with Noisy Feedback
Viaarxiv icon

Provably Sample Efficient RLHF via Active Preference Optimization

Add code
Bookmark button
Alert button
Feb 16, 2024
Nirjhar Das, Souradip Chakraborty, Aldo Pacchiano, Sayak Ray Chowdhury

Viaarxiv icon

GAR-meets-RAG Paradigm for Zero-Shot Information Retrieval

Add code
Bookmark button
Alert button
Oct 31, 2023
Daman Arora, Anush Kini, Sayak Ray Chowdhury, Nagarajan Natarajan, Gaurav Sinha, Amit Sharma

Viaarxiv icon

Differentially Private Reward Estimation with Preference Feedback

Add code
Bookmark button
Alert button
Oct 30, 2023
Sayak Ray Chowdhury, Xingyu Zhou, Nagarajan Natarajan

Viaarxiv icon

Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards

Add code
Bookmark button
Alert button
Jun 05, 2023
Yulian Wu, Xingyu Zhou, Sayak Ray Chowdhury, Di Wang

Figure 1 for Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards
Figure 2 for Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards
Figure 3 for Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards
Figure 4 for Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards
Viaarxiv icon

On Differentially Private Federated Linear Contextual Bandits

Add code
Bookmark button
Alert button
Feb 27, 2023
Xingyu Zhou, Sayak Ray Chowdhury

Figure 1 for On Differentially Private Federated Linear Contextual Bandits
Figure 2 for On Differentially Private Federated Linear Contextual Bandits
Figure 3 for On Differentially Private Federated Linear Contextual Bandits
Viaarxiv icon

Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference

Add code
Bookmark button
Alert button
Jul 23, 2022
Debangshu Banerjee, Avishek Ghosh, Sayak Ray Chowdhury, Aditya Gopalan

Figure 1 for Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference
Figure 2 for Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference
Figure 3 for Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference
Viaarxiv icon

Model Selection in Reinforcement Learning with General Function Approximations

Add code
Bookmark button
Alert button
Jul 06, 2022
Avishek Ghosh, Sayak Ray Chowdhury

Viaarxiv icon

Distributed Differential Privacy in Multi-Armed Bandits

Add code
Bookmark button
Alert button
Jun 12, 2022
Sayak Ray Chowdhury, Xingyu Zhou

Figure 1 for Distributed Differential Privacy in Multi-Armed Bandits
Figure 2 for Distributed Differential Privacy in Multi-Armed Bandits
Figure 3 for Distributed Differential Privacy in Multi-Armed Bandits
Figure 4 for Distributed Differential Privacy in Multi-Armed Bandits
Viaarxiv icon

Shuffle Private Linear Contextual Bandits

Add code
Bookmark button
Alert button
Feb 11, 2022
Sayak Ray Chowdhury, Xingyu Zhou

Figure 1 for Shuffle Private Linear Contextual Bandits
Figure 2 for Shuffle Private Linear Contextual Bandits
Viaarxiv icon