Picture for Rongzhi Zhang

Rongzhi Zhang

College of Computing, Georgia Institute of Technology

LongCat-Flash-Thinking-2601 Technical Report

Add code
Jan 23, 2026
Viaarxiv icon

MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering

Add code
May 12, 2025
Figure 1 for MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Figure 2 for MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Figure 3 for MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Figure 4 for MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Viaarxiv icon

Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training

Add code
Feb 10, 2025
Figure 1 for Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training
Figure 2 for Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training
Figure 3 for Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training
Figure 4 for Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training
Viaarxiv icon

LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy

Add code
Oct 04, 2024
Figure 1 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Figure 2 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Figure 3 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Figure 4 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Viaarxiv icon

PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs

Add code
Jun 06, 2024
Figure 1 for PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs
Figure 2 for PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs
Figure 3 for PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs
Figure 4 for PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs
Viaarxiv icon

ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models

Add code
Mar 17, 2024
Viaarxiv icon

TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance

Add code
Jan 24, 2024
Figure 1 for TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance
Figure 2 for TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance
Figure 3 for TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance
Figure 4 for TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance
Viaarxiv icon

Local Boosting for Weakly-Supervised Learning

Add code
Jun 05, 2023
Viaarxiv icon

ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval

Add code
May 18, 2023
Figure 1 for ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
Figure 2 for ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
Figure 3 for ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
Figure 4 for ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
Viaarxiv icon

Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation

Add code
May 08, 2023
Figure 1 for Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation
Figure 2 for Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation
Figure 3 for Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation
Figure 4 for Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation
Viaarxiv icon