Alert button

"Text": models, code, and papers
Alert button

Towards Codable Text Watermarking for Large Language Models

Jul 29, 2023
Lean Wang, Wenkai Yang, Deli Chen, Hao Zhou, Yankai Lin, Fandong Meng, Jie Zhou, Xu Sun

Figure 1 for Towards Codable Text Watermarking for Large Language Models
Figure 2 for Towards Codable Text Watermarking for Large Language Models
Figure 3 for Towards Codable Text Watermarking for Large Language Models
Figure 4 for Towards Codable Text Watermarking for Large Language Models
Viaarxiv icon

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

Sep 27, 2023
Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu

Figure 1 for LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
Figure 2 for LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
Figure 3 for LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
Figure 4 for LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
Viaarxiv icon

Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness

Sep 27, 2023
Valentin Barriere, Felipe del Rio, Andres Carvallo De Ferari, Carlos Aspillaga, Eugenio Herrera-Berg, Cristian Buc Calderon

Figure 1 for Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness
Figure 2 for Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness
Figure 3 for Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness
Figure 4 for Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness
Viaarxiv icon

TopRoBERTa: Topology-Aware Authorship Attribution of Deepfake Texts

Sep 22, 2023
Adaku Uchendu, Thai Le, Dongwon Lee

Figure 1 for TopRoBERTa: Topology-Aware Authorship Attribution of Deepfake Texts
Figure 2 for TopRoBERTa: Topology-Aware Authorship Attribution of Deepfake Texts
Figure 3 for TopRoBERTa: Topology-Aware Authorship Attribution of Deepfake Texts
Figure 4 for TopRoBERTa: Topology-Aware Authorship Attribution of Deepfake Texts
Viaarxiv icon

Unsupervised Speech Recognition with N-Skipgram and Positional Unigram Matching

Oct 03, 2023
Liming Wang, Mark Hasegawa-Johnson, Chang D. Yoo

Viaarxiv icon

Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

Sep 27, 2023
Haonan Chang, Kowndinya Boyalakuntla, Shiyang Lu, Siwei Cai, Eric Jing, Shreesh Keskar, Shijie Geng, Adeeb Abbas, Lifeng Zhou, Kostas Bekris, Abdeslam Boularias

Viaarxiv icon

FaceCLIPNeRF: Text-driven 3D Face Manipulation using Deformable Neural Radiance Fields

Aug 07, 2023
Sungwon Hwang, Junha Hyung, Daejin Kim, Min-Jung Kim, Jaegul Choo

Figure 1 for FaceCLIPNeRF: Text-driven 3D Face Manipulation using Deformable Neural Radiance Fields
Figure 2 for FaceCLIPNeRF: Text-driven 3D Face Manipulation using Deformable Neural Radiance Fields
Figure 3 for FaceCLIPNeRF: Text-driven 3D Face Manipulation using Deformable Neural Radiance Fields
Figure 4 for FaceCLIPNeRF: Text-driven 3D Face Manipulation using Deformable Neural Radiance Fields
Viaarxiv icon

PromptASR for contextualized ASR with controllable style

Sep 20, 2023
Xiaoyu Yang, Wei Kang, Zengwei Yao, Yifan Yang, Liyong Guo, Fangjun Kuang, Long Lin, Daniel Povey

Figure 1 for PromptASR for contextualized ASR with controllable style
Figure 2 for PromptASR for contextualized ASR with controllable style
Figure 3 for PromptASR for contextualized ASR with controllable style
Figure 4 for PromptASR for contextualized ASR with controllable style
Viaarxiv icon

From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models

Oct 13, 2023
Dongsheng Jiang, Yuchen Liu, Songlin Liu, Xiaopeng Zhang, Jin Li, Hongkai Xiong, Qi Tian

Figure 1 for From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
Figure 2 for From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
Figure 3 for From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
Figure 4 for From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
Viaarxiv icon

MUTEX: Learning Unified Policies from Multimodal Task Specifications

Sep 25, 2023
Rutav Shah, Roberto Martín-Martín, Yuke Zhu

Figure 1 for MUTEX: Learning Unified Policies from Multimodal Task Specifications
Figure 2 for MUTEX: Learning Unified Policies from Multimodal Task Specifications
Figure 3 for MUTEX: Learning Unified Policies from Multimodal Task Specifications
Figure 4 for MUTEX: Learning Unified Policies from Multimodal Task Specifications
Viaarxiv icon