Picture for Jaehoon Lee

Jaehoon Lee

Shammie

Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability

Add code
Aug 14, 2024
Figure 1 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Figure 2 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Figure 3 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Figure 4 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Viaarxiv icon

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Add code
Aug 06, 2024
Figure 1 for Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Figure 2 for Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Figure 3 for Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Figure 4 for Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Viaarxiv icon

Scaling Exponents Across Parameterizations and Optimizers

Add code
Jul 08, 2024
Figure 1 for Scaling Exponents Across Parameterizations and Optimizers
Figure 2 for Scaling Exponents Across Parameterizations and Optimizers
Figure 3 for Scaling Exponents Across Parameterizations and Optimizers
Figure 4 for Scaling Exponents Across Parameterizations and Optimizers
Viaarxiv icon

Training LLMs over Neurally Compressed Text

Add code
Apr 04, 2024
Figure 1 for Training LLMs over Neurally Compressed Text
Figure 2 for Training LLMs over Neurally Compressed Text
Figure 3 for Training LLMs over Neurally Compressed Text
Figure 4 for Training LLMs over Neurally Compressed Text
Viaarxiv icon

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Add code
Dec 22, 2023
Figure 1 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 2 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 3 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 4 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Viaarxiv icon

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

Add code
Nov 15, 2023
Figure 1 for Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?
Figure 2 for Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?
Figure 3 for Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?
Figure 4 for Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?
Viaarxiv icon

Small-scale proxies for large-scale Transformer training instabilities

Add code
Sep 25, 2023
Figure 1 for Small-scale proxies for large-scale Transformer training instabilities
Figure 2 for Small-scale proxies for large-scale Transformer training instabilities
Figure 3 for Small-scale proxies for large-scale Transformer training instabilities
Figure 4 for Small-scale proxies for large-scale Transformer training instabilities
Viaarxiv icon

Replacing softmax with ReLU in Vision Transformers

Add code
Sep 15, 2023
Figure 1 for Replacing softmax with ReLU in Vision Transformers
Figure 2 for Replacing softmax with ReLU in Vision Transformers
Figure 3 for Replacing softmax with ReLU in Vision Transformers
Figure 4 for Replacing softmax with ReLU in Vision Transformers
Viaarxiv icon

MadSGM: Multivariate Anomaly Detection with Score-based Generative Models

Add code
Aug 29, 2023
Figure 1 for MadSGM: Multivariate Anomaly Detection with Score-based Generative Models
Figure 2 for MadSGM: Multivariate Anomaly Detection with Score-based Generative Models
Figure 3 for MadSGM: Multivariate Anomaly Detection with Score-based Generative Models
Figure 4 for MadSGM: Multivariate Anomaly Detection with Score-based Generative Models
Viaarxiv icon