Picture for David Mimno

David Mimno

Princeton University

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Add code
May 22, 2023
Figure 1 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Figure 2 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Figure 3 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Figure 4 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Viaarxiv icon

Sensemaking About Contraceptive Methods Across Online Platforms

Add code
Jan 23, 2023
Viaarxiv icon

Breaking BERT: Evaluating and Optimizing Sparsified Attention

Add code
Oct 07, 2022
Figure 1 for Breaking BERT: Evaluating and Optimizing Sparsified Attention
Figure 2 for Breaking BERT: Evaluating and Optimizing Sparsified Attention
Figure 3 for Breaking BERT: Evaluating and Optimizing Sparsified Attention
Figure 4 for Breaking BERT: Evaluating and Optimizing Sparsified Attention
Viaarxiv icon

Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model

Add code
Oct 05, 2022
Figure 1 for Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
Figure 2 for Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
Figure 3 for Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
Figure 4 for Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
Viaarxiv icon

On-the-Fly Rectification for Robust Large-Vocabulary Topic Inference

Add code
Nov 12, 2021
Figure 1 for On-the-Fly Rectification for Robust Large-Vocabulary Topic Inference
Figure 2 for On-the-Fly Rectification for Robust Large-Vocabulary Topic Inference
Figure 3 for On-the-Fly Rectification for Robust Large-Vocabulary Topic Inference
Figure 4 for On-the-Fly Rectification for Robust Large-Vocabulary Topic Inference
Viaarxiv icon

Tecnologica cosa: Modeling Storyteller Personalities in Boccaccio's Decameron

Add code
Sep 22, 2021
Figure 1 for Tecnologica cosa: Modeling Storyteller Personalities in Boccaccio's Decameron
Figure 2 for Tecnologica cosa: Modeling Storyteller Personalities in Boccaccio's Decameron
Figure 3 for Tecnologica cosa: Modeling Storyteller Personalities in Boccaccio's Decameron
Figure 4 for Tecnologica cosa: Modeling Storyteller Personalities in Boccaccio's Decameron
Viaarxiv icon

Comparing Text Representations: A Theory-Driven Approach

Add code
Sep 15, 2021
Figure 1 for Comparing Text Representations: A Theory-Driven Approach
Figure 2 for Comparing Text Representations: A Theory-Driven Approach
Figure 3 for Comparing Text Representations: A Theory-Driven Approach
Figure 4 for Comparing Text Representations: A Theory-Driven Approach
Viaarxiv icon

Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents

Add code
Oct 30, 2020
Figure 1 for Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
Figure 2 for Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
Figure 3 for Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
Figure 4 for Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
Viaarxiv icon

Topic Modeling with Contextualized Word Representation Clusters

Add code
Oct 23, 2020
Figure 1 for Topic Modeling with Contextualized Word Representation Clusters
Figure 2 for Topic Modeling with Contextualized Word Representation Clusters
Figure 3 for Topic Modeling with Contextualized Word Representation Clusters
Figure 4 for Topic Modeling with Contextualized Word Representation Clusters
Viaarxiv icon

How we do things with words: Analyzing text as social and cultural data

Add code
Jul 02, 2019
Viaarxiv icon