Alert button
Picture for Malte Ostendorff

Malte Ostendorff

Alert button

Tokenizer Choice For LLM Training: Negligible or Crucial?

Oct 18, 2023
Mehdi Ali, Michael Fromm, Klaudia Thellmann, Richard Rutmann, Max Lübbering, Johannes Leveling, Katrin Klug, Jan Ebert, Niclas Doll, Jasper Schulze Buschhoff, Charvi Jain, Alexander Arno Weber, Lena Jurkschat, Hammam Abdelwahab, Chelsea John, Pedro Ortiz Suarez, Malte Ostendorff, Samuel Weinbach, Rafet Sifa, Stefan Kesselheim, Nicolas Flores-Herr

Figure 1 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Figure 2 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Figure 3 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Figure 4 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Viaarxiv icon

AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity using Contrastive Learning and Structured Knowledge

Jul 22, 2023
Tim Schopf, Emanuel Gerber, Malte Ostendorff, Florian Matthes

Figure 1 for AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity using Contrastive Learning and Structured Knowledge
Figure 2 for AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity using Contrastive Learning and Structured Knowledge
Figure 3 for AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity using Contrastive Learning and Structured Knowledge
Figure 4 for AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity using Contrastive Learning and Structured Knowledge
Viaarxiv icon

Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning

Jan 23, 2023
Malte Ostendorff, Georg Rehm

Figure 1 for Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Figure 2 for Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Figure 3 for Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Figure 4 for Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Viaarxiv icon

Specialized Document Embeddings for Aspect-based Similarity of Research Papers

Mar 28, 2022
Malte Ostendorff, Till Blume, Terry Ruas, Bela Gipp, Georg Rehm

Figure 1 for Specialized Document Embeddings for Aspect-based Similarity of Research Papers
Figure 2 for Specialized Document Embeddings for Aspect-based Similarity of Research Papers
Figure 3 for Specialized Document Embeddings for Aspect-based Similarity of Research Papers
Figure 4 for Specialized Document Embeddings for Aspect-based Similarity of Research Papers
Viaarxiv icon

HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information

Mar 17, 2022
Qian Ruan, Malte Ostendorff, Georg Rehm

Figure 1 for HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information
Figure 2 for HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information
Figure 3 for HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information
Figure 4 for HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information
Viaarxiv icon

Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings

Feb 14, 2022
Malte Ostendorff, Nils Rethmeier, Isabelle Augenstein, Bela Gipp, Georg Rehm

Figure 1 for Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings
Figure 2 for Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings
Figure 3 for Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings
Figure 4 for Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings
Viaarxiv icon

A Qualitative Evaluation of User Preference for Link-based vs. Text-based Recommendations of Wikipedia Articles

Sep 16, 2021
Malte Ostendorff, Corinna Breitinger, Bela Gipp

Figure 1 for A Qualitative Evaluation of User Preference for Link-based vs. Text-based Recommendations of Wikipedia Articles
Figure 2 for A Qualitative Evaluation of User Preference for Link-based vs. Text-based Recommendations of Wikipedia Articles
Figure 3 for A Qualitative Evaluation of User Preference for Link-based vs. Text-based Recommendations of Wikipedia Articles
Figure 4 for A Qualitative Evaluation of User Preference for Link-based vs. Text-based Recommendations of Wikipedia Articles
Viaarxiv icon

Evaluating Document Representations for Content-based Legal Literature Recommendations

Apr 28, 2021
Malte Ostendorff, Elliott Ash, Terry Ruas, Bela Gipp, Julian Moreno-Schneider, Georg Rehm

Figure 1 for Evaluating Document Representations for Content-based Legal Literature Recommendations
Figure 2 for Evaluating Document Representations for Content-based Legal Literature Recommendations
Figure 3 for Evaluating Document Representations for Content-based Legal Literature Recommendations
Figure 4 for Evaluating Document Representations for Content-based Legal Literature Recommendations
Viaarxiv icon