Abstract:Large language models have become extremely popular recently due to their ability to achieve strong performance on a variety of tasks, such as text generation and rewriting, but their size and computation cost make them difficult to access, deploy, and secure in many settings. This paper investigates whether small, decoder-only language models can provide an efficient alternative for the tasks of grammar correction and text simplification. The experiments in this paper focus on testing small language models out of the box, fine-tuned, and run sequentially on the JFLEG and ASSET datasets using established metrics. The results show that while SLMs may learn certain behaviors well, their performance remains below strong baselines and current LLMs. The results also show that SLMs struggle with retaining meaning and hallucinations. These findings suggest that despite their efficiency advantages, current SLMs are not yet competitive enough with modern LLMs for rewriting, and further advances in training are required for SLMs to close the performance gap between them and today's LLMs.
Abstract:This study explores the impact of scaling semantic categories on the image classification performance of vision transformers (ViTs). In this specific case, the CLIP server provided by Jina AI is used for experimentation. The research hypothesizes that as the number of ground truth and artificially introduced semantically equivalent categories increases, the labeling accuracy of ViTs improves until a theoretical maximum or limit is reached. A wide variety of image datasets were chosen to test this hypothesis. These datasets were processed through a custom function in Python designed to evaluate the model's accuracy, with adjustments being made to account for format differences between datasets. By exponentially introducing new redundant categories, the experiment assessed accuracy trends until they plateaued, decreased, or fluctuated inconsistently. The findings show that while semantic scaling initially increases model performance, the benefits diminish or reverse after surpassing a critical threshold, providing insight into the limitations and possible optimization of category labeling strategies for ViTs.