Abstract:Recent text-to-image (T2I) generation models have achieved remarkable results by training on billion-scale datasets, following a `bigger is better' paradigm that prioritizes data quantity over quality. We challenge this established paradigm by demonstrating that strategic data augmentation of small, well-curated datasets can match or outperform models trained on massive web-scraped collections. Using only ImageNet enhanced with well-designed text and image augmentations, we achieve a +2 overall score over SD-XL on GenEval and +5 on DPGBench while using just 1/10th the parameters and 1/1000th the training images. Our results suggest that strategic data augmentation, rather than massive datasets, could offer a more sustainable path forward for T2I generation.
Abstract:This paper presents a novel framework for designing support vector machines (SVMs), which does not impose restriction on the SVM kernel to be positive-definite and allows the user to define memory constraint in terms of fixed template vectors. This makes the framework scalable and enables its implementation for low-power, high-density and memory constrained embedded application. An efficient hardware implementation of the same is also discussed, which utilizes novel low power memtransistor based cross-bar architecture, and is robust to device mismatch and randomness. We used memtransistor measurement data, and showed that the designed SVMs can achieve state-of-the-art classification accuracy on both synthetic and real-world benchmark datasets. This framework would be beneficial for design of SVM based wake-up systems for internet of things (IoTs) and edge devices where memtransistors can be used to optimize system's energy-efficiency and perform in-memory matrix-vector multiplication (MVM).