Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kanan Mahammadli

Sequential Large Language Model-Based Hyper-Parameter Optimization

Oct 27, 2024

Kanan Mahammadli

Abstract:This study introduces SLLMBO, an innovative framework that leverages Large Language Models (LLMs) for hyperparameter optimization (HPO), incorporating dynamic search space adaptability, enhanced parameter landscape exploitation, and a hybrid, novel LLM-Tree-structured Parzen Estimator (LLM-TPE) sampler. By addressing limitations in recent fully LLM-based methods and traditional Bayesian Optimization (BO), SLLMBO achieves more robust optimization. This comprehensive benchmarking evaluates multiple LLMs, including GPT-3.5-turbo, GPT-4o, Claude-Sonnet-3.5, and Gemini-1.5-flash, extending prior work beyond GPT-3.5 and GPT-4 and establishing SLLMBO as the first framework to benchmark a diverse set of LLMs for HPO. By integrating LLMs' established strengths in parameter initialization with the exploitation abilities demonstrated in this study, alongside TPE's exploration capabilities, the LLM-TPE sampler achieves a balanced exploration-exploitation trade-off, reduces API costs, and mitigates premature early stoppings for more effective parameter searches. Across 14 tabular tasks in classification and regression, the LLM-TPE sampler outperformed fully LLM-based methods and achieved superior results over BO methods in 9 tasks. Testing early stopping in budget-constrained scenarios further demonstrated competitive performance, indicating that LLM-based methods generally benefit from extended iterations for optimal results. This work lays the foundation for future research exploring open-source LLMs, reproducibility of LLM results in HPO, and benchmarking SLLMBO on complex datasets, such as image classification, segmentation, and machine translation.

Via

Access Paper or Ask Questions

Class-Specific Data Augmentation: Bridging the Imbalance in Multiclass Breast Cancer Classification

Oct 15, 2023

Kanan Mahammadli, Abdullah Burkan Bereketoglu, Ayse Gul Kabakci

Abstract:Breast Cancer is the most common cancer among women, which is also visible in men, and accounts for more than 1 in 10 new cancer diagnoses each year. It is also the second most common cause of women who die from cancer. Hence, it necessitates early detection and tailored treatment. Early detection can provide appropriate and patient-based therapeutic schedules. Moreover, early detection can also provide the type of cyst. This paper employs class-level data augmentation, addressing the undersampled classes and raising their detection rate. This approach suggests two key components: class-level data augmentation on structure-preserving stain normalization techniques to hematoxylin and eosin-stained images and transformer-based ViTNet architecture via transfer learning for multiclass classification of breast cancer images. This merger enables categorizing breast cancer images with advanced image processing and deep learning as either benign or as one of four distinct malignant subtypes by focusing on class-level augmentation and catering to unique characteristics of each class with increasing precision of classification on undersampled classes, which leads to lower mortality rates associated with breast cancer. The paper aims to ease the duties of the medical specialist by operating multiclass classification and categorizing the image into benign or one of four different malignant types of breast cancers.

Via

Access Paper or Ask Questions