Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sachin Sharma

Pre-Hoc Predictions in AutoML: Leveraging LLMs to Enhance Model Selection and Benchmarking for Tabular datasets

Oct 02, 2025

Yannis Belkhiter, Seshu Tirupathi, Giulio Zizzo, Sachin Sharma, John D. Kelleher

Abstract:The field of AutoML has made remarkable progress in post-hoc model selection, with libraries capable of automatically identifying the most performing models for a given dataset. Nevertheless, these methods often rely on exhaustive hyperparameter searches, where methods automatically train and test different types of models on the target dataset. Contrastingly, pre-hoc prediction emerges as a promising alternative, capable of bypassing exhaustive search through intelligent pre-selection of models. Despite its potential, pre-hoc prediction remains under-explored in the literature. This paper explores the intersection of AutoML and pre-hoc model selection by leveraging traditional models and Large Language Model (LLM) agents to reduce the search space of AutoML libraries. By relying on dataset descriptions and statistical information, we reduce the AutoML search space. Our methodology is applied to the AWS AutoGluon portfolio dataset, a state-of-the-art AutoML benchmark containing 175 tabular classification datasets available on OpenML. The proposed approach offers a shift in AutoML workflows, significantly reducing computational overhead, while still selecting the best model for the given dataset.

* Oral Presentations ADAPT Annual Scientific Conference 2025

Via

Access Paper or Ask Questions

YouLeQD: Decoding the Cognitive Complexity of Questions and Engagement in Online Educational Videos from Learners' Perspectives

Jan 20, 2025

Nong Ming, Sachin Sharma, Jiho Noh

Abstract:Questioning is a fundamental aspect of education, as it helps assess students' understanding, promotes critical thinking, and encourages active engagement. With the rise of artificial intelligence in education, there is a growing interest in developing intelligent systems that can automatically generate and answer questions and facilitate interactions in both virtual and in-person education settings. However, to develop effective AI models for education, it is essential to have a fundamental understanding of questioning. In this study, we created the YouTube Learners' Questions on Bloom's Taxonomy Dataset (YouLeQD), which contains learner-posed questions from YouTube lecture video comments. Along with the dataset, we developed two RoBERTa-based classification models leveraging Large Language Models to detect questions and analyze their cognitive complexity using Bloom's Taxonomy. This dataset and our findings provide valuable insights into the cognitive complexity of learner-posed questions in educational videos and their relationship with interaction metrics. This can aid in the development of more effective AI models for education and improve the overall learning experience for students.

* 11pages. Extended version, Jan 2025. A shortened version was resubmitted and published in IEEE Conference on Semantic Computing Feb 2025

Via

Access Paper or Ask Questions

Characterization of Frequent Online Shoppers using Statistical Learning with Sparsity

Nov 11, 2021

Rajiv Sambasivan, Mark Burgess, Jörg Schad, Arthur Keen, Christopher Woodward, Alexander Geenen, Sachin Sharma

Figure 1 for Characterization of Frequent Online Shoppers using Statistical Learning with Sparsity

Figure 2 for Characterization of Frequent Online Shoppers using Statistical Learning with Sparsity

Figure 3 for Characterization of Frequent Online Shoppers using Statistical Learning with Sparsity

Figure 4 for Characterization of Frequent Online Shoppers using Statistical Learning with Sparsity

Abstract:Developing shopping experiences that delight the customer requires businesses to understand customer taste. This work reports a method to learn the shopping preferences of frequent shoppers to an online gift store by combining ideas from retail analytics and statistical learning with sparsity. Shopping activity is represented as a bipartite graph. This graph is refined by applying sparsity-based statistical learning methods. These methods are interpretable and reveal insights about customers' preferences as well as products driving revenue to the store.

Via

Access Paper or Ask Questions