The existing methods for evaluating the inference abilities of Large Language Models (LLMs) have been results-centric, making it difficult to assess the inference process. We introduce a new approach using the Abstract and Reasoning Corpus (ARC) dataset to evaluate the inference and contextual understanding abilities of large language models in a process-centric manner. ARC demands rigorous logical structures for problem-solving, making it a benchmark that facilitates the comparison of model inference abilities with humans. Experimental results confirm that while large language models possess weak inference abilities, they still lag in terms of logical coherence, compositionality, and productivity. Our experiments highlight the reasoning capabilities of LLMs, proposing development paths for achieving human-level reasoning.
In the pursuit of artificial general intelligence (AGI), we tackle Abstraction and Reasoning Corpus (ARC) tasks using a novel two-pronged approach. We employ the Decision Transformer in an imitation learning paradigm to model human problem-solving, and introduce an object detection algorithm, the Push and Pull clustering method. This dual strategy enhances AI's ARC problem-solving skills and provides insights for AGI progression. Yet, our work reveals the need for advanced data collection tools, robust training datasets, and refined model structures. This study highlights potential improvements for Decision Transformers and propels future AGI research.
A mesh generation method that can generate an optimal mesh for a blade passage at a single attempt is developed using deep reinforcement learning (DRL). Unlike the conventional methods, where meshing parameters must be specified by the user or iteratively optimized from scratch for a newly given geometry, the developed method employs DRL-based multi-condition (MC) optimization to define meshing parameters for various geometries optimally. The method involves the following steps: (1) development of a base algorithm for structured mesh generation of a blade passage; (2) formulation of an MC optimization problem to optimize meshing parameters introduced while developing the base algorithm; and (3) development of a DRL-based mesh generation algorithm by solving the MC optimization problem using DRL. As a result, the developed algorithm is able to successfully generate optimal meshes at a single trial for various blades.
The Cox proportional hazards model is a canonical method in survival analysis for prediction of the life expectancy of a patient given clinical or genetic covariates -- it is a linear model in its original form. In recent years, several methods have been proposed to generalize the Cox model to neural networks, but none of these are both numerically correct and computationally efficient. We propose FastCPH, a new method that runs in linear time and supports both the standard Breslow and Efron methods for tied events. We also demonstrate the performance of FastCPH combined with LassoNet, a neural network that provides interpretability through feature sparsity, on survival datasets. The final procedure is efficient, selects useful covariates and outperforms existing CoxPH approaches.
An integrated framework of computational fluid-structural dynamics (CFD-CSD) and deep reinforcement learning (deep-RL) is developed for control of a fly-scale flexible-winged flyer in complex flow. Dynamics of the flyer in complex flow is highly unsteady and nonlinear, which makes modeling the dynamics challenging. Thus, conventional control methodologies, where the dynamics is modeled, are insufficient for regulating such complicated dynamics. Therefore, in the present study, the integrated framework, in which the whole governing equations for fluid and structure are solved, is proposed to generate a control policy for the flyer. For the deep-RL to successfully learn the control policy, accurate and ample data of the dynamics are required. However, satisfying both the quality and quantity of the data on the intricate dynamics is extremely difficult since, in general, more accurate data are more costly. In the present study, two strategies are proposed to deal with the dilemma. To obtain accurate data, the CFD-CSD is adopted for precisely predicting the dynamics. To gain ample data, a novel data reproduction method is devised, where the obtained data are replicated for various situations while conserving the dynamics. With those data, the framework learns the control policy in various flow conditions and the learned policy is shown to have remarkable performance in controlling the flyer in complex flow fields.
A multi-condition multi-objective optimization method that can find Pareto front over a defined condition space is developed for the first time using deep reinforcement learning. Unlike the conventional methods which perform optimization at a single condition, the present method learns the correlations between conditions and optimal solutions. The exclusive capability of the developed method is examined in the solutions of a novel modified Kursawe benchmark problem and an airfoil shape optimization problem which include nonlinear characteristics which are difficult to resolve using conventional optimization methods. Pareto front with high resolution over a defined condition space is successfully determined in each problem. Compared with multiple operations of a single-condition optimization method for multiple conditions, the present multi-condition optimization method based on deep reinforcement learning shows a greatly accelerated search of Pareto front by reducing the number of required function evaluations. An analysis of aerodynamics performance of airfoils with optimally designed shapes confirms that multi-condition optimization is indispensable to avoid significant degradation of target performance for varying flow conditions.
Accurate prognosis for an individual patient is a key component of precision oncology. Recent advances in machine learning have enabled the development of models using a wider range of data, including imaging. Radiomics aims to extract quantitative predictive and prognostic biomarkers from routine medical imaging, but evidence for computed tomography radiomics for prognosis remains inconclusive. We have conducted an institutional machine learning challenge to develop an accurate model for overall survival prediction in head and neck cancer using clinical data etxracted from electronic medical records and pre-treatment radiological images, as well as to evaluate the true added benefit of radiomics for head and neck cancer prognosis. Using a large, retrospective dataset of 2,552 patients and a rigorous evaluation framework, we compared 12 different submissions using imaging and clinical data, separately or in combination. The winning approach used non-linear, multitask learning on clinical data and tumour volume, achieving high prognostic accuracy for 2-year and lifetime survival prediction and outperforming models relying on clinical data only, engineered radiomics and deep learning. Combining all submissions in an ensemble model resulted in improved accuracy, with the highest gain from a image-based deep learning model. Our results show the potential of machine learning and simple, informative prognostic factors in combination with large datasets as a tool to guide personalized cancer care.
Accurate survival prediction is crucial for development of precision cancer medicine, creating the need for new sources of prognostic information. Recently, there has been significant interest in exploiting routinely collected clinical and medical imaging data to discover new prognostic markers in multiple cancer types. However, most of the previous studies focus on individual data modalities alone and do not make use of recent advances in machine learning for survival prediction. We present Deep-CR MTLR -- a novel machine learning approach for accurate cancer survival prediction from multi-modal clinical and imaging data in the presence of competing risks based on neural networks and an extension of the multi-task logistic regression framework. We demonstrate improved prognostic performance of the multi-modal approach over single modality predictors in a cohort of 2552 head and neck cancer patients, particularly for cancer specific survival, where our approach achieves 2-year AUROC of 0.774 and $C$-index of 0.788.