The increasing volume of commercially available conversational agents (CAs) on the market has resulted in users being burdened with learning and adopting multiple agents to accomplish their tasks. Though prior work has explored supporting a multitude of domains within the design of a single agent, the interaction experience suffers due to the large action space of desired capabilities. To address these problems, we introduce a new task BBAI: Black-Box Agent Integration, focusing on combining the capabilities of multiple black-box CAs at scale. We explore two techniques: question agent pairing and question response pairing aimed at resolving this task. Leveraging these techniques, we design One For All (OFA), a scalable system that provides a unified interface to interact with multiple CAs. Additionally, we introduce MARS: Multi-Agent Response Selection, a new encoder model for question response pairing that jointly encodes user question and agent response pairs. We demonstrate that OFA is able to automatically and accurately integrate an ensemble of commercially available CAs spanning disparate domains. Specifically, using the MARS encoder we achieve the highest accuracy on our BBAI task, outperforming strong baselines.
Research indicates that monotonous automated driving increases the incidence of fatigued driving. Although many prediction models based on advanced machine learning techniques were proposed to monitor driver fatigue, especially in manual driving, little is known about how these black-box machine learning models work. In this paper, we proposed a combination of eXtreme Gradient Boosting (XGBoost) and SHAP (SHapley Additive exPlanations) to predict driver fatigue with explanations due to their efficiency and accuracy. First, in order to obtain the ground truth of driver fatigue, PERCLOS (percentage of eyelid closure over the pupil over time) between 0 and 100 was used as the response variable. Second, we built a driver fatigue regression model using both physiological and behavioral measures with XGBoost and it outperformed other selected machine learning models with 3.847 root-mean-squared error (RMSE), 1.768 mean absolute error (MAE) and 0.996 adjusted $R^2$. Third, we employed SHAP to identify the most important predictor variables and uncovered the black-box XGBoost model by showing the main effects of most important predictor variables globally and explaining individual predictions locally. Such an explainable driver fatigue prediction model offered insights into how to intervene in automated driving when necessary, such as during the takeover transition period from automated driving to manual driving.