Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kit Kuksenok

Toward Best Practices for Explainable B2B Machine Learning

Jun 11, 2019

Kit Kuksenok

Figure 1 for Toward Best Practices for Explainable B2B Machine Learning

Abstract:To design tools and data pipelines for explainable B2B machine learning (ML) systems, we need to recognize not only the immediate audience of such tools and data, but also (1) their organizational context and (2) secondary audiences. Our learnings are based on building custom ML-based chatbots for recruitment. We believe that in the B2B context, "explainable" ML means not only a system that can "explain itself" through tools and data pipelines, but also enables its domain-expert users to explain it to other stakeholders.

* 4 pages, 1 figure; position paper for INTERACT 2019 workshop on Humans in the Loop: Bridging AI and HCI

Via

Access Paper or Ask Questions

Evaluation and Improvement of Chatbot Text Classification Data Quality Using Plausible Negative Examples

Jun 05, 2019

Kit Kuksenok, Andriy Martyniv

Figure 1 for Evaluation and Improvement of Chatbot Text Classification Data Quality Using Plausible Negative Examples

Figure 2 for Evaluation and Improvement of Chatbot Text Classification Data Quality Using Plausible Negative Examples

Figure 3 for Evaluation and Improvement of Chatbot Text Classification Data Quality Using Plausible Negative Examples

Figure 4 for Evaluation and Improvement of Chatbot Text Classification Data Quality Using Plausible Negative Examples

Abstract:We describe and validate a metric for estimating multi-class classifier performance based on cross-validation and adapted for improvement of small, unbalanced natural-language datasets used in chatbot design. Our experiences draw upon building recruitment chatbots that mediate communication between job-seekers and recruiters by exposing the ML/NLP dataset to the recruiting team. Evaluation approaches must be understandable to various stakeholders, and useful for improving chatbot performance. The metric, nex-cv, uses negative examples in the evaluation of text classification, and fulfils three requirements. First, it is actionable: it can be used by non-developer staff. Second, it is not overly optimistic compared to human ratings, making it a fast method for comparing classifiers. Third, it allows model-agnostic comparison, making it useful for comparing systems despite implementation differences. We validate the metric based on seven recruitment-domain datasets in English and German over the course of one year.

* Included in the ACL2019 1st workshop on NLP for Conversational AI (Florence, Italy). Code available: https://github.com/jobpal/nex-cv

Via

Access Paper or Ask Questions

Transparency in Maintenance of Recruitment Chatbots

May 09, 2019

Kit Kuksenok, Nina Praß

Figure 1 for Transparency in Maintenance of Recruitment Chatbots

Figure 2 for Transparency in Maintenance of Recruitment Chatbots

Figure 3 for Transparency in Maintenance of Recruitment Chatbots

Abstract:We report on experiences with implementing conversational agents in the recruitment domain based on a machine learning (ML) system. Recruitment chatbots mediate communication between job-seekers and recruiters by exposing ML data to recruiter teams. Errors are difficult to understand, communicate, and resolve because they may span and combine UX, ML, and software issues. In an effort to improve organizational and technical transparency, we came to rely on a key contact role. Though effective for design and development, the centralization of this role poses challenges for transparency in sustained maintenance of this kind of ML-based mediating system.

* 4 pages, 3 figures, prepared for CHI2019 (Glasgow) workshop: Where is the Human? Bridging the Gap Between AI and HCI

Via

Access Paper or Ask Questions