Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Avinesh P. V. S.

Model Stability with Continuous Data Updates

Jan 14, 2022

Huiting Liu, Avinesh P. V. S., Siddharth Patwardhan, Peter Grasch, Sachin Agarwal

Figure 1 for Model Stability with Continuous Data Updates

Figure 2 for Model Stability with Continuous Data Updates

Figure 3 for Model Stability with Continuous Data Updates

Figure 4 for Model Stability with Continuous Data Updates

Abstract:In this paper, we study the "stability" of machine learning (ML) models within the context of larger, complex NLP systems with continuous training data updates. For this study, we propose a methodology for the assessment of model stability (which we refer to as jitter under various experimental conditions. We find that model design choices, including network architecture and input representation, have a critical impact on stability through experiments on four text classification tasks and two sequence labeling tasks. In classification tasks, non-RNN-based models are observed to be more stable than RNN-based ones, while the encoder-decoder model is less stable in sequence labeling tasks. Moreover, input representations based on pre-trained fastText embeddings contribute to more stability than other choices. We also show that two learning strategies -- ensemble models and incremental training -- have a significant influence on stability. We recommend ML model designers account for trade-offs in accuracy and jitter when making modeling choices.

Via

Access Paper or Ask Questions

Live Blog Corpus for Summarization

Feb 27, 2018

Avinesh P. V. S., Maxime Peyrard, Christian M. Meyer

Figure 1 for Live Blog Corpus for Summarization

Figure 2 for Live Blog Corpus for Summarization

Figure 3 for Live Blog Corpus for Summarization

Figure 4 for Live Blog Corpus for Summarization

Abstract:Live blogs are an increasingly popular news format to cover breaking news and live events in online journalism. Online news websites around the world are using this medium to give their readers a minute by minute update on an event. Good summaries enhance the value of the live blogs for a reader but are often not available. In this paper, we study a way of collecting corpora for automatic live blog summarization. In an empirical evaluation using well-known state-of-the-art summarization systems, we show that live blogs corpus poses new challenges in the field of summarization. We make our tools publicly available to reconstruct the corpus to encourage the research community and replicate our results.

* To appear in the Proceedings of LREC 2018

Via

Access Paper or Ask Questions