Alert button
Picture for Issam H. Laradji

Issam H. Laradji

Alert button

Servicenow Research, University of British Columbia

LLM aided semi-supervision for Extractive Dialog Summarization

Nov 19, 2023
Nishant Mishra, Gaurav Sahu, Iacer Calixto, Ameen Abu-Hanna, Issam H. Laradji

Generating high-quality summaries for chat dialogs often requires large labeled datasets. We propose a method to efficiently use unlabeled data for extractive summarization of customer-agent dialogs. In our method, we frame summarization as a question-answering problem and use state-of-the-art large language models (LLMs) to generate pseudo-labels for a dialog. We then use these pseudo-labels to fine-tune a chat summarization model, effectively transferring knowledge from the large LLM into a smaller specialized model. We demonstrate our method on the \tweetsumm dataset, and show that using 10\% of the original labelled data set we can achieve 65.9/57.0/61.0 ROUGE-1/-2/-L, whereas the current state-of-the-art trained on the entire training data set obtains 65.16/55.81/64.37 ROUGE-1/-2/-L. In other words, in the worst case (i.e., ROUGE-L) we still effectively retain 94.7% of the performance while using only 10% of the data.

* to be published in EMNLP Findings 
Viaarxiv icon

Enchancing Semi-Supervised Learning for Extractive Summarization with an LLM-based pseudolabeler

Nov 16, 2023
Gaurav Sahu, Olga Vechtomova, Issam H. Laradji

This work tackles the task of extractive text summarization in a limited labeled data scenario using a semi-supervised approach. Specifically, we propose a prompt-based pseudolabel selection strategy using GPT-4. We evaluate our method on three text summarization datasets: TweetSumm, WikiHow, and ArXiv/PubMed. Our experiments show that by using an LLM to evaluate and generate pseudolabels, we can improve the ROUGE-1 by 10-20\% on the different datasets, which is akin to enhancing pretrained models. We also show that such a method needs a smaller pool of unlabeled examples to perform better.

Viaarxiv icon

PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation

Oct 22, 2023
Gaurav Sahu, Olga Vechtomova, Dzmitry Bahdanau, Issam H. Laradji

Data augmentation is a widely used technique to address the problem of text classification when there is a limited amount of training data. Recent work often tackles this problem using large language models (LLMs) like GPT3 that can generate new examples given already available ones. In this work, we propose a method to generate more helpful augmented data by utilizing the LLM's abilities to follow instructions and perform few-shot classifications. Our specific PromptMix method consists of two steps: 1) generate challenging text augmentations near class boundaries; however, generating borderline examples increases the risk of false positives in the dataset, so we 2) relabel the text augmentations using a prompting-based LLM classifier to enhance the correctness of labels in the generated data. We evaluate the proposed method in challenging 2-shot and zero-shot settings on four text classification datasets: Banking77, TREC6, Subjectivity (SUBJ), and Twitter Complaints. Our experiments show that generating and, crucially, relabeling borderline examples facilitates the transfer of knowledge of a massive LLM like GPT3.5-turbo into smaller and cheaper classifiers like DistilBERT$_{base}$ and BERT$_{base}$. Furthermore, 2-shot PromptMix outperforms multiple 5-shot data augmentation methods on the four datasets. Our code is available at https://github.com/ServiceNow/PromptMix-EMNLP-2023.

* Accepted to EMNLP 2023 (Long paper) 
Viaarxiv icon

Data Augmentation for Intent Classification with Off-the-shelf Large Language Models

Apr 05, 2022
Gaurav Sahu, Pau Rodriguez, Issam H. Laradji, Parmida Atighehchian, David Vazquez, Dzmitry Bahdanau

Figure 1 for Data Augmentation for Intent Classification with Off-the-shelf Large Language Models
Figure 2 for Data Augmentation for Intent Classification with Off-the-shelf Large Language Models
Figure 3 for Data Augmentation for Intent Classification with Off-the-shelf Large Language Models
Figure 4 for Data Augmentation for Intent Classification with Off-the-shelf Large Language Models

Data augmentation is a widely employed technique to alleviate the problem of data scarcity. In this work, we propose a prompting-based approach to generate labelled training data for intent classification with off-the-shelf language models (LMs) such as GPT-3. An advantage of this method is that no task-specific LM-fine-tuning for data generation is required; hence the method requires no hyper-parameter tuning and is applicable even when the available training data is very scarce. We evaluate the proposed method in a few-shot setting on four diverse intent classification tasks. We find that GPT-generated data significantly boosts the performance of intent classifiers when intents in consideration are sufficiently distinct from each other. In tasks with semantically close intents, we observe that the generated data is less helpful. Our analysis shows that this is because GPT often generates utterances that belong to a closely-related intent instead of the desired one. We present preliminary evidence that a prompting-based GPT classifier could be helpful in filtering the generated data to enhance its quality.

* Accepted to 4th Workshop on NLP for Conversational AI, ACL 2022 
Viaarxiv icon

A Deep Learning Localization Method for Measuring Abdominal Muscle Dimensions in Ultrasound Images

Sep 30, 2021
Alzayat Saleh, Issam H. Laradji, Corey Lammie, David Vazquez, Carol A Flavell, Mostafa Rahimi Azghadi

Figure 1 for A Deep Learning Localization Method for Measuring Abdominal Muscle Dimensions in Ultrasound Images
Figure 2 for A Deep Learning Localization Method for Measuring Abdominal Muscle Dimensions in Ultrasound Images
Figure 3 for A Deep Learning Localization Method for Measuring Abdominal Muscle Dimensions in Ultrasound Images
Figure 4 for A Deep Learning Localization Method for Measuring Abdominal Muscle Dimensions in Ultrasound Images

Health professionals extensively use Two- Dimensional (2D) Ultrasound (US) videos and images to visualize and measure internal organs for various purposes including evaluation of muscle architectural changes. US images can be used to measure abdominal muscles dimensions for the diagnosis and creation of customized treatment plans for patients with Low Back Pain (LBP), however, they are difficult to interpret. Due to high variability, skilled professionals with specialized training are required to take measurements to avoid low intra-observer reliability. This variability stems from the challenging nature of accurately finding the correct spatial location of measurement endpoints in abdominal US images. In this paper, we use a Deep Learning (DL) approach to automate the measurement of the abdominal muscle thickness in 2D US images. By treating the problem as a localization task, we develop a modified Fully Convolutional Network (FCN) architecture to generate blobs of coordinate locations of measurement endpoints, similar to what a human operator does. We demonstrate that using the TrA400 US image dataset, our network achieves a Mean Absolute Error (MAE) of 0.3125 on the test set, which almost matches the performance of skilled ultrasound technicians. Our approach can facilitate next steps for automating the process of measurements in 2D US images, while reducing inter-observer as well as intra-observer variability for more effective clinical outcomes.

* 9 pages, 8 figures, 1 tables, Accepted for Publication in the IEEE Journal of Biomedical and Health Informatics (J-BHI) 25-May-2021 
Viaarxiv icon

A Realistic Fish-Habitat Dataset to Evaluate Algorithms for Underwater Visual Analysis

Aug 28, 2020
Alzayat Saleh, Issam H. Laradji, Dmitry A. Konovalov, Michael Bradley, David Vazquez, Marcus Sheaves

Figure 1 for A Realistic Fish-Habitat Dataset to Evaluate Algorithms for Underwater Visual Analysis
Figure 2 for A Realistic Fish-Habitat Dataset to Evaluate Algorithms for Underwater Visual Analysis
Figure 3 for A Realistic Fish-Habitat Dataset to Evaluate Algorithms for Underwater Visual Analysis
Figure 4 for A Realistic Fish-Habitat Dataset to Evaluate Algorithms for Underwater Visual Analysis

Visual analysis of complex fish habitats is an important step towards sustainable fisheries for human consumption and environmental protection. Deep Learning methods have shown great promise for scene analysis when trained on large-scale datasets. However, current datasets for fish analysis tend to focus on the classification task within constrained, plain environments which do not capture the complexity of underwater fish habitats. To address this limitation, we present DeepFish as a benchmark suite with a large-scale dataset to train and test methods for several computer vision tasks. The dataset consists of approximately 40 thousand images collected underwater from 20 \green{habitats in the} marine-environments of tropical Australia. The dataset originally contained only classification labels. Thus, we collected point-level and segmentation labels to have a more comprehensive fish analysis benchmark. These labels enable models to learn to automatically monitor fish count, identify their locations, and estimate their sizes. Our experiments provide an in-depth analysis of the dataset characteristics, and the performance evaluation of several state-of-the-art approaches based on our benchmark. Although models pre-trained on ImageNet have successfully performed on this benchmark, there is still room for improvement. Therefore, this benchmark serves as a testbed to motivate further development in this challenging domain of underwater computer vision. Code is available at: https://github.com/alzayats/DeepFish

* 10 pages, 5 figures, 3 tables, Accepted for Publication in Scientific Reports (Nature) 14 August 2020 
Viaarxiv icon

LOOC: Localize Overlapping Objects with Count Supervision

Jul 03, 2020
Issam H. Laradji, Rafael Pardinas, Pau Rodriguez, David Vazquez

Figure 1 for LOOC: Localize Overlapping Objects with Count Supervision
Figure 2 for LOOC: Localize Overlapping Objects with Count Supervision
Figure 3 for LOOC: Localize Overlapping Objects with Count Supervision
Figure 4 for LOOC: Localize Overlapping Objects with Count Supervision

Acquiring count annotations generally requires less human effort than point-level and bounding box annotations. Thus, we propose the novel problem setup of localizing objects in dense scenes under this weaker supervision. We propose LOOC, a method to Localize Overlapping Objects with Count supervision. We train LOOC by alternating between two stages. In the first stage, LOOC learns to generate pseudo point-level annotations in a semi-supervised manner. In the second stage, LOOC uses a fully-supervised localization method that trains on these pseudo labels. The localization method is used to progressively improve the quality of the pseudo labels. We conducted experiments on popular counting datasets. For localization, LOOC achieves a strong new baseline in the novel problem setup where only count supervision is available. For counting, LOOC outperforms current state-of-the-art methods that only use count as their supervision. Code is available at: https://github.com/ElementAI/looc.

Viaarxiv icon

Where are the Masks: Instance Segmentation with Image-level Supervision

Jul 02, 2019
Issam H. Laradji, David Vazquez, Mark Schmidt

Figure 1 for Where are the Masks: Instance Segmentation with Image-level Supervision
Figure 2 for Where are the Masks: Instance Segmentation with Image-level Supervision
Figure 3 for Where are the Masks: Instance Segmentation with Image-level Supervision
Figure 4 for Where are the Masks: Instance Segmentation with Image-level Supervision

A major obstacle in instance segmentation is that existing methods often need many per-pixel labels in order to be effective. These labels require large human effort and for certain applications, such labels are not readily available. To address this limitation, we propose a novel framework that can effectively train with image-level labels, which are significantly cheaper to acquire. For instance, one can do an internet search for the term "car" and obtain many images where a car is present with minimal effort. Our framework consists of two stages: (1) train a classifier to generate pseudo masks for the objects of interest; (2) train a fully supervised Mask R-CNN on these pseudo masks. Our two main contribution are proposing a pipeline that is simple to implement and is amenable to different segmentation methods; and achieves new state-of-the-art results for this problem setup. Our results are based on evaluating our method on PASCAL VOC 2012, a standard dataset for weakly supervised methods, where we demonstrate major performance gains compared to existing methods with respect to mean average precision.

* Accepted at BMVC2019 
Viaarxiv icon

Instance Segmentation with Point Supervision

Jun 14, 2019
Issam H. Laradji, Negar Rostamzadeh, Pedro O. Pinheiro, David Vazquez, Mark Schmidt

Figure 1 for Instance Segmentation with Point Supervision
Figure 2 for Instance Segmentation with Point Supervision
Figure 3 for Instance Segmentation with Point Supervision
Figure 4 for Instance Segmentation with Point Supervision

Instance segmentation methods often require costly per-pixel labels. We propose a method that only requires point-level annotations. During training, the model only has access to a single pixel label per object, yet the task is to output full segmentation masks. To address this challenge, we construct a network with two branches: (1) a localization network (L-Net) that predicts the location of each object; and (2) an embedding network (E-Net) that learns an embedding space where pixels of the same object are close. The segmentation masks for the located objects are obtained by grouping pixels with similar embeddings. At training time, while L-Net only requires point-level annotations, E-Net uses pseudo-labels generated by a class-agnostic object proposal method. We evaluate our approach on PASCAL VOC, COCO, KITTI and CityScapes datasets. The experiments show that our method (1) obtains competitive results compared to fully-supervised methods in certain scenarios; (2) outperforms fully- and weakly- supervised methods with a fixed annotation budget; and (3) is a first strong baseline for instance segmentation with point-level supervision.

Viaarxiv icon

Efficient Deep Gaussian Process Models for Variable-Sized Input

May 16, 2019
Issam H. Laradji, Mark Schmidt, Vladimir Pavlovic, Minyoung Kim

Figure 1 for Efficient Deep Gaussian Process Models for Variable-Sized Input
Figure 2 for Efficient Deep Gaussian Process Models for Variable-Sized Input
Figure 3 for Efficient Deep Gaussian Process Models for Variable-Sized Input
Figure 4 for Efficient Deep Gaussian Process Models for Variable-Sized Input

Deep Gaussian processes (DGP) have appealing Bayesian properties, can handle variable-sized data, and learn deep features. Their limitation is that they do not scale well with the size of the data. Existing approaches address this using a deep random feature (DRF) expansion model, which makes inference tractable by approximating DGPs. However, DRF is not suitable for variable-sized input data such as trees, graphs, and sequences. We introduce the GP-DRF, a novel Bayesian model with an input layer of GPs, followed by DRF layers. The key advantage is that the combination of GP and DRF leads to a tractable model that can both handle a variable-sized input as well as learn deep long-range dependency structures of the data. We provide a novel efficient method to simultaneously infer the posterior of GP's latent vectors and infer the posterior of DRF's internal weights and random frequencies. Our experiments show that GP-DRF outperforms the standard GP model and DRF model across many datasets. Furthermore, they demonstrate that GP-DRF enables improved uncertainty quantification compared to GP and DRF alone, with respect to a Bhattacharyya distance assessment. Source code is available at https://github.com/IssamLaradji/GP_DRF.

* Accepted in IJCNN 2019 
Viaarxiv icon