Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Efficient Hybrid Transformer: Learning Global-local Context for Urban Sence Segmentation

Sep 18, 2021
Libo Wang, Shenghui Fang, Ce Zhang, Rui Li, Chenxi Duan

Semantic segmentation of fine-resolution urban scene images plays a vital role in extensive practical applications, such as land cover mapping, urban change detection, environmental protection and economic assessment. Driven by rapid developments in deep learning technologies, convolutional neural networks (CNNs) have dominated the semantic segmentation task for many years. Convolutional neural networks adopt hierarchical feature representation and have strong local context extraction. However, the local property of the convolution layer limits the network from capturing global information that is crucial for improving fine-resolution image segmentation. Recently, Transformer comprise a hot topic in the computer vision domain. Vision Transformer demonstrates the great capability of global information modelling, boosting many vision tasks, such as image classification, object detection and especially semantic segmentation. In this paper, we propose an efficient hybrid Transformer (EHT) for semantic segmentation of urban scene images. EHT takes advantage of CNNs and Transformer, learning global-local context to strengthen the feature representation. Extensive experiments demonstrate that EHT has higher efficiency with competitive accuracy compared with state-of-the-art benchmark methods. Specifically, the proposed EHT achieves a 67.0% mIoU on the UAVid test set and outperforms other lightweight models significantly. The code will be available soon.


  Access Paper or Ask Questions

A literature survey on student feedback assessment tools and their usage in sentiment analysis

Sep 09, 2021
Himali Aryal

Online learning is becoming increasingly popular, whether for convenience, to accommodate work hours, or simply to have the freedom to study from anywhere. Especially, during the Covid-19 pandemic, it has become the only viable option for learning. The effectiveness of teaching various hard-core programming courses with a mix of theoretical content is determined by the student interaction and responses. In contrast to a digital lecture through Zoom or Teams, a lecturer may rapidly acquire such responses from students' facial expressions, behavior, and attitude in a physical session, even if the listener is largely idle and non-interactive. However, student assessment in virtual learning is a challenging task. Despite the challenges, different technologies are progressively being integrated into teaching environments to boost student engagement and motivation. In this paper, we evaluate the effectiveness of various in-class feedback assessment methods such as Kahoot!, Mentimeter, Padlet, and polling to assist a lecturer in obtaining real-time feedback from students throughout a session and adapting the teaching style accordingly. Furthermore, some of the topics covered by student suggestions include tutor suggestions, enhancing teaching style, course content, and other subjects. Any input gives the instructor valuable insight into how to improve the student's learning experience, however, manually going through all of the qualitative comments and extracting the ideas is tedious. Thus, in this paper, we propose a sentiment analysis model for extracting the explicit suggestions from the students' qualitative feedback comments.


  Access Paper or Ask Questions

Strong Optimal Classification Trees

Mar 29, 2021
Sina Aghaei, Andr茅s G贸mez, Phebe Vayanos

Decision trees are among the most popular machine learning models and are used routinely in applications ranging from revenue management and medicine to bioinformatics. In this paper, we consider the problem of learning optimal binary classification trees. Literature on the topic has burgeoned in recent years, motivated both by the empirical suboptimality of heuristic approaches and the tremendous improvements in mixed-integer optimization (MIO) technology. Yet, existing MIO-based approaches from the literature do not leverage the power of MIO to its full extent: they rely on weak formulations, resulting in slow convergence and large optimality gaps. To fill this gap in the literature, we propose an intuitive flow-based MIO formulation for learning optimal binary classification trees. Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees. Moreover, we show that our formulation has a stronger linear optimization relaxation than existing methods. We exploit the decomposable structure of our formulation and max-flow/min-cut duality to derive a Benders' decomposition method to speed-up computation. We propose a tailored procedure for solving each decomposed subproblem that provably generates facets of the feasible set of the MIO as constraints to add to the main problem. We conduct extensive computational experiments on standard benchmark datasets on which we show that our proposed approaches are 31 times faster than state-of-the art MIO-based techniques and improve out of sample performance by up to 8%.


  Access Paper or Ask Questions

On the benefits of robust models in modulation recognition

Mar 27, 2021
Javier Maroto, G茅r么me Bovet, Pascal Frossard

Given the rapid changes in telecommunication systems and their higher dependence on artificial intelligence, it is increasingly important to have models that can perform well under different, possibly adverse, conditions. Deep Neural Networks (DNNs) using convolutional layers are state-of-the-art in many tasks in communications. However, in other domains, like image classification, DNNs have been shown to be vulnerable to adversarial perturbations, which consist of imperceptible crafted noise that when added to the data fools the model into misclassification. This puts into question the security of DNNs in communication tasks, and in particular in modulation recognition. We propose a novel framework to test the robustness of current state-of-the-art models where the adversarial perturbation strength is dependent on the signal strength and measured with the "signal to perturbation ratio" (SPR). We show that current state-of-the-art models are susceptible to these perturbations. In contrast to current research on the topic of image classification, modulation recognition allows us to have easily accessible insights on the usefulness of the features learned by DNNs by looking at the constellation space. When analyzing these vulnerable models we found that adversarial perturbations do not shift the symbols towards the nearest classes in constellation space. This shows that DNNs do not base their decisions on signal statistics that are important for the Bayes-optimal modulation recognition model, but spurious correlations in the training data. Our feature analysis and proposed framework can help in the task of finding better models for communication systems.


  Access Paper or Ask Questions

Learning to Drop Points for LiDAR Scan Synthesis

Feb 23, 2021
Kazuto Nakashima, Ryo Kurazume

Generative modeling of 3D scenes is a crucial topic for aiding mobile robots to improve unreliable observations. However, despite the rapid progress in the natural image domain, building generative models is still challenging for 3D data, such as point clouds. Most existing studies on point clouds have focused on small and uniform-density data. In contrast, 3D LiDAR point clouds widely used in mobile robots are non-trivial to be handled because of the large number of points and varying-density. To circumvent this issue, 3D-to-2D projected representation such as a cylindrical depth map has been studied in existing LiDAR processing tasks but susceptible to discrete lossy pixels caused by failures of laser reflection. This paper proposes a novel framework based on generative adversarial networks to synthesize realistic LiDAR data as an improved 2D representation. Our generative architectures are designed to learn a distribution of inverse depth maps and simultaneously simulate the lossy pixels, which enables us to decompose an underlying smooth geometry and the corresponding uncertainty of laser reflection. To simulate the lossy pixels, we propose a differentiable framework to learn to produce sample-dependent binary masks using the Gumbel-Sigmoid reparametrization trick. We demonstrate the effectiveness of our approach in synthesis and reconstruction tasks on two LiDAR datasets. We further showcase potential applications by recovering various corruptions in LiDAR data.


  Access Paper or Ask Questions

Towards Meaningful Statements in IR Evaluation. Mapping Evaluation Measures to Interval Scales

Jan 07, 2021
Marco Ferrante, Nicola Ferro, Norbert Fuhr

Recently, it was shown that most popular IR measures are not interval-scaled, implying that decades of experimental IR research used potentially improper methods, which may have produced questionable results. However, it was unclear if and to what extent these findings apply to actual evaluations and this opened a debate in the community with researchers standing on opposite positions about whether this should be considered an issue (or not) and to what extent. In this paper, we first give an introduction to the representational measurement theory explaining why certain operations and significance tests are permissible only with scales of a certain level. For that, we introduce the notion of meaningfulness specifying the conditions under which the truth (or falsity) of a statement is invariant under permissible transformations of a scale. Furthermore, we show how the recall base and the length of the run may make comparison and aggregation across topics problematic. Then we propose a straightforward and powerful approach for turning an evaluation measure into an interval scale, and describe an experimental evaluation of the differences between using the original measures and the interval-scaled ones. For all the regarded measures - namely Precision, Recall, Average Precision, (Normalized) Discounted Cumulative Gain, Rank-Biased Precision and Reciprocal Rank - we observe substantial effects, both on the order of average values and on the outcome of significance tests. For the latter, previously significant differences turn out to be insignificant, while insignificant ones become significant. The effect varies remarkably between the tests considered but overall, on average, we observed a 25% change in the decision about which systems are significantly different and which are not.


  Access Paper or Ask Questions

On the Performance of One-Bit DoA Estimation via Sparse Linear Arrays

Dec 28, 2020
Saeid Sedighi, M. R. Bhavani Shankar, Mojtaba Soltanalian, Bj枚rn Ottersten

Direction of Arrival (DoA) estimation using Sparse Linear Arrays (SLAs) has recently gained considerable attention in array processing thanks to their capability to provide enhanced degrees of freedom in resolving uncorrelated source signals. Additionally, deployment of one-bit Analog-to-Digital Converters (ADCs) has emerged as an important topic in array processing, as it offers both a low-cost and a low-complexity implementation. In this paper, we study the problem of DoA estimation from one-bit measurements received by an SLA. Specifically, we first investigate the identifiability conditions for the DoA estimation problem from one-bit SLA data and establish an equivalency with the case when DoAs are estimated from infinite-bit unquantized measurements. Towards determining the performance limits of DoA estimation from one-bit quantized data, we derive a pessimistic approximation of the corresponding Cram\'{e}r-Rao Bound (CRB). This pessimistic CRB is then used as a benchmark for assessing the performance of one-bit DoA estimators. We also propose a new algorithm for estimating DoAs from one-bit quantized data. We investigate the analytical performance of the proposed method through deriving a closed-form expression for the covariance matrix of the asymptotic distribution of the DoA estimation errors and show that it outperforms the existing algorithms in the literature. Numerical simulations are provided to validate the analytical derivations and corroborate the resulting performance improvement.

* 13 pages, 7 figures 

  Access Paper or Ask Questions

Demographic Representation and Collective Storytelling in the Me Too Twitter Hashtag Activism Movement

Oct 13, 2020
Aaron Mueller, Zach Wood-Doughty, Silvio Amir, Mark Dredze, Alicia L. Nobles

The #MeToo movement on Twitter has drawn attention to the pervasive nature of sexual harassment and violence. While #MeToo has been praised for providing support for self-disclosures of harassment or violence and shifting societal response, it has also been criticized for exemplifying how women of color have been discounted for their historical contributions to and excluded from feminist movements. Through an analysis of over 600,000 tweets from over 256,000 unique users, we examine online #MeToo conversations across gender and racial/ethnic identities and the topics that each demographic emphasized. We found that tweets authored by white women were overrepresented in the movement compared to other demographics, aligning with criticism of unequal representation. We found that intersected identities contributed differing narratives to frame the movement, co-opted the movement to raise visibility in parallel ongoing movements, employed the same hashtags both critically and supportively, and revived and created new hashtags in response to pivotal moments. Notably, tweets authored by black women often expressed emotional support and were critical about differential treatment in the justice system and by police. In comparison, tweets authored by white women and men often highlighted sexual harassment and violence by public figures and weaved in more general political discussions. We discuss the implications of work for digital activism research and design including suggestions to raise visibility by those who were under-represented in this hashtag activism movement. Content warning: this article discusses issues of sexual harassment and violence.

* 27 pages (incl. 5 for references). Submitted to CSCW 2021 

  Access Paper or Ask Questions

End-to-End JPEG Decoding and Artifacts Suppression Using Heterogeneous Residual Convolutional Neural Network

Jul 01, 2020
Jun Niu

Existing deep learning models separate JPEG artifacts suppression from the decoding protocol as independent task. In this work, we take one step forward to design a true end-to-end heterogeneous residual convolutional neural network (HR-CNN) with spectrum decomposition and heterogeneous reconstruction mechanism. Benefitting from the full CNN architecture and GPU acceleration, the proposed model considerably improves the reconstruction efficiency. Numerical experiments show that the overall reconstruction speed reaches to the same magnitude of the standard CPU JPEG decoding protocol, while both decoding and artifacts suppression are completed together. We formulate the JPEG artifacts suppression task as an interactive process of decoding and image detail reconstructions. A heterogeneous, fully convolutional, mechanism is proposed to particularly address the uncorrelated nature of different spectral channels. Directly starting from the JPEG code in k-space, the network first extracts the spectral samples channel by channel, and restores the spectral snapshots with expanded throughput. These intermediate snapshots are then heterogeneously decoded and merged into the pixel space image. A cascaded residual learning segment is designed to further enhance the image details. Experiments verify that the model achieves outstanding performance in JPEG artifacts suppression, while its full convolutional operations and elegant network structure offers higher computational efficiency for practical online usage compared with other deep learning models on this topic.


  Access Paper or Ask Questions

Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles

Jan 30, 2020
Szil谩rd Aradi

Academic research in the field of autonomous vehicles has reached high popularity in recent years related to several topics as sensor technologies, V2X communications, safety, security, decision making, control, and even legal and standardization rules. Besides classic control design approaches, Artificial Intelligence and Machine Learning methods are present in almost all of these fields. Another part of research focuses on different layers of Motion Planning, such as strategic decisions, trajectory planning, and control. A wide range of techniques in Machine Learning itself have been developed, and this article describes one of these fields, Deep Reinforcement Learning (DRL). The paper provides insight into the hierarchical motion planning problem and describes the basics of DRL. The main elements of designing such a system are the modeling of the environment, the modeling abstractions, the description of the state and the perception models, the appropriate rewarding, and the realization of the underlying neural network. The paper describes vehicle models, simulation possibilities and computational requirements. Strategic decisions on different layers and the observation models, e.g., continuous and discrete state representations, grid-based, and camera-based solutions are presented. The paper surveys the state-of-art solutions systematized by the different tasks and levels of autonomous driving, such as car-following, lane-keeping, trajectory following, merging, or driving in dense traffic. Finally, open questions and future challenges are discussed.


  Access Paper or Ask Questions

<<
551
552
553
554
555
556
557
558
559
560
561
562
563
>>