Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Resources for Turkish Dependency Parsing: Introducing the BOUN Treebank and the BoAT Annotation Tool

Feb 24, 2020
Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Gözde Berk, Seyyit Talha Bedir, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür

In this paper, we describe our contributions and efforts to develop Turkish resources, which include a new treebank (BOUN Treebank) with novel sentences, along with the guidelines we adopted and a new annotation tool we developed (BoAT). The manual annotation process we employed was shaped and implemented by a team of four linguists and five NLP specialists. Decisions regarding the annotation of the BOUN Treebank were made in line with the Universal Dependencies framework, which originated from the works of De Marneffe et al. (2014) and Nivre et al. (2016). We took into account the recent unifying efforts based on the re-annotation of other Turkish treebanks in the UD framework (T\"urk et al., 2019). Through the BOUN Treebank, we introduced a total of 9,757 sentences from various topics including biographical texts, national newspapers, instructional texts, popular culture articles, and essays. In addition, we report the parsing results of a graph-based dependency parser obtained over each text type, the total of the BOUN Treebank, and all Turkish treebanks that we either re-annotated or introduced. We show that a state-of-the-art dependency parser has improved scores for identifying the proper head and the syntactic relationships between the heads and the dependents. In light of these results, we have observed that the unification of the Turkish annotation scheme and introducing a more comprehensive treebank improves performance with regards to dependency parsing

* 29 pages, 5 figures, 10 tables, submitted to Language Resources and Evaluation 

  Access Paper or Ask Questions

Mobility Inference on Long-Tailed Sparse Trajectory

Jan 21, 2020
Lei Shi

Analyzing the urban trajectory in cities has become an important topic in data mining. How can we model the human mobility consisting of stay and travel from the raw trajectory data? How can we infer such a mobility model from the single trajectory information? How can we further generalize the mobility inference to accommodate the real-world trajectory data that is sparsely sampled over time? In this paper, based on formal and rigid definitions of the stay/travel mobility, we propose a single trajectory inference algorithm that utilizes a generic long-tailed sparsity pattern in the large-scale trajectory data. The algorithm guarantees a 100\% precision in the stay/travel inference with a provable lower-bound in the recall. Furthermore, we introduce an encoder-decoder learning architecture that admits multiple trajectories as inputs. The architecture is optimized for the mobility inference problem through customized embedding and learning mechanism. Evaluations with three trajectory data sets of 40 million urban users validate the performance guarantees of the proposed inference algorithm and demonstrate the superiority of our deep learning model, in comparison to well-known sequence learning methods. On extremely sparse trajectories, the deep learning model achieves a 2$\times$ overall accuracy improvement from the single trajectory inference algorithm, through proven scalability and generalizability to large-scale versatile training data.

  Access Paper or Ask Questions

Graph Neural Networks: A Review of Methods and Applications

Jan 02, 2019
Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Maosong Sun

Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics system, learning molecular fingerprints, predicting protein interface, and classifying diseases require that a model learns from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures, like the dependency tree of sentences and the scene graph of images, is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with arbitrary depth. Although the primitive GNNs have been found difficult to train for a fixed point, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful learning with them. In recent years, systems based on graph convolutional network (GCN) and gated graph neural network (GGNN) have demonstrated ground-breaking performance on many tasks mentioned above. In this survey, we provide a detailed review over existing graph neural network models, systematically categorize the applications, and propose four open problems for future research.

  Access Paper or Ask Questions

Multi-focus Image Fusion using dictionary learning and Low-Rank Representation

Apr 23, 2018
Hui Li, Xiao-Jun Wu

Among the representation learning, the low-rank representation (LRR) is one of the hot research topics in many fields, especially in image processing and pattern recognition. Although LRR can capture the global structure, the ability of local structure preservation is limited because LRR lacks dictionary learning. In this paper, we propose a novel multi-focus image fusion method based on dictionary learning and LRR to get a better performance in both global and local structure. Firstly, the source images are divided into several patches by sliding window technique. Then, the patches are classified according to the Histogram of Oriented Gradient (HOG) features. And the sub-dictionaries of each class are learned by K-singular value decomposition (K-SVD) algorithm. Secondly, a global dictionary is constructed by combining these sub-dictionaries. Then, we use the global dictionary in LRR to obtain the LRR coefficients vector for each patch. Finally, the l_1-norm and choose-max fuse strategy for each coefficients vector is adopted to reconstruct fused image from the fused LRR coefficients and the global dictionary. Experimental results demonstrate that the proposed method can obtain state-of-the-art performance in both qualitative and quantitative evaluations compared with serval classical methods and novel methods.The Code of our fusion method is available at

* 12 pages, 5 figures, 2 tables. The 9th International Conference on Image and Graphics (ICIG 2017, Oral) 

  Access Paper or Ask Questions

MovieGraphs: Towards Understanding Human-Centric Situations from Videos

Apr 15, 2018
Paul Vicol, Makarand Tapaswi, Lluis Castrejon, Sanja Fidler

There is growing interest in artificial intelligence to build socially intelligent robots. This requires machines to have the ability to "read" people's emotions, motivations, and other factors that affect behavior. Towards this goal, we introduce a novel dataset called MovieGraphs which provides detailed, graph-based annotations of social situations depicted in movie clips. Each graph consists of several types of nodes, to capture who is present in the clip, their emotional and physical attributes, their relationships (i.e., parent/child), and the interactions between them. Most interactions are associated with topics that provide additional details, and reasons that give motivations for actions. In addition, most interactions and many attributes are grounded in the video with time stamps. We provide a thorough analysis of our dataset, showing interesting common-sense correlations between different social aspects of scenes, as well as across scenes over time. We propose a method for querying videos and text with graphs, and show that: 1) our graphs contain rich and sufficient information to summarize and localize each scene; and 2) subgraphs allow us to describe situations at an abstract level and retrieve multiple semantically relevant situations. We also propose methods for interaction understanding via ordering, and reason understanding. MovieGraphs is the first benchmark to focus on inferred properties of human-centric situations, and opens up an exciting avenue towards socially-intelligent AI agents.

* Spotlight at CVPR 2018. Webpage: 

  Access Paper or Ask Questions

PCA-Based Missing Information Imputation for Real-Time Crash Likelihood Prediction Under Imbalanced Data

Feb 11, 2018
Jintao Ke, Shuaichao Zhang, Hai Yang, Xiqun Chen

The real-time crash likelihood prediction has been an important research topic. Various classifiers, such as support vector machine (SVM) and tree-based boosting algorithms, have been proposed in traffic safety studies. However, few research focuses on the missing data imputation in real-time crash likelihood prediction, although missing values are commonly observed due to breakdown of sensors or external interference. Besides, classifying imbalanced data is also a difficult problem in real-time crash likelihood prediction, since it is hard to distinguish crash-prone cases from non-crash cases which compose the majority of the observed samples. In this paper, principal component analysis (PCA) based approaches, including LS-PCA, PPCA, and VBPCA, are employed for imputing missing values, while two kinds of solutions are developed to solve the problem in imbalanced data. The results show that PPCA and VBPCA not only outperform LS-PCA and other imputation methods (including mean imputation and k-means clustering imputation), in terms of the root mean square error (RMSE), but also help the classifiers achieve better predictive performance. The two solutions, i.e., cost-sensitive learning and synthetic minority oversampling technique (SMOTE), help improve the sensitivity by adjusting the classifiers to pay more attention to the minority class.

  Access Paper or Ask Questions

Multimodal Image Captioning for Marketing Analysis

Feb 06, 2018
Philipp Harzig, Stephan Brehm, Rainer Lienhart, Carolin Kaiser, René Schallner

Automatically captioning images with natural language sentences is an important research topic. State of the art models are able to produce human-like sentences. These models typically describe the depicted scene as a whole and do not target specific objects of interest or emotional relationships between these objects in the image. However, marketing companies require to describe these important attributes of a given scene. In our case, objects of interest are consumer goods, which are usually identifiable by a product logo and are associated with certain brands. From a marketing point of view, it is desirable to also evaluate the emotional context of a trademarked product, i.e., whether it appears in a positive or a negative connotation. We address the problem of finding brands in images and deriving corresponding captions by introducing a modified image captioning network. We also add a third output modality, which simultaneously produces real-valued image ratings. Our network is trained using a classification-aware loss function in order to stimulate the generation of sentences with an emphasis on words identifying the brand of a product. We evaluate our model on a dataset of images depicting interactions between humans and branded products. The introduced network improves mean class accuracy by 24.5 percent. Thanks to adding the third output modality, it also considerably improves the quality of generated captions for images depicting branded products.

* 4 pages, 1 figure, accepted at MIPR2018 

  Access Paper or Ask Questions

Private False Discovery Rate Control

Nov 12, 2015
Cynthia Dwork, Weijie Su, Li Zhang

We provide the first differentially private algorithms for controlling the false discovery rate (FDR) in multiple hypothesis testing, with essentially no loss in power under certain conditions. Our general approach is to adapt a well-known variant of the Benjamini-Hochberg procedure (BHq), making each step differentially private. This destroys the classical proof of FDR control. To prove FDR control of our method, (a) we develop a new proof of the original (non-private) BHq algorithm and its robust variants -- a proof requiring only the assumption that the true null test statistics are independent, allowing for arbitrary correlations between the true nulls and false nulls. This assumption is fairly weak compared to those previously shown in the vast literature on this topic, and explains in part the empirical robustness of BHq. Then (b) we relate the FDR control properties of the differentially private version to the control properties of the non-private version. \end{enumerate} We also present a low-distortion "one-shot" differentially private primitive for "top $k$" problems, e.g., "Which are the $k$ most popular hobbies?" (which we apply to: "Which hypotheses have the $k$ most significant $p$-values?"), and use it to get a faster privacy-preserving instantiation of our general approach at little cost in accuracy. The proof of privacy for the one-shot top~$k$ algorithm introduces a new technique of independent interest.

  Access Paper or Ask Questions

Exploration and Exploitation in Federated Learning to Exclude Clients with Poisoned Data

Apr 29, 2022
Shadha Tabatabai, Ihab Mohammed, Basheer Qolomany, Abdullatif Albasser, Kashif Ahmad, Mohamed Abdallah, Ala Al-Fuqaha

Federated Learning (FL) is one of the hot research topics, and it utilizes Machine Learning (ML) in a distributed manner without directly accessing private data on clients. However, FL faces many challenges, including the difficulty to obtain high accuracy, high communication cost between clients and the server, and security attacks related to adversarial ML. To tackle these three challenges, we propose an FL algorithm inspired by evolutionary techniques. The proposed algorithm groups clients randomly in many clusters, each with a model selected randomly to explore the performance of different models. The clusters are then trained in a repetitive process where the worst performing cluster is removed in each iteration until one cluster remains. In each iteration, some clients are expelled from clusters either due to using poisoned data or low performance. The surviving clients are exploited in the next iteration. The remaining cluster with surviving clients is then used for training the best FL model (i.e., remaining FL model). Communication cost is reduced since fewer clients are used in the final training of the FL model. To evaluate the performance of the proposed algorithm, we conduct a number of experiments using FEMNIST dataset and compare the result against the random FL algorithm. The experimental results show that the proposed algorithm outperforms the baseline algorithm in terms of accuracy, communication cost, and security.

* Accepted at 2022 IWCMC 

  Access Paper or Ask Questions

A Novel Mask R-CNN Model to Segment Heterogeneous Brain Tumors through Image Subtraction

Apr 04, 2022
Sanskriti Singh

The segmentation of diseases is a popular topic explored by researchers in the field of machine learning. Brain tumors are extremely dangerous and require the utmost precision to segment for a successful surgery. Patients with tumors usually take 4 MRI scans, T1, T1gd, T2, and FLAIR, which are then sent to radiologists to segment and analyze for possible future surgery. To create a second segmentation, it would be beneficial to both radiologists and patients in being more confident in their conclusions. We propose using a method performed by radiologists called image segmentation and applying it to machine learning models to prove a better segmentation. Using Mask R-CNN, its ResNet backbone being pre-trained on the RSNA pneumonia detection challenge dataset, we can train a model on the Brats2020 Brain Tumor dataset. Center for Biomedical Image Computing & Analytics provides MRI data on patients with and without brain tumors and the corresponding segmentations. We can see how well the method of image subtraction works by comparing it to models without image subtraction through DICE coefficient (F1 score), recall, and precision on the untouched test set. Our model performed with a DICE coefficient of 0.75 in comparison to 0.69 without image subtraction. To further emphasize the usefulness of image subtraction, we compare our final model to current state-of-the-art models to segment tumors from MRI scans.

* 8 pages, 7 figures 

  Access Paper or Ask Questions