Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

The Open World of Micro-Videos

Apr 01, 2016
Phuc Xuan Nguyen, Gregory Rogez, Charless Fowlkes, Deva Ramanan

Micro-videos are six-second videos popular on social media networks with several unique properties. Firstly, because of the authoring process, they contain significantly more diversity and narrative structure than existing collections of video "snippets". Secondly, because they are often captured by hand-held mobile cameras, they contain specialized viewpoints including third-person, egocentric, and self-facing views seldom seen in traditional produced video. Thirdly, due to to their continuous production and publication on social networks, aggregate micro-video content contains interesting open-world dynamics that reflects the temporal evolution of tag topics. These aspects make micro-videos an appealing well of visual data for developing large-scale models for video understanding. We analyze a novel dataset of micro-videos labeled with 58 thousand tags. To analyze this data, we introduce viewpoint-specific and temporally-evolving models for video understanding, defined over state-of-the-art motion and deep visual features. We conclude that our dataset opens up new research opportunities for large-scale video analysis, novel viewpoints, and open-world dynamics.

  Access Paper or Ask Questions

Mapping Out Narrative Structures and Dynamics Using Networks and Textual Information

Mar 24, 2016
Semi Min, Juyong Park

Human communication is often executed in the form of a narrative, an account of connected events composed of characters, actions, and settings. A coherent narrative structure is therefore a requisite for a well-formulated narrative -- be it fictional or nonfictional -- for informative and effective communication, opening up the possibility of a deeper understanding of a narrative by studying its structural properties. In this paper we present a network-based framework for modeling and analyzing the structure of a narrative, which is further expanded by incorporating methods from computational linguistics to utilize the narrative text. Modeling a narrative as a dynamically unfolding system, we characterize its progression via the growth patterns of the character network, and use sentiment analysis and topic modeling to represent the actual content of the narrative in the form of interaction maps between characters with associated sentiment values and keywords. This is a network framework advanced beyond the simple occurrence-based one most often used until now, allowing one to utilize the unique characteristics of a given narrative to a high degree. Given the ubiquity and importance of narratives, such advanced network-based representation and analysis framework may lead to a more systematic modeling and understanding of narratives for social interactions, expression of human sentiments, and communication.

* 17 pages, 10 figures 

  Access Paper or Ask Questions

Robust Statistical Ranking: Theory and Algorithms

Aug 15, 2014
Qianqian Xu, Jiechao Xiong, Qingming Huang, Yuan Yao

Deeply rooted in classical social choice and voting theory, statistical ranking with paired comparison data experienced its renaissance with the wide spread of crowdsourcing technique. As the data quality might be significantly damaged in an uncontrolled crowdsourcing environment, outlier detection and robust ranking have become a hot topic in such data analysis. In this paper, we propose a robust ranking framework based on the principle of Huber's robust statistics, which formulates outlier detection as a LASSO problem to find sparse approximations of the cyclic ranking projection in Hodge decomposition. Moreover, simple yet scalable algorithms are developed based on Linearized Bregman Iteration to achieve an even less biased estimator than LASSO. Statistical consistency of outlier detection is established in both cases which states that when the outliers are strong enough and in Erdos-Renyi random graph sampling settings, outliers can be faithfully detected. Our studies are supported by experiments with both simulated examples and real-world data. The proposed framework provides us a promising tool for robust ranking with large scale crowdsourcing data arising from computer vision, multimedia, machine learning, sociology, etc.

* 16 pages, 8 figures 

  Access Paper or Ask Questions

Indoor Activity Detection and Recognition for Sport Games Analysis

Apr 25, 2014
Georg Waltner, Thomas Mauthner, Horst Bischof

Activity recognition in sport is an attractive field for computer vision research. Game, player and team analysis are of great interest and research topics within this field emerge with the goal of automated analysis. The very specific underlying rules of sports can be used as prior knowledge for the recognition task and present a constrained environment for evaluation. This paper describes recognition of single player activities in sport with special emphasis on volleyball. Starting from a per-frame player-centered activity recognition, we incorporate geometry and contextual information via an activity context descriptor that collects information about all player's activities over a certain timespan relative to the investigated player. The benefit of this context information on single player activity recognition is evaluated on our new real-life dataset presenting a total amount of almost 36k annotated frames containing 7 activity classes within 6 videos of professional volleyball games. Our incorporation of the contextual information improves the average player-centered classification performance of 77.56% by up to 18.35% on specific classes, proving that spatio-temporal context is an important clue for activity recognition.

* Part of the OAGM 2014 proceedings (arXiv:1404.3538) 

  Access Paper or Ask Questions

Topological characterizations to three types of covering approximation operators

Sep 29, 2012
Aiping Huang, William Zhu

Covering-based rough set theory is a useful tool to deal with inexact, uncertain or vague knowledge in information systems. Topology, one of the most important subjects in mathematics, provides mathematical tools and interesting topics in studying information systems and rough sets. In this paper, we present the topological characterizations to three types of covering approximation operators. First, we study the properties of topology induced by the sixth type of covering lower approximation operator. Second, some topological characterizations to the covering lower approximation operator to be an interior operator are established. We find that the topologies induced by this operator and by the sixth type of covering lower approximation operator are the same. Third, we study the conditions which make the first type of covering upper approximation operator be a closure operator, and find that the topology induced by the operator is the same as the topology induced by the fifth type of covering upper approximation operator. Forth, the conditions of the second type of covering upper approximation operator to be a closure operator and the properties of topology induced by it are established. Finally, these three topologies space are compared. In a word, topology provides a useful method to study the covering-based rough sets.

  Access Paper or Ask Questions

Transforming Graph Representations for Statistical Relational Learning

Mar 30, 2012
Ryan A. Rossi, Luke K. McDowell, David W. Aha, Jennifer Neville

Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed.

  Access Paper or Ask Questions

DeepGraviLens: a Multi-Modal Architecture for Classifying Gravitational Lensing Data

May 03, 2022
Nicolò Oreste Pinciroli Vago, Piero Fraternali

Gravitational lensing is the relativistic effect generated by massive bodies, which bend the space-time surrounding them. It is a deeply investigated topic in astrophysics and allows validating theoretical relativistic results and studying faint astrophysical objects that would not be visible otherwise. In recent years Machine Learning methods have been applied to support the analysis of the gravitational lensing phenomena by detecting lensing effects in data sets consisting of images associated with brightness variation time series. However, the state-of-art approaches either consider only images and neglect time-series data or achieve relatively low accuracy on the most difficult data sets. This paper introduces DeepGraviLens, a novel multi-modal network that classifies spatio-temporal data belonging to one non-lensed system type and three lensed system types. It surpasses the current state of the art accuracy results by $\approx$ 19% to $\approx$ 43%, depending on the considered data set. Such an improvement will enable the acceleration of the analysis of lensed objects in upcoming astrophysical surveys, which will exploit the petabytes of data collected, e.g., from the Vera C. Rubin Observatory.

  Access Paper or Ask Questions

Cross-Modality High-Frequency Transformer for MR Image Super-Resolution

Mar 29, 2022
Chaowei Fang, Dingwen Zhang, Liang Wang, Yulun Zhang, Lechao Cheng, Junwei Han

Improving the resolution of magnetic resonance (MR) image data is critical to computer-aided diagnosis and brain function analysis. Higher resolution helps to capture more detailed content, but typically induces to lower signal-to-noise ratio and longer scanning time. To this end, MR image super-resolution has become a widely-interested topic in recent times. Existing works establish extensive deep models with the conventional architectures based on convolutional neural networks (CNN). In this work, to further advance this research field, we make an early effort to build a Transformer-based MR image super-resolution framework, with careful designs on exploring valuable domain prior knowledge. Specifically, we consider two-fold domain priors including the high-frequency structure prior and the inter-modality context prior, and establish a novel Transformer architecture, called Cross-modality high-frequency Transformer (Cohf-T), to introduce such priors into super-resolving the low-resolution (LR) MR images. Comprehensive experiments on two datasets indicate that Cohf-T achieves new state-of-the-art performance.

  Access Paper or Ask Questions

IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks

Mar 24, 2022
Liying Cheng, Lidong Bing, Ruidan He, Qian Yu, Yan Zhang, Luo Si

Traditionally, a debate usually requires a manual preparation process, including reading plenty of articles, selecting the claims, identifying the stances of the claims, seeking the evidence for the claims, etc. As the AI debate attracts more attention these years, it is worth exploring the methods to automate the tedious process involved in the debating system. In this work, we introduce a comprehensive and large dataset named IAM, which can be applied to a series of argument mining tasks, including claim extraction, stance classification, evidence extraction, etc. Our dataset is collected from over 1k articles related to 123 topics. Near 70k sentences in the dataset are fully annotated based on their argument properties (e.g., claims, stances, evidence, etc.). We further propose two new integrated argument mining tasks associated with the debate preparation process: (1) claim extraction with stance classification (CESC) and (2) claim-evidence pair extraction (CEPE). We adopt a pipeline approach and an end-to-end method for each integrated task separately. Promising experimental results are reported to show the values and challenges of our proposed tasks, and motivate future research on argument mining.

* 11 pages, 3 figures, accepted by ACL 2022 

  Access Paper or Ask Questions

Unsupervised Learning on 3D Point Clouds by Clustering and Contrasting

Feb 14, 2022
Guofeng Mei, Litao Yu, Qiang Wu, Jian Zhang, Mohammed Bennamoun

Learning from unlabeled or partially labeled data to alleviate human labeling remains a challenging research topic in 3D modeling. Along this line, unsupervised representation learning is a promising direction to auto-extract features without human intervention. This paper proposes a general unsupervised approach, named \textbf{ConClu}, to perform the learning of point-wise and global features by jointly leveraging point-level clustering and instance-level contrasting. Specifically, for one thing, we design an Expectation-Maximization (EM) like soft clustering algorithm that provides local supervision to extract discriminating local features based on optimal transport. We show that this criterion extends standard cross-entropy minimization to an optimal transport problem, which we solve efficiently using a fast variant of the Sinkhorn-Knopp algorithm. For another, we provide an instance-level contrasting method to learn the global geometry, which is formulated by maximizing the similarity between two augmentations of one point cloud. Experimental evaluations on downstream applications such as 3D object classification and semantic segmentation demonstrate the effectiveness of our framework and show that it can outperform state-of-the-art techniques.

  Access Paper or Ask Questions