Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lu Fang

LapEPI-Net: A Laplacian Pyramid EPI structure for Learning-based Dense Light Field Reconstruction

Feb 17, 2019

Gaochang Wu, Yebin Liu, Lu Fang, Tianyou Chai

Figure 1 for LapEPI-Net: A Laplacian Pyramid EPI structure for Learning-based Dense Light Field Reconstruction

Figure 2 for LapEPI-Net: A Laplacian Pyramid EPI structure for Learning-based Dense Light Field Reconstruction

Figure 3 for LapEPI-Net: A Laplacian Pyramid EPI structure for Learning-based Dense Light Field Reconstruction

Figure 4 for LapEPI-Net: A Laplacian Pyramid EPI structure for Learning-based Dense Light Field Reconstruction

Abstract:For dense sampled light field (LF) reconstruction problem, existing approaches focus on a depth-free framework to achieve non-Lambertian performance. However, they trap in the trade-off "either aliasing or blurring" problem, i.e., pre-filtering the aliasing components (caused by the angular sparsity of the input LF) always leads to a blurry result. In this paper, we intend to solve this challenge by introducing an elaborately designed epipolar plane image (EPI) structure within a learning-based framework. Specifically, we start by analytically showing that decreasing the spatial scale of an EPI shows higher efficiency in addressing the aliasing problem than simply adopting pre-filtering. Accordingly, we design a Laplacian Pyramid EPI (LapEPI) structure that contains both low spatial scale EPI (for aliasing) and high-frequency residuals (for blurring) to solve the trade-off problem. We then propose a novel network architecture for the LapEPI structure, termed as LapEPI-net. To ensure the non-Lambertian performance, we adopt a transfer-learning strategy by first pre-training the network with natural images then fine-tuning it with unstructured LFs. Extensive experiments demonstrate the high performance and robustness of the proposed approach for tackling the aliasing-or-blurring problem as well as the non-Lambertian reconstruction.

* 10 pages, 8 figures, 4 tables

Via

Access Paper or Ask Questions

SPI-Optimizer: an integral-Separated PI Controller for Stochastic Optimization

Dec 29, 2018

Dan Wang, Mengqi Ji, Yong Wang, Haoqian Wang, Lu Fang

Figure 1 for SPI-Optimizer: an integral-Separated PI Controller for Stochastic Optimization

Figure 2 for SPI-Optimizer: an integral-Separated PI Controller for Stochastic Optimization

Figure 3 for SPI-Optimizer: an integral-Separated PI Controller for Stochastic Optimization

Figure 4 for SPI-Optimizer: an integral-Separated PI Controller for Stochastic Optimization

Abstract:To overcome the oscillation problem in the classical momentum-based optimizer, recent work associates it with the proportional-integral (PI) controller, and artificially adds D term producing a PID controller. It suppresses oscillation with the sacrifice of introducing extra hyper-parameter. In this paper, we start by analyzing: why momentum-based method oscillates about the optimal point? and answering that: the fluctuation problem relates to the lag effect of integral (I) term. Inspired by the conditional integration idea in classical control society, we propose SPI-Optimizer, an integral-Separated PI controller based optimizer WITHOUT introducing extra hyperparameter. It separates momentum term adaptively when the inconsistency of current and historical gradient direction occurs. Extensive experiments demonstrate that SPIOptimizer generalizes well on popular network architectures to eliminate the oscillation, and owns competitive performance with faster convergence speed (up to 40% epochs reduction ratio ) and more accurate classification result on MNIST, CIFAR10, and CIFAR100 (up to 27.5% error reduction ratio) than the state-of-the-art methods.

Via

Access Paper or Ask Questions

RegNet: Learning the Optimization of Direct Image-to-Image Pose Registration

Dec 26, 2018

Lei Han, Mengqi Ji, Lu Fang, Matthias Nießner

Figure 1 for RegNet: Learning the Optimization of Direct Image-to-Image Pose Registration

Figure 2 for RegNet: Learning the Optimization of Direct Image-to-Image Pose Registration

Figure 3 for RegNet: Learning the Optimization of Direct Image-to-Image Pose Registration

Figure 4 for RegNet: Learning the Optimization of Direct Image-to-Image Pose Registration

Abstract:Direct image-to-image alignment that relies on the optimization of photometric error metrics suffers from limited convergence range and sensitivity to lighting conditions. Deep learning approaches has been applied to address this problem by learning better feature representations using convolutional neural networks, yet still require a good initialization. In this paper, we demonstrate that the inaccurate numerical Jacobian limits the convergence range which could be improved greatly using learned approaches. Based on this observation, we propose a novel end-to-end network, RegNet, to learn the optimization of image-to-image pose registration. By jointly learning feature representation for each pixel and partial derivatives that replace handcrafted ones (e.g., numerical differentiation) in the optimization step, the neural network facilitates end-to-end optimization. The energy landscape is constrained on both the feature representation and the learned Jacobian, hence providing more flexibility for the optimization as a consequence leads to more robust and faster convergence. In a series of experiments, including a broad ablation study, we demonstrate that RegNet is able to converge for large-baseline image pairs with fewer iterations.

* 8 pages, 6 figures

Via

Access Paper or Ask Questions

CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping

Jul 27, 2018

Haitian Zheng, Mengqi Ji, Haoqian Wang, Yebin Liu, Lu Fang

Figure 1 for CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping

Figure 2 for CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping

Figure 3 for CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping

Figure 4 for CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping

Abstract:The Reference-based Super-resolution (RefSR) super-resolves a low-resolution (LR) image given an external high-resolution (HR) reference image, where the reference image and LR image share similar viewpoint but with significant resolution gap x8. Existing RefSR methods work in a cascaded way such as patch matching followed by synthesis pipeline with two independently defined objective functions, leading to the inter-patch misalignment, grid effect and inefficient optimization. To resolve these issues, we present CrossNet, an end-to-end and fully-convolutional deep neural network using cross-scale warping. Our network contains image encoders, cross-scale warping layers, and fusion decoder: the encoder serves to extract multi-scale features from both the LR and the reference images; the cross-scale warping layers spatially aligns the reference feature map with the LR feature map; the decoder finally aggregates feature maps from both domains to synthesize the HR output. Using cross-scale warping, our network is able to perform spatial alignment at pixel-level in an end-to-end fashion, which improves the existing schemes both in precision (around 2dB-4dB) and efficiency (more than 100 times faster).

* To be appeared in ECCV 2018

Via

Access Paper or Ask Questions

Beyond SIFT using Binary features for Loop Closure Detection

Sep 18, 2017

Lei Han, Guyue Zhou, Lan Xu, Lu Fang

Figure 1 for Beyond SIFT using Binary features for Loop Closure Detection

Figure 2 for Beyond SIFT using Binary features for Loop Closure Detection

Figure 3 for Beyond SIFT using Binary features for Loop Closure Detection

Figure 4 for Beyond SIFT using Binary features for Loop Closure Detection

Abstract:In this paper a binary feature based Loop Closure Detection (LCD) method is proposed, which for the first time achieves higher precision-recall (PR) performance compared with state-of-the-art SIFT feature based approaches. The proposed system originates from our previous work Multi-Index hashing for Loop closure Detection (MILD), which employs Multi-Index Hashing (MIH)~\cite{greene1994multi} for Approximate Nearest Neighbor (ANN) search of binary features. As the accuracy of MILD is limited by repeating textures and inaccurate image similarity measurement, burstiness handling is introduced to solve this problem and achieves considerable accuracy improvement. Additionally, a comprehensive theoretical analysis on MIH used in MILD is conducted to further explore the potentials of hashing methods for ANN search of binary features from probabilistic perspective. This analysis provides more freedom on best parameter choosing in MIH for different application scenarios. Experiments on popular public datasets show that the proposed approach achieved the highest accuracy compared with state-of-the-art while running at 30Hz for databases containing thousands of images.

* IROS 2017 paper for loop closure detection

Via

Access Paper or Ask Questions

SurfaceNet: An End-to-end 3D Neural Network for Multiview Stereopsis

Aug 05, 2017

Mengqi Ji, Juergen Gall, Haitian Zheng, Yebin Liu, Lu Fang

Figure 1 for SurfaceNet: An End-to-end 3D Neural Network for Multiview Stereopsis

Figure 2 for SurfaceNet: An End-to-end 3D Neural Network for Multiview Stereopsis

Figure 3 for SurfaceNet: An End-to-end 3D Neural Network for Multiview Stereopsis

Figure 4 for SurfaceNet: An End-to-end 3D Neural Network for Multiview Stereopsis

Abstract:This paper proposes an end-to-end learning framework for multiview stereopsis. We term the network SurfaceNet. It takes a set of images and their corresponding camera parameters as input and directly infers the 3D model. The key advantage of the framework is that both photo-consistency as well geometric relations of the surface structure can be directly learned for the purpose of multiview stereopsis in an end-to-end fashion. SurfaceNet is a fully 3D convolutional network which is achieved by encoding the camera parameters together with the images in a 3D voxel representation. We evaluate SurfaceNet on the large-scale DTU benchmark.

* 2017 iccv poster

Via

Access Paper or Ask Questions

MILD: Multi-Index hashing for Loop closure Detection

Feb 28, 2017

Lei Han, Lu Fang

Figure 1 for MILD: Multi-Index hashing for Loop closure Detection

Figure 2 for MILD: Multi-Index hashing for Loop closure Detection

Figure 3 for MILD: Multi-Index hashing for Loop closure Detection

Figure 4 for MILD: Multi-Index hashing for Loop closure Detection

Abstract:Loop Closure Detection (LCD) has been proved to be extremely useful in global consistent visual Simultaneously Localization and Mapping (SLAM) and appearance-based robot relocalization. Methods exploiting binary features in bag of words representation have recently gained a lot of popularity for their efficiency, but suffer from low recall due to the inherent drawback that high dimensional binary feature descriptors lack well-defined centroids. In this paper, we propose a realtime LCD approach called MILD (Multi-Index Hashing for Loop closure Detection), in which image similarity is measured by feature matching directly to achieve high recall without introducing extra computational complexity with the aid of Multi-Index Hashing (MIH). A theoretical analysis of the approximate image similarity measurement using MIH is presented, which reveals the trade-off between efficiency and accuracy from a probabilistic perspective. Extensive comparisons with state-of-the-art LCD methods demonstrate the superiority of MILD in both efficiency and accuracy.

* 6 pages, 5 figures; accepted by IEEE ICME 2017

Via

Access Paper or Ask Questions

Utilizing High-level Visual Feature for Indoor Shopping Mall Navigation

Feb 18, 2017

Ziwei Xu, Haitian Zheng, Minjian Pang, Yangchun Zhu, Xiongfei Su, Guyue Zhou, Lu Fang

Figure 1 for Utilizing High-level Visual Feature for Indoor Shopping Mall Navigation

Figure 2 for Utilizing High-level Visual Feature for Indoor Shopping Mall Navigation

Figure 3 for Utilizing High-level Visual Feature for Indoor Shopping Mall Navigation

Figure 4 for Utilizing High-level Visual Feature for Indoor Shopping Mall Navigation

Abstract:Towards robust and convenient indoor shopping mall navigation, we propose a novel learning-based scheme to utilize the high-level visual information from the storefront images captured by personal devices of users. Specifically, we decompose the visual navigation problem into localization and map generation respectively. Given a storefront input image, a novel feature fusion scheme (denoted as FusionNet) is proposed by fusing the distinguishing DNN-based appearance feature and text feature for robust recognition of store brands, which serves for accurate localization. Regarding the map generation, we convert the user-captured indicator map of the shopping mall into a topological map by parsing the stores and their connectivity. Experimental results conducted on the real shopping malls demonstrate that the proposed system achieves robust localization and precise map generation, enabling accurate navigation.

Via

Access Paper or Ask Questions

FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras

Nov 29, 2016

Lan Xu, Lu Fang, Wei Cheng, Kaiwen Guo, Guyue Zhou, Qionghai Dai, Yebin Liu

Figure 1 for FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras

Figure 2 for FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras

Figure 3 for FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras

Figure 4 for FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras

Abstract:Aiming at automatic, convenient and non-instrusive motion capture, this paper presents a new generation markerless motion capture technique, the FlyCap system, to capture surface motions of moving characters using multiple autonomous flying cameras (autonomous unmanned aerial vehicles(UAV) each integrated with an RGBD video camera). During data capture, three cooperative flying cameras automatically track and follow the moving target who performs large scale motions in a wide space. We propose a novel non-rigid surface registration method to track and fuse the depth of the three flying cameras for surface motion tracking of the moving target, and simultaneously calculate the pose of each flying camera. We leverage the using of visual-odometry information provided by the UAV platform, and formulate the surface tracking problem in a non-linear objective function that can be linearized and effectively minimized through a Gaussian-Newton method. Quantitative and qualitative experimental results demonstrate the competent and plausible surface and motion reconstruction results

* This paper has been withdrawn by the author due to a crucial sign error

Via

Access Paper or Ask Questions

Deep Learning for Surface Material Classification Using Haptic And Visual Information

May 01, 2016

Haitian Zheng, Lu Fang, Mengqi Ji, Matti Strese, Yigitcan Ozer, Eckehard Steinbach

Figure 1 for Deep Learning for Surface Material Classification Using Haptic And Visual Information

Figure 2 for Deep Learning for Surface Material Classification Using Haptic And Visual Information

Figure 3 for Deep Learning for Surface Material Classification Using Haptic And Visual Information

Figure 4 for Deep Learning for Surface Material Classification Using Haptic And Visual Information

Abstract:When a user scratches a hand-held rigid tool across an object surface, an acceleration signal can be captured, which carries relevant information about the surface. More importantly, such a haptic signal is complementary to the visual appearance of the surface, which suggests the combination of both modalities for the recognition of the surface material. In this paper, we present a novel deep learning method dealing with the surface material classification problem based on a Fully Convolutional Network (FCN), which takes as input the aforementioned acceleration signal and a corresponding image of the surface texture. Compared to previous surface material classification solutions, which rely on a careful design of hand-crafted domain-specific features, our method automatically extracts discriminative features utilizing the advanced deep learning methodologies. Experiments performed on the TUM surface material database demonstrate that our method achieves state-of-the-art classification accuracy robustly and efficiently.

* 8 pages, under review as a paper at Transactions on Multimedia

Via

Access Paper or Ask Questions