 |
FlashFusion: Real-time Globally Consistent Dense 3D Reconstruction using CPU Computing
Robotics Science and Systems, 2018 Project Page
Aiming at the practical usage of dense 3D reconstruction on portable devices, we propose FlashFusion, a Fast LArge-Scale High-resolution (sub-centimeter level) 3D reconstruction system without the use of GPU computing. It enables globally-consistent localization through a robust yet fast global bundle adjustment scheme, and realizes spatial hashing based volumetric fusion running at 300Hz and rendering at 25Hz. Extensive experiments on both real world and synthetic datasets demonstrate that FlashFusion succeeds to enable real- time, globally consistent, high-resolution (5mm), and large-scale dense 3D reconstruction using highly-constrained computation, i.e., the CPU computing on portable device.
|
 |
MILD: Multi-Index hashing for appearance based Loop closure Detection
IEEE Int. Conf. on Multimedia & Expo. (ICME) 2017 Best Student Paper Award
Beyond SIFT using Binary Features in Loop Closure Detection
IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS) 2017
Code ICME-PDF IROS-PDF
A binary feature based Loop Closure Detection method is proposed, which achieves higher precision-recall performance than state-of-the-art SIFT feature based approaches. The proposed scheme employs Multi-Index Hashing for Approximate Nearest Neighbor search of binary features, and it runs at 30Hz for databases containing thousands of images.
|
 |
SurfaceNet: An End-to-end 3D Neural Network for Multiview Stereopsis
Int. Conf. on Computer Vision (ICCV) 2017 Project Page
We propose an end-to-end learning framework for multiview stereopsis, i.e., SurfaceNet. It takes a set of images and their corresponding camera parameters as input and directly infers the 3D model. The key advantage of the framework is that both photo-consistency as well geometric relations of the surface structure can be directly learned for the purpose of multiview stereopsis in an end-to-end fashion. SurfaceNet is a fully 3D convolutional network which is achieved by encoding the camera parameters together with the images in a 3D voxel representation. |

|
Multiscale Gigapixel Video: A
Cross Resolution Image Matching and Warping Approach
Int. Conf. on Computational
Photography (ICCP) 2017 Project Page
We present a multi-scale camera array to capture and synthesize gigapixel videos in an efficient way.
Our acquisition setup contains a reference camera with a short-focus lens to get a large field-of-view video and a number of unstructured long-focus cameras to capture local-view details.
|
|
FlyCap: Markerless
Motion Capture Using Multiple Autonomous Flying Camera
IEEE Trans. on Visualization and Computer Graphics, 2017 Project
Page
We present a new markerless motion capture end-to-end system for moving targets in a wide space without extra constraints like fixed capture volume, using multiple autonomous flying cameras.
|
|
Monocular Long-term
Target Following on UAVs
CVPR workshop 2016

We investigate the challenging long-term visual tracking problem and its implementation on Unmanned Aerial Vehicles (UAVs).
|

|
Learning High-level Prior with
Convolutional Neural Networks for Semantic Segmentation
under review, IEEE Trans. on
Image Processing

This paper proposes a convolutional neural network that can fuse high-level prior for semantic image segmentation.
|
|
Deep Learning for Surface Material
Classification Using Haptic and Visual Information
IEEE Trans. on Multimedia, 2016

This paper deals with the surface material classification problem based on a Fully Convolutional Network (FCN), taking acceleration signal and a corresponding image of the surface texture as inputs.
|

|
Computation and Memory Efficient
Image Segmentation
IEEE Trans. on CSVT, 2016

This paper addresses the segmentation problem under limited computation and memory resources. Given a segmentation algorithm, we propose a framework that can reduce its computation time and memory requirement simultaneously, while preserving its accuracy.
|

|
Magic Glasses: From 2D to 3D
IEEE Trans. on CSVT, 2016
 
This paper proposes a virtual 3D eyeglasses try on system driven by a 2D Internet image of a human face wearing with a pair of eyeglasses.
|
|
Estimation of Virtual View
Synthesis Distortion in 3D Video
IEEE Trans. on Image Processing, 2016
& IEEE TIP 2014.
This paper proposes an analytical model to estimate the synthesized view quality in 3D video. The model relates errors in the depth images to the synthesis quality, taking into account texture image characteristics, texture image quality, the virtual view position, and the rendering process.
|
|
Stereo Matching with Optimal Local
Adaptive Radiometric Compensation
IEEE Signal Processing Letter, 2014

We propose a radiometrically invariant stereo matching method by approximating the spatially varying Pixel Value Correspondence Function (PVCF) between a corresponding pixel pair as a locally consistent polynomial within an optimal local adaptive window.
|
|
Robust Blur Kernel Estimation for
License Plate Images from Fast Moving Vehicles
IEEE Trans. on Image Processing, 2016

This paper proposes a sparse representation scheme to deal with the snapshot of over-speed vehicle captured by surveillance camera that is frequently blurred due to fast motion.
|
|
Deblurring Saturated Night Images
with Function-form Kernel
IEEE Trans. on Image Processing, 2015

This paper deals with the deblurring of night images that suffer low contrast, heavy noise and saturated regions. The key idea is to deduce blur kernels from saturated regions via a novel kernel representation and advanced algorithms.
|

|
Separable Kernel for Image
Deblurring
IEEE Computer Vision and Pattern
Recognition (CVPR), 2014

This paper deals with the image deblurring problem in a completely new perspective by proposing separable kernel - trajectory, intensity and defocus, to represent the inherent properties of the camera and scene system.
|
|
Adaptive Multispectral
Demosaicking Based on Frequency Domain Analysis of Spectral Correlation
IEEE Trans. on Image
Processing, 2017
This paper deals with multispectral demosaicking where each band is significantly undersampled due to the increment in the number of bands.
|

|
Multichannel Non-Local Means
Fusion for Color Image Denoising
IEEE Trans. on CSVT, 2013

This paper proposes a multichannel nonlocal means fusion (MNLF) scheme based on the inherent strong interchannel correlation feature of color images.
|
|
Subpixel Rendering: From Font
Rendering to Image Subsampling
IEEE Signal Processing Magazine, 2012

This paper introduces subpixel arrangement in color displays, how subpixel rendering works, and several practical subpixel rendering applications in font rendering and image subsampling.
|