The spatial relationship among objects provide rich clues to object contexts for visual recognition. In this paper, we propose to learn Semantic Feature Map (SFM) by deep neural networks to model the spatial object contexts for better understanding of image and video contents. Specifically, we first extract high-level semantic object features on input image with convolutional neural networks for every object proposals, and organize them to the designed SFM so that spatial information among objects are preserved...
Authors
Rui-wei Zhao
Zuxuan Wu
Yu-Gang Jiang
Related Content
BodyFusion: Real-time Capture of Human Motion and Surface…
We propose BodyFusion, a novel real-time geometry fusion method that can track and reconstruct non-rigid surface motion of a human…
A Factorization Approach for Enabling Structure-From-Motion/SLAM Using Integer…
We propose BodyFusion, a novel real-time geometry fusion method that can track and reconstruct non-rigid surface motion of a human…
Decoder Network over Lightweight Reconstructed Feature for Fast...
Recently, the community of style transfer is trying to incorporate semantic information into traditional system. This practice achieves better perceptual...
Colored Point Cloud Registration Revisited
We present an algorithm for aligning two colored point clouds. The key idea is to optimize a joint photometric and...