0%

Research

Research

ARCam: A User-Defined Camera for AR Photographic Art Creation [IEEE VR 2023 Demo]

  • Authors: Xinyi LUO, Zihao ZHU, Yuyang Wang, Pan HUI
  • Abstract: Photography in augmented reality can be challenging due to the restrictions of pre-defined settings. However, adjustable photography settings and real-time previews are significant for AR photographic creation as creators must adjust multiple camera properties to present unique visual effects. In this work, we designed an AR camera (ARCam) with various adjustable properties to give users a high degree of freedom for photographic art creation in real-time preview.
  • Download Paper

GraphMSE: Efficient Meta-path Selection in Semantically Aligned Feature Space for Graph Neural Networks [AAAI 2021]

  • Authors: Yi LI, Yilun JIN, Guojie SONG, Zihao ZHU, Chuan SHI, Yiming WANG
  • Abstract: Heterogeneous information networks (HINs) are ideal for describing real-world data with different types of entities and relationships. To carry out machine learning on HINs, meta-paths are widely utilized to extract semantics with pre-defined patterns, and models such as graph convolutional networks (GCNs) are thus enabled. However, previous works generally assume a fixed set of meta-paths, which is unrealistic as real-world data are overwhelmingly diverse. Therefore, it is appealing if meta-paths can be automatically selected given an HIN, yet existing works aiming at such problem possess drawbacks, such as poor efficiency and ignoring feature heterogeneity. To address these drawbacks, we propose GraphMSE, an efficient heterogeneous GCN combined with automatic meta-path selection. Specifically, we design highly efficient meta-path sampling techniques, and then injectively project sampled meta-path instances to vectors. We then design a novel semantic feature space alignment, aiming to align the meta-path instance vectors and hence facilitate meta-path selection. Extensive experiments on real-world datasets demonstrate that GraphMSE outperforms state-of-the-art counterparts, figures out important meta-paths, and is dramatically (e.g. 200 times) more efficient.
  • Download Paper

Real-Time Semantic Segmentation of Aerial Videos Based on Bilateral Segmentation Network [IGARSS 2021]

  • Authors: Yihao ZUO, Junli YANG, Zihao ZHU, Ruizhe LI, Yuhan ZHOU, Yutong ZHENG
  • Abstract: In recent years, deep learning algorithms have been widely used in semantic segmentation of aerial images. However, most of the current research in this field focus on images but not videos. In this paper, we address the problem of real-time aerial video semantic segmentation with BiSeNet[1]. Since BiSeNet is originally proposed for semantic segmentation of natural city scene images, we need a corresponding dataset to ensure the effect of transfer learning when applying it to aerial video segmentation. Therefore, we build a UAV streetscape sequence dataset (USSD) to fill the vacancy of dataset in this field and facilitate our research. Evaluation on USSD shows that BiSeNet outperforms other state-of-the-art methods. It achieves 79.26% mIoU and 93.37% OA with speed of 148.7 FPS on NVIDIA Tesla V100 for a 1920x1080 frame size input aerial video, which satisfies the demand of aerial video semantic segmentation with a competitive balance of accuracy and speed. The aerial video semantic segmentation results are provided at Our Repository.
  • Download Paper