Welcome to SIAT Video Team (SVT), comprised of members from High Performance Computing Center (HPCC), Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences (CAS). We have long been engaged in the field of multimedia communications and visual signal processing for 2D/3D, VR/AR videos, including video coding, visual signal pre/post-processing, and computational visual perception. We are also pursuing challenging problems in the innovation areas, such as VR/AR, AI etc.
Join Us
Deep Learning Based Just Noticeable Difference and Perceptual Quality Prediction Models for Compressed Video | |
IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-SCVT) 2021 . Yun Zhang,Huanhua Liu, You Yang, Xiaoping Fan, Sam Kwong, C. C. Jay Kuo Full-Text Full-Text | |
Highly Efficient Multiview Depth Coding Based on Histogram Projection and Allowable Depth Distortion | |
IEEE Transactions on Image Processing (IEEE T-IP), 2021 . Yun Zhang*, Linwei Zhu, Raouf Hamzaoi, Sam Kowng, Yo-Sung Ho Full-Text | |
Projection Invariant Feature and Visual Saliency-Based Stereoscopic Omnidirectional Image Quality Assessment | |
IEEE Transactions on Broadcasting (IEEE T-BC), 2021 . Xuemei Zhou, Yun Zhang*, Na Li, Xu Wang, Yang Zhou and Yo-Sung Ho Full-Text | |
Learning-based Satisfied User Ratio Prediction for Symmetrically and Asymmetrically Compressed Stereoscopic Images | |
IEEE Transactions on Multimedia (IEEE T-MM), 2021 . Chunling Fan, Yun Zhang*, Raouf Hamzaoui, Qingshan Jiang, Djemei Ziou Full-Text | |
Deep Learning-Based Chroma Prediction for Intra Versatile Video Coding | |
IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT), 2020 Linwei Zhu, Yun Zhang*, Shiqi Wang, Sam Kwong, Xin Jin, and Yu Qiao Full-Text | |
Machine Learning Based Video Coding Optimizations: A Survey | |
Information Sciences (INS), Elsevier, 2020. Yun Zhang, Sam Kwong*, Shiqi Wang Full-Text | |
Deep Learning Based Picture-Wise Just Noticeable Distortion Prediction Model for Image Compression | |
IEEE Transactions on Image Processing (IEEE T-IP). 2020 . Huanhua Liu, Yun Zhang*, Huan Zhang, Chunling Fan, Sam Kwong, C.C. Jay Kuo, and Xiaoping Fan Full-Text | |
Sparse Representation based Video Quality Assessment for Synthesized 3D Videos | |
IEEE Transactions on Image Processing (IEEE T-IP) . 2020 . Yun Zhang*, Huan Zhang, Mei Yu, Sam Kwong, and Yo-Sung Ho Full-Text | |
Generative Adversarial Network Based Intra Prediction for Video Coding | |
IEEE Transactions on Multimedia (IEEE T-MM) . 2020 . Linwei Zhu, Sam Kwong, Yun Zhang, Shiqi Wang, and Xu Wang Full-Text |
SIAT Synthesized Video Quality Database Project Page | |
We develop a synthesized video quality database which includes ten different MVD sequences and 140 synthesized videos in 1024×768 and 1920×1088 resolution. For each sequence, 14 different texture/depth quantization combinations were used to generate the texture/depth view pairs with compression distortion. A total of 56 subjects participated in the experiment. Each synthesized sequence was rated by 40 subjects using single stimulus paradigm with continuous score. The Difference Mean Opinion Scores (DMOS) are provided. | |
SIAT Depth Quality Database Project Page | |
We develop a stereoscopic video depth quality database which includes ten different stereoscopic sequences and 160 distorted stereo videos in 1920×1080 resolution. The ten sequences are from the Nantes-Madrid-3D-Stereoscopic-V1 (NAMA3DS1) database. There are four categories of impairments in the NAMA3DS1 database which are H.264 coding, JPEG2000 coding, down-sampling and sharpening. However, only symmetric distortions are considered in NAMA3DS1 database. Since both symmetric and asymmetric distortions are necessary to study, we generate additional stereoscopic videos with asymmetric distortions. There are 90 symmetrically distorted video pairs and 70 asymmetrically distorted video pairs. 30 subjects (24 male, 6 female) participated in the symmetric distortion experiment and 24 subjects (19 male, 5 female) participated in the asymmetric distortion experiment. | |
Picture-level JND Database (Symmetric & Asymmetric) Project Page | |
We study the Picture-level Just Noticeable Difference (PJND) of symmetrically and asymmetrically compressed stereoscopic images for JPEG2000 and H.265 intra coding. We conduct interactive subjective quality assessment tests to determine the PJND point using both a pristine image and a distorted image as a reference. We generate two PJND-based stereo image datasets, including Shenzhen Institutes of Advanced Technology-picture-level Just noticeable difference-based Symmetric Stereo Image dataset (SIAT-JSSI) and Shenzhen Institutes of Advanced Technology-picture-level Just noticeable difference-based Asymmetric Stereo Image dataset (SIAT-JASI). Each dataset includes ten source images, respectively. The PJNDPRI and PJNDDRI are provided. PJNDPRI reveals the minimum distortion against a pristine image. PJNDDRI reveals the minimum distortion against a distorted image. |
Ultra-High Definition 3D Video Live System Project Page | |
Ultra-high Definition 3D video live and on-demand system aims to solve the issues in processing, storage and transmission, and quality evaluation fields, providing a realistic and immersive viewing experience. This system can be widely applied in film and television production, digital games, remote control, cultural relic protection, military simulation, and other applications. | |
VR/360° Video Projection Conversion Software Project Page | |
Projection conversion is one of the essential procedures in the virtual reality video/panoramic video technology. The projection format of 360-degree panoramic video will determine the quality and the compression efficiency of panoramic video. The selection of projection format in different application scenarios can effectively reduce the video transmission bandwidth and storage and provide customers with high-quality virtual reality experience. | |
VR Video Live System Project Page | |
The immersive virtual reality video live system allows customers to watch 4K Ultra High Videos on demand and for live broadcast, and can provide 360-degree high-quality, realistic and interactive visual experience. | |
JND Predition Software Project Page | |
The distortion perceptron is a deep learning technology software platform developed based on the PW-JND (Picture Wise Just Noticeable Difference) prediction model. This technology is used in VR image and video compression with serious bandwidth storage bottlenecks, and maximizes compression efficiency in the case of JND property by the human eye. |
Note: All resources shall not be used for commercial purposes, if you have any questions, please contact us: (yun.zhang@siat.ac.cn)