Bandwidth constraints in live streaming require video codecs to balance compression strength and frame rate, yet the perceptual consequences of this trade-off remain underexplored. We present the high frame rate live streaming (HFR-LS) dataset, comprising 384 subject-rated 1080p videos encoded at multiple target bitrates by systematically varying compression strength and frame rate. A single-stimulus, hidden-reference subjective study shows that frame rate has a noticeable effect on perceived quality, and interacts with both bitrate and source content. The HFR-LS dataset is available at https://github.com/real-hjq/HFR-LS to facilitate research on bitrate-constrained live streaming.
Thief of Truth is a first-person perspective Virtual Reality (VR) comic that explores the relationship between humans and artificial intelligence (AI). The work tells the story of a mind-uploaded human being reborn as a new subject while interacting with an AI that is looking for the meaning of life. In order to experiment with the expandability of VR comics, the work was produced by focusing on three problems. First, the comic is designed using the viewing control effect of VR. Second, through VR controller-based interaction, the player's immersion in the work is increased. Third, a method for increasing accessibility to VR comics was devised. This work aims to present an example of an experimental attempt in VR Comics.
Traditional error detection and correction codes focus on bit-level fidelity, which is insufficient for emerging technologies like eXtended Reality (XR) and holographic communications requiring high-data-rate, low-latency systems. Bit-level metrics cannot comprehensively evaluate Quality-of-Service (QoS) in these scenarios. This letter proposes TopoCode which leverages Topological Data Analysis (TDA) and persistent homology to encode topological information for message-level error detection and correction. It introduces minimal redundancy while enabling effective data reconstruction, especially in low Signal-to-Noise Ratio (SNR) conditions. TopoCode offers a promising approach to meet the demands of next-generation communication systems prioritizing semantic accuracy and message-level integrity.
Pedro Martin, António Rodrigues, João Ascenso
et al.
This short paper proposes a new database - NeRF-QA - containing 48 videos synthesized with seven NeRF based methods, along with their perceived quality scores, resulting from subjective assessment tests; for the videos selection, both real and synthetic, 360 degrees scenes were considered. This database will allow to evaluate the suitability, to NeRF based synthesized views, of existing objective quality metrics and also the development of new quality metrics, specific for this case.
To date, sonification apps are rare. Music apps on the other hand are widely used. Smartphone users like to play with music. In this manuscript, we present Mixing Levels, a spirit level sonification based on music mixing. Tilting the smartphone adjusts the volumes of 5 musical instruments in a rock music loop. Only when perfectly leveled, all instruments in the mix are well-audible. The app is supposed to be useful and fun. Since the app appears like a music mixing console, people have fun to interact with Mixing Levels, so that learning the sonification is a playful experience.
Luca Rossetto, Klaus Schoeffmann, Abraham Bernstein
For research results to be comparable, it is important to have common datasets for experimentation and evaluation. The size of such datasets, however, can be an obstacle to their use. The Vimeo Creative Commons Collection (V3C) is a video dataset designed to be representative of video content found on the web, containing roughly 3800 hours of video in total, split into three shards. In this paper, we present insights on the second of these shards (V3C2) and discuss their implications for research areas, such as video retrieval, for which the dataset might be particularly useful. We also provide all the extracted data in order to simplify the use of the dataset.
This paper is a brief report for MUSE2020 challenge. We present our solution for Muse-Wild sub challenge. The aim of this challenge is to investigate sentiment analysis method in real-world situation. Our solutions achieve the best CCC performance of 0.4670, 0.3571 for arousal, and valence respectively on the challenge validation set, which outperforms the baseline system with corresponding CCC of 0.3078 and 1506.
We present the ari package for automatically generating technology-focused educational videos. The goal of the package is to create reproducible videos, with the ability to change and update video content seamlessly. We present several examples of generating videos including using R Markdown slide decks, PowerPoint slides, or simple images as source material. We also discuss how ari can help instructors reach new audiences through programmatically translating materials into other languages.
In this paper, a new steganographic method is presented that provides minimum distortion in the stego image. The proposed encoding algorithm focuses on DCT rounding error and optimizes that in a way to reduce distortion in the stego image, and the proposed algorithm produces less distortion than existing methods (e.g., F5 algorithm). The proposed method is based on DCT rounding error which helps to lower distortion and higher embedding capacity.
This paper proposes a robust watermarking method for uncompressed video data against H.264/AVC and H.265/HEVC compression standards. We embed the watermark data in the mid-range transform coefficients of a block that is less similar to its corresponding block in the previous and next frames. This idea makes the watermark robust against the compression standards that use the inter prediction technique. The last two video compression standards also use inter prediction for motion compensation like previous video compression standards. Therefore, the proposed method is also well suited with these standards. Simulation results show the adequate robustness and transparency of our watermarking scheme.
Krzysztof Wegner, Tomasz Grajek, Jakub Stankowski
et al.
HEVC (MPEG-H Part 2 and H.265) is a new coding technology which is expected to be deployed on the market along with new video services in the near future. HEVC is a successor of currently widely used AVC (MPEG-4 Part 10 and H.264). In this paper, the quality coding gains obtained for the Cascaded Pixel Domain Transcoder of AVC-coded material to HEVC standard are reported. Extensive experiments showed that transcoding with bitrate reduction allows the achievement of better rate-distortion performance than by compressing an original video sequence with the use of AVC at the same (reduced) bitrate.
Nematollah Zarmehi, Morteza Banagar, Mohammad Ali Akhaee
In this paper, we investigate an additive video watermarking method in H.264 standard in presence of the Laplacian noise. In some applications, due to the loss of some pixels or a region of a frame, we resort to Laplacian noise rather than Gaussian one. The embedding is performed in the transform domain; while an optimum and a sub-optimum decoder are derived for the proposed Laplacian model. Simulation results show that the proposed watermarking scheme has suitable performance with enough transparency required for watermarking applications.
In this paper, we present methods for image compression on the basis of eigenvalue decomposition of normal matrices. The proposed methods are convenient and self-explanatory, requiring fewer and easier computations as compared to some existing methods. Through the proposed techniques, the image is transformed to the space of normal matrices. Then, the properties of spectral decomposition are dealt with to obtain compressed images. Experimental results are provided to illustrate the validity of the methods.
Image width is important for image understanding. We propose a novel method to estimate widths for JPEG images when their widths are not available. The key idea is that the distance between two decoded MCUs (Minimum Coded Unit) adjacent in the vertical direction is usually small, which is measured by the average Euclidean distance between the pixels from the bottom row of the top MCU and the top row of the bottom MCU. On PASCAL VOC 2010 challenge dataset and USC-SIPI image database, experimental results show the high performance of the proposed approach.
Pawel Kopiczko, Wojciech Mazurczyk, Krzysztof Szczypiorski
The paper proposes StegTorrent a new network steganographic method for the popular P2P file transfer service-BitTorrent. It is based on modifying the order of data packets in the peer-peer data exchange protocol. Unlike other existing steganographic methods that modify the packets' order it does not require any synchronization. Experimental results acquired from prototype implementation proved that it provides high steganographic bandwidth of up to 270 b/s while introducing little transmission distortion and providing difficult detectability.
This paper develops a new video compression approach based on underdetermined blind source separation. Underdetermined blind source separation, which can be used to efficiently enhance the video compression ratio, is combined with various off-the-shelf codecs in this paper. Combining with MPEG-2, video compression ratio could be improved slightly more than 33%. As for combing with H.264, 4X~12X more compression ratio could be achieved with acceptable PSNR, according to different kinds of video sequences.
In this paper, we suggest a general model for the fixed-valued impulse noise and propose a two-stage method for high density noise suppression while preserving the image details. In the first stage, we apply an iterative impulse detector, exploiting the image entropy, to identify the corrupted pixels and then employ an Adaptive Iterative Mean filter to restore them. The filter is adaptive in terms of the number of iterations, which is different for each noisy pixel, according to the Euclidean distance from the nearest uncorrupted pixel. Experimental results show that the proposed filter is fast and outperforms the best existing techniques in both objective and subjective performance measures.