Source author record

Siyuan Zhou

Siyuan Zhou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.AS Information Theory Machine Learning math.IT Networking and Internet Architecture Robotics Software Engineering Sound

Catalog footprint

What is connected

6works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Finding Fallen Objects Via Asynchronous Audio-Visual Integration

The way an object looks and sounds provide complementary reflections of its physical properties. In many settings cues from vision and audition arrive asynchronously but must be integrated, as when we hear an object dropped on the floor and then must find it. In this paper, we introduce a setting in which to study multi-modal object localization in 3D virtual environments. An object is dropped somewhere in a room. An embodied robot agent, equipped with a camera and microphone, must determine what object has been dropped -- and where -- by combining audio and visual signals with knowledge of the underlying physics. To study this problem, we have generated a large-scale dataset -- the Fallen Objects dataset -- that includes 8000 instances of 30 physical object categories in 64 rooms. The dataset uses the ThreeDWorld platform which can simulate physics-based impact sounds and complex physical interactions between objects in a photorealistic setting. As a first step toward addressing this challenge, we develop a set of embodied agent baselines, based on imitation learning, reinforcement learning, and modular planning, and perform an in-depth analysis of the challenge of this new task.

preprint2022arXiv

From Pixel to Patch: Synthesize Context-aware Features for Zero-shot Semantic Segmentation

Zero-shot learning has been actively studied for image classification task to relieve the burden of annotating image labels. Interestingly, semantic segmentation task requires more labor-intensive pixel-wise annotation, but zero-shot semantic segmentation has only attracted limited research interest. Thus, we focus on zero-shot semantic segmentation, which aims to segment unseen objects with only category-level semantic representations provided for unseen categories. In this paper, we propose a novel Context-aware feature Generation Network (CaGNet), which can synthesize context-aware pixel-wise visual features for unseen categories based on category-level semantic representations and pixel-wise contextual information. The synthesized features are used to finetune the classifier to enable segmenting unseen objects. Furthermore, we extend pixel-wise feature generation and finetuning to patch-wise feature generation and finetuning, which additionally considers inter-pixel relationship. Experimental results on Pascal-VOC, Pascal-Context, and COCO-stuff show that our method significantly outperforms the existing zero-shot semantic segmentation methods. Code is available at https://github.com/bcmi/CaGNetv2-Zero-Shot-Semantic-Segmentation.

preprint2020arXiv

Context-aware Feature Generation for Zero-shot Semantic Segmentation

Existing semantic segmentation models heavily rely on dense pixel-wise annotations. To reduce the annotation pressure, we focus on a challenging task named zero-shot semantic segmentation, which aims to segment unseen objects with zero annotations. This task can be accomplished by transferring knowledge across categories via semantic word embeddings. In this paper, we propose a novel context-aware feature generation method for zero-shot segmentation named CaGNet. In particular, with the observation that a pixel-wise feature highly depends on its contextual information, we insert a contextual module in a segmentation network to capture the pixel-wise contextual information, which guides the process of generating more diverse and context-aware features from semantic word embeddings. Our method achieves state-of-the-art results on three benchmark datasets for zero-shot segmentation. Codes are available at: https://github.com/bcmi/CaGNet-Zero-Shot-Semantic-Segmentation.

preprint2020arXiv

MLD: An Intelligent Memory Leak Detection Scheme Based on Defect Modes in Smart Grids

With the expansion of the software scale and complexity of smart grid systems, the detection of smart grid software defects has become a research hotspot. Because of the large scale of the existing smart grid software code, the efficiency and accuracy of the existing smart grid defect detection algorithms are not high. We propose an intelligent memory leak detection scheme based on defect modes MLD in smart grid. Based on the analysis of existing memory leak defect modes, we summarize memory operation behaviors (allocation, release and transfer) and present a state machine model. We employ a fuzzy matching algorithm based on regular expression to determine the memory operation behaviors and then analyze the change in the state machine to assess the vulnerability in the source code. To improve the efficiency of detection and solve the problem of repeated detection at the function call point, we propose a function summary method for memory operation behaviors. The experimental results demonstrate that the method we proposed has high detection speed and accuracy. The algorithm we proposed can identify the defects of the smart grid operation software and ensure the safe operation of the grid.

preprint2014arXiv

Closed-form Output Statistics of MIMO Block-Fading Channels

The information that can be transmitted through a wireless channel, with multiple-antenna equipped transmitter and receiver, is crucially influenced by the channel behavior as well as by the structure of the input signal. We characterize in closed form the probability density function (pdf) of the output of MIMO block-fading channels, for an arbitrary SNR value. Our results provide compact expressions for such output statistics, paving the way to a more detailed analytical information-theoretic exploration of communications in presence of block fading. The analysis is carried out assuming two different structures for the input signal: the i.i.d. Gaussian distribution and a product form that has been proved to be optimal for non-coherent communication, i.e., in absence of any channel state information. When the channel is fed by an i.i.d. Gaussian input, we assume the Gramian of the channel matrix to be unitarily invariant and derive the output statistics in both the noise-limited and the interference-limited scenario, considering different fading distributions. When the product-form input is adopted, we provide the expressions of the output pdf as the relationship between the overall number of antennas and the fading coherence length varies. We also highlight the relation between our newly derived expressions and the results already available in the literature, and, for some cases, we numerically compute the mutual information, based on the proposed expression of the output statistics.

preprint2014arXiv

Real-Time Scheduling for Content Broadcasting in LTE

Broadcasting capabilities are one of the most promising features of upcoming LTE-Advanced networks. However, the task of scheduling broadcasting sessions is far from trivial, since it affects the available resources of several contiguous cells as well as the amount of resources that can be devoted to unicast traffic. In this paper, we present a compact, convenient model for broadcasting in LTE, as well as a set of efficient algorithms to define broadcasting areas and to actually perform content scheduling. We study the performance of our algorithms in a realistic scenario, deriving interesting insights on the possible trade-offs between effectiveness and computational efficiency.

Siyuan Zhou

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Finding Fallen Objects Via Asynchronous Audio-Visual Integration

From Pixel to Patch: Synthesize Context-aware Features for Zero-shot Semantic Segmentation

Context-aware Feature Generation for Zero-shot Semantic Segmentation

MLD: An Intelligent Memory Leak Detection Scheme Based on Defect Modes in Smart Grids

Closed-form Output Statistics of MIMO Block-Fading Channels

Real-Time Scheduling for Content Broadcasting in LTE