Source author record

Yu Hao

Yu Hao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Computation and Language cond-mat.mes-hall Machine Learning quant-ph cond-mat.mtrl-sci cond-mat.supr-con eess.IV eess.SP physics.optics Social and Information Networks

Catalog footprint

What is connected

11works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

1st Place Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: End-to-End Recognition of Out of Vocabulary Words

Scene text recognition has attracted increasing interest in recent years due to its wide range of applications in multilingual translation, autonomous driving, etc. In this report, we describe our solution to the Out of Vocabulary Scene Text Understanding (OOV-ST) Challenge, which aims to extract out-of-vocabulary (OOV) words from natural scene images. Our oCLIP-based model achieves 28.59\% in h-mean which ranks 1st in end-to-end OOV word recognition track of OOV Challenge in ECCV2022 TiE Workshop.

preprint2022arXiv

Detect and Approach: Close-Range Navigation Support for People with Blindness and Low Vision

People with blindness and low vision (pBLV) experience significant challenges when locating final destinations or targeting specific objects in unfamiliar environments. Furthermore, besides initially locating and orienting oneself to a target object, approaching the final target from one's present position is often frustrating and challenging, especially when one drifts away from the initial planned path to avoid obstacles. In this paper, we develop a novel wearable navigation solution to provide real-time guidance for a user to approach a target object of interest efficiently and effectively in unfamiliar environments. Our system contains two key visual computing functions: initial target object localization in 3D and continuous estimation of the user's trajectory, both based on the 2D video captured by a low-cost monocular camera mounted on in front of the chest of the user. These functions enable the system to suggest an initial navigation path, continuously update the path as the user moves, and offer timely recommendation about the correction of the user's path. Our experiments demonstrate that our system is able to operate with an error of less than 0.5 meter both outdoor and indoor. The system is entirely vision-based and does not need other sensors for navigation, and the computation can be run with the Jetson processor in the wearable system to facilitate real-time navigation assistance.

preprint2022arXiv

Network-Aware 5G Edge Computing for Object Detection: Augmenting Wearables to "See" More, Farther and Faster

Advanced wearable devices are increasingly incorporating high-resolution multi-camera systems. As state-of-the-art neural networks for processing the resulting image data are computationally demanding, there has been growing interest in leveraging fifth generation (5G) wireless connectivity and mobile edge computing for offloading this processing to the cloud. To assess this possibility, this paper presents a detailed simulation and evaluation of 5G wireless offloading for object detection within a powerful, new smart wearable called VIS4ION, for the Blind-and-Visually Impaired (BVI). The current VIS4ION system is an instrumented book-bag with high-resolution cameras, vision processing and haptic and audio feedback. The paper considers uploading the camera data to a mobile edge cloud to perform real-time object detection and transmitting the detection results back to the wearable. To determine the video requirements, the paper evaluates the impact of video bit rate and resolution on object detection accuracy and range. A new street scene dataset with labeled objects relevant to BVI navigation is leveraged for analysis. The vision evaluation is combined with a detailed full-stack wireless network simulation to determine the distribution of throughputs and delays with real navigation paths and ray-tracing from new high-resolution 3D models in an urban environment. For comparison, the wireless simulation considers both a standard 4G-Long Term Evolution (LTE) carrier and high-rate 5G millimeter-wave (mmWave) carrier. The work thus provides a thorough and realistic assessment of edge computing with mmWave connectivity in an application with both high bandwidth and low latency requirements.

preprint2022arXiv

Runner-Up Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: Cropped Word Recognition

This report presents our 2nd place solution to ECCV 2022 challenge on Out-of-Vocabulary Scene Text Understanding (OOV-ST) : Cropped Word Recognition. This challenge is held in the context of ECCV 2022 workshop on Text in Everything (TiE), which aims to extract out-of-vocabulary words from natural scene images. In the competition, we first pre-train SCATTER on the synthetic datasets, then fine-tune the model on the training set with data augmentations. Meanwhile, two additional models are trained specifically for long and vertical texts. Finally, we combine the output from different models with different layers, different backbones, and different seeds as the final results. Our solution achieves a word accuracy of 59.45\% when considering out-of-vocabulary words only.

preprint2020arXiv

Correlation-aware Unsupervised Change-point Detection via Graph Neural Networks

Change-point detection (CPD) aims to detect abrupt changes over time series data. Intuitively, effective CPD over multivariate time series should require explicit modeling of the dependencies across input variables. However, existing CPD methods either ignore the dependency structures entirely or rely on the (unrealistic) assumption that the correlation structures are static over time. In this paper, we propose a Correlation-aware Dynamics Model for CPD, which explicitly models the correlation structure and dynamics of variables by incorporating graph neural networks into an encoder-decoder framework. Extensive experiments on synthetic and real-world datasets demonstrate the advantageous performance of the proposed model on CPD tasks over strong baselines, as well as its ability to classify the change-points as correlation changes or independent changes. Keywords: Multivariate Time Series, Change-point Detection, Graph Neural Networks

preprint2020arXiv

Inductive Link Prediction for Nodes Having Only Attribute Information

Predicting the link between two nodes is a fundamental problem for graph data analytics. In attributed graphs, both the structure and attribute information can be utilized for link prediction. Most existing studies focus on transductive link prediction where both nodes are already in the graph. However, many real-world applications require inductive prediction for new nodes having only attribute information. It is more challenging since the new nodes do not have structure information and cannot be seen during the model training. To solve this problem, we propose a model called DEAL, which consists of three components: two node embedding encoders and one alignment mechanism. The two encoders aim to output the attribute-oriented node embedding and the structure-oriented node embedding, and the alignment mechanism aligns the two types of embeddings to build the connections between the attributes and links. Our model DEAL is versatile in the sense that it works for both inductive and transductive link prediction. Extensive experiments on several benchmark datasets show that our proposed model significantly outperforms existing inductive link prediction methods, and also outperforms the state-of-the-art methods on transductive link prediction.

preprint2015arXiv

Knowlege Graph Embedding by Flexible Translation

Knowledge graph embedding refers to projecting entities and relations in knowledge graph into continuous vector spaces. State-of-the-art methods, such as TransE, TransH, and TransR build embeddings by treating relation as translation from head entity to tail entity. However, previous models can not deal with reflexive/one-to-many/many-to-one/many-to-many relations properly, or lack of scalability and efficiency. Thus, we propose a novel method, flexible translation, named TransF, to address the above issues. TransF regards relation as translation between head entity vector and tail entity vector with flexible magnitude. To evaluate the proposed model, we conduct link prediction and triple classification on benchmark datasets. Experimental results show that our method remarkably improve the performance compared with several state-of-the-art baselines.

preprint2015arXiv

Superconducting Circuitry for Quantum Electromechanical Systems

Superconducting systems have a long history of use in experiments that push the frontiers of mechanical sensing. This includes both applied and fundamental research, which at present day ranges from quantum computing research and efforts to explore Planck-scale physics to fundamental studies on the nature of motion and the quantum limits on our ability to measure it. In this paper, we first provide a short history of the role of superconducting circuitry and devices in mechanical sensing, focusing primarily on efforts in the last decade to push the study of quantum mechanics to include motion on the scale of human-made structures. This background sets the stage for the remainder of the paper, which focuses on the development of quantum electromechanical systems (QEMS) that incorporate superconducting quantum bits (qubits), superconducting transmission line resonators and flexural nanomechanical elements. In addition to providing the motivation and relevant background on the physical behavior of these systems, we discuss our recent efforts to develop a particular type of QEMS that is based upon the Cooper-pair box (CPB) and superconducting coplanar waveguide (CPW) cavities, a system which has the potential to serve as a testbed for studying the quantum properties of motion in engineered systems.

preprint2015arXiv

TransA: An Adaptive Approach for Knowledge Graph Embedding

Knowledge representation is a major topic in AI, and many studies attempt to represent entities and relations of knowledge base in a continuous vector space. Among these attempts, translation-based methods build entity and relation vectors by minimizing the translation loss from a head entity to a tail one. In spite of the success of these methods, translation-based methods also suffer from the oversimplified loss metric, and are not competitive enough to model various and complex entities/relations in knowledge bases. To address this issue, we propose \textbf{TransA}, an adaptive metric approach for embedding, utilizing the metric learning ideas to provide a more flexible embedding method. Experiments are conducted on the benchmark datasets and our proposed method makes significant and consistent improvements over the state-of-the-art baselines.

preprint2014arXiv

Development of a broadband reflective T-filter for voltage biasing high-Q superconducting microwave cavities

We present the design of a reflective stop-band filter based on quasi-lumped elements that can be utilized to introduce large dc and low-frequency voltage biases into a low-loss superconducting coplanar waveguide (CPW) cavity. Transmission measurements of the filter are seen to be in good agreement with simulations and demonstrate insertion losses greater than $20\,{\rm dB}$ in the range of ${\rm 3\,to\,10\,GHz}$. Moreover, transmission measurements of the CPW's fundamental mode demonstrate that loaded quality factors exceeding $10^5$ can be achieved with this design for dc voltages as large as ${\rm 20\,V}$ and for the cavity operated in the single-photon regime. This makes the design suitable for use in a number of applications including qubit-coupled mechanical systems and circuit QED.

preprint2009arXiv

Infrared carpet cloak designed with uniform silicon grating structure

Through a particularly chosen coordinate transformation, we propose an optical carpet cloak that only requires homogeneous anisotropic dielectric material. The proposed cloak could be easily imitated and realized by alternative layers of isotropic dielectrics. To demonstrate the cloaking performance, we have designed a two-dimensional version that a uniform silicon grating structure fabricated on a silicon-on-insulator wafer could work as an infrared carpet cloak. The cloak has been validated through full wave electromagnetic simulations, and the non-resonance feature also enables a broadband cloaking for wavelengths ranging from 1372 to 2000 nm.

Yu Hao

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

1st Place Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: End-to-End Recognition of Out of Vocabulary Words

Detect and Approach: Close-Range Navigation Support for People with Blindness and Low Vision

Network-Aware 5G Edge Computing for Object Detection: Augmenting Wearables to "See" More, Farther and Faster

Runner-Up Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: Cropped Word Recognition

Correlation-aware Unsupervised Change-point Detection via Graph Neural Networks

Inductive Link Prediction for Nodes Having Only Attribute Information

Knowlege Graph Embedding by Flexible Translation

Superconducting Circuitry for Quantum Electromechanical Systems

TransA: An Adaptive Approach for Knowledge Graph Embedding

Development of a broadband reflective T-filter for voltage biasing high-Q superconducting microwave cavities

Infrared carpet cloak designed with uniform silicon grating structure