Source author record

Jin Yu

Jin Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision cond-mat.mtrl-sci cond-mat.mes-hall Biological Physics cond-mat.soft cond-mat.stat-mech Databases eess.AS eess.IV Information Retrieval math.OC Molecular Networks Multimedia physics.app-ph physics.comp-ph physics.optics Social and Information Networks Sound

Catalog footprint

What is connected

20works

19topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

HQANN: Efficient and Robust Similarity Search for Hybrid Queries with Structured and Unstructured Constraints

The in-memory approximate nearest neighbor search (ANNS) algorithms have achieved great success for fast high-recall query processing, but are extremely inefficient when handling hybrid queries with unstructured (i.e., feature vectors) and structured (i.e., related attributes) constraints. In this paper, we present HQANN, a simple yet highly efficient hybrid query processing framework which can be easily embedded into existing proximity graph-based ANNS algorithms. We guarantee both low latency and high recall by leveraging navigation sense among attributes and fusing vector similarity search with attribute filtering. Experimental results on both public and in-house datasets demonstrate that HQANN is 10x faster than the state-of-the-art hybrid ANNS solutions to reach the same recall quality and its performance is hardly affected by the complexity of attributes. It can reach 99\% recall@10 in just around 50 microseconds On GLOVE-1.2M with thousands of attribute constraints.

preprint2022arXiv

Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition

Recognizing human non-speech vocalizations is an important task and has broad applications such as automatic sound transcription and health condition monitoring. However, existing datasets have a relatively small number of vocal sound samples or noisy labels. As a consequence, state-of-the-art audio event classification models may not perform well in detecting human vocal sounds. To support research on building robust and accurate vocal sound recognition, we have created a VocalSound dataset consisting of over 21,000 crowdsourced recordings of laughter, sighs, coughs, throat clearing, sneezes, and sniffs from 3,365 unique subjects. Experiments show that the vocal sound recognition performance of a model can be significantly improved by 41.9% by adding VocalSound dataset to an existing dataset as training material. In addition, different from previous datasets, the VocalSound dataset contains meta information such as speaker age, gender, native language, country, and health condition.

preprint2021arXiv

Deep Learning for Distinguishing Normal versus Abnormal Chest Radiographs and Generalization to Unseen Diseases

Chest radiography (CXR) is the most widely-used thoracic clinical imaging modality and is crucial for guiding the management of cardiothoracic conditions. The detection of specific CXR findings has been the main focus of several artificial intelligence (AI) systems. However, the wide range of possible CXR abnormalities makes it impractical to build specific systems to detect every possible condition. In this work, we developed and evaluated an AI system to classify CXRs as normal or abnormal. For development, we used a de-identified dataset of 248,445 patients from a multi-city hospital network in India. To assess generalizability, we evaluated our system using 6 international datasets from India, China, and the United States. Of these datasets, 4 focused on diseases that the AI was not trained to detect: 2 datasets with tuberculosis and 2 datasets with coronavirus disease 2019. Our results suggest that the AI system generalizes to new patient populations and abnormalities. In a simulated workflow where the AI system prioritized abnormal cases, the turnaround time for abnormal cases reduced by 7-28%. These results represent an important step towards evaluating whether AI can be safely used to flag cases in a general setting where previously unseen abnormalities exist.

preprint2021arXiv

Distribution of ripples in graphene membrane

Intrinsic ripples with various configurations and sizes were reported to affect the physical and chemical properties of 2D materials. By performing molecular dynamics simulations and theoretical analysis, we use two geometric models of the ripple shape to explore numerically the distribution of ripples in graphene membrane. We focus on the ratio of ripple height to its diameter (t/D) which was recently shown to be the most relevant for chemical activity of graphene membranes. Our result demonstrates that the ripple density decreases as the coefficient t/D increases, in a qualitative agreement with the Boltzmann distribution derived analytically from the bending energy of the membrane. Our theoretical study provides also specific quantitative information on the ripple distribution in graphene and gives new insights applicable to other 2D materials.

preprint2021arXiv

Electronic and Optical properties of transition metal dichalcogenides under symmetric and asymmetric field-effect doping

Doping via electrostatic gating is a powerful and widely used technique to tune the electron densities in layered materials. The microscopic details of how these setups affect the layered material are, however, subtle and call for careful theoretical treatments. Using semiconducting monolayers of transition metal dichalcogenides (TMDs) as prototypical systems affected by electrostatic gating, we show that the electronic and optical properties change indeed dramatically when the gating geometry is properly taken into account. This effect is implemented by a self-consistent calculation of the Coulomb interaction between the charges in different sub-layers within the tight-binding approximation. Thereby we consider both, single- and double-sided gating. Our results show that, at low doping levels of $10^{13}$ cm$^{-2}$, the electronic bands of monolayer TMDs shift rigidly for both types of gating, and subsequently undergo a Lifshitz transition. When approaching the doping level of $10^{14}$ cm$^{-2}$, the band structure changes dramatically, especially in the case of single-sided gating where we find that monolayer \ce{MoS2} and \ce{WS2} become indirect gap semiconductors. The optical conductivities calculated within linear response theory also show clear signatures of these doping-induced band structure renormalizations. Our numerical results based on light-weighted tight-binding models indicate the importance of electronic screening in doped layered structures, and pave the way for further understanding gated super-lattice structures formed by mutlilayers with extended Moiré pattern.

preprint2021arXiv

From Machine Learning to Transfer Learning in Laser-Induced Breakdown Spectroscopy: the Case of Rock Analysis for Mars Exploration

With the ChemCam instrument, laser-induced breakdown spectroscopy (LIBS) has successively contributed to Mars exploration by determining elemental compositions of the soil, crust and rocks. Two new lunched missions, Chinese Tianwen 1 and American Perseverance, will further increase the number of LIBS instruments on Mars after the planned landings in spring 2021. Such unprecedented situation requires a reinforced research effort on the methods of LIBS spectral data treatment. Although the matrix effects correspond to a general issue in LIBS, they become accentuated in the case of rock analysis for Mars exploration, because of the large variation of rock composition leading to the chemical matrix effect, and the difference in morphology between laboratory standard samples (in pressed pellet, glass or ceramics) used to establish calibration models and natural rocks encountered on Mars, leading to the physical matric effect. The chemical matrix effect has been tackled in the ChemCam project with large sets of laboratory standard samples offering a good representation of various compositions of Mars rocks. The present work deals with the physical matrix effect which is still expecting a satisfactory solution. The approach consists in introducing transfer learning in LIBS data treatment. For the specific case of total alkali-silica (TAS) classification of natural rocks, the results show a significant improvement of the prediction capacity of pellet sample-based models when trained together with suitable information from rocks in a procedure of transfer learning. The correct classification rate of rocks increases from 33.3% with a machine learning model to 83.3% with a transfer learning model.

preprint2021arXiv

Two-Phase Dynamics of DNA Supercoiling based on DNA Polymer Physics

DNA supercoils are generated in genome regulation processes such as transcription and replication, and provide mechanical feedback to such processes. Under tension, DNA supercoil can present a coexistence state of plectonemic (P) and stretched (S) phases. Experiments have revealed the dynamic behaviors of plectoneme, e.g. diffusion, nucleation and hopping. To represent these dynamics with computational changes, we demonstrated first the fast dynamics on the DNA to reach torque equilibrium within the P and S phases, and then identified the two-phase boundaries as collective slow variables to describe the essential dynamics. According to the time scale separation demonstrated here, we accordingly developed a two-phase model on the dynamics of DNA supercoiling, which can capture physiologically relevant events across time scales of several orders of magnitudes. In this model, we systematically characterized the slow dynamics between the two phases, and compared the numerical results with that from the DNA polymer physics-based worm-like chain model. The supercoiling dynamics, including the nucleation, diffusion, and hopping of plectoneme, have been well represented and reproduced, using the two-phase dynamic model, at trivial computational costs. Our current developments, therefore, can be implemented to explore multi-scale physical mechanisms of the DNA supercoiling-dependent physiological processes.

preprint2020arXiv

A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network Representation Learning

Network Embedding has been widely studied to model and manage data in a variety of real-world applications. However, most existing works focus on networks with single-typed nodes or edges, with limited consideration of unbalanced distributions of nodes and edges. In real-world applications, networks usually consist of billions of various types of nodes and edges with abundant attributes. To tackle these challenges, in this paper we propose a multi-semantic metapath (MSM) model for large scale heterogeneous representation learning. Specifically, we generate multi-semantic metapath-based random walks to construct the heterogeneous neighborhood to handle the unbalanced distributions and propose a unified framework for the embedding learning. We conduct systematical evaluations for the proposed framework on two challenging datasets: Amazon and Alibaba. The results empirically demonstrate that MSM can achieve relatively significant gains over previous state-of-arts on link prediction.

preprint2020arXiv

Comprehensive Information Integration Modeling Framework for Video Titling

In e-commerce, consumer-generated videos, which in general deliver consumers' individual preferences for the different aspects of certain products, are massive in volume. To recommend these videos to potential consumers more effectively, diverse and catchy video titles are critical. However, consumer-generated videos seldom accompany appropriate titles. To bridge this gap, we integrate comprehensive sources of information, including the content of consumer-generated videos, the narrative comment sentences supplied by consumers, and the product attributes, in an end-to-end modeling framework. Although automatic video titling is very useful and demanding, it is much less addressed than video captioning. The latter focuses on generating sentences that describe videos as a whole while our task requires the product-aware multi-grained video analysis. To tackle this issue, the proposed method consists of two processes, i.e., granular-level interaction modeling and abstraction-level story-line summarization. Specifically, the granular-level interaction modeling first utilizes temporal-spatial landmark cues, descriptive words, and abstractive attributes to builds three individual graphs and recognizes the intra-actions in each graph through Graph Neural Networks (GNN). Then the global-local aggregation module is proposed to model inter-actions across graphs and aggregate heterogeneous graphs into a holistic graph representation. The abstraction-level story-line summarization further considers both frame-level video features and the holistic graph to utilize the interactions between products and backgrounds, and generate the story-line topic of the video. We collect a large-scale dataset accordingly from real-world data in Taobao, a world-leading e-commerce platform, and will make the desensitized version publicly available to nourish further development of the research community...

preprint2020arXiv

Grounded and Controllable Image Completion by Incorporating Lexical Semantics

In this paper, we present an approach, namely Lexical Semantic Image Completion (LSIC), that may have potential applications in art, design, and heritage conservation, among several others. Existing image completion procedure is highly subjective by considering only visual context, which may trigger unpredictable results which are plausible but not faithful to a grounded knowledge. To permit both grounded and controllable completion process, we advocate generating results faithful to both visual and lexical semantic context, i.e., the description of leaving holes or blank regions in the image (e.g., hole description). One major challenge for LSIC comes from modeling and aligning the structure of visual-semantic context and translating across different modalities. We term this process as structure completion, which is realized by multi-grained reasoning blocks in our model. Another challenge relates to the unimodal biases, which occurs when the model generates plausible results without using the textual description. This can be true since the annotated captions for an image are often semantically equivalent in existing datasets, and thus there is only one paired text for a masked image in training. We devise an unsupervised unpaired-creation learning path besides the over-explored paired-reconstruction path, as well as a multi-stage training strategy to mitigate the insufficiency of labeled data. We conduct extensive quantitative and qualitative experiments as well as ablation studies, which reveal the efficacy of our proposed LSIC.

preprint2020arXiv

Poet: Product-oriented Video Captioner for E-commerce

In e-commerce, a growing number of user-generated videos are used for product promotion. How to generate video descriptions that narrate the user-preferred product characteristics depicted in the video is vital for successful promoting. Traditional video captioning methods, which focus on routinely describing what exists and happens in a video, are not amenable for product-oriented video captioning. To address this problem, we propose a product-oriented video captioner framework, abbreviated as Poet. Poet firstly represents the videos as product-oriented spatial-temporal graphs. Then, based on the aspects of the video-associated product, we perform knowledge-enhanced spatial-temporal inference on those graphs for capturing the dynamic change of fine-grained product-part characteristics. The knowledge leveraging module in Poet differs from the traditional design by performing knowledge filtering and dynamic memory modeling. We show that Poet achieves consistent performance improvement over previous methods concerning generation quality, product aspects capturing, and lexical diversity. Experiments are performed on two product-oriented video captioning datasets, buyer-generated fashion video dataset (BFVD) and fan-generated fashion video dataset (FFVD), collected from Mobile Taobao. We will release the desensitized datasets to promote further investigations on both video captioning and general video analysis problems.

preprint2020arXiv

Strain-induced semiconductor to metal transition in MA2Z4 bilayers

Very recently, a new type of two-dimensional layered material MoSi2N4 has been fabricated, which is semiconducting with weak interlayer interaction, high strength, and excellent stability. We systematically investigate theoretically the effect of vertical strain on the electronic structure of MA2Z4 (M=Ti/Cr/Mo, A=Si, Z=N/P) bilayers. Taking bilayer MoSi2N4 as an example, our first principle calculations show that its indirect band gap decreases monotonically as the vertical compressive strain increases. Under a critical strain around 22%, it undergoes a transition from semiconductor to metal. We attribute this to the opposite energy shift of states in different layers, which originates from the built-in electric field induced by the asymmetric charge transfer between two inner sublayers near the interface. Similar semiconductor to metal transitions are observed in other strained MA2Z4 bilayers, and the estimated critical pressures to realize such transitions are within the same order as semiconducting transition metal dichalcogenides. The semiconductor to metal transitions observed in the family of MA2Z4 bilayers present interesting possibilities for strain-induced engineering of their electronic properties.

preprint2019arXiv

Tunable magneto-optical properties of single-layer tin diselenide: From GW approximation to large-scale tight-binding calculations

A parameterized tight-binding (TB) model based on the first-principles GW calculations is developed for single layer tin diselenide (SnSe$_2$) and used to study its electronic and optical properties under external magnetic field. The truncated model is derived from six maximally localized wannier orbitals on Se site, which accurately describes the quasi-particle electronic states of single layer SnSe$_2$ in a wide energy range. The quasi-particle electronic states are dominated by the hoppings between nearest wannier orbitals ($t_1$-$t_6$). Our numerical calculation shows that, due to the electron-hole asymmetry, two sets of Landau Level spectrum are obtained when a perpendicular magnetic field is applied. The Landau Level spectrum follows linear dependence on the level index and magnetic field, exhibiting properties of two-dimensional electron gas in traditional semiconductors. The optical conductivity calculation shows that the optical gap is very close to the GW value, and can be tuned by external magnetic field. Our proposed TB model can be used for further exploring the electronic, optical, and transport properties of SnSe$_2$, especially in the presence of external magnetic fields.

preprint2015arXiv

Computational modeling to elucidate molecular mechanisms of epigenetic memory

How do mammalian cells that share the same genome exist in notably distinct phenotypes, exhibiting differences in morphology, gene expression patterns, and epigenetic chromatin statuses? Furthermore how do cells of different phenotypes differentiate reproducibly from a single fertilized egg? These are fundamental problems in developmental biology. Epigenetic histone modifications play an important role in the maintenance of different cell phenotypes. The exact molecular mechanism for inheritance of the modification patterns over cell generations remains elusive. The complexity comes partly from the number of molecular species and the broad time scales involved. In recent years mathematical modeling has made significant contributions on elucidating the molecular mechanisms of DNA methylation and histone covalent modification inheritance. We will pedagogically introduce the typical procedure and some technical details of performing a mathematical modeling study, and discuss future developments.

preprint2014arXiv

Magnetic Proximity Effect and Interlayer Exchange Coupling of Ferromagnetic/Topological Insulator/Ferromagnetic Trilayer

Magnetic proximity effect between topological insulator (TI) and ferromagnetic insulator (FMI) is considered to have great potential in spintronics. However, a complete determination of interfacial magnetic structure has been highly challenging. We theoretically investigate the interlayer exchange coupling of two FMIs separated by a TI thin film, and show that the particular electronic states of the TI contributing to the proximity effect can be directly identified through the coupling behavior between two FMIs, together with a tunability of coupling constant. Such FMI/TI/FMI structure not only serves as a platform to clarify the magnetic structure of FMI/TI interface, but also provides insights into designing the magnetic storage devices with ultrafast response.

preprint2013arXiv

A New Paradigm to Half-Metallicity in Graphene Nanoribbons

In contrast to the well recognized transverse-electric-field-induced half-metallicity in zigzag graphene nanoribbons, here we demonstrate by first-principles calculations that zigzag graphene nanoribbons sandwiched between hexagonal boron nitride nanoribbons or sheets can be tuned into half-metal simply by a bias voltage or a moderate compressive strain. The half-metallicity is attributed to an enhanced coupling effect of spontaneous polarization and asymmetrical exchange correlation along the ribbon width. The findings should open a viable route for efficient spin-resolved band engineering in graphene based devices that are compatible with the current technology of semiconductor industry.

preprint2013arXiv

Effective Absorption Enhancement in Small Molecule Organic Solar Cells by Employing Trapezoid Gratings

We demonstrate the optical absorption has been enhanced in the small molecule organic solar cells by employing trapezoid grating structure. The enhanced absorption is mainly attributed to both waveguide modes and surface plasmon modes, which has been simulated by using finite-difference time-domain method. The simulated results show that the surface plasmon along the semitransparent metallic Ag anode is excited by introducing the periodical trapezoid gratings, which induce high intensity field increment in the donor layer. Meanwhile, the waveguide modes result a high intensity field in acceptor layer. The increment of field improves the absorption of organic solar cells, significantly, which has been demonstrated by simulating the electrical properties. The simulated results exhibiting 31 % increment of the short-circuit current has been achieved in the optimized device, which is supported by the experimental measurement. The power conversion efficiency of the grating sample obtained in experiment exhibits an enhancement of 7.7 %.

preprint2012arXiv

The Entire Quantile Path of a Risk-Agnostic SVM Classifier

A quantile binary classifier uses the rule: Classify x as +1 if P(Y = 1|X = x) >= t, and as -1 otherwise, for a fixed quantile parameter t {[0, 1]. It has been shown that Support Vector Machines (SVMs) in the limit are quantile classifiers with t = 1/2 . In this paper, we show that by using asymmetric cost of misclassification SVMs can be appropriately extended to recover, in the limit, the quantile binary classifier for any t. We then present a principled algorithm to solve the extended SVM classifier for all values of t simultaneously. This has two implications: First, one can recover the entire conditional distribution P(Y = 1|X = x) = t for t {[0, 1]. Second, we can build a risk-agnostic SVM classifier where the cost of misclassification need not be known apriori. Preliminary numerical experiments show the effectiveness of the proposed algorithm.

preprint2011arXiv

Incremental Top-k List Comparison Approach to Robust Multi-Structure Model Fitting

Random hypothesis sampling lies at the core of many popular robust fitting techniques such as RANSAC. In this paper, we propose a novel hypothesis sampling scheme based on incremental computation of distances between partial rankings (top-$k$ lists) derived from residual sorting information. Our method simultaneously (1) guides the sampling such that hypotheses corresponding to all true structures can be quickly retrieved and (2) filters the hypotheses such that only a small but very promising subset remain. This permits the usage of simple agglomerative clustering on the surviving hypotheses for accurate model selection. The outcome is a highly efficient multi-structure robust estimation technique. Experiments on synthetic and real data show the superior performance of our approach over previous methods.

preprint2010arXiv

A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning

We extend the well-known BFGS quasi-Newton method and its memory-limited variant LBFGS to the optimization of nonsmooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: the local quadratic model, the identification of a descent direction, and the Wolfe line search conditions. We prove that under some technical conditions, the resulting subBFGS algorithm is globally convergent in objective function value. We apply its memory-limited variant (subLBFGS) to L_2-regularized risk minimization with the binary hinge loss. To extend our algorithm to the multiclass and multilabel settings, we develop a new, efficient, exact line search algorithm. We prove its worst-case time complexity bounds, and show that our line search can also be used to extend a recently developed bundle method to the multiclass and multilabel settings. We also apply the direction-finding component of our algorithm to L_1-regularized risk minimization with logistic loss. In all these contexts our methods perform comparable to or better than specialized state-of-the-art solvers on a number of publicly available datasets. An open source implementation of our algorithms is freely available.

Jin Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

HQANN: Efficient and Robust Similarity Search for Hybrid Queries with Structured and Unstructured Constraints

Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition

Deep Learning for Distinguishing Normal versus Abnormal Chest Radiographs and Generalization to Unseen Diseases

Distribution of ripples in graphene membrane

Electronic and Optical properties of transition metal dichalcogenides under symmetric and asymmetric field-effect doping

From Machine Learning to Transfer Learning in Laser-Induced Breakdown Spectroscopy: the Case of Rock Analysis for Mars Exploration

Two-Phase Dynamics of DNA Supercoiling based on DNA Polymer Physics

A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network Representation Learning

Comprehensive Information Integration Modeling Framework for Video Titling

Grounded and Controllable Image Completion by Incorporating Lexical Semantics

Poet: Product-oriented Video Captioner for E-commerce

Strain-induced semiconductor to metal transition in MA2Z4 bilayers

Tunable magneto-optical properties of single-layer tin diselenide: From GW approximation to large-scale tight-binding calculations

Computational modeling to elucidate molecular mechanisms of epigenetic memory

Magnetic Proximity Effect and Interlayer Exchange Coupling of Ferromagnetic/Topological Insulator/Ferromagnetic Trilayer

A New Paradigm to Half-Metallicity in Graphene Nanoribbons

Effective Absorption Enhancement in Small Molecule Organic Solar Cells by Employing Trapezoid Gratings

The Entire Quantile Path of a Risk-Agnostic SVM Classifier

Incremental Top-k List Comparison Approach to Robust Multi-Structure Model Fitting

A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning