Source author record

Yan Yang

Yan Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

19works

21topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Benchmarking and Improving GUI Agents in High-Dynamic Environments

Recent advancements in Graphical User Interface (GUI) agents have predominantly focused on training paradigms like supervised fine-tuning (SFT) and reinforcement learning (RL). However, the challenge of high-dynamic GUI environments remains largely underexplored. Existing agents typically rely on a single screenshot after each action for decision-making, leading to a partially observable (or even unobservable) Markov decision process, where the key GUI state including important information for actions is often inadequately captured. To systematically explore this challenge, we introduce DynamicGUIBench, a comprehensive online GUI benchmark spanning ten applications and diverse interaction scenarios characterized by important interface changes between actions. Furthermore, we present DynamicUI, an agent designed for dynamic interfaces, which takes screen-recording videos of the interaction process as input and consists of three components: a dynamic perceiver, a refinement strategy, and a reflection. Specifically, the dynamic perceiver clusters frames of the GUI video, generates captions for the centroids, and iteratively selects the most informative frames as the salient dynamic context. Considering that there may be inconsistencies and noise between the selected frames and the textual context of the agent, the refinement strategy employs an action-conditioned filtering to refine thoughts to mitigate thought-action inconsistency and redundancy. Based on the refined agent trajectories, the reflection module provides effective and accurate guidance for further actions. Experiments on DynamicGUIBench demonstrate that DynamicUI significantly improves the performance in dynamic GUI environments, while maintaining competitive performance on other public benchmarks.

preprint2022arXiv

DPST: De Novo Peptide Sequencing with Amino-Acid-Aware Transformers

De novo peptide sequencing aims to recover amino acid sequences of a peptide from tandem mass spectrometry (MS) data. Existing approaches for de novo analysis enumerate MS evidence for all amino acid classes during inference. It leads to over-trimming on receptive fields of MS data and restricts MS evidence associated with following undecoded amino acids. Our approach, DPST, circumvents these limitations with two key components: (1) A confidence value aggregation encoder to sketch spectrum representations according to amino-acid-based connectivity among MS; (2) A global-local fusion decoder to progressively assimilate contextualized spectrum representations with a predefined preconception of localized MS evidence and amino acid priors. Our components originate from a closed-form solution and selectively attend to informative amino-acid-aware MS representations. Through extensive empirical studies, we demonstrate the superiority of DPST, showing that it outperforms state-of-the-art approaches by a margin of 12% - 19% peptide accuracy.

preprint2022arXiv

Hardness prediction of age-hardening aluminum alloy based on ensemble learning

With the rapid development of artificial intelligence, the combination of material database and machine learning has driven the progress of material informatics. Because aluminum alloy is widely used in many fields, so it is significant to predict the properties of aluminum alloy. In this thesis, the data of Al-Cu-Mg-X (X: Zn, Zr, etc.) alloy are used to input the composition, aging conditions (time and temperature) and predict its hardness. An ensemble learning solution based on automatic machine learning and an attention mechanism introduced into the secondary learner of deep neural network are proposed respectively. The experimental results show that selecting the correct secondary learner can further improve the prediction accuracy of the model. This manuscript introduces the attention mechanism to improve the secondary learner based on deep neural network, and obtains a fusion model with better performance. The R-Square of the best model is 0.9697 and the MAE is 3.4518HV.

preprint2022arXiv

K-textures, a self-supervised hard clustering deep learning algorithm for satellite image segmentation

Deep learning self-supervised algorithms that can segment an image in a fixed number of hard labels such as the k-means algorithm and relying only on deep learning techniques are still lacking. Here, we introduce the k-textures algorithm which provides self-supervised segmentation of a 4-band image (RGB-NIR) for a $k$ number of classes. An example of its application on high resolution Planet satellite imagery is given. Our algorithm shows that discrete search is feasible using convolutional neural networks (CNN) and gradient descent. The model detects $k$ hard clustering classes represented in the model as $k$ discrete binary masks and their associated $k$ independently generated textures, that combined are a simulation of the original image. The similarity loss is the mean squared error between the features of the original and the simulated image, both extracted from the penultimate convolutional block of Keras 'imagenet' pretrained VGG-16 model and a custom feature extractor made with Planet data. The main advances of the k-textures model are: first, the $k$ discrete binary masks are obtained inside the model using gradient descent. The model allows for the generation of discrete binary masks using a novel method using a hard sigmoid activation function. Second, it provides hard clustering classes -- each pixels has only one class. Finally, in comparison to k-means, where each pixel is considered independently, here, contextual information is also considered and each class is not associated only to similar values in the color channels but also to a texture. Our approach is designed to ease the production of training samples for satellite image segmentation and the k-textures architecture could be adapted to support different number of bands and for more complex tasks, such as object self-segmentation. The model codes and weights are available at https://doi.org/10.5281/zenodo.6359859

preprint2022arXiv

Molecular distance matrix prediction based on graph convolutional networks

Molecular structure has important applications in many fields. For example, some studies show that molecular spatial information can be used to achieve better prediction results when predicting molecular properties. However, traditional molecular geometry calculations, such as density functional theory (DFT), are time-consuming. In view of this, we propose a model based on graph convolutional networks to predict the pairwise distance between atoms, also called distance matrix prediction of the molecule(DMGCN). In order to indicate the effect of DMGCN model, the model is compared with the model DeeperGCN-DAGNN and the method of calculating molecular conformation in RDKit. Results show that the MAE of DMGCN is smaller than DeeperGCN-DAGNN and RDKit. In addition, the distances predicted by the DMGCN model and the distances calculated by the QM9 dataset are used to predict the molecular properties, thus showing the effectiveness of the distance predicted by the DMGCN model.

preprint2022arXiv

Pressure-Strain Interaction as the Energy Dissipation Estimate in Collisionless Plasma

The dissipative mechanism in weakly collisional plasma is a topic that pervades decades of studies without a consensus solution. We compare several energy dissipation estimates based on energy transfer processes in plasma turbulence and provide justification for the pressure-strain interaction as a direct estimate of the energy dissipation rate. The global and scale-by-scale energy balances are examined in 2.5D and 3D kinetic simulations. We show that the global internal energy increase and the temperature enhancement of each species are directly tracked by the pressure-strain interaction. The incompressive part of the pressure-strain interaction dominates over its compressive part in all simulations considered. The scale-by-scale energy balance is quantified by scale filtered Vlasov-Maxwell equations, a kinetic plasma approach, and the lag dependent von Kármán-Howarth equation, an approach based on fluid models. We find that the energy balance is exactly satisfied across all scales, but the lack of a well-defined inertial range influences the distribution of the energy budget among different terms in the inertial range. Therefore, the widespread use of the Yaglom relation to estimating dissipation rate is questionable in some cases, especially when the scale separation in the system is not clearly defined. In contrast, the pressure-strain interaction balances exactly the dissipation rate at kinetic scales regardless of the scale separation.

preprint2021arXiv

SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering

Medical visual question answering (Med-VQA) has tremendous potential in healthcare. However, the development of this technology is hindered by the lacking of publicly-available and high-quality labeled datasets for training and evaluation. In this paper, we present a large bilingual dataset, SLAKE, with comprehensive semantic labels annotated by experienced physicians and a new structural medical knowledge base for Med-VQA. Besides, SLAKE includes richer modalities and covers more human body parts than the currently available dataset. We show that SLAKE can be used to facilitate the development and evaluation of Med-VQA systems. The dataset can be downloaded from http://www.med-vqa.com/slake.

preprint2020arXiv

A novel semi-supervised multi-view clustering framework for screening Parkinson's disease

In recent years, there are many research cases for the diagnosis of Parkinson's disease (PD) with the brain magnetic resonance imaging (MRI) by utilizing the traditional unsupervised machine learning methods and the supervised deep learning models. However, unsupervised learning methods are not good at extracting accurate features among MRIs and it is difficult to collect enough data in the field of PD to satisfy the need of training deep learning models. Moreover, most of the existing studies are based on single-view MRI data, of which data characteristics are not sufficient enough. In this paper, therefore, in order to tackle the drawbacks mentioned above, we propose a novel semi-supervised learning framework called Semi-supervised Multi-view learning Clustering architecture technology (SMC). The model firstly introduces the sliding window method to grasp different features, and then uses the dimensionality reduction algorithms of Linear Discriminant Analysis (LDA) to process the data with different features. Finally, the traditional single-view clustering and multi-view clustering methods are employed on multiple feature views to obtain the results. Experiments show that our proposed method is superior to the state-of-art unsupervised learning models on the clustering effect. As a result, it may be noted that, our work could contribute to improving the effectiveness of identifying PD by previous labeled and subsequent unlabeled medical MRI data in the realistic medical environment.

preprint2020arXiv

DeepSTCL: A Deep Spatio-temporal ConvLSTM for Travel Demand Prediction

Urban resource scheduling is an important part of the development of a smart city, and transportation resources are the main components of urban resources. Currently, a series of problems with transportation resources such as unbalanced distribution and road congestion disrupt the scheduling discipline. Therefore, it is significant to predict travel demand for urban resource dispatching. Previously, the traditional time series models were used to forecast travel demand, such as AR, ARIMA and so on. However, the prediction efficiency of these methods is poor and the training time is too long. In order to improve the performance, deep learning is used to assist prediction. But most of the deep learning methods only utilize temporal dependence or spatial dependence of data in the forecasting process. To address these limitations, a novel deep learning traffic demand forecasting framework which based on Deep Spatio-Temporal ConvLSTM is proposed in this paper. In order to evaluate the performance of the framework, an end-to-end deep learning system is designed and a real dataset is used. Furthermore, the proposed method can capture temporal dependence and spatial dependence simultaneously. The closeness, period and trend components of spatio-temporal data are used in three predicted branches. These branches have the same network structures, but do not share weights. Then a linear fusion method is used to get the final result. Finally, the experimental results on DIDI order dataset of Chengdu demonstrate that our method outperforms traditional models with accuracy and speed.

preprint2020arXiv

High order mixed finite elements with mass lumping for elasticity on triangular grids

A family of conforming mixed finite elements with mass lumping on triangular grids are presented for linear elasticity. The stress field is approximated by symmetric $H({\rm div})-P_k (k\geq 3)$ polynomial tensors enriched with higher order bubbles so as to allow mass lumping, which can be viewed as the Hu-Zhang elements enriched with higher order interior bubble functions. The displacement field is approximated by $C^{-1}-P_{k-1}$ polynomial vectors enriched with higher order terms to ensure the stability condition. For both the proposed mixed elements and their mass lumping schemes, optimal error estimates are derived for the stress with $H(\rm div)$ norm and the displacement with $L^2$ norm. Numerical results confirm the theoretical analysis.

preprint2020arXiv

In situ Measurement of Curvature of Magnetic Field in Turbulent Space Plasmas: A Statistical Study

Using in situ data, accumulated in the turbulent magnetosheath by the Magnetospheric Multiscale (MMS) Mission, we report a statistical study of magnetic field curvature and discuss its role in the turbulent space plasmas. Consistent with previous simulation results, the Probability Distribution Function (PDF) of the curvature is shown to have distinct power-law tails for both high and low value limits. We find that the magnetic-field-line curvature is intermittently distributed in space. High curvature values reside near weak magnetic-field regions, while low curvature values are correlated with small magnitude of the force acting normal to the field lines. A simple statistical treatment provides an explanation for the observed curvature distribution. This novel statistical characterization of magnetic curvature in space plasma provides a starting point for assessing, in a turbulence context, the applicability and impact of particle energization processes, such as curvature drift, that rely on this fundamental quantity.

preprint2020arXiv

Statistics of Kinetic Dissipation in Earth's Magnetosheath -- MMS Observations

A familiar problem in space and astrophysical plasmas is to understand how dissipation and heating occurs. These effects are often attributed to the cascade of broadband turbulence which transports energy from large scale reservoirs to small scale kinetic degrees of freedom. When collisions are infrequent, local thermodynamic equilibrium is not established. In this case the final stage of energy conversion becomes more complex than in the fluid case, and both pressure-dilatation and pressure strain interactions (Pi-D $\equiv -Π_{ij} D_{ij}$) become relevant and potentially important. Pi-D in plasma turbulence has been studied so far primarily using simulations. The present study provides a statistical analysis of Pi-D in the Earth's magnetosheath using the unique measurement capabilities of the Magnetospheric Multiscale (MMS) mission. We find that the statistics of Pi-D in this naturally occurring plasma environment exhibit strong resemblance to previously established fully kinetic simulations results. The conversion of energy is concentrated in space and occurs near intense current sheets, but not within them. This supports recent suggestions that the chain of energy transfer channels involves regional, rather than pointwise, correlations.

preprint2016arXiv

Edge Detection Methods Based on Differential Phase Congruency of Monogenic Image

Edge Detection Methods Based on Differential Phase Congruency of Monogenic Image Abstract: Edge detection has been widely used in medical image processing and automatic diagnosis. Some novel edge detection algorithms,based on the monogenic scale-space, are proposed by detecting points of local extrema in local amplitude, the local attenuation and modified differential phase congruency methods. The monogenic scale-space is obtained from a known image by Poisson and conjugate Poisson filtering. In mathematics, it is the Hardy space in the upper half-space. The boundary value of the monogenic scale-space representation is a monogenic image. In the monogenic scale-space, the definitions involving scale, such as local amplitude,local attenuation, local phase angle, local phase vector and local frequency (phase derivatives) are proposed. Using Clifford analysis, the relations between the local attenuation and the local phase vector are obtained. These study will be improved the understanding of image analysis in higher dimensional spaces. Experimental results are shown by using some typical images.

preprint2015arXiv

Sharp inequalities of homogeneous expansions for quasi-convex mappings of type B and almost starlike mappings of order alpha

In this paper, we first obtain several sharp inequalities of homogeneous expansion for both the subclass of all normalized biholomorphic quasi-convex mappings of type B and order alpha and the subclass of all normalized biholomorphic almost starlike mappings of order alpha defined on the unit ball $B$ of a complex Banach space X. Then, with these sharp inequalities, we derive the sharp estimates of the third and fourth homogeneous expansions for the above mappings defined on the unit polydisk $D^n$ in C(n).

preprint2014arXiv

Computation of Maximum Likelihood Estimates for Multiresponse Generalized Linear Mixed Models with Non-nested, Correlated Random Effects

Estimation of generalized linear mixed models (GLMMs) with non-nested random effects structures requires approximation of high-dimensional integrals. Many existing methods are tailored to the low-dimensional integrals produced by nested designs. We explore the modifications that are required in order to adapt an EM algorithm with first-order and fully exponential Laplace approximations to a non-nested, multiple response model. The equations in the estimation routine are expressed as functions of the first four derivatives of the conditional likelihood of an arbitrary GLMM, providing a template for future applications. We apply the method to a joint Poisson-binary model for ranking sporting teams, and discuss the estimation of a correlated random effects model designed to evaluate the sensitivity of value-added models for teacher evaluation to assumptions about the missing data process. Source code in R is provided in the online supplementary material.

preprint2014arXiv

Efficient Maximum Likelihood Estimation of Multiple Membership Linear Mixed Models, with an Application to Educational Value-Added Assessments

The generalized persistence (GP) model, developed in the context of estimating ``value added'' by individual teachers to their students' current and future test scores, is one of the most flexible value-added models in the literature. Although developed in the educational setting, the GP model can potentially be applied to any structure where each sequential response of a lower-level unit may be associated with a different higher-level unit, and the effects of the higher-level units may persist over time. The flexibility of the GP model, however, and its multiple membership random effects structure lead to computational challenges that have limited the model's availability. We develop an EM algorithm to compute maximum likelihood estimates efficiently for the GP model, making use of the sparse structure of the random effects and error covariance matrices. The algorithm is implemented in the package GPvam in R statistical software. We give examples of the computations and illustrate the gains in computational efficiency achieved by our estimation procedure.

preprint2014arXiv

Multi-Shot Person Re-Identification via Relational Stein Divergence

Person re-identification is particularly challenging due to significant appearance changes across separate camera views. In order to re-identify people, a representative human signature should effectively handle differences in illumination, pose and camera parameters. While general appearance-based methods are modelled in Euclidean spaces, it has been argued that some applications in image and video analysis are better modelled via non-Euclidean manifold geometry. To this end, recent approaches represent images as covariance matrices, and interpret such matrices as points on Riemannian manifolds. As direct classification on such manifolds can be difficult, in this paper we propose to represent each manifold point as a vector of similarities to class representers, via a recently introduced form of Bregman matrix divergence known as the Stein divergence. This is followed by using a discriminative mapping of similarity vectors for final classification. The use of similarity vectors is in contrast to the traditional approach of embedding manifolds into tangent spaces, which can suffer from representing the manifold structure inaccurately. Comparative evaluations on benchmark ETHZ and iLIDS datasets for the person re-identification task show that the proposed approach obtains better performance than recent techniques such as Histogram Plus Epitome, Partial Least Squares, and Symmetry-Driven Accumulation of Local Features.

preprint2012arXiv

The thickness of amalgamations of graphs

The thickness $θ(G)$ of a graph $G$ is the minimum number of planar spanning subgraphs into which the graph $G$ can be decomposed. As a topological invariant of a graph, it is a measurement of the closeness to planarity of a graph, and it also has important applications to VLSI design. In this paper, the thickness of graphs that are obtained by vertex-amalgamation and bar-amalgamation of any two graphs whose thicknesses are known are obtained, respectively. And the lower and upper bounds for the thickness of graphs that are obtained by edge-amalgamation and 2-vertex-amalgamation of any two graphs whose thicknesses are known are also derived, respectively.

preprint2012arXiv

The thickness of cartesian product $K_n \Box P_m$

The thickness $θ(G)$ of a graph $G$ is the minimum number of planar spanning subgraphs into which the graph $G$ can be decomposed. It is a topological invariant of a graph, which was defined by W.T. Tutte in 1963 and also has important applications to VLSI design. But comparing with other topological invariants, e.g. genus and crossing number, results about thickness of graphs are few. The only types of graphs whose thicknesses have been obtained are complete graphs, complete bipartite graphs and hypercubes. In this paper, by operations on graphs, the thickness of the cartesian product $K_n \Box P_m$, $n,m \geq 2$ are obtained.

Yan Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

Benchmarking and Improving GUI Agents in High-Dynamic Environments

DPST: De Novo Peptide Sequencing with Amino-Acid-Aware Transformers

Hardness prediction of age-hardening aluminum alloy based on ensemble learning

K-textures, a self-supervised hard clustering deep learning algorithm for satellite image segmentation

Molecular distance matrix prediction based on graph convolutional networks

Pressure-Strain Interaction as the Energy Dissipation Estimate in Collisionless Plasma

SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering

A novel semi-supervised multi-view clustering framework for screening Parkinson's disease

DeepSTCL: A Deep Spatio-temporal ConvLSTM for Travel Demand Prediction

High order mixed finite elements with mass lumping for elasticity on triangular grids

In situ Measurement of Curvature of Magnetic Field in Turbulent Space Plasmas: A Statistical Study

Statistics of Kinetic Dissipation in Earth's Magnetosheath -- MMS Observations

Edge Detection Methods Based on Differential Phase Congruency of Monogenic Image

Sharp inequalities of homogeneous expansions for quasi-convex mappings of type B and almost starlike mappings of order alpha

Computation of Maximum Likelihood Estimates for Multiresponse Generalized Linear Mixed Models with Non-nested, Correlated Random Effects

Efficient Maximum Likelihood Estimation of Multiple Membership Linear Mixed Models, with an Application to Educational Value-Added Assessments

Multi-Shot Person Re-Identification via Relational Stein Divergence

The thickness of amalgamations of graphs

The thickness of cartesian product $K_n \Box P_m$