Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
20topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Global Context Compression with Interleaved Vision-Text Transformation

Recent achievements of vision-language models in end-to-end OCR point to a new avenue for low-loss compression of textual information. This motivates earlier works that render the Transformer's input into images for prefilling, which effectively reduces the number of tokens through visual encoding, thereby alleviating the quadratically increased Attention computations. However, this partial compression fails to save computational or memory costs at token-by-token inference. In this paper, we investigate global context compression, which saves tokens at both prefilling and inference stages. Consequently, we propose VIST2, a novel Transformer that interleaves input text chunks alongside their visual encoding, while depending exclusively on visual tokens in the pre-context to predict the next text token distribution. Around this idea, we render text chunks into sketch images and train VIST2 in multiple stages, starting from curriculum-scheduled pretraining for optical language modeling, followed by modal-interleaved instruction tuning. We conduct extensive experiments using VIST2 families scaled from 0.6B to 8B to explore the training recipe and hyperparameters. With a 4$\times$ compression ratio, the resulting models demonstrate significant superiority over baselines on long writing tasks, achieving, on average, a 3$\times$ speedup in first-token generation, 77% reduction in memory usage, and 74% reduction in FLOPS. Our codes and datasets will be public to support further studies.

preprint2024arXiv

A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide Generation

Therapeutic peptides represent a unique class of pharmaceutical agents crucial for the treatment of human diseases. Recently, deep generative models have exhibited remarkable potential for generating therapeutic peptides, but they only utilize sequence or structure information alone, which hinders the performance in generation. In this study, we propose a Multi-Modal Contrastive Diffusion model (MMCD), fusing both sequence and structure modalities in a diffusion framework to co-generate novel peptide sequences and structures. Specifically, MMCD constructs the sequence-modal and structure-modal diffusion models, respectively, and devises a multi-modal contrastive learning strategy with intercontrastive and intra-contrastive in each diffusion timestep, aiming to capture the consistency between two modalities and boost model performance. The inter-contrastive aligns sequences and structures of peptides by maximizing the agreement of their embeddings, while the intra-contrastive differentiates therapeutic and non-therapeutic peptides by maximizing the disagreement of their sequence/structure embeddings simultaneously. The extensive experiments demonstrate that MMCD performs better than other state-of-theart deep generative methods in generating therapeutic peptides across various metrics, including antimicrobial/anticancer score, diversity, and peptide-docking.

preprint2022arXiv

Dark matter admixed neutron star properties in the light of X-ray pulse profile observations

The distribution of the dark matter (DM) in DM-admixed-neutron stars (DANSs) is supposed to be either a dense dark core or an extended dark halo, which is subject to the DM fraction of DANS ($f_χ$) and the DM properties, such as the mass ($m_χ$) and the strength of the self-interaction ($y$). In this paper, we perform an in-depth analysis of the formation criterion for dark core/dark halo and point out that the relative distribution of these two components is essentially determined by the ratio of the central enthalpy of the DM component to that of the baryonic matter component inside DANSs. For the critical case where the radii of DM and baryonic matter are the same, we further derive an analytical formula to describe the dependence of $f^{\rm crit}_χ$ on $m_χ$ and $y$ for given DANS mass. The relative distribution of the two components in DANSs can lead to different observational effects. We here focus on the modification of the pulsar pulse profile due to the extra light-bending effect in the case of a dark-halo existence and conduct the first investigation of the dark-halo effects on the pulse profile. We find that the peak flux deviation is strongly dependent on the ratio of the halo mass to the radius of the DM component. Lastly, we perform Bayesian parameter estimation on the DM particle properties based on the recent X-ray observations of PSR J0030+0451 and PSR J0740+6620 by the Neutron Star Interior Composition Explorer.

preprint2022arXiv

First passage of a diffusing particle under stochastic resetting in bounded domains with spherical symmetry

We investigate the first passage properties of a Brownian particle diffusing freely inside a $d$-dimensional sphere with absorbing spherical surface subject to stochastic resetting. We derive the mean time to absorption (MTA) as functions of resetting rate $γ$ and initial distance $r$ of the particle to the center of the sphere. We find that when $r>r_c$ there exists a nonzero optimal resetting rate $γ_{\rm opt}$ at which the MTA is a minimum, where $r_c=\sqrt {d/\left( {d + 4} \right)} R$ and $R$ is the radius of sphere. As $r$ increases, $γ_{\rm opt}$ exhibits a continuous transition from zero to nonzero at $r=r_c$. Furthermore, we consider that the particle lies in between two two-dimensional or three-dimensional concentric spheres, and obtain the domain in which resetting expedites the MTA, which is $(R_1, r_{c_1}) \cup (r_{c_2},R_2)$, with $R_1$ and $R_2$ being the radius of inner and outer spheres, respectively. Interestingly, when $R_1/R_2$ is less than a critical value, $γ_{\rm opt}$ exhibits a discontinuous transition at $r=r_{c_1}$; otherwise, such a transition is continuous. However, at $r=r_{c_2}$, $γ_{\rm opt}$ always shows a continuous transition.

preprint2022arXiv

Robust optimal policies for team Markov games

In stochastic dynamic environments, team Markov games have emerged as a versatile paradigm for studying sequential decision-making problems of fully cooperative multi-agent systems. However, the optimality of the derived policies is usually sensitive to model parameters, which are typically unknown and required to be estimated from noisy data in practice. To mitigate the sensitivity of optimal policies to these uncertain parameters, we propose a robust model of team Markov games in this paper, where agents utilize robust optimization approaches to update strategies. This model extends team Markov games to the scenario of incomplete information and meanwhile provides an alternative solution concept of robust team optimality. To seek such a solution, we develop a robust iterative learning algorithm of team policies and prove its convergence. This algorithm, compared with robust dynamic programming, not only possesses a faster convergence rate, but also allows for using approximation calculations to alleviate the curse of dimensionality. Moreover, some numerical simulations are presented to demonstrate the effectiveness of the algorithm by generalizing the game model of sequential social dilemmas to uncertain scenarios.

preprint2022arXiv

Simulation of the FDA Nozzle Benchmark: A Lattice Boltzmann Study

Background and objective: Contrary to flows in small intracranial vessels, many blood flow configurations such as those found in aortic vessels and aneurysms involve larger Reynolds numbers and, therefore, transitional or turbulent conditions. Dealing with such systems require both robust and efficient numerical methods. Methods: We assess here the performance of a lattice Boltzmann solver with full Hermite expansion of the equilibrium and central Hermite moments collision operator at higher Reynolds numbers, especially for under-resolved simulations. To that end the food and drug administration's benchmark nozzle is considered at three different Reynolds numbers covering all regimes: 1) laminar at a Reynolds number of 500, 2) transitional at a Reynolds number of $3500$, and 3) low-level turbulence at a Reynolds number of 6500. Results: The lattice Boltzmann results are compared with previously published inter-laboratory experimental data obtained by particle image velocimetry. Our results show good agreement with the experimental measurements throughout the nozzle, demonstrating the good performance of the solver even in under-resolved simulations. Conclusion: In this manner, fast but sufficiently accurate numerical predictions can be achieved for flow configurations of practical interest regarding medical applications.

preprint2021arXiv

First passage in discrete-time absorbing Markov chains under stochastic resetting

First passage of stochastic processes under resetting has recently been an active research topic in the field of statistical physics. However, most of previous studies mainly focused on the systems with continuous time and space. In this paper, we study the effect of stochastic resetting on first passage properties of discrete-time absorbing Markov chains, described by a transition matrix $\brm{Q}$ between transient states and a transition matrix $\brm{R}$ from transient states to absorbing states. Using a renewal approach, we exactly derive the unconditional mean first passage time (MFPT) to either of absorbing states, the splitting probability the and conditional MFPT to each absorbing state. All the quantities can be expressed in terms of a deformed fundamental matrix $\brm{Z_γ}=\left[\brm{I}-(1-γ) \brm{Q} \right]^{-1}$ and $\brm{R}$, where $\brm{I}$ is the identity matrix, and $γ$ is the resetting probability at each time step. We further show a sufficient condition under which the unconditional MPFT can be optimized by stochastic resetting. Finally, we apply our results to two concrete examples: symmetric random walks on one-dimensional lattices with absorbing boundaries and voter model on complete graphs.

preprint2020arXiv

A Guaranteed Convergence Analysis for the Projected Fast Iterative Soft-Thresholding Algorithm in Parallel MRI

The boom of non-uniform sampling and compressed sensing techniques dramatically alleviates the lengthy data acquisition problem of magnetic resonance imaging. Sparse reconstruction, thanks to its fast computation and promising performance, has attracted researchers to put numerous efforts on it and has been adopted in commercial scanners. To perform sparse reconstruction, choosing a proper algorithm is essential in providing satisfying results and saving time in tuning parameters. The pFISTA, a simple and efficient algorithm for sparse reconstruction, has been successfully extended to parallel imaging. However, its convergence criterion is still an open question. And the existing convergence criterion of single-coil pFISTA cannot be applied to the parallel imaging pFISTA, which, therefore, imposes confusions and difficulties on users about determining the only parameter - step size. In this work, we provide the guaranteed convergence analysis of the parallel imaging version pFISTA to solve the two well-known parallel imaging reconstruction models, SENSE and SPIRiT. Along with the convergence analysis, we provide recommended step size values for SENSE and SPIRiT reconstructions to obtain fast and promising reconstructions. Experiments on in vivo brain images demonstrate the validity of the convergence criterion. Besides, experimental results show that compared to using backtracking and power iteration to determine the step size, our recommended step size achieves more than five times acceleration in reconstruction time in most tested cases.

preprint2020arXiv

Event Arguments Extraction via Dilate Gated Convolutional Neural Network with Enhanced Local Features

Event Extraction plays an important role in information-extraction to understand the world. Event extraction could be split into two subtasks: one is event trigger extraction, the other is event arguments extraction. However, the F-Score of event arguments extraction is much lower than that of event trigger extraction, i.e. in the most recent work, event trigger extraction achieves 80.7%, while event arguments extraction achieves only 58%. In pipelined structures, the difficulty of event arguments extraction lies in its lack of classification feature, and the much higher computation consumption. In this work, we proposed a novel Event Extraction approach based on multi-layer Dilate Gated Convolutional Neural Network (EE-DGCNN) which has fewer parameters. In addition, enhanced local information is incorporated into word features, to assign event arguments roles for triggers predicted by the first subtask. The numerical experiments demonstrated significant performance improvement beyond state-of-art event extraction approaches on real-world datasets. Further analysis of extraction procedure is presented, as well as experiments are conducted to analyze impact factors related to the performance improvement.

preprint2019arXiv

pISTA-SENSE-ResNet for Parallel MRI Reconstruction

Magnetic resonance imaging has been widely applied in clinical diagnosis, however, is limited by its long data acquisition time. Although imaging can be accelerated by sparse sampling and parallel imaging, achieving promising reconstruction images with a fast reconstruction speed remains a challenge. Recently, deep learning approaches have attracted a lot of attention for its encouraging reconstruction results but without a proper interpretability. In this letter, to enable high-quality image reconstruction for the parallel magnetic resonance imaging, we design the network structure from the perspective of sparse iterative reconstruction and enhance it with the residual structure. The experimental results of a public knee dataset show that compared with the optimization-based method and the latest deep learning parallel imaging methods, the proposed network has less error in reconstruction and is more stable under different acceleration factors.