Source author record

Yilong Yang

Yilong Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Computer Vision math.GR Software Engineering Databases Machine Learning math.CO math.DS

Catalog footprint

What is connected

8works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Cross-Modal Prompt Injection Attack against Large Vision-Language Models with Image-Only Perturbation

Large vision-language models (LVLMs) have emerged as a powerful paradigm for multimodal intelligence, but their growing deployment also expands the attack surface of prompt injection. Despite this growing concern, existing attacks still suffer from a critical limitation: the injected prompt for one modality only steers the model's interpretation of that singular input. Alternatively, these attacks remain multimodal but fail to achieve cross-modal prompt perturbation. To bridge this gap, we introduce a novel cross-modal prompt injection attack CrossMPI, which can steer the model's interpretation of both textual and visual inputs via image-only prompt injection. Our design is underpinned by the following key breakthroughs. First, we turn the focus of the injected prompt perturbation optimization from the visual embedding space (typically with only $10^5$ parameters) to the model hidden state space (for multimodal information integration and with $10^7$ parameters). Then, two strategies are adopted to mitigate the optimization challenges posed by the larger parameter space. To constrain the optimized model parameter space, we introduce a layer selection strategy that identifies the layers most critical to multimodal integration. Interestingly, deviating from the past experience, our analysis reveals that the optimal layers for LVLM prompt perturbation reside in the middle of the model rather than the last. To constrain the image perturbation space, we propose a new distance-decremental perturbation budget assignment strategy that allocates budgets decrementally as the pixel distance to semantic-critical regions increases. Extensive experiments across multiple LVLMs and datasets show that our method significantly outperforms baseline approaches.

preprint2026arXiv

On the Generation and Mitigation of Harmful Geometry in Image-to-3D Models

Recent advances in image-to-3D models have significantly improved the fidelity and accessibility of 3D content creation. Such a powerful reconstruction capability that enables creative design can also be misused by the adversary to generate harmful geometries, which can be further fabricated via 3D printers and pose real-world risks. However, such risks are largely underexplored: it remains unclear how well current image-to-3D models can produce these harmful geometries, and whether existing safeguards can reliably prevent such generation. To fill this gap, we conduct a systematic measurement study of harmful geometry generation and mitigation. We first describe this risk through three kinds of unsafe categories: direct-use physical hazards, risky templates or components, and deceptive replicas. Each category is instantiated with representative objects. We evaluate both open-source and commercial image-to-3D models under original, degraded, viewpoint-shifted, and semantically camouflaged inputs. We consider different evaluation metrics, including geometric validity, multi-view VLM-based semantic scoring, targeted human validation, and controlled physical fabrication. The results reveal a concerning reality that current image-to-3D models can effectively reconstruct the harmful geometries, while fewer than 0.3% of such geometries trigger commercial moderation flags. As a first step toward mitigation, we evaluate three representative safeguard families, including input moderation, model-level benign alignment, and output-level filtering. We find that existing safeguards have distinct weaknesses. We further develop a stacked defense that can reduce harmful retention to <1%, but still at 11% overall false-positive cost. Taken together, our findings demonstrate that the risk in current system and encourage better geometry-aware safeguards for moderation.

preprint2022arXiv

Automated Enterprise Applications Generation from Requirements Model

Enterprise applications can be automatically generated from a sophisticated OO design model based on model-driven approach. The design model contains information about how to decompose the system into components, how to encapsulate the system operations into classes, and how the objects of classes collaborate to fulfill the functionality of the system operations. However, the efforts to build the design model from a validated requirements model are not proportional to the return. In practice, it is very desirable to have an approach that can automatically generate standardized enterprise applications directly from the validated requirements models. In this paper, we propose an approach named RM2EA, which can reach this goal based on the contract-based requirements model. We demonstrate the proposed approach through 13 case studies. The evaluation result shows that the quality and efficiency of the generated applications are almost equal to the applications implemented by developers: firstly, we demonstrate that a popular type of enterprise applications (i.e., a Jakarta EE application) can be successfully generated by customizing and improving the set of rules; secondly, RM2EA can generate more readable or maintainable code; thirdly, the enterprise applications generated by RM2EA achieve similar performance in test results. Overall, the result is satisfactory, and the implementation of the proposed approach can be further enhanced and applied to software development in the industry.

preprint2022arXiv

Reduced Power Graphs of $\mathrm{PGL}_n(\mathbb{F}_q)$

Given a group $G$, let us connect two non-identity elements by an edge if and only if one is a power of another. This gives a graph structure on $G$ minus identity, called the reduced power graph. It is conjectured by Akbari and Ashrafi that if a non-abelian finite simple group has a connected reduced power graph, then it must be an alternating group. In this paper, we shall give a complete description of when the reduced power graphs of $\mathrm{PGL}_n(\mathbb{F}_q)$ are connected for all $q$ and all $n\geq 3$. In particular, the conjectured by Akbari and Ashrafi is false. We shall also provide an upper bound in their diameters, and in case of disconnection, provide a description of all connected components.

preprint2020arXiv

Cloud-based Federated Boosting for Mobile Crowdsensing

The application of federated extreme gradient boosting to mobile crowdsensing apps brings several benefits, in particular high performance on efficiency and classification. However, it also brings a new challenge for data and model privacy protection. Besides it being vulnerable to Generative Adversarial Network (GAN) based user data reconstruction attack, there is not the existing architecture that considers how to preserve model privacy. In this paper, we propose a secret sharing based federated learning architecture FedXGB to achieve the privacy-preserving extreme gradient boosting for mobile crowdsensing. Specifically, we first build a secure classification and regression tree (CART) of XGBoost using secret sharing. Then, we propose a secure prediction protocol to protect the model privacy of XGBoost in mobile crowdsensing. We conduct a comprehensive theoretical analysis and extensive experiments to evaluate the security, effectiveness, and efficiency of FedXGB. The results indicate that FedXGB is secure against the honest-but-curious adversaries and attains less than 1% accuracy loss compared with the original XGBoost model.

preprint2016arXiv

Billiards in near rectangles

For every quadrilateral sufficiently close to a rectangle, we shall show that it possess a periodic billiard path. This is an REU work done at ICERM in Summer 2012.

preprint2016arXiv

The Ultraproducts of Quasirandom Groups

In this paper, we shall prove that an ultraproduct of non-abelian finite simple groups is either finite simple, or has no finite dimensional unitary representation other than the trivial one. Then we shall generalize this result for other kinds of quasirandom groups. A group is called D- quasirandom if all of its nontrivial representations over the complex numbers have dimensions at least D. We shall study the question of whether a non-principal ultraproduct of a given sequence of quasirandom groups remains quasirandom, and whether an ultraproduct of increasingly quasirandom groups becomes minimally almost periodic (i.e. no non-trivial finite-dimensional unitary representation at all). We answer this question in the affirmative when the groups in question are simple, quasisimple, semisimple, or when the groups in question have bounded number of conjugacy classes in their cosocles (the intersection of all maximal normal subgroups), or when the groups are arbitrary products (not necessarily finite) of the groups just listed. We shall also present with an ultraproduct of increasingly quasirandom groups with a non-trivial one-dimensional representation. We also obtain some results in the case of semi-direct products and short exact sequences of quasirandom groups. Finally, two applications of our results are given, one in triangle patterns of quasirandom groups and one in self-Bohrifying groups. Our main tools are some variations of the covering number for groups, different kinds of length functions on groups, and the classification of finite simple groups.

preprint2014arXiv

Anonymously Analyzing Clinical Datasets

This paper takes on the problem of automatically identifying clinically-relevant patterns in medical datasets without compromising patient privacy. To achieve this goal, we treat datasets as a black box for both internal and external users of data that lets us handle clinical data queries directly and far more efficiently. The novelty of the approach lies in avoiding the data de-identification process often used as a means of preserving patient privacy. The implemented toolkit combines software engineering technologies such as Java EE and RESTful web services, to allow exchanging medical data in an unidentifiable XML format as well as restricting users to the need-to-know principle. Our technique also inhibits retrospective processing of data, such as attacks by an adversary on a medical dataset using advanced computational methods to reveal Protected Health Information (PHI). The approach is validated on an endoscopic reporting application based on openEHR and MST standards. From the usability perspective, the approach can be used to query datasets by clinical researchers, governmental or non-governmental organizations in monitoring health care services to improve quality of care.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint