Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
24works
0followers
20topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

24 published item(s)

preprint2026arXiv

ARMOR: An Agentic Framework for Reaction Feasibility Prediction via Adaptive Utility-aware Multi-tool Reasoning

Reaction feasibility prediction, as a fundamental problem in computational chemistry, has benefited from diverse tools enabled by recent advances in artificial intelligence, particularly large language models. However, the performance of individual tools varies substantially across reactions, making it difficult for any single tool to consistently perform well across all cases. This raises a critical challenge: how to effectively leverage multiple tools to obtain more accurate feasibility predictions. To address this, we propose ARMOR, an agentic framework that explicitly models tool-specific utilities, adaptively prioritizes tools, and further resolves the potential tool conflicts to produce the final prediction for each reaction. Unlike existing approaches that rely on simple aggregation or heuristic assignment over various tools, ARMOR organizes tools into a hierarchy that prioritizes top-performing tools and defers others when needed, characterizes their strengths through tool-specific patterns, and resolves conflicts via memoryaugmented reasoning. Extensive experiments on a public dataset demonstrate that ARMOR consistently outperforms strong baselines, including single-tool methods as well as various tool aggregation and tool selection approaches. Further analysis shows that the improvements are particularly significant on reactions with conflicting tool predictions, highlighting the effectiveness of ARMOR in leveraging the complementary strengths of multiple tools. The code is available via https://anonymous.4open.science/r/ARMOR-E13F.

preprint2026arXiv

Layer-Order Inversion: Rethinking Latent Multi-Hop Reasoning in Large Language Models

Large language models (LLMs) perform well on multi-hop reasoning, yet how they internally compose multiple facts remains unclear. Recent work proposes \emph{hop-aligned circuit hypothesis}, suggesting that bridge entities are computed sequentially across layers before later-hop answers. Through systematic analyses on real-world multi-hop queries, we show that this hop-aligned assumption does not generalize: later-hop answer entities can become decodable earlier than bridge entities, a phenomenon we call \emph{layer-order inversion}, which strengthens with total hops. To explain this behavior, we propose a \emph{probabilistic recall-and-extract} framework that models multi-hop reasoning as broad probabilistic recall in shallow MLP layers followed by selective extraction in deeper attention layers. This framework is empirically validated through systematic probing analyses, reinterpreting prior layer-wise decoding evidence, explaining chain-of-thought gains, and providing a mechanistic diagnosis of multi-hop failures despite correct single-hop knowledge. Code is available at https://github.com/laquabe/Layer-Order-Inversion.

preprint2024arXiv

Automated Invariant Generation for Solidity Smart Contracts

Smart contracts are computer programs running on blockchains to automate the transaction execution between users. The absence of contract specifications poses a real challenge to the correctness verification of smart contracts. Program invariants are properties that are always preserved throughout the execution, which characterize an important aspect of the program behaviors. In this paper, we propose a novel invariant generation framework, INVCON+, for Solidity smart contracts. INVCON+ extends the existing invariant detector, InvCon, to automatically produce verified contract invariants based on both dynamic inference and static verification. Unlike INVCON+, InvCon only produces likely invariants, which have a high probability to hold, yet are still not verified against the contract code. Particularly, INVCON+ is able to infer more expressive invariants that capture richer semantic relations of contract code. We evaluate INVCON+ on 361 ERC20 and 10 ERC721 real-world contracts, as well as common ERC20 vulnerability benchmarks. The experimental results indicate that INVCON+ efficiently produces high-quality invariant specifications, which can be used to secure smart contracts from common vulnerabilities.

preprint2024arXiv

HPE:Answering Complex Questions over Text by Hybrid Question Parsing and Execution

The dominant paradigm of textual question answering systems is based on end-to-end neural networks, which excels at answering natural language questions but falls short on complex ones. This stands in contrast to the broad adaptation of semantic parsing approaches over structured data sources (e.g., relational database, knowledge graphs), that convert natural language questions to logical forms and execute them with query engines. Towards combining the strengths of neural and symbolic methods, we propose a framework of question parsing and execution on textual QA. It comprises two central pillars: (1) We parse the question of varying complexity into an intermediate representation, named H-expression, which is composed of simple questions as the primitives and symbolic operations representing the relationships among them; (2) To execute the resulting H-expressions, we design a hybrid executor, which integrates the deterministic rules to translate the symbolic operations with a drop-in neural reader network to answer each decomposed simple question. Hence, the proposed framework can be viewed as a top-down question parsing followed by a bottom-up answer backtracking. The resulting H-expressions closely guide the execution process, offering higher precision besides better interpretability while still preserving the advantages of the neural readers for resolving its primitive elements. Our extensive experiments on MuSiQue, 2WikiQA, HotpotQA, and NQ show that the proposed parsing and hybrid execution framework outperforms existing approaches in supervised, few-shot, and zero-shot settings, while also effectively exposing its underlying reasoning process.

preprint2024arXiv

VKIE: The Application of Key Information Extraction on Video Text

Extracting structured information from videos is critical for numerous downstream applications in the industry. In this paper, we define a significant task of extracting hierarchical key information from visual texts on videos. To fulfill this task, we decouple it into four subtasks and introduce two implementation solutions called PipVKIE and UniVKIE. PipVKIE sequentially completes the four subtasks in continuous stages, while UniVKIE is improved by unifying all the subtasks into one backbone. Both PipVKIE and UniVKIE leverage multimodal information from vision, text, and coordinates for feature representation. Extensive experiments on one well-defined dataset demonstrate that our solutions can achieve remarkable performance and efficient inference speed.

preprint2022arXiv

Are Pretrained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

Pre-trained Transformer-based models were reported to be robust in intent classification. In this work, we first point out the importance of in-domain out-of-scope detection in few-shot intent recognition tasks and then illustrate the vulnerability of pre-trained Transformer-based models against samples that are in-domain but out-of-scope (ID-OOS). We construct two new datasets, and empirically show that pre-trained models do not perform well on both ID-OOS examples and general out-of-scope examples, especially on fine-grained few-shot intent detection tasks. To figure out how the models mistakenly classify ID-OOS intents as in-scope intents, we further conduct analysis on confidence scores and the overlapping keywords, as well as point out several prospective directions for future work. Resources are available on https://github.com/jianguoz/Few-Shot-Intent-Detection.

preprint2022arXiv

Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots

Few-shot table-to-text generation is a task of composing fluent and faithful sentences to convey table content using limited data. Despite many efforts having been made towards generating impressive fluent sentences by fine-tuning powerful pre-trained language models, the faithfulness of generated content still needs to be improved. To this end, this paper proposes a novel approach Attend, Memorize and Generate (called AMG), inspired by the text generation process of humans. In particular, AMG (1) attends over the multi-granularity of context using a novel strategy based on table slot level and traditional token-by-token level attention to exploit both the table structure and natural linguistic information; (2) dynamically memorizes the table slot allocation states; and (3) generates faithful sentences according to both the context and memory allocation states. Comprehensive experiments with human evaluation on three domains (i.e., humans, songs, and books) of the Wiki dataset show that our model can generate higher qualified texts when compared with several state-of-the-art baselines, in both fluency and faithfulness.

preprint2022arXiv

ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection

We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of <human, action, object> in images. Most existing works treat HOIs as individual interaction categories, thus can not handle the problem of long-tail distribution and polysemy of action labels. We argue that multi-level consistencies among objects, actions and interactions are strong cues for generating semantic representations of rare or previously unseen HOIs. Leveraging the compositional and relational peculiarities of HOI labels, we propose ConsNet, a knowledge-aware framework that explicitly encodes the relations among objects, actions and interactions into an undirected graph called consistency graph, and exploits Graph Attention Networks (GATs) to propagate knowledge among HOI categories as well as their constituents. Our model takes visual features of candidate human-object pairs and word embeddings of HOI labels as inputs, maps them into visual-semantic joint embedding space and obtains detection results by measuring their similarities. We extensively evaluate our model on the challenging V-COCO and HICO-DET datasets, and results validate that our approach outperforms state-of-the-arts under both fully-supervised and zero-shot settings. Code is available at https://github.com/yeliudev/ConsNet.

preprint2022arXiv

Contrastive Graph Multimodal Model for Text Classification in Videos

The extraction of text information in videos serves as a critical step towards semantic understanding of videos. It usually involved in two steps: (1) text recognition and (2) text classification. To localize texts in videos, we can resort to large numbers of text recognition methods based on OCR technology. However, to our knowledge, there is no existing work focused on the second step of video text classification, which will limit the guidance to downstream tasks such as video indexing and browsing. In this paper, we are the first to address this new task of video text classification by fusing multimodal information to deal with the challenging scenario where different types of video texts may be confused with various colors, unknown fonts and complex layouts. In addition, we tailor a specific module called CorrelationNet to reinforce feature representation by explicitly extracting layout information. Furthermore, contrastive learning is utilized to explore inherent connections between samples using plentiful unlabeled videos. Finally, we construct a new well-defined industrial dataset from the news domain, called TI-News, which is dedicated to building and evaluating video text recognition and classification applications. Extensive experiments on TI-News demonstrate the effectiveness of our method.

preprint2022arXiv

Incentive Mechanism Design for Emergency Frequency Control in Multi-Infeed Hybrid AC-DC System

In multi-infeed hybrid AC-DC (MIDC) systems, the emergency frequency control (EFC) with LCC-HVDC systems participating is of vital importance for system frequency stability. Nevertheless, when regional power systems are operated by different decision-makers, the LCC-HVDC systems and their connected AC systems might be unwilling to participate in the EFC due to the costs and losses. In this paper, to incentivize the LCC-HVDC systems and their connected adjacent AC systems to participate in the droop-based EFC, a novel control-parameter-based incentive mechanism is proposed, which can deal with various possible emergency frequency faults. Then, a non-cooperative-based incentive game model is formulated to implement the incentive mechanism in the MIDC system. An algorithm for seeking the Nash equilibrium is designed, and the uniqueness of Nash equilibrium is proven. Moreover, the individual rationality, incentive compatibility and social optimality of the proposed mechanism are analyzed and proven. The effectiveness of the proposed incentive mechanism is verified through a case study.

preprint2022arXiv

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification

Scene segmentation and classification (SSC) serve as a critical step towards the field of video structuring analysis. Intuitively, jointly learning of these two tasks can promote each other by sharing common information. However, scene segmentation concerns more on the local difference between adjacent shots while classification needs the global representation of scene segments, which probably leads to the model dominated by one of the two tasks in the training phase. In this paper, from an alternate perspective to overcome the above challenges, we unite these two tasks into one task by a new form of predicting shots link: a link connects two adjacent shots, indicating that they belong to the same scene or category. To the end, we propose a general One Stage Multimodal Sequential Link Framework (OS-MSL) to both distinguish and leverage the two-fold semantics by reforming the two learning tasks into a unified one. Furthermore, we tailor a specific module called DiffCorrNet to explicitly extract the information of differences and correlations among shots. Extensive experiments on a brand-new large scale dataset collected from real-world applications, and MovieScenes are conducted. Both the results demonstrate the effectiveness of our proposed method against strong baselines.

preprint2022arXiv

Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack

Defense models against adversarial attacks have grown significantly, but the lack of practical evaluation methods has hindered progress. Evaluation can be defined as looking for defense models&#39; lower bound of robustness given a budget number of iterations and a test dataset. A practical evaluation method should be convenient (i.e., parameter-free), efficient (i.e., fewer iterations) and reliable (i.e., approaching the lower bound of robustness). Towards this target, we propose a parameter-free Adaptive Auto Attack (A$^3$) evaluation method which addresses the efficiency and reliability in a test-time-training fashion. Specifically, by observing that adversarial examples to a specific defense model follow some regularities in their starting points, we design an Adaptive Direction Initialization strategy to speed up the evaluation. Furthermore, to approach the lower bound of robustness under the budget number of iterations, we propose an online statistics-based discarding strategy that automatically identifies and abandons hard-to-attack images. Extensive experiments demonstrate the effectiveness of our A$^3$. Particularly, we apply A$^3$ to nearly 50 widely-used defense models. By consuming much fewer iterations than existing methods, i.e., $1/10$ on average (10$\times$ speed up), we achieve lower robust accuracy in all cases. Notably, we won $\textbf{first place}$ out of 1681 teams in CVPR 2021 White-box Adversarial Attacks on Defense Models competitions with this method. Code is available at: $\href{https://github.com/liuye6666/adaptive_auto_attack}{https://github.com/liuye6666/adaptive\_auto\_attack}$

preprint2022arXiv

The Large High Altitude Air Shower Observatory (LHAASO) Science Book (2021 Edition)

Since the science white paper of the Large High Altitude Air Shower Observatory (LHAASO) published on arXiv in 2019 [e-Print: 1905.02773 (astro-ph.HE)], LHAASO has completed the transition from a project to an operational gamma-ray astronomical observatory LHAASO is a new generation multi-component facility located in Daocheng, Sichuan province of China, at an altitude of 4410 meters. It aims at measuring with unprecedented sensitivity the spectrum, composition, and anisotropy of cosmic rays in the energy range between 10$^{12}$ and 10$^{18}$~eV, and acting simultaneously as a wide aperture (one stereoradiant) continuously operating gamma-ray telescope in the energy range between 10$^{11}$ and $10^{15}$~eV with the designed sensitivity of 1.3\% of the Crab Unit (CU) above 100 TeV. LHAASO&#39;s capability of measuring simultaneously different shower components (electrons, muons, and Cherenkov/fluorescence light), will allow it to investigate the origin, acceleration, and propagation of CR through measurement of the energy spectrum, elemental composition, and anisotropy with unprecedented resolution. The remarkable sensitivity of LHAASO will play a key role in CR physics and gamma-ray astronomy for a general and comprehensive exploration of the high energy universe and will allow important studies of fundamental physics (such as indirect dark matter search, Lorentz invariance violation, quantum gravity) and solar and heliospheric physics. The LHAASO Collaboration organized an editorial working group and finished all editorial work of this science book, to summarize the instrumental features and outline the prospects of scientific researches with the LHAASO experiment.

preprint2022arXiv

UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection

Finding relevant moments and highlights in videos according to natural language queries is a natural and highly valuable common need in the current video content explosion era. Nevertheless, jointly conducting moment retrieval and highlight detection is an emerging research topic, even though its component problems and some related tasks have already been studied for a while. In this paper, we present the first unified framework, named Unified Multi-modal Transformers (UMT), capable of realizing such joint optimization while can also be easily degenerated for solving individual problems. As far as we are aware, this is the first scheme to integrate multi-modal (visual-audio) learning for either joint optimization or the individual moment retrieval task, and tackles moment retrieval as a keypoint detection problem using a novel query generator and query decoder. Extensive comparisons with existing methods and ablation studies on QVHighlights, Charades-STA, YouTube Highlights, and TVSum datasets demonstrate the effectiveness, superiority, and flexibility of the proposed method under various settings. Source code and pre-trained models are available at https://github.com/TencentARC/UMT.

preprint2021arXiv

Enriching Non-Autoregressive Transformer with Syntactic and SemanticStructures for Neural Machine Translation

The non-autoregressive models have boosted the efficiency of neural machine translation through parallelized decoding at the cost of effectiveness when comparing with the autoregressive counterparts. In this paper, we claim that the syntactic and semantic structures among natural language are critical for non-autoregressive machine translation and can further improve the performance. However, these structures are rarely considered in the existing non-autoregressive models. Inspired by this intuition, we propose to incorporate the explicit syntactic and semantic structures of languages into a non-autoregressive Transformer, for the task of neural machine translation. Moreover, we also consider the intermediate latent alignment within target sentences to better learn the long-term token dependencies. Experimental results on two real-world datasets (i.e., WMT14 En-De and WMT16 En-Ro) show that our model achieves a significantly faster speed, as well as keeps the translation quality when compared with several state-of-the-art non-autoregressive models.

preprint2021arXiv

KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning

Generative commonsense reasoning which aims to empower machines to generate sentences with the capacity of reasoning over a set of concepts is a critical bottleneck for text generation. Even the state-of-the-art pre-trained language generation models struggle at this task and often produce implausible and anomalous sentences. One reason is that they rarely consider incorporating the knowledge graph which can provide rich relational information among the commonsense concepts. To promote the ability of commonsense reasoning for text generation, we propose a novel knowledge graph augmented pre-trained language generation model KG-BART, which encompasses the complex relations of concepts through the knowledge graph and produces more logical and natural sentences as output. Moreover, KG-BART can leverage the graph attention to aggregate the rich concept semantics that enhances the model generalization on unseen concept sets. Experiments on benchmark CommonGen dataset verify the effectiveness of our proposed approach by comparing with several strong pre-trained language generation models, particularly KG-BART outperforms BART by 5.80, 4.60, in terms of BLEU-3, 4. Moreover, we also show that the generated context by our model can work as background scenarios to benefit downstream commonsense QA tasks.

preprint2020arXiv

A General Initialization Scheme for Electromagnetic Transient Simulation: Towards Large-Scale Hybrid AC-DC Grids

With the large-scale hybrid AC-DC grids coming into being, electromagnetic transient (EMT) simulation is required to accurately describe the dynamics of systems. However, the EMT steady-state initialization for hybrid AC-DC system is difficult and time-consuming when the system scale is huge. In order to provide a stable snapshot for EMT simulation with nonlinear components and black-box components, this paper proposes a general initialization scheme for EMT simulation (EMT-GIS) which can be implemented in the electromagnetic transient program (EMTP)-type simulators. First, an integrated power flow (IPF) algorithm is introduced to provide the steady-state results. Then, an initialized snapshot calculation-and-splicing mechanism is designed for EMT-GIS. The proposed EMT-GIS is tested using a hybrid AC-DC system in China on the CloudPSS simulation platform. Test results verify the effectiveness of the proposed EMT-GIS.

preprint2020arXiv

Commonsense Evidence Generation and Injection in Reading Comprehension

Human tackle reading comprehension not only based on the given context itself but often rely on the commonsense beyond. To empower the machine with commonsense reasoning, in this paper, we propose a Commonsense Evidence Generation and Injection framework in reading comprehension, named CEGI. The framework injects two kinds of auxiliary commonsense evidence into comprehensive reading to equip the machine with the ability of rational thinking. Specifically, we build two evidence generators: the first generator aims to generate textual evidence via a language model; the other generator aims to extract factual evidence (automatically aligned text-triples) from a commonsense knowledge graph after graph completion. Those evidences incorporate contextual commonsense and serve as the additional inputs to the model. Thereafter, we propose a deep contextual encoder to extract semantic relationships among the paragraph, question, option, and evidence. Finally, we employ a capsule network to extract different linguistic units (word and phrase) from the relations, and dynamically predict the optimal option based on the extracted units. Experiments on the CosmosQA dataset demonstrate that the proposed CEGI model outperforms the current state-of-the-art approaches and achieves the accuracy (83.6%) on the leaderboard.

preprint2020arXiv

Fine-Grained Urban Flow Inference

The ubiquitous deployment of monitoring devices in urban flow monitoring systems induces a significant cost for maintenance and operation. A technique is required to reduce the number of deployed devices, while preventing the degeneration of data accuracy and granularity. In this paper, we present an approach for inferring the real-time and fine-grained crowd flows throughout a city based on coarse-grained observations. This task exhibits two challenges: the spatial correlations between coarse- and fine-grained urban flows, and the complexities of external impacts. To tackle these issues, we develop a model entitled UrbanFM which consists of two major parts: 1) an inference network to generate fine-grained flow distributions from coarse-grained inputs that uses a feature extraction module and a novel distributional upsampling module; 2) a general fusion subnet to further boost the performance by considering the influence of different external factors. This structure provides outstanding effectiveness and efficiency for small scale upsampling. However, the single-pass upsampling used by UrbanFM is insufficient at higher upscaling rates. Therefore, we further present UrbanPy, a cascading model for progressive inference of fine-grained urban flows by decomposing the original tasks into multiple subtasks. Compared to UrbanFM, such an enhanced structure demonstrates favorable performance for larger-scale inference tasks.

preprint2020arXiv

Optimal Emergency Frequency Control Based on Coordinated Droop in Multi-Infeed Hybrid AC-DC System

In multi-infeed hybrid AC-DC (MIDC) systems, the asynchronous interconnection between regional grids, the complicated system dynamics and possible cascading failures have an enormous effect on the frequency stability. In order to deal with the frequency instability problems in emergency situations, this paper proposes a decentralized emergency frequency control strategy based on coordinated droop for the MIDC system. First, a P-f droop control for LCC-HVDC systems is introduced and the coordinated droop mechanism among LCC-HVDC systems and generators is designed. Then, to reasonably allocate the power imbalance among LCC-HVDC systems and generators, an optimal emergency frequency control (OEFC) problem is formulated, and the optimal droop coefficients are selected in a decentralized approach, which can deal with various control objectives. A Lyapunov stability analysis shows that the closed-loop equilibrium is locally asymptotically stable considering the LCC-HVDC dynamics. The effectiveness of the proposed emergency control strategy is verified through simulations.

preprint2020arXiv

Revisiting Convolutional Neural Networks for Citywide Crowd Flow Analytics

Citywide crowd flow analytics is of great importance to smart city efforts. It aims to model the crowd flow (e.g., inflow and outflow) of each region in a city based on historical observations. Nowadays, Convolutional Neural Networks (CNNs) have been widely adopted in raster-based crowd flow analytics by virtue of their capability in capturing spatial dependencies. After revisiting CNN-based methods for different analytics tasks, we expose two common critical drawbacks in the existing uses: 1) inefficiency in learning global spatial dependencies, and 2) overlooking latent region functions. To tackle these challenges, in this paper we present a novel framework entitled DeepLGR that can be easily generalized to address various citywide crowd flow analytics problems. This framework consists of three parts: 1) a local feature extraction module to learn representations for each region; 2) a global context module to extract global contextual priors and upsample them to generate the global features; and 3) a region-specific predictor based on tensor decomposition to provide customized predictions for each region, which is very parameter-efficient compared to previous methods. Extensive experiments on two typical crowd flow analytics tasks demonstrate the effectiveness, stability, and generality of our framework.

preprint2020arXiv

The Impact of the Physical Layer on the Performance of Concurrent Transmissions

The popularity of concurrent transmissions (CT) has soared after recent studies have shown their feasibility on the four physical layers specified by BLE 5, hence providing an alternative to the use of IEEE 802.15.4 for the design of reliable and efficient low-power wireless protocols. However, to date, the extent to which physical layer properties affect the performance of CT has not yet been investigated in detail. This paper fills this gap and provides the first extensive study on the impact of the physical layer on CT-based solutions using IEEE 802.15.4 and BLE 5. We first highlight through simulation how the impact of errors induced by de-synchronization and beating on the performance of CT highly depends on the choice of the underlying physical layer. We then confirm these observations experimentally on real hardware through an analysis of the bit error distribution across received packets, unveiling possible techniques to effectively handle these errors. We further study the performance of CT-based flooding protocols in the presence of radio interference on a large-scale, and derive important insights on how the used physical layer affects their dependability.

preprint2019arXiv

Bas-relief Generation from Point Clouds Based on Normal Space Compression with Real-time Adjustment on CPU

Bas-relief generation based on 3d models is a hot topic in computer graphics. State-of-the-art algorithms take a mesh surface as input, but real-time interaction via CPU cannot be realized. In this paper, a bas-relief generation algorithm that takes a scattered point cloud as input is proposed. The algorithm takes normal vectors as the operation object and the variation of the local surface as the compression criterion. By constructing and solving linear equations of bas-relief vertices, the closed-form solution can be obtained. Since there is no need to compute discrete gradients on a point cloud lacking topology information, it is easier to implement and more intuitive than gradient domain methods. The algorithm provides parameters to adjust the bas-relief height, saturation and detail richness. At the same time, through the solution strategy based on the subspace, it realizes the real-time adjustment of the bas-relief effect based on the computing power of a consumer CPU. In addition, an iterative solution to generate a bas-relief model of a specified height is presented to meet specific application requirements. Experiments show that our algorithm provides a unified solution for various types of bas-relief creation and can generate bas-reliefs with good saturation and rich details.

preprint2019arXiv

Binary-Tree Encoding for Uniform Binary Sources in Index Modulation Systems

The problem of designing bit-to-pattern mappings and power allocation schemes for orthogonal frequency-division multiplexing (OFDM) systems that employ subcarrier index modulation (IM) is considered. We assume the binary source conveys a stream of independent, uniformly distributed bits to the pattern mapper, which introduces a constraint on the pattern transmission probability distribution that can be quantified using a binary tree formalism. Under this constraint, we undertake the task of maximizing the achievable rate subject to the availability of channel knowledge at the transmitter. The optimization variables are the pattern probability distribution (i.e., the bit-to-pattern mapping) and the transmit powers allocated to active subcarriers. To solve the problem, we first consider the relaxed problem where pattern probabilities are allowed to take any values in the interval [0,1] subject to a sum probability constraint. We develop (approximately) optimal solutions to the relaxed problem by using new bounds and asymptotic results, and then use a novel heuristic algorithm to project the relaxed solution onto a point in the feasible set of the constrained problem. Numerical analysis shows that this approach is capable of achieving the maximum mutual information for the relaxed problem in low and high-SNR regimes and offers noticeable benefits in terms of achievable rate relative to a conventional OFDM-IM benchmark.