Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
192works
0followers
55topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

192 published item(s)

preprint2026arXiv

A Real-time Scale-robust Network for Glottis Segmentation in Nasal Transnasal Intubation

Nasotracheal intubation (NTI) is a critical clinical procedure for establishing and maintaining patient airway patency. Machine-assisted NTI has emerged as a pivotal approach for optimizing procedural efficiency and minimizing manual intervention. However, visual detection algorithms employed for NTI navigation encounter significant challenges, including complex anatomical environments and suboptimal illumination conditions surrounding the glottis. Additionally, the glottis presents considerable scale variability throughout the procedure, initially appearing as a small, difficult-to-capture structure before expanding to occupy nearly the entire field of view. Moreover, traditional visual detection methods often have high computational costs, making real-time, high-precision detection on portable devices challenging. To enhance NTI efficacy and address these challenges, this paper proposes a novel glottis segmentation framework optimized for vision-assisted NTI applications. First, we designed a lightweight, multi-receptive field feature extraction module to reduce intra-class differences, achieving robustness to scale variations of the glottis. This module was then stacked to form the backbone and neck of our network. Subsequently, we developed an advanced label assignment method and redefined the number of samples to further reduce intra-class differences and enhance accuracy in the complex NTI environment. Experiments on three distinct datasets demonstrate that our network surpasses state-of-the-art algorithms, achieving a segmentation mDice of 92.9\% with a compact model size of 19 MB and an inference speed exceeding 170 frames per second. % Our code and datasets will be open-sourced on GitHub after the manuscript is accepted. Our code and datasets are available at https://github.com/HBUT-CV/GlottisNet.

preprint2026arXiv

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

General reasoning represents a long-standing and formidable challenge in artificial intelligence. Recent breakthroughs, exemplified by large language models (LLMs) and chain-of-thought prompting, have achieved considerable success on foundational reasoning tasks. However, this success is heavily contingent upon extensive human-annotated demonstrations, and models' capabilities are still insufficient for more complex problems. Here we show that the reasoning abilities of LLMs can be incentivized through pure reinforcement learning (RL), obviating the need for human-labeled reasoning trajectories. The proposed RL framework facilitates the emergent development of advanced reasoning patterns, such as self-reflection, verification, and dynamic strategy adaptation. Consequently, the trained model achieves superior performance on verifiable tasks such as mathematics, coding competitions, and STEM fields, surpassing its counterparts trained via conventional supervised learning on human demonstrations. Moreover, the emergent reasoning patterns exhibited by these large-scale models can be systematically harnessed to guide and enhance the reasoning capabilities of smaller models.

preprint2026arXiv

Don't Start Over: A Cost-Effective Framework for Migrating Personalized Prompts Between LLMs

Personalization in Large Language Models (LLMs) often relies on user-specific soft prompts. However, these prompts become obsolete when the foundation model is upgraded, necessitating costly, full-scale retraining. To overcome this limitation, we propose the Prompt-level User Migration Adapter (PUMA), a lightweight framework to efficiently migrate personalized prompts across incompatible models. PUMA utilizes a parameter-efficient adapter to bridge the semantic gap, combined with a group-based user selection strategy to significantly reduce training costs. Experiments on three large-scale datasets show our method matches or even surpasses the performance of retraining from scratch, reducing computational cost by up to 98%. The framework demonstrates strong generalization across diverse model architectures and robustness in advanced scenarios like chained and aggregated migrations, offering a practical path for the sustainable evolution of personalized AI by decoupling user assets from the underlying models.

preprint2026arXiv

FinDeepForecast: A Live Multi-Agent System for Benchmarking Deep Research Agents in Financial Forecasting

Deep Research (DR) Agents powered by advanced Large Language Models (LLMs) have fundamentally shifted the paradigm for completing complex research tasks. Yet, a comprehensive and live evaluation of their forecasting performance on real-world, research-oriented tasks in high-stakes domains (e.g., finance) remains underexplored. We introduce FinDeepForecast, the first live, end-to-end multi-agent system for automatically evaluating DR agents by continuously generating research-oriented financial forecasting tasks. This system is equipped with a dual-track taxonomy, enabling the dynamic generation of recurrent and non-recurrent forecasting tasks at both corporate and macro levels. With this system, we generate FinDeepForecastBench, a weekly evaluation benchmark over a ten-week horizon, encompassing 8 global economies and 1,314 listed companies, and evaluate 13 representative methods. Extensive experiments show that, while DR agents consistently outperform strong baselines, their performance still falls short of genuine forward-looking financial reasoning. We expect the proposed FinDeepForecast system to consistently facilitate future advancements of DR agents in research-oriented financial forecasting tasks. The benchmark and leaderboard are publicly available on the OpenFinArena Platform.

preprint2026arXiv

FinDeepResearch: Evaluating Deep Research Agents in Rigorous Financial Analysis

Deep Research (DR) agents, powered by advanced Large Language Models (LLMs), have recently garnered increasing attention for their capability in conducting complex research tasks. However, existing literature lacks a rigorous and systematic evaluation of DR Agent's capabilities in critical research analysis. To address this gap, we first propose HisRubric, a novel evaluation framework with a hierarchical analytical structure and a fine-grained grading rubric for rigorously assessing DR agents' capabilities in corporate financial analysis. This framework mirrors the professional analyst's workflow, progressing from data recognition to metric calculation, and finally to strategic summarization and interpretation. Built on this framework, we construct a FinDeepResearch benchmark that comprises 64 listed companies from 8 financial markets across 4 languages, encompassing a total of 15,808 grading items. We further conduct extensive experiments on the FinDeepResearch using 16 representative methods, including 6 DR agents, 5 LLMs equipped with both deep reasoning and search capabilities, and 5 LLMs with deep reasoning capabilities only. The results reveal the strengths and limitations of these approaches across diverse capabilities, financial markets, and languages, offering valuable insights for future research and development. The benchmark and evaluation code is publicly available at https://OpenFinArena.com/.

preprint2026arXiv

FlowCompile: An Optimizing Compiler for Structured LLM Workflows

Structured LLM workflows, where specialized LLM sub-agents execute according to a predefined graph, have become a powerful abstraction for solving complex tasks. Optimizing such workflows, i.e., selecting configurations for each sub-agent to balance accuracy and latency, is challenging due to the combinatorial design space over model choices, reasoning budgets, and workflow structures. Existing cost-aware methods largely treat workflow optimization as a routing problem, selecting a configuration at inference time for each query according to the accuracy-latency objective used during training. We argue that structured LLM workflows can also be optimized from a compilation perspective: before deployment, the system can globally explore the workflow design space and construct a reusable set of workflow-level configurations spanning diverse accuracy-latency trade-offs. Drawing inspiration from machine learning compilers, we introduce FlowCompile, a structured LLM workflow compiler that performs compile-time design space exploration to identify a high-quality, reusable trade-off set. FlowCompile decomposes a workflow into sub-agents, profiles each sub-agent under diverse configurations, and composes these measurements through a structure-aware proxy to estimate workflow-level accuracy and latency. It then identifies diverse high-quality configurations in a single compile-time pass, without retraining or online adaptation. Experiments across diverse workflows and challenging benchmarks show that FlowCompile consistently outperforms heuristically optimized workflow configurations and routing-based baselines, delivering up to 6.4x speedup. The compiled configuration set further serves as a reusable optimization artifact, enabling flexible deployment under varying runtime preferences and supporting downstream selection or routing.

preprint2026arXiv

GIFT: Games as Informal Training for Generalizable LLMs

While Large Language Models (LLMs) have achieved remarkable success in formal learning tasks such as mathematics and code generation, they still struggle with the "practical wisdom" and generalizable intelligence, such as strategic creativity and social reasoning, that characterize human cognition. This gap arises from a lack of informal learning, which thrives on interactive feedback rather than goal-oriented instruction. In this paper, we propose treating Games as a primary environment for LLM informal learning, leveraging their intrinsic reward signals and abstracted complexity to cultivate diverse competencies. To address the performance degradation observed in multi-task learning, we introduce a Nested Training Framework. Unlike naive task mixing optimizing an implicit "OR" objective, our framework employs sequential task composition to enforce an explicit "AND" objective, compelling the model to master multiple abilities simultaneously to achieve maximal rewards. Using GRPO-based reinforcement learning across Matrix Games, TicTacToe, and Who's the Spy games, we demonstrate that integrating game-based informal learning not only prevents task interference but also significantly bolsters the model's generalization across broad ability-oriented benchmarks. The framework and implementation are publicly available.

preprint2026arXiv

GPS-Synchronized Monitoring of Core-collapse Supernova Bursts with PandaX-4T via Coherent Elastic Neutrino Nuclear Scattering

The landmark detection of neutrinos from SN1987A marked the dawn of neutrino astrophysics. The neutrino burst provided essential insights into fundamental properties of neutrinos, and served as key probes of stellar evolution and supernova dynamics. The recent advancement in coherent elastic neutrino-nucleus scattering enables the detection of core-collapse supernova burst neutrinos using tonne-scale liquid xenon detectors originally designed for dark matter direct detection. Leveraging this capability, we developed and deployed an online supernova monitoring system for the PandaX-4T experiment. This system features a GPS module with millisecond-level timing precision, a low false-alarm rate, and high sensitivity to galactic core-collapse supernova explosion events. The methodology is robust, directly scalable, and planned for implementation in the next-generation PandaX-20T experiment.

preprint2026arXiv

Guardians of the Hair: Rescuing Soft Boundaries in Depth, Stereo, and Novel Views

Soft boundaries, like thin hairs, are commonly observed in natural and computer-generated imagery, but they remain challenging for 3D vision due to the ambiguous mixing of foreground and background cues. This paper introduces Guardians of the Hair (HairGuard), a framework designed to recover fine-grained soft boundary details in 3D vision tasks. Specifically, we first propose a novel data curation pipeline that leverages image matting datasets for training and design a depth fixer network to automatically identify soft boundary regions. With a gated residual module, the depth fixer refines depth precisely around soft boundaries while maintaining global depth quality, allowing plug-and-play integration with state-of-the-art depth models. For view synthesis, we perform depth-based forward warping to retain high-fidelity textures, followed by a generative scene painter that fills disoccluded regions and eliminates redundant background artifacts within soft boundaries. Finally, a color fuser adaptively combines warped and inpainted results to produce novel views with consistent geometry and fine-grained details. Extensive experiments demonstrate that HairGuard achieves state-of-the-art performance across monocular depth estimation, stereo image/video conversion, and novel view synthesis, with significant improvements in soft boundary regions.

preprint2026arXiv

Large Language Models as Amortized Pareto-Front Generators for Constrained Bi-Objective Convex Optimization

Generating feasible Pareto fronts for constrained bi-objective continuous optimization is central to multi-criteria decision-making. Existing methods usually rely on iterative scalarization, evolutionary search, or problem-specific solvers, requiring repeated optimization for each instance. We introduce DIPS, an end-to-end framework that fine-tunes large language models as amortized Pareto-front generators for constrained bi-objective convex optimization. Given a textual problem description, DIPS directly outputs an ordered set of feasible continuous decision vectors approximating the Pareto front. To make continuous optimization compatible with autoregressive language modeling, DIPS combines a compact discretization scheme, Numerically Grounded Token Initialization for new numerical tokens, and Three-Phase Curriculum Optimization, which progressively aligns structural validity, feasibility, and Pareto-front quality. Across five families of constrained bi-objective convex problems, a fine-tuned 7B-parameter model achieves normalized hypervolume ratios of 95.29% to 98.18% relative to reference fronts. With vLLM-accelerated inference, DIPS solves one instance in as little as 0.16 seconds and outperforms general-purpose and reasoning LLM baselines under the evaluated setting. These results suggest that LLMs can serve as effective amortized generators for continuous Pareto-front approximation.

preprint2026arXiv

LLM-ReSum: A Framework for LLM Reflective Summarization through Self-Evaluation

Reliable evaluation of large language model (LLM)-generated summaries remains an open challenge, particularly across heterogeneous domains and document lengths. We conduct a comprehensive meta-evaluation of 14 automatic summarization metrics and LLM-based evaluators across seven datasets spanning five domains, covering documents from short news articles to long scientific, governmental, and legal texts (2K-27K words) with over 1,500 human-annotated summaries. Our results show that traditional lexical overlap metrics (e.g., ROUGE, BLEU) exhibit weak or negative correlation with human judgments, while task-specific neural metrics and LLM-based evaluators achieve substantially higher alignment, especially for linguistic quality assessment. Leveraging these findings, we propose LLM-ReSum, a self-reflective summarization framework that integrates LLM-based evaluation and generation in a closed feedback loop without model finetuning. Across three domains, LLM-ReSum improves low-quality summaries by up to 33% in factual accuracy and 39% in coverage, with human evaluators preferring refined summaries in 89% of cases. We additionally introduce PatentSumEval, a new human-annotated benchmark for legal document summarization comprising 180 expert-evaluated summaries. All code and datasets will be released in GitHub.

preprint2026arXiv

MASH: A Multiplatform and Multimodal Annotated Dataset for Societal Impact of Hurricane

Natural disasters cause multidimensional threats to human societies, with hurricanes exemplifying one of the most disruptive events that not only caused severe physical damage but also sparked widespread discussion on social media platforms. Existing datasets for studying societal impacts of hurricanes often focus on outdated hurricanes and are limited to a single social media platform, failing to capture the broader societal impact in today's diverse social media environment. Moreover, existing datasets annotate visual and textual content of the post separately, failing to account for the multimodal nature of social media posts. To address these gaps, we present a multiplatform and Multimodal Annotated Dataset for Societal Impact of Hurricane (MASH) that includes 59,607 relevant social media data posts from Reddit, TikTok, and YouTube. In addition, all relevant social media data posts are annotated in a multimodal approach that considers both textual and visual content on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes. To our best knowledge, MASH is the first large-scale, multi-platform, multimodal, and multi-dimensionally annotated dataset centered on hurricane disasters. In addition, we introduce an online platform that supports interactive data exploration, provides preliminary analytical results, and allows users to share their insights regarding the societal impacts of hurricanes. We envision that MASH can contribute to the study of hurricanes' impact on society, such as disaster response, disaster severity classification, public sentiment analysis, disaster policy making, and bias identification. The dataset is publicly available at https://huggingface.co/datasets/YRC10/MASH under the Creative Commons Attribution 4.0 (CC BY 4.0) license.

preprint2026arXiv

Multimodal Cultural Heritage Knowledge Graph Extension with Language and Vision Models

The preservation and interpretation of cultural heritage increasingly rely on digital technologies, among which Knowledge Graphs (KGs) stand out for their ability to structure vast amounts of data. However, the construction and expansion of these KGs often face challenges due to the diverse and complex nature of cultural heritage information. In this paper, we propose a novel approach for extending KG resources in the domain of cultural heritage, which we applied to French data. First, we introduce a new knowledge graph in the domain of French cultural heritage, WJoconde, which is distinguished by its multimodality as it integrates both textual and image information of the entities. We further introduce three variants of WJoconde to facilitate downstream research, such as Knowledge Graph Completion (KGC). We also built a comprehensive benchmark for KGC methods on our dataset. Second, we propose a new framework for extending cultural heritage KGs using multi-modal approaches leveraging Large Language Models (LLMs) and Vision-Language Models (VLMs), which includes automated data extraction from unstructured resources combined with a special validation pipeline for grounding the output of both models, to further extend WJoconde. Our results show that by integrating the rich text and image information in cultural heritage data, we can efficiently enhance KGs with high reliability. We open-source all code and benchmark datasets with text and images, as well as the original data with an interactive access point

preprint2026arXiv

OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory

Autonomous LLM agents increasingly operate in long-horizon, interactive settings where success depends on reusing experience accumulated over extended histories. However, existing agent memory systems are fundamentally constrained by text-context budgets: storing or revisiting raw trajectories is prohibitively token-expensive, while summarization and text-only retrieval trade token savings for information loss and fragmented evidence. To address this limitation, we propose Optical Context Retrieval Memory (OCR-Memory), a memory framework that leverages the visual modality as a high-density representation of agent experience, enabling retention of arbitrarily long histories with minimal prompt overhead at retrieval time. Specifically, OCR-Memory renders historical trajectories into images annotated with unique visual identifiers. OCR-Memory retrieves stored experience via a \emph{locate-and-transcribe} paradigm that selects relevant regions through visual anchors and retrieves the corresponding verbatim text, avoiding free-form generation and reducing hallucination. Experiments on long-horizon agent benchmarks show consistent gains under strict context limits, demonstrating that optical encoding increases effective memory capacity while preserving faithful evidence recovery.

preprint2026arXiv

PERM: Psychology-grounded Empathetic Reward Modeling for Large Language Models

Large Language Models (LLMs) are increasingly deployed in human-centric applications, yet they often fail to provide substantive emotional support. While Reinforcement Learning (RL) has been utilized to enhance empathy of LLMs, existing reward models typically evaluate empathy from a single perspective, overlooking the inherently bidirectional interaction nature of empathy between the supporter and seeker as defined by Empathy Cycle theory. To address this limitation, we propose Psychology-grounded Empathetic Reward Modeling (PERM). PERM operationalizes empathy evaluation through a bidirectional decomposition: 1) Supporter perspective, assessing internal resonation and communicative expression; 2) Seeker perspective, evaluating emotional reception. Additionally, it incorporates a bystander perspective to monitor overall interaction quality. Extensive experiments on a widely-used emotional intelligence benchmark and an industrial daily conversation dataset demonstrate that PERM outperforms state-of-the-art baselines by over 10\%. Furthermore, a blinded user study reveals a 70\% preference for our approach, highlighting its efficacy in generating more empathetic responses. Our code, dataset, and models are available at https://github.com/ZhengWwwq/PERM.

preprint2026arXiv

PRTS: A Primitive Reasoning and Tasking System via Contrastive Representations

Vision-Language-Action (VLA) models advance robotic control via strong visual-linguistic priors. However, existing VLAs predominantly frame pretraining as supervised behavior cloning, overlooking the fundamental nature of robot learning as a goal-reaching process that requires understanding temporal task progress. We present \textbf{PRTS} (\textbf{P}rimitive \textbf{R}easoning and \textbf{T}asking \textbf{S}ystem), a VLA foundation model that reformulates pretraining through Goal-Conditioned Reinforcement Learning. By treating language instructions as goals and employing contrastive reinforcement learning, PRTS learns a unified embedding space where the inner product of state-action and goal embeddings approximates the log-discounted goal occupancy, the probability of reaching the language-specified goal from the current state-action, quantitatively assessing physical feasibility beyond static semantic matching. PRTS draws this dense goal-reachability supervision directly from offline trajectories without reward annotations, and folds it into the VLM backbone via a role-aware causal mask, incurring negligible overhead over vanilla behavior cloning. This paradigm endows the high-level reasoning system with intrinsic goal reachability awareness, bridging semantic reasoning and temporal task progress, and further benefits goal-conditioned action prediction. Pretrained on 167B tokens of diverse manipulation and embodied-reasoning data, PRTS reaches state-of-the-art performance on LIBERO, LIBERO-Pro, LIBERO-Plus, SimplerEnv, and a real-world suite of 14 complex tasks, with particularly substantial gains on long-horizon, contact-rich, and zero-shot novel-instruction settings, confirming that injecting goal-reachability awareness significantly improves both execution success and long-horizon planning of general-purpose robotic foundation policies.

preprint2026arXiv

Rethinking the Text-Vision Reasoning Imbalance in MLLMs through the Lens of Training Recipes

Multimodal large language models (MLLMs) have demonstrated strong capabilities on vision-and-language tasks. However, recent findings reveal an imbalance in their reasoning capabilities across visual and textual modalities. Specifically, current MLLMs often over-rely on textual cues while under-attending to visual content, resulting in suboptimal performance on tasks that require genuine visual reasoning. We refer to this phenomenon as the \textit{modality gap}, defined as the performance disparity between text-centric and vision-centric inputs. In this paper, we analyze the modality gap through the lens of training recipes. We first show that existing training recipes tend to amplify this gap. Then, we systematically explore strategies to bridge it from two complementary perspectives: data and loss design. Our findings provide insights into developing training recipes that mitigate the modality gap and promote more balanced multimodal reasoning. Our code is publicly available at https://github.com/UCSB-NLP-Chang/Bridging-Modality-Gap.

preprint2026arXiv

Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence

As large models evolve from conversational assistants into autonomous agents, challenges increasingly arise from long-horizon decision making, tool use, and real environment interaction. Existing agenticinfrastructure remain fragmented across evaluation, data management, and agent evolution, making it difficult to discover risks systematically and improve models in a continuous closed loop. In this report, we present \textbf{Safactory}, a scalable agent factory for trustworthy autonomous intelligence. Safactory integrates three tightly coupled platforms: a \textbf{Parallel Simulation Platform} for trajectory generation, a \textbf{Trustworthy Data Platform} for trajectory storage and experience extraction, and an \textbf{Autonomous Evolution Platform} for asynchronous reinforcement learning and on-policy distillation. As far as we know, Safactory is the first framework to propose a unified evolutionary pipeline for next-generation trustworthy autonomous intelligence.

preprint2026arXiv

Song Aesthetics Evaluation with Multi-Stem Attention and Hierarchical Uncertainty Modeling

Music generative artificial intelligence (AI) is rapidly expanding music content, necessitating automated song aesthetics evaluation. However, existing studies largely focus on speech, audio or singing quality, leaving song aesthetics underexplored. Moreover, conventional approaches often predict a precise Mean Opinion Score (MOS) value directly, which struggles to capture the nuances of human perception in song aesthetics evaluation. This paper proposes a song-oriented aesthetics evaluation framework, featuring two novel modules: 1) Multi-Stem Attention Fusion (MSAF) builds bidirectional cross-attention between mixture-vocal and mixture-accompaniment pairs, fusing them to capture complex musical features; 2) Hierarchical Granularity-Aware Interval Aggregation (HiGIA) learns multi-granularity score probability distributions, aggregates them into a score interval, and applies a regression within the interval to produce the final score. We evaluated on two datasets of full-length songs: SongEval dataset (AI-generated) and an internal aesthetics dataset (human-created), and compared with two state-of-the-art (SOTA) models. Results show that the proposed method achieves stronger performance for multi-dimensional song aesthetics evaluation.

preprint2026arXiv

Task-Aware Scanning Parameter Configuration for Robotic Inspection Using Vision Language Embeddings and Hyperdimensional Computing

Robotic laser profiling is widely used for dimensional verification and surface inspection, yet measurement fidelity is often dominated by sensor configuration rather than robot motion. Industrial profilers expose multiple coupled parameters, including sampling frequency, measurement range, exposure time, receiver dynamic range, and illumination, that are still tuned by trial-and-error; mismatches can cause saturation, clipping, or missing returns that cannot be recovered downstream. We formulate instruction-conditioned sensing parameter recommendation; given a pre-scan RGB observation and a natural-language inspection instruction, infer a discrete configuration over key parameters of a robot-mounted profiler. To benchmark this problem, we develop Instruct-Obs2Param, a real-world multimodal dataset linking inspection intents and multi-view pose and illumination variation across 16 objects to canonical parameter regimes. We then propose ScanHD, a hyperdimensional computing framework that binds instruction and observation into a task-aware code and performs parameter-wise associative reasoning with compact memories, matching discrete scanner regimes while yielding stable, interpretable, low-latency decisions. On Instruct-Obs2Param, ScanHD achieves 92.7% average exact accuracy and 98.1% average Win@1 accuracy across the five parameters, with strong cross-split generalization and low-latency inference suitable for deployment, outperforming rule-based heuristics, conventional multimodal models, and multimodal large language models. This work enables autonomous, instruction-conditioned sensing configuration from task intent and scene context, eliminating manual tuning and elevating sensor configuration from a static setting to an adaptive decision variable.

preprint2026arXiv

UniFixer: A Universal Reference-Guided Fixer for Diffusion-Based View Synthesis

With the recent surge of generative models, diffusion-based approaches have become mainstream for view synthesis tasks, either in an explicit depth-warp-inpaint or in an implicit end-to-end manner. Despite their success, both paradigms often suffer from noticeable quality degradation, e.g., blurred details and distorted structures, caused by pixel-to-latent compression and diffusion hallucination. In this paper, we investigate diffusion degradation from three key dimensions (i.e., spatial, temporal, and backbone-related) and propose UniFixer, a universal reference-guided framework that fixes diverse degradation artifacts via a coarse-to-fine strategy. Specifically, a reference pre-alignment module is first designed to perform coarse alignment between the reference view and the degraded novel view. A global structure anchoring mechanism then rectifies geometric distortions to ensure structural fidelity, followed by a local detail injection module that recovers fine-grained texture details for high-quality view synthesis. Our UniFixer serves as a plug-and-play refiner that achieves zero-shot fixing across different types of diffusion degradation, and extensive experiments verify our state-of-the-art performance on novel view synthesis and stereo conversion.

preprint2026arXiv

WOW-Seg: A Word-free Open World Segmentation Model

Open world image segmentation aims to achieve precise segmentation and semantic understanding of targets within images by addressing the infinitely open set of object categories encountered in the real world. However, traditional closed-set segmentation approaches struggle to adapt to complex open world scenarios, while foundation segmentation models such as SAM exhibit notable discrepancies between their strong segmentation capabilities and relatively weaker semantic understanding. To bridge these discrepancies, we propose WOW-Seg, a Word-free Open World Segmentation model for segmenting and recognizing objects from open-set categories. Specifically, WOW-Seg introduces a novel visual prompt module, Mask2Token, which transforms image masks into visual tokens and ensures their alignment with the VLLM feature space. Moreover, we introduce the Cascade Attention Mask to decouple information across different instances. This approach mitigates inter-instance interference, leading to a significant improvement in model performance. We further construct an open world region recognition test benchmark: the Region Recognition Dataset (RR-7K). With 7,662 classes, it represents the most extensive category-rich region recognition dataset to date. WOW-Seg attains strong results on the LVIS dataset, achieving a semantic similarity of 89.7 and a semantic IoU of 82.4. This performance surpasses the previous SOTA while using only one-eighth the parameter count. These results underscore the strong open world generalization capabilities of WOW-Seg. The code and related resources are available at https://github.com/AAwcAA/WOW-Seg-Meta.

preprint2025arXiv

Chiral superconductivity from spin polarized Chern band in twisted MoTe$_2$

Superconductivity has been observed in twisted MoTe2 within the anomalous Hall metal parent state. Key signatures-including a fully spin/valley polarized normal state, anomalous Hall resistivity hysteresis, superconducting phase adjacent to the fractional Chern insulating state and a narrow superconducting dome at zero gating field-collectively indicate chiral superconductivity driven by intravalley pairing of electrons. Within the Kohn-Luttinger mechanism, we compute the superconducting phase diagram via random phase approximation, incorporating Coulomb repulsion in a realistic continuum model. Our results identify a dominant intravalley pairing with a narrow superconducting dome of p+ip type at zero gate field. This chiral phase contrasts sharply with the much weaker time-reversal-symmetric intervalley pairing at finite gating field. Our work highlights the role of band topology in achieving robust topological superconductivity, and supports the chiral and topological nature of the superconductivity observed in twisted MoTe2.

preprint2024arXiv

From Beginner to Expert: Modeling Medical Knowledge into General LLMs

Recently, large language model (LLM) based artificial intelligence (AI) systems have demonstrated remarkable capabilities in natural language understanding and generation. However, these models face a significant challenge when it comes to sensitive applications, such as reasoning over medical knowledge and answering medical questions in a physician-like manner. Prior studies attempted to overcome this challenge by increasing the model size (>100B) to learn more general medical knowledge, while there is still room for improvement in LLMs with smaller-scale model sizes (<100B). In this work, we start from a pre-trained general LLM model (AntGLM-10B) and fine-tune it from a medical beginner towards a medical expert (called AntGLM-Med-10B), which leverages a 3-stage optimization procedure, i.e., general medical knowledge injection, medical domain instruction tuning, and specific medical task adaptation. Our contributions are threefold: (1) We specifically investigate how to adapt a pre-trained general LLM in medical domain, especially for a specific medical task. (2) We collect and construct large-scale medical datasets for each stage of the optimization process. These datasets encompass various data types and tasks, such as question-answering, medical reasoning, multi-choice questions, and medical conversations. (3) Specifically for multi-choice questions in the medical domain, we propose a novel Verification-of-Choice approach for prompting engineering, which significantly enhances the reasoning ability of LLMs. Remarkably, by combining the above approaches, our AntGLM-Med-10B model can outperform the most of LLMs on PubMedQA, including both general and medical LLMs, even when these LLMs have larger model size.

preprint2024arXiv

Multiple Chern bands in twisted MoTe$_2$ and possible non-Abelian states

We investigate the moiré band structures and possible even denominator fractional quantum Hall state in small angle twisted bilayer MoTe$_2$, using combined large-scale local basis density functional theory calculation and continuum model exact diagonalization. Via large-scale first principles calculations at $θ=1.89^{\circ}$, we find a sequence of $C=1$(Chern number in K valley)moiré Chern bands, in analogy to Landau levels. By constructing the continuum model with multiple Chern bands, we undertake band-projected exact diagonalization using unscreened Coulomb repulsion to identify possible non-Abelian states near twist angle $θ=1.89^{\circ}$ at the half filling of second moiré band.

preprint2023arXiv

A Bi-Step Grounding Paradigm for Large Language Models in Recommendation Systems

As the focus on Large Language Models (LLMs) in the field of recommendation intensifies, the optimization of LLMs for recommendation purposes (referred to as LLM4Rec) assumes a crucial role in augmenting their effectiveness in providing recommendations. However, existing approaches for LLM4Rec often assess performance using restricted sets of candidates, which may not accurately reflect the models&#39; overall ranking capabilities. In this paper, our objective is to investigate the comprehensive ranking capacity of LLMs and propose a two-step grounding framework known as BIGRec (Bi-step Grounding Paradigm for Recommendation). It initially grounds LLMs to the recommendation space by fine-tuning them to generate meaningful tokens for items and subsequently identifies appropriate actual items that correspond to the generated tokens. By conducting extensive experiments on two datasets, we substantiate the superior performance, capacity for handling few-shot scenarios, and versatility across multiple domains exhibited by BIGRec. Furthermore, we observe that the marginal benefits derived from increasing the quantity of training samples are modest for BIGRec, implying that LLMs possess the limited capability to assimilate statistical information, such as popularity and collaborative filtering, due to their robust semantic priors. These findings also underline the efficacy of integrating diverse statistical information into the LLM4Rec framework, thereby pointing towards a potential avenue for future research. Our code and data are available at https://github.com/SAI990323/Grounding4Rec.

preprint2023arXiv

Backdoor Attacks Against Dataset Distillation

Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.

preprint2023arXiv

DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models

Text-to-image generation models that generate images based on prompt descriptions have attracted an increasing amount of attention during the past few months. Despite their encouraging performance, these models raise concerns about the misuse of their generated fake images. To tackle this problem, we pioneer a systematic study on the detection and attribution of fake images generated by text-to-image generation models. Concretely, we first build a machine learning classifier to detect the fake images generated by various text-to-image generation models. We then attribute these fake images to their source models, such that model owners can be held responsible for their models&#39; misuse. We further investigate how prompts that generate fake images affect detection and attribution. We conduct extensive experiments on four popular text-to-image generation models, including DALL$\cdot$E 2, Stable Diffusion, GLIDE, and Latent Diffusion, and two benchmark prompt-image datasets. Empirical results show that (1) fake images generated by various models can be distinguished from real ones, as there exists a common artifact shared by fake images from different models; (2) fake images can be effectively attributed to their source models, as different models leave unique fingerprints in their generated images; (3) prompts with the ``person&#39;&#39; topic or a length between 25 and 75 enable models to generate fake images with higher authenticity. All findings contribute to the community&#39;s insight into the threats caused by text-to-image generation models. We appeal to the community&#39;s consideration of the counterpart solutions, like ours, against the rapidly-evolving fake image generation.

preprint2023arXiv

Implications of Nano-Hertz Gravitational Waves on Electroweak Phase Transition in the Singlet Dark Matter Model

Inspired by the recent evidences of nano-Hertz stochastic gravitational waves observed by the pulsar timing array collaborations, we explore their implied supercooled electroweak phase transition in the singlet extension of the Standard Model. Our findings reveal that by adjusting the model parameter at per milli level, the corresponding percolation temperature can be continuously lowered to 1 GeV. With such a low percolation temperature, the singlet dark matter may freeze out before the electroweak phase transition, and, consequently, the entropy generated during the transition can significantly affect the dark matter relic density. It alleviates the tension between the requirement of a strong electroweak phase transition and the constraints imposed by dark matter direct detection, and can be tested in future experiments.

preprint2022arXiv

A first look at the function space for planar two-loop six-particle Feynman integrals

Two-loop corrections to scattering amplitudes are crucial theoretical input for collider physics. Recent years have seen tremendous advances in computing Feynman integrals, scattering amplitudes, and cross sections for five-particle processes. In this paper, we initiate the study of the function space for planar two-loop six-particle processes. We study all genuine six-particle Feynman integrals, and derive the differential equations they satisfy on maximal cuts. Performing a leading singularity analysis in momentum space, and in Baikov representation, we find an integral basis that puts the differential equations into canonical form. The corresponding differential equation in the eight independent kinematic variables is derived with the finite-field reconstruction method and the symbol letters are identified. We identify the dual conformally invariant hexagon alphabet known from maximally supersymmetric Yang-Mills theory as a subset of our alphabet. This paper constitutes an important step in the analytic calculation of planar two-loop six-particle Feynman integrals.

preprint2022arXiv

A joint explanation of W-mass and muon g-2 in 2HDM

Since both $W$-mass and muon $g-2$ can be affected by the mass splittings among extra Higgs bosons $(H,~A,~H^\pm)$ in a 2HDM, we take a model with $μ$-$τ$ LFV interactions to examine the two anomalies reported respectively by CDF II and FNAL. We obtain the following observations: (i) Combined with theoretical constraints, the CDF $W$-mass measurement disfavors $H$ or $A$ to degenerate in mass with $H^\pm$, but allows $H$ and $A$ to degenerate. The mass splitting between $H^\pm$ and $H/A$ is required to be larger than 10 GeV. The $m_{H^\pm}$ and $m_{A}$ are favored to be smaller than 650 GeV for $m_H<120$ GeV, and allowed to have more large values with increasing of $m_H$. (ii) After imposing other relevant experimental constraints, there are parameter spaces that simultaneously satisfy (at $2σ$ level) the CDF $W$-mass, the FNAL muon $g-2$ and the data of lepton universality in $τ$ decays, but the mass splittings among extra Higgs bosons are strictly constrained.

preprint2022arXiv

A Lightweight NMS-free Framework for Real-time Visual Fault Detection System of Freight Trains

Real-time vision-based system of fault detection (RVBS-FD) for freight trains is an essential part of ensuring railway transportation safety. Most existing vision-based methods still have high computational costs based on convolutional neural networks. The computational cost is mainly reflected in the backbone, neck, and post-processing, i.e., non-maximum suppression (NMS). In this paper, we propose a lightweight NMS-free framework to achieve real-time detection and high accuracy simultaneously. First, we use a lightweight backbone for feature extraction and design a fault detection pyramid to process features. This fault detection pyramid includes three novel individual modules using attention mechanism, bottleneck, and dilated convolution for feature enhancement and computation reduction. Instead of using NMS, we calculate different loss functions, including classification and location costs in the detection head, to further reduce computation. Experimental results show that our framework achieves over 83 frames per second speed with a smaller model size and higher accuracy than the state-of-the-art detectors. Meanwhile, the hardware resource requirements of our method are low during the training and testing process.

preprint2022arXiv

Addressing Confounding Feature Issue for Causal Recommendation

In recommender system, some feature directly affects whether an interaction would happen, making the happened interactions not necessarily indicate user preference. For instance, short videos are objectively easier to be finished even though the user does not like the video. We term such feature as confounding feature, and video length is a confounding feature in video recommendation. If we fit a model on such interaction data, just as done by most data-driven recommender systems, the model will be biased to recommend short videos more, and deviate from user actual requirement. This work formulates and addresses the problem from the causal perspective. Assuming there are some factors affecting both the confounding feature and other item features, e.g., the video creator, we find the confounding feature opens a backdoor path behind user item matching and introduces spurious correlation. To remove the effect of backdoor path, we propose a framework named Deconfounding Causal Recommendation (DCR), which performs intervened inference with do-calculus. Nevertheless, evaluating do calculus requires to sum over the prediction on all possible values of confounding feature, significantly increasing the time cost. To address the efficiency challenge, we further propose a mixture-of experts (MoE) model architecture, modeling each value of confounding feature with a separate expert module. Through this way, we retain the model expressiveness with few additional costs. We demonstrate DCR on the backbone model of neural factorization machine (NFM), showing that DCR leads to more accurate prediction of user preference with small inference time cost.

preprint2022arXiv

Adversarial Support Alignment

We study the problem of aligning the supports of distributions. Compared to the existing work on distribution alignment, support alignment does not require the densities to be matched. We propose symmetric support difference as a divergence measure to quantify the mismatch between supports. We show that select discriminators (e.g. discriminator trained for Jensen-Shannon divergence) are able to map support differences as support differences in their one-dimensional output space. Following this result, our method aligns supports by minimizing a symmetrized relaxed optimal transport cost in the discriminator 1D space via an adversarial process. Furthermore, we show that our approach can be viewed as a limit of existing notions of alignment by increasing transportation assignment tolerance. We quantitatively evaluate the method across domain adaptation tasks with shifts in label distributions. Our experiments show that the proposed method is more robust against these shifts than other alignment-based baselines.

preprint2022arXiv

AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N

Comprehensive global cooperation is essential to limit global temperature increases while continuing economic development, e.g., reducing severe inequality or achieving long-term economic growth. Achieving long-term cooperation on climate change mitigation with n strategic agents poses a complex game-theoretic problem. For example, agents may negotiate and reach climate agreements, but there is no central authority to enforce adherence to those agreements. Hence, it is critical to design negotiation and agreement frameworks that foster cooperation, allow all agents to meet their individual policy objectives, and incentivize long-term adherence. This is an interdisciplinary challenge that calls for collaboration between researchers in machine learning, economics, climate science, law, policy, ethics, and other fields. In particular, we argue that machine learning is a critical tool to address the complexity of this domain. To facilitate this research, here we introduce RICE-N, a multi-region integrated assessment model that simulates the global climate and economy, and which can be used to design and evaluate the strategic outcomes for different negotiation and agreement frameworks. We also describe how to use multi-agent reinforcement learning to train rational agents using RICE-N. This framework underpinsAI for Global Climate Cooperation, a working group collaboration and competition on climate negotiation and agreement design. Here, we invite the scientific community to design and evaluate their solutions using RICE-N, machine learning, economic intuition, and other domain knowledge. More information can be found on www.ai4climatecoop.org.

preprint2022arXiv

Algorithms of Real-Time Navigation and Control of Autonomous Unmanned Vehicles

The rapid development of robotics has benefited by more and more people putting their attention to it. With the demand for robots is growing for the purpose of fulfilling tasks instead of humans, how to control the robot better is becoming a hot topic. For obstacle avoidance, we proposed algorithms for both 2D planar environments and 3D space environments. The example cases we raise are those that need to be addressed but have always been ignored. In addition, we put efforts into trajectory planning for robots. The two scenarios we set are self-driving cars on the road and reconnaissance and surveillance of drones. For future expectations, there are some possible directions. How to combine traditional navigation algorithms and high-tech algorithms together so as to fulfill the tasks perfectly while the computational efficiency is not too high is a worthy topic. In addition, extending the obstacle avoidance algorithms to more competitive situations. Moreover, cooperation among multi robots are worth attention by researchers. All in all, there is still a long way to go for the development of navigation and control of mobile robots. Despite this, we believe we do not need to wait for too long time to see the revolution of robots.

preprint2022arXiv

Analytical Equation of Three-point Correlation Function of Galaxies: to Third Order of Density Perturbation

Applying functional differentiation to the density field with Newtonian gravity, we obtain the static, nonlinear equation of the three-point correlation function $ζ$ of galaxies, to the third order density perturbations. We make the equation closed and perform renormalization of the mass and the Jeans wavenumber. Using the boundary condition inferred from observations, we obtain the third order solution $ζ(r, u, θ)$ at fixed $u=2$, which is positive, exhibits a $U$-shape along the angle $θ$, and decreases monotonously along the radial $r$ up to the range $r \leq 30\, h^{-1}$Mpc in our computation. The corresponding reduced $Q(r, u, θ)$ deviates from 1 of the Gaussian case, has a deeper $U$-shape along $θ$, and varies non-monotonously along $r$. The third order solution agrees with the SDSS data of galaxies, quite close to the previous second order solution, especially at large scales. This indicates that the equations of correlation functions with increasing orders of density perturbation provide a stable description of the nonlinear galaxy system.

preprint2022arXiv

Analyzing the Effects of Handling Data Imbalance on Learned Features from Medical Images by Looking Into the Models

One challenging property lurking in medical datasets is the imbalanced data distribution, where the frequency of the samples between the different classes is not balanced. Training a model on an imbalanced dataset can introduce unique challenges to the learning problem where a model is biased towards the highly frequent class. Many methods are proposed to tackle the distributional differences and the imbalanced problem. However, the impact of these approaches on the learned features is not well studied. In this paper, we look deeper into the internal units of neural networks to observe how handling data imbalance affects the learned features. We study several popular cost-sensitive approaches for handling data imbalance and analyze the feature maps of the convolutional neural networks from multiple perspectives: analyzing the alignment of salient features with pathologies and analyzing the pathology-related concepts encoded by the networks. Our study reveals differences and insights regarding the trained models that are not reflected by quantitative metrics such as AUROC and AP and show up only by looking at the models through a lens.

preprint2022arXiv

Applications of Multi-Agent Reinforcement Learning in Future Internet: A Comprehensive Survey

Future Internet involves several emerging technologies such as 5G and beyond 5G networks, vehicular networks, unmanned aerial vehicle (UAV) networks, and Internet of Things (IoTs). Moreover, future Internet becomes heterogeneous and decentralized with a large number of involved network entities. Each entity may need to make its local decision to improve the network performance under dynamic and uncertain network environments. Standard learning algorithms such as single-agent Reinforcement Learning (RL) or Deep Reinforcement Learning (DRL) have been recently used to enable each network entity as an agent to learn an optimal decision-making policy adaptively through interacting with the unknown environments. However, such an algorithm fails to model the cooperations or competitions among network entities, and simply treats other entities as a part of the environment that may result in the non-stationarity issue. Multi-agent Reinforcement Learning (MARL) allows each network entity to learn its optimal policy by observing not only the environments, but also other entities&#39; policies. As a result, MARL can significantly improve the learning efficiency of the network entities, and it has been recently used to solve various issues in the emerging networks. In this paper, we thus review the applications of MARL in the emerging networks. In particular, we provide a tutorial of MARL and a comprehensive survey of applications of MARL in next generation Internet. In particular, we first introduce single-agent RL and MARL. Then, we review a number of applications of MARL to solve emerging issues in future Internet. The issues consist of network access, transmit power control, computation offloading, content caching, packet routing, trajectory design for UAV-aided networks, and network security issues.

preprint2022arXiv

Attention-based Dual Supervised Decoder for RGBD Semantic Segmentation

Encoder-decoder models have been widely used in RGBD semantic segmentation, and most of them are designed via a two-stream network. In general, jointly reasoning the color and geometric information from RGBD is beneficial for semantic segmentation. However, most existing approaches fail to comprehensively utilize multimodal information in both the encoder and decoder. In this paper, we propose a novel attention-based dual supervised decoder for RGBD semantic segmentation. In the encoder, we design a simple yet effective attention-based multimodal fusion module to extract and fuse deeply multi-level paired complementary information. To learn more robust deep representations and rich multi-modal information, we introduce a dual-branch decoder to effectively leverage the correlations and complementary cues of different tasks. Extensive experiments on NYUDv2 and SUN-RGBD datasets demonstrate that our method achieves superior performance against the state-of-the-art methods.

preprint2022arXiv

Auditing Membership Leakages of Multi-Exit Networks

Relying on the fact that not all inputs require the same amount of computation to yield a confident prediction, multi-exit networks are gaining attention as a prominent approach for pushing the limits of efficient deployment. Multi-exit networks endow a backbone model with early exits, allowing to obtain predictions at intermediate layers of the model and thus save computation time and/or energy. However, current various designs of multi-exit networks are only considered to achieve the best trade-off between resource usage efficiency and prediction accuracy, the privacy risks stemming from them have never been explored. This prompts the need for a comprehensive investigation of privacy risks in multi-exit networks. In this paper, we perform the first privacy analysis of multi-exit networks through the lens of membership leakages. In particular, we first leverage the existing attack methodologies to quantify the multi-exit networks&#39; vulnerability to membership leakages. Our experimental results show that multi-exit networks are less vulnerable to membership leakages and the exit (number and depth) attached to the backbone model is highly correlated with the attack performance. Furthermore, we propose a hybrid attack that exploits the exit information to improve the performance of existing attacks. We evaluate membership leakage threat caused by our hybrid attack under three different adversarial setups, ultimately arriving at a model-free and data-free adversary. These results clearly demonstrate that our hybrid attacks are very broadly applicable, thereby the corresponding risks are much more severe than shown by existing membership inference attacks. We further present a defense mechanism called TimeGuard specifically for multi-exit networks and show that TimeGuard mitigates the newly proposed attacks perfectly.

preprint2022arXiv

ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers

Self-supervised learning in speech involves training a speech representation network on a large-scale unannotated speech corpus, and then applying the learned representations to downstream tasks. Since the majority of the downstream tasks of SSL learning in speech largely focus on the content information in speech, the most desirable speech representations should be able to disentangle unwanted variations, such as speaker variations, from the content. However, disentangling speakers is very challenging, because removing the speaker information could easily result in a loss of content as well, and the damage of the latter usually far outweighs the benefit of the former. In this paper, we propose a new SSL method that can achieve speaker disentanglement without severe loss of content. Our approach is adapted from the HuBERT framework, and incorporates disentangling mechanisms to regularize both the teacher labels and the learned representations. We evaluate the benefit of speaker disentanglement on a set of content-related downstream tasks, and observe a consistent and notable performance advantage of our speaker-disentangled representations.

preprint2022arXiv

Crystal growth engineering and origin of the weak ferromagnetism in antiferromagnetic matrix of orthochromates from $t$-$e$ orbital hybridization

We report a combined experimental and theoretical study on intriguing magnetic properties of quasiferroelectric orthochromates. Large single crystals of the family of RECrO$_3$ (RE = Y, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, and Lu) compounds were successfully grown. Neutron Laue study indicates a good quality of the obtained single crystals. Applied magnetic-field and temperature dependent magnetization measurements reveal their intrinsic magnetic properties, especially the antiferromagnetic (AFM) transition temperatures. Density functional theory studies of the electronic structures were carried out using the Perdew-Burke-Ernzerhof functional plus Hubbard $U$ method. Crystallographic information and magnetism were theoretically optimized systematically. When RE$^{3+}$ cations vary from Y$^{3+}$ and Eu$^{3+}$ to Lu$^{3+}$ ions, the calculated $t$-$e$ orbital hybridization degree and Néel temperature behave similarly to the experimentally-determined AFM transition temperature with variation in cationic radius. We found that the $t$-$e$ hybridization is anisotropic, causing a magnetic anisotropy of Cr$^{3+}$ sublattices. This was evaluated with the nearest-neighbour $J_1$-$J_2$ model. Our research provides a picture of the electronic structures during the $t$-$e$ hybridization process while changing RE ions and sheds light on the nature of the weak ferromagnetism coexisting with predominated antiferromagnetism. The available large RECrO$_3$ single crystals build a platform for further studies of orthochromates.

preprint2022arXiv

Data-Efficient Double-Win Lottery Tickets from Robust Pre-training

Pre-training serves as a broadly adopted starting point for transfer learning on various downstream tasks. Recent investigations of lottery tickets hypothesis (LTH) demonstrate such enormous pre-trained models can be replaced by extremely sparse subnetworks (a.k.a. matching subnetworks) without sacrificing transferability. However, practical security-crucial applications usually pose more challenging requirements beyond standard transfer, which also demand these subnetworks to overcome adversarial vulnerability. In this paper, we formulate a more rigorous concept, Double-Win Lottery Tickets, in which a located subnetwork from a pre-trained model can be independently transferred on diverse downstream tasks, to reach BOTH the same standard and robust generalization, under BOTH standard and adversarial training regimes, as the full pre-trained model can do. We comprehensively examine various pre-training mechanisms and find that robust pre-training tends to craft sparser double-win lottery tickets with superior performance over the standard counterparts. For example, on downstream CIFAR-10/100 datasets, we identify double-win matching subnetworks with the standard, fast adversarial, and adversarial pre-training from ImageNet, at 89.26%/73.79%, 89.26%/79.03%, and 91.41%/83.22% sparsity, respectively. Furthermore, we observe the obtained double-win lottery tickets can be more data-efficient to transfer, under practical data-limited (e.g., 1% and 10%) downstream schemes. Our results show that the benefits from robust pre-training are amplified by the lottery ticket scheme, as well as the data-limited transfer setting. Codes are available at https://github.com/VITA-Group/Double-Win-LTH.

preprint2022arXiv

Decoupled Pyramid Correlation Network for Liver Tumor Segmentation from CT images

Purpose: Automated liver tumor segmentation from Computed Tomography (CT) images is a necessary prerequisite in the interventions of hepatic abnormalities and surgery planning. However, accurate liver tumor segmentation remains challenging due to the large variability of tumor sizes and inhomogeneous texture. Recent advances based on Fully Convolutional Network (FCN) for medical image segmentation drew on the success of learning discriminative pyramid features. In this paper, we propose a Decoupled Pyramid Correlation Network (DPC-Net) that exploits attention mechanisms to fully leverage both low- and high-level features embedded in FCN to segment liver tumor. Methods: We first design a powerful Pyramid Feature Encoder (PFE) to extract multi-level features from input images. Then we decouple the characteristics of features concerning spatial dimension (i.e., height, width, depth) and semantic dimension (i.e., channel). On top of that, we present two types of attention modules, Spatial Correlation (SpaCor) and Semantic Correlation (SemCor) modules, to recursively measure the correlation of multi-level features. The former selectively emphasizes global semantic information in low-level features with the guidance of high-level ones. The latter adaptively enhance spatial details in high-level features with the guidance of low-level ones. Results: We evaluate the DPC-Net on MICCAI 2017 LiTS Liver Tumor Segmentation (LiTS) challenge dataset. Dice Similarity Coefficient (DSC) and Average Symmetric Surface Distance (ASSD) are employed for evaluation. The proposed method obtains a DSC of 76.4% and an ASSD of 0.838 mm for liver tumor segmentation, outperforming the state-of-the-art methods. It also achieves a competitive results with a DSC of 96.0% and an ASSD of 1.636 mm for liver segmentation.

preprint2022arXiv

Delayed Impact of Interdisciplinary Research

Interdisciplinary research increasingly fuels innovation, and is considered to be a key to tomorrow breakthrough. Yet little is known about whether interdisciplinary research manifests delayed impact. Here, we use the time to reach the citation peak to quantify the highest impact time and citation dynamics, and examine its relationship with interdisciplinarity. Using large scale publication datasets, our results suggest that interdisciplinary papers show significant delayed impact both microscopically per paper and macroscopically collectively, as it takes longer time for interdisciplinary papers to reach their citation peak. Furthermore, we study the underlying forces of such delayed impact, finding that the effect goes beyond the Matthew effect (i.e., the rich-get-richer effect). Finally, we find that team size and content conventionality only partly account for this effect. Overall, our results suggest that governments, research administrators, funding agencies should be aware of this general feature of interdisciplinary science, which may have broad policy implications.

preprint2022arXiv

DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings

We propose DiffCSE, an unsupervised contrastive learning framework for learning sentence embeddings. DiffCSE learns sentence embeddings that are sensitive to the difference between the original sentence and an edited sentence, where the edited sentence is obtained by stochastically masking out the original sentence and then sampling from a masked language model. We show that DiffSCE is an instance of equivariant contrastive learning (Dangovski et al., 2021), which generalizes contrastive learning and learns representations that are insensitive to certain types of augmentations and sensitive to other &#34;harmful&#34; types of augmentations. Our experiments show that DiffCSE achieves state-of-the-art results among unsupervised sentence representation learning methods, outperforming unsupervised SimCSE by 2.3 absolute points on semantic textual similarity tasks.

preprint2022arXiv

Direct observation of moiré flat-band breakdown at the edge of magic-angle twisted bilayer graphene

Low-energy moiré flat bands in magic-angle twisted bilayer graphene (tBG) have demonstrated incredible potentials to exhibit rich exotic quantum phenomena. Theoretically, the moiré flat bands of tBG are based on the extended structures, i.e., the moiré patterns with periodic boundary conditions. However, a fundamental question of whether the flat bands can exist in the graphene moiré patterns with a reduced structure symmetry, such as sample edges, remains unanswered. Here, via scanning tunneling microscopy and spectroscopy, we study the local electronic properties of a magic-angle tBG near the sample terminated edge and report a direct observation of breakdown of the moiré flat bands. We show that the moiré electronic structures, including the low-energy flat bands, can sufficiently exist in a complete moiré spot, i.e., a moiré supercell, right at the edge even the translational symmetry of the moiré patterns is broken in one direction. However, the flat-band characteristic is obviously absent in the incomplete moiré spots that are partly terminated by the edge. Our results indicate that a whole moiré spot is sufficient and indispensable for the generation of the effective moiré flat bands in tBG.

preprint2022arXiv

Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization without Accessing Target Domain Data

We propose to harness the potential of simulation for the semantic segmentation of real-world self-driving scenes in a domain generalization fashion. The segmentation network is trained without any data of target domains and tested on the unseen target domains. To this end, we propose a new approach of domain randomization and pyramid consistency to learn a model with high generalizability. First, we propose to randomize the synthetic images with the styles of real images in terms of visual appearances using auxiliary datasets, in order to effectively learn domain-invariant representations. Second, we further enforce pyramid consistency across different &#34;stylized&#34; images and within an image, in order to learn domain-invariant and scale-invariant features, respectively. Extensive experiments are conducted on the generalization from GTA and SYNTHIA to Cityscapes, BDDS and Mapillary; and our method achieves superior results over the state-of-the-art techniques. Remarkably, our generalization results are on par with or even better than those obtained by state-of-the-art simulation-to-real domain adaptation methods, which access the target domain data at training time.

preprint2022arXiv

Dynamic Backdoor Attacks Against Machine Learning Models

Machine learning (ML) has made tremendous progress during the past decade and is being adopted in various critical real-world applications. However, recent research has shown that ML models are vulnerable to multiple security and privacy attacks. In particular, backdoor attacks against ML models have recently raised a lot of awareness. A successful backdoor attack can cause severe consequences, such as allowing an adversary to bypass critical authentication systems. Current backdooring techniques rely on adding static triggers (with fixed patterns and locations) on ML model inputs which are prone to detection by the current backdoor detection mechanisms. In this paper, we propose the first class of dynamic backdooring techniques against deep neural networks (DNN), namely Random Backdoor, Backdoor Generating Network (BaN), and conditional Backdoor Generating Network (c-BaN). Triggers generated by our techniques can have random patterns and locations, which reduce the efficacy of the current backdoor detection mechanisms. In particular, BaN and c-BaN based on a novel generative network are the first two schemes that algorithmically generate triggers. Moreover, c-BaN is the first conditional backdooring technique that given a target label, it can generate a target-specific trigger. Both BaN and c-BaN are essentially a general framework which renders the adversary the flexibility for further customizing backdoor attacks. We extensively evaluate our techniques on three benchmark datasets: MNIST, CelebA, and CIFAR-10. Our techniques achieve almost perfect attack performance on backdoored data with a negligible utility loss. We further show that our techniques can bypass current state-of-the-art defense mechanisms against backdoor attacks, including ABS, Februus, MNTD, Neural Cleanse, and STRIP.

preprint2022arXiv

Effective Tensor Completion via Element-wise Weighted Low-rank Tensor Train with Overlapping Ket Augmentation

In recent years, there have been an increasing number of applications of tensor completion based on the tensor train (TT) format because of its efficiency and effectiveness in dealing with higher-order tensor data. However, existing tensor completion methods using TT decomposition have two obvious drawbacks. One is that they only consider mode weights according to the degree of mode balance, even though some elements are recovered better in an unbalanced mode. The other is that serious blocking artifacts appear when the missing element rate is relatively large. To remedy such two issues, in this work, we propose a novel tensor completion approach via the element-wise weighted technique. Accordingly, a novel formulation for tensor completion and an effective optimization algorithm, called as tensor completion by parallel weighted matrix factorization via tensor train (TWMac-TT), is proposed. In addition, we specifically consider the recovery quality of edge elements from adjacent blocks. Different from traditional reshaping and ket augmentation, we utilize a new tensor augmentation technique called overlapping ket augmentation, which can further avoid blocking artifacts. We then conduct extensive performance evaluations on synthetic data and several real image data sets. Our experimental results demonstrate that the proposed algorithm TWMac-TT outperforms several other competing tensor completion methods.

preprint2022arXiv

Electronic structure, magnetic properties and pairing tendencies of the copper-based honeycomb lattice Na$_2$Cu$_2$TeO$_6$

Spin-$1/2$ chains with alternating antiferromagnetic and ferromagnetic couplings have attracted considerable interest due to the topological character of their spin excitations. Here, using density functional theory and density matrix renormalization group methods, we have systematically studied the dimerized chain system Na$_2$Cu$_2$TeO$_6$. Near the Fermi level, the dominant states are mainly contributed by the Cu $3d_{x^2-y^2}$ orbitals highly hybridized with the O $2p$ orbitals in the nonmagnetic phase, leading to an &#34;effective&#34; single-orbital low-energy model. Furthermore, the bandwidth of the Cu $3d_{x^2-y^2}$ states is small ($\sim 0.8$ eV), suggesting that electronic correlations will strongly affect this system. By introducing such electronic correlations, we found this system is a Mott insulator. Moreover, by calculating the magnetic exchange interactions ($J_1$, $J_2$ and $J_3$), we explained the size and sign of the exchange interactions in Na$_2$Cu$_2$TeO$_6$, in agreement with neutron experiments. In addition, we constructed a single-orbital Hubbard model for this dimerized chain system, where the quantum fluctuations are taken into account. Both AFM and FM coupling ($\uparrow$-$\downarrow$-$\downarrow$-$\uparrow$) along the chain were found in our DMRG and Lanczos calculations, in agreement with DFT and neutron results. We also calculated the hole pairing binding energy $ΔE$ which becomes negative at Hubbard $U \sim 11$ eV, indicating incipient pairing tendencies. Finally, we also looked at various cases of hole doping that always exhibit tight pairs. Thus, we believe our results for Na$_2$Cu$_2$TeO$_6$ could provide guidance to experimentalists and theorists working on this dimerized chain system, such as short-range magnetic coupling, doping effects, and possible pairing tendencies.

preprint2022arXiv

Evolution of barchan dune interactions investigated by a downscaled water tunnel experiment: the temporal characteristics and a soliton-like behavior

This paper reports a downscaled water tunnel experiment to study the temporal characteristics of a double dune interaction system and the new pattern of dune interaction when the initial mass ratio of the two dunes is large. These topics are useful for a comprehensive understanding of the dune interaction system but were rarely covered before. The turnover time scale under dune interaction is defined, and its time averaged value is found to have a nonmonotonic relationship with the initial mass ratio. A nonmonotonic relationship is also found between the convexity of the downstream dune tip and the initial mass ratio. The stationary points of the two nonmonotonic curves above correspond to the same dune interaction pattern named &#39;exchange-chasing&#39;, which is considered indispensable in the classification map of dune interactions. The upstream dune acts as an energy transmitter between fluid flow and the downstream dune. A soliton-like behavior occurs when the downstream dune enlarges, where a small dune is detached from the downstream dune tip and gets passed by the upstream dune approximately without mass exchange. The activity of such temporary soliton is found to be negatively related with the initial dune spacing and positively related with the initial mass ratio.

preprint2022arXiv

Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech

In this paper, we study the disentanglement of speaker and language representations in non-autoregressive cross-lingual TTS models from various aspects. We propose a phoneme length regulator that solves the length mismatch problem between IPA input sequence and monolingual alignment results. Using the phoneme length regulator, we present a FastPitch-based cross-lingual model with IPA symbols as input representations. Our experiments show that language-independent input representations (e.g. IPA symbols), an increasing number of training speakers, and explicit modeling of speech variance information all encourage non-autoregressive cross-lingual TTS model to disentangle speaker and language representations. The subjective evaluation shows that our proposed model can achieve decent naturalness and speaker similarity in cross-language voice cloning.

preprint2022arXiv

Finding MNEMON: Reviving Memories of Node Embeddings

Previous security research efforts orbiting around graphs have been exclusively focusing on either (de-)anonymizing the graphs or understanding the security and privacy issues of graph neural networks. Little attention has been paid to understand the privacy risks of integrating the output from graph embedding models (e.g., node embeddings) with complex downstream machine learning pipelines. In this paper, we fill this gap and propose a novel model-agnostic graph recovery attack that exploits the implicit graph structural information preserved in the embeddings of graph nodes. We show that an adversary can recover edges with decent accuracy by only gaining access to the node embedding matrix of the original graph without interactions with the node embedding models. We demonstrate the effectiveness and applicability of our graph recovery attack through extensive experiments.

preprint2022arXiv

Freedom to Choose: Understanding Input Modality Preferences of People with Upper-body Motor Impairments for Activities of Daily Living

Many people with upper-body motor impairments encounter challenges while performing Activities of Daily Living (ADLs) and Instrumental Activities of Daily Living (IADLs), such as toileting, grooming, and managing finances, which have impacts on their Quality of Life (QOL). Although existing assistive technologies enable people with upper-body motor impairments to use different input modalities to interact with computing devices independently (e.g., using voice to interact with a computer), many people still require Personal Care Assistants (PCAs) to perform ADLs. Multimodal input has the potential to enable users to perform ADLs without human assistance. We conducted 12 semi-structured interviews with people who have upper-body motor impairments to capture their existing practices and challenges of performing ADLs, identify opportunities to expand the input possibilities for assistive devices, and understand user preferences for multimodal interaction during everyday tasks. Finally, we discuss implications for the design and use of multimodal input solutions to support user independence and collaborative experiences when performing daily living tasks.

preprint2022arXiv

Geometric Algebra and Algebraic Geometry of Loop and Potts Models

We uncover a connection between two seemingly separate subjects in integrable models: the representation theory of the affine Temperley-Lieb algebra, and the algebraic structure of solutions to the Bethe equations of the XXZ spin chain. We study the solution of Bethe equations analytically by computational algebraic geometry, and find that the solution space encodes rich information about the representation theory of Temperley-Lieb algebra. Using these connections, we compute the partition function of the completely-packed loop model and of the closely related random-cluster Potts model, on medium-size lattices with toroidal boundary conditions, by two quite different methods. We consider the partial thermodynamic limit of infinitely long tori and analyze the corresponding condensation curves of the zeros of the partition functions. Two components of these curves are obtained analytically in the full thermodynamic limit.

preprint2022arXiv

Global fits of SUSY at future Higgs factories

In this work, we study the impact of electroweak and Higgs precision measurements at future electron-positron colliders on several typical supersymmetric models, including the Constrained Minimal Supersymmetric Standard Model (CMSSM), Non-Universal Higgs Mass generalisations (NUHM1, NUHM2), and the 7-dimensional Minimal Supersymmetric Standard Model (MSSM7). Using publicly-available data from the \textsf{GAMBIT} community, we post-process previous SUSY global fits with additional likelihoods to explore the discovery potential of Higgs factories, such as the Circular Electron Positron Collider (CEPC), the Future Circular Collider (FCC) and the International Linear Collider (ILC). We show that the currently allowed parameter space of these models will be further tested by future precision measurements. In particular, dark matter annihilation mechanisms may be distinguished by precise measurements of Higgs observables.

preprint2022arXiv

High-throughput study of the anomalous Hall effect

Despite being known for a long time the anomalous Hall effect still attracts attention because of its complex origins, its connection to topology and because it serves as a useful probe of the magnetic order. Here we study the anomalous Hall effect using automatic high-throughput calculation scheme. We calculate the intrinsic anomalous Hall effect in 2871 ferromagnetic materials. We use these results to study general properties of the anomalous Hall effect such as its dependence on the strength of the spin-orbit coupling or magnetization. We also examine the origin of the anomalous Hall effect in the materials with the largest effect and show that the origin of the large anomalous Hall effect is usually associated with symmetry protected band degeneracies in the non-relativistic electronic structure, typically mirror symmetry protected nodal lines. Additionally, we study the dependence of the anomalous Hall effect on the magnetization direction, showing that in many materials it differs significantly from the commonly assumed expression $\mathbf{j}^\text{AHE} \sim \mathbf{M} \times \mathbf{E}$.

preprint2022arXiv

Learning Rich Features for Gait Recognition by Integrating Skeletons and Silhouettes

Gait recognition captures gait patterns from the walking sequence of an individual for identification. Most existing gait recognition methods learn features from silhouettes or skeletons for the robustness to clothing, carrying, and other exterior factors. The combination of the two data modalities, however, is not fully exploited. Previous multimodal gait recognition methods mainly employ the skeleton to assist the local feature extraction where the intrinsic discrimination of the skeleton data is ignored. This paper proposes a simple yet effective Bimodal Fusion (BiFusion) network which mines discriminative gait patterns in skeletons and integrates with silhouette representations to learn rich features for identification. Particularly, the inherent hierarchical semantics of body joints in a skeleton is leveraged to design a novel Multi-Scale Gait Graph (MSGG) network for the feature extraction of skeletons. Extensive experiments on CASIA-B and OUMVLP demonstrate both the superiority of the proposed MSGG network in modeling skeletons and the effectiveness of the bimodal fusion for gait recognition. Under the most challenging condition of walking in different clothes on CASIA-B, our method achieves the rank-1 accuracy of 92.1%.

preprint2022arXiv

Linking Emergent and Natural Languages via Corpus Transfer

The study of language emergence aims to understand how human languages are shaped by perceptual grounding and communicative intent. Computational approaches to emergent communication (EC) predominantly consider referential games in limited domains and analyze the learned protocol within the game framework. As a result, it remains unclear how the emergent languages from these settings connect to natural languages or provide benefits in real-world language processing tasks, where statistical models trained on large text corpora dominate. In this work, we propose a novel way to establish such a link by corpus transfer, i.e. pretraining on a corpus of emergent language for downstream natural language tasks, which is in contrast to prior work that directly transfers speaker and listener parameters. Our approach showcases non-trivial transfer benefits for two different tasks -- language modeling and image captioning. For example, in a low-resource setup (modeling 2 million natural language tokens), pre-training on an emergent language corpus with just 2 million tokens reduces model perplexity by $24.6\%$ on average across ten natural languages. We also introduce a novel metric to predict the transferability of an emergent language by translating emergent messages to natural language captions grounded on the same images. We find that our translation-based metric highly correlates with the downstream performance on modeling natural languages (for instance $ρ=0.83$ on Hebrew), while topographic similarity, a popular metric in previous work, shows surprisingly low correlation ($ρ=0.003$), hinting that simple properties like attribute disentanglement from synthetic domains might not capture the full complexities of natural language. Our findings also indicate potential benefits of moving language emergence forward with natural language resources and models.

preprint2022arXiv

Low energy supersymmetry confronted with current experiments: an overview

This study provides a brief overview of low-energy supersymmetry (SUSY) in light of current experimental constraints, such as collider searches, dark matter searches, and muon $g-2$ measurements. In addition, we survey a variety of low energy supersymmetric models: the phenomenological minimal supersymmetric model (MSSM); the supersymmetric models with cut-off-scale boundary conditions, i.e., the minimal supergravity (mSUGRA) or the constrained MSSM (CMSSM), the gauge mediation of SUSY breaking (GMSB), and the anomaly mediation of SUSY breaking (AMSB), as well as their extensions. The conclusion is that the low energy SUSY can survive all current experimental constraints and remains compelling, albeit suffering from a little fine-tuning problem. The fancy models like mSUGRA, GMSB, and AMSB need to be extended if the muon $g-2$ anomaly comes from new physics.

preprint2022arXiv

Low energy SUSY confronted with new measurements of W-boson mass and muon g-2

The new CDF II measurement of $W$-boson mass shows a 7$σ$ deviation from the Standard Model (SM) prediction, while the recent FNAL measurement of the muon $g-2$ shows a 4.2$σ$ deviation (combined with the BNL result) from the SM. Both of them strongly indicate new physics beyond the SM. In this work we study the implication of both measurements on low energy supersymmetry. With an extensive exploration of the parameter space of the minimal supersymmetric standard model (MSSM), we find that in the parameter space allowed by current experimental constraints from colliders and dark matter detections, the MSSM can simultaneously explain both measurements on the edge of $2σ$ level, taking theoretical uncertainties into consideration. The favored parameter space, characterized by a compressed spectrum between bino, wino and stau, with the stop being around 1 TeV, may be covered in the near future LHC searches.

preprint2022arXiv

m-Order Time Optimal Control Synthesis Function of Discrete System

In this paper, first of all, we introduce the basic concepts of generating function in combinatorics and some combinatorial identities. In order to facilitate the understanding of m-order time optimal control synthesis function of discrete system (referred as m-order synthesis function), secondly, we introduce the derivation process and control ideas of 2nd-order synthesis function, and then deduce in detail the m-order synthesis function by means of generating function. By use of the m-order tracking-form synthesis function with filter factor, the methods of signal extraction and its predictive compensation are presented in this paper, and their immunity and effectiveness are verified by numerical simulation.

preprint2022arXiv

Maxwell field with gauge fixing term in de Sitter space: exact solution and stress tensor

The Maxwell field with a general gauge fixing (GF) term is nontrivial, not only the longitudinal and temporal modes are mixed up in the field equations, but also unwanted consequences might arise from the GF term. We derive the complete set of solutions in de Sitter space, and implement the covariant canonical quantization which restricts the residual gauge transformation down to a quantum residual gauge transformation. Then, in the Gupta-Bleuler (GB) physical state, we calculate the stress tensor which is amazingly independent of the gauge fixing constant and is also invariant under the quantum residual gauge transformation. The transverse components are simply the same as those in the Minkowski spacetime, and the transverse vacuum stress tensor has only one UV divergent term ($\propto k^4$), which becomes zero by the 0th-order adiabatic regularization. The longitudinal-temporal stress tensor in the GB state is zero due to a cancelation between the longitudinal and temporal parts. More interesting is the stress tensor of the GF term. Its particle contribution is zero due to the cancelation in the GB state, and its vacuum contribution is twice that of a minimally-coupling massless scalar field, containing $k^4$ and $k^2$ divergences. After the 2nd-order adiabatic regularization, the GF vacuum stress tensor becomes zero too, so that there is no need to introduce a ghost field, and the zero GF vacuum stress tensor can not be a possible candidate for the cosmological constant. Thus, all the physics predicted by the Maxwell field with the GF term will be the same as that without the GF term. We also carry out analogous calculation in the Minkowski spacetime, and the stress tensor is similar to, but simpler than that in de Sitter space.

preprint2022arXiv

Membership Inference Attacks by Exploiting Loss Trajectory

Machine learning models are vulnerable to membership inference attacks in which an adversary aims to predict whether or not a particular sample was contained in the target model&#39;s training dataset. Existing attack methods have commonly exploited the output information (mostly, losses) solely from the given target model. As a result, in practical scenarios where both the member and non-member samples yield similarly small losses, these methods are naturally unable to differentiate between them. To address this limitation, in this paper, we propose a new attack method, called \system, which can exploit the membership information from the whole training process of the target model for improving the attack performance. To mount the attack in the common black-box setting, we leverage knowledge distillation, and represent the membership information by the losses evaluated on a sequence of intermediate models at different distillation epochs, namely \emph{distilled loss trajectory}, together with the loss from the given target model. Experimental results over different datasets and model architectures demonstrate the great advantage of our attack in terms of different metrics. For example, on CINIC-10, our attack achieves at least 6$\times$ higher true-positive rate at a low false-positive rate of 0.1\% than existing methods. Further analysis demonstrates the general effectiveness of our attack in more strict scenarios.

preprint2022arXiv

Membership-Doctor: Comprehensive Assessment of Membership Inference Against Machine Learning Models

Machine learning models are prone to memorizing sensitive data, making them vulnerable to membership inference attacks in which an adversary aims to infer whether an input sample was used to train the model. Over the past few years, researchers have produced many membership inference attacks and defenses. However, these attacks and defenses employ a variety of strategies and are conducted in different models and datasets. The lack of comprehensive benchmark, however, means we do not understand the strengths and weaknesses of existing attacks and defenses. We fill this gap by presenting a large-scale measurement of different membership inference attacks and defenses. We systematize membership inference through the study of nine attacks and six defenses and measure the performance of different attacks and defenses in the holistic evaluation. We then quantify the impact of the threat model on the results of these attacks. We find that some assumptions of the threat model, such as same-architecture and same-distribution between shadow and target models, are unnecessary. We are also the first to execute attacks on the real-world data collected from the Internet, instead of laboratory datasets. We further investigate what determines the performance of membership inference attacks and reveal that the commonly believed overfitting level is not sufficient for the success of the attacks. Instead, the Jensen-Shannon distance of entropy/cross-entropy between member and non-member samples correlates with attack performance much better. This gives us a new way to accurately predict membership inference risks without running the attack. Finally, we find that data augmentation degrades the performance of existing attacks to a larger extent, and we propose an adaptive attack using augmentation to train shadow and attack models that improve attack performance.

preprint2022arXiv

mmFormer: Multimodal Medical Transformer for Incomplete Multimodal Learning of Brain Tumor Segmentation

Accurate brain tumor segmentation from Magnetic Resonance Imaging (MRI) is desirable to joint learning of multimodal images. However, in clinical practice, it is not always possible to acquire a complete set of MRIs, and the problem of missing modalities causes severe performance degradation in existing multimodal segmentation methods. In this work, we present the first attempt to exploit the Transformer for multimodal brain tumor segmentation that is robust to any combinatorial subset of available modalities. Concretely, we propose a novel multimodal Medical Transformer (mmFormer) for incomplete multimodal learning with three main components: the hybrid modality-specific encoders that bridge a convolutional encoder and an intra-modal Transformer for both local and global context modeling within each modality; an inter-modal Transformer to build and align the long-range correlations across modalities for modality-invariant features with global semantics corresponding to tumor region; a decoder that performs a progressive up-sampling and fusion with the modality-invariant features to generate robust segmentation. Besides, auxiliary regularizers are introduced in both encoder and decoder to further enhance the model&#39;s robustness to incomplete modalities. We conduct extensive experiments on the public BraTS $2018$ dataset for brain tumor segmentation. The results demonstrate that the proposed mmFormer outperforms the state-of-the-art methods for incomplete multimodal brain tumor segmentation on almost all subsets of incomplete modalities, especially by an average 19.07% improvement of Dice on tumor segmentation with only one available modality. The code is available at https://github.com/YaoZhang93/mmFormer.

preprint2022arXiv

On Xing Tian and the Perseverance of Anti-China Sentiment Online

Sinophobia, anti-Chinese sentiment, has existed on the Web for a long time. The outbreak of COVID-19 and the extended quarantine has further amplified it. However, we lack a quantitative understanding of the cause of Sinophobia as well as how it evolves over time. In this paper, we conduct a large-scale longitudinal measurement of Sinophobia, between 2016 and 2021, on two mainstream and fringe Web communities. By analyzing 8B posts from Reddit and 206M posts from 4chan&#39;s /pol/, we investigate the origins, evolution, and content of Sinophobia. We find that, anti-Chinese content may be evoked by political events not directly related to China, e.g., the U.S. withdrawal from the Paris Agreement. And during the COVID-19 pandemic, daily usage of Sinophobic slurs has significantly increased even with the hate-speech ban policy. We also show that the semantic meaning of the words &#34;China&#34; and &#34;Chinese&#34; are shifting towards Sinophobic slurs with the rise of COVID-19 and remain the same in the pandemic period. We further use topic modeling to show the topics of Sinophobic discussion are pretty diverse and broad. We find that both Web communities share some common Sinophobic topics like ethnics, economics and commerce, weapons and military, foreign relations, etc. However, compared to 4chan&#39;s /pol/, more daily life-related topics including food, game, and stock are found in Reddit. Our finding also reveals that the topics related to COVID-19 and blaming the Chinese government are more prevalent in the pandemic period. To the best of our knowledge, this paper is the longest quantitative measurement of Sinophobia.

preprint2022arXiv

Point-splitting regularization of the stress tensor of a coupling scalar field in de Sitter space

We perform the point-splitting regularization on the vacuum stress tensor of a coupling scalar field in de Sitter space under the guidance from the adiabatically regularized Green&#39;s function. For the massive scalar field with the minimal coupling $ξ=0$, the 2nd order point-splitting regularization yields a finite vacuum stress tensor with a positive, constant energy density, which can be identified as the cosmological constant that drives de Sitter inflation. For the coupling $ξ\ne 0$, we find that, even if the regularized Green&#39;s function is continuous, UV and IR convergent, the point-splitting regularization does not automatically lead to an appropriate stress tensor. The coupling $ξR$ causes log divergent terms, as well as higher-order finite terms which depend upon the path of the coincidence limit. After removing these unwanted terms by extra treatments, the 2nd-order regularization for small couplings $ξ\in(0,\frac{1}{7.04})$, and respectively the 0th-order regularization for the conformal coupling $ξ=\frac16$, yield a finite, constant vacuum stress tensor, in analogy to the case $ξ=0$. For the massless field with $ξ=0$ or $ξ=\frac16$, the point-splitting regularization yields a vanishing vacuum stress tensor, and there is no conformal trace anomaly for $ξ=\frac16$. If the 4th-order regularization were taken, the regularized energy density for general $ξ$ would be negative, which is inconsistent with the de Sitter inflation, and the regularized Green&#39;s function would be singular at the zero mass, which is unphysical. In all these cases, the stress tensor from the point-splitting regularization is equal to that from the adiabatic one.

preprint2022arXiv

Pro-UIGAN: Progressive Face Hallucination from Occluded Thumbnails

In this paper, we study the task of hallucinating an authentic high-resolution (HR) face from an occluded thumbnail. We propose a multi-stage Progressive Upsampling and Inpainting Generative Adversarial Network, dubbed Pro-UIGAN, which exploits facial geometry priors to replenish and upsample (8*) the occluded and tiny faces (16*16 pixels). Pro-UIGAN iteratively (1) estimates facial geometry priors for low-resolution (LR) faces and (2) acquires non-occluded HR face images under the guidance of the estimated priors. Our multi-stage hallucination network super-resolves and inpaints occluded LR faces in a coarse-to-fine manner, thus reducing unwanted blurriness and artifacts significantly. Specifically, we design a novel cross-modal transformer module for facial priors estimation, in which an input face and its landmark features are formulated as queries and keys, respectively. Such a design encourages joint feature learning across the input facial and landmark features, and deep feature correspondences will be discovered by attention. Thus, facial appearance features and facial geometry priors are learned in a mutual promotion manner. Extensive experiments demonstrate that our Pro-UIGAN achieves visually pleasing HR faces, reaching superior performance in downstream tasks, i.e., face alignment, face parsing, face recognition and expression classification, compared with other state-of-the-art (SotA) methods.

preprint2022arXiv

Pseudogap metal and magnetization plateau from doping moiré Mott insulator

The problem of doping Mott insulators is of fundamental importance and long-standing interest in the study of strongly correlated electron systems. The advent of semiconductor based moiré materials opens a new ground for simulating the Hubbard model on the triangular lattice and exploring the rich phase diagram of doped Mott insulators as a function of doping and external magnetic field. Based on our recent identification of spin polaron quasiparticle in Mott insulator, in this work we predict a new metallic state emerges at small doping and intermediate field range, a pseudogap metal that exhibits a single-particle gap and a doping-dependent magnetization plateau.

preprint2022arXiv

Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

Semi-supervised learning (SSL) leverages both labeled and unlabeled data to train machine learning (ML) models. State-of-the-art SSL methods can achieve comparable performance to supervised learning by leveraging much fewer labeled data. However, most existing works focus on improving the performance of SSL. In this work, we take a different angle by studying the training data privacy of SSL. Specifically, we propose the first data augmentation-based membership inference attacks against ML models trained by SSL. Given a data sample and the black-box access to a model, the goal of membership inference attack is to determine whether the data sample belongs to the training dataset of the model. Our evaluation shows that the proposed attack can consistently outperform existing membership inference attacks and achieves the best performance against the model trained by SSL. Moreover, we uncover that the reason for membership leakage in SSL is different from the commonly believed one in supervised learning, i.e., overfitting (the gap between training and testing accuracy). We observe that the SSL model is well generalized to the testing data (with almost 0 overfitting) but &#39;&#39;memorizes&#39;&#39; the training data by giving a more confident prediction regardless of its correctness. We also explore early stopping as a countermeasure to prevent membership inference attacks against SSL. The results show that early stopping can mitigate the membership inference attack, but with the cost of model&#39;s utility degradation.

preprint2022arXiv

Semi-supervised Cardiac Image Segmentation via Label Propagation and Style Transfer

Accurate segmentation of cardiac structures can assist doctors to diagnose diseases, and to improve treatment planning, which is highly demanded in the clinical practice. However, the shortage of annotation and the variance of the data among different vendors and medical centers restrict the performance of advanced deep learning methods. In this work, we present a fully automatic method to segment cardiac structures including the left (LV) and right ventricle (RV) blood pools, as well as for the left ventricular myocardium (MYO) in MRI volumes. Specifically, we design a semi-supervised learning method to leverage unlabelled MRI sequence timeframes by label propagation. Then we exploit style transfer to reduce the variance among different centers and vendors for more robust cardiac image segmentation. We evaluate our method in the M&Ms challenge 7 , ranking 2nd place among 14 competitive teams.

preprint2022arXiv

Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization

Text normalization (TN) systems in production are largely rule-based using weighted finite-state transducers (WFST). However, WFST-based systems struggle with ambiguous input when the normalized form is context-dependent. On the other hand, neural text normalization systems can take context into account but they suffer from unrecoverable errors and require labeled normalization datasets, which are hard to collect. We propose a new hybrid approach that combines the benefits of rule-based and neural systems. First, a non-deterministic WFST outputs all normalization candidates, and then a neural language model picks the best one -- similar to shallow fusion for automatic speech recognition. While the WFST prevents unrecoverable errors, the language model resolves contextual ambiguity. The approach is easy to extend and we show it is effective. It achieves comparable or better results than existing state-of-the-art TN models.

preprint2022arXiv

Simple and statistically sound recommendations for analysing physical theories

Physical theories that depend on many parameters or are tested against data from many different experiments pose unique challenges to statistical inference. Many models in particle physics, astrophysics and cosmology fall into one or both of these categories. These issues are often sidestepped with statistically unsound ad hoc methods, involving intersection of parameter intervals estimated by multiple experiments, and random or grid sampling of model parameters. Whilst these methods are easy to apply, they exhibit pathologies even in low-dimensional parameter spaces, and quickly become problematic to use and interpret in higher dimensions. In this article we give clear guidance for going beyond these procedures, suggesting where possible simple methods for performing statistically sound inference, and recommendations of readily-available software tools and standards that can assist in doing so. Our aim is to provide any physicists lacking comprehensive statistical training with recommendations for reaching correct scientific conclusions, with only a modest increase in analysis burden. Our examples can be reproduced with the code publicly available at https://doi.org/10.5281/zenodo.4322283.

preprint2022arXiv

SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks

SpeechSplit can perform aspect-specific voice conversion by disentangling speech into content, rhythm, pitch, and timbre using multiple autoencoders in an unsupervised manner. However, SpeechSplit requires careful tuning of the autoencoder bottlenecks, which can be time-consuming and less robust. This paper proposes SpeechSplit 2.0, which constrains the information flow of the speech component to be disentangled on the autoencoder input using efficient signal processing methods instead of bottleneck tuning. Evaluation results show that SpeechSplit 2.0 achieves comparable performance to SpeechSplit in speech disentanglement and superior robustness to the bottleneck size variations. Our code is available at https://github.com/biggytruck/SpeechSplit2.

preprint2022arXiv

SSLGuard: A Watermarking Scheme for Self-supervised Learning Pre-trained Encoders

Self-supervised learning is an emerging machine learning paradigm. Compared to supervised learning which leverages high-quality labeled datasets, self-supervised learning relies on unlabeled datasets to pre-train powerful encoders which can then be treated as feature extractors for various downstream tasks. The huge amount of data and computational resources consumption makes the encoders themselves become the valuable intellectual property of the model owner. Recent research has shown that the machine learning model&#39;s copyright is threatened by model stealing attacks, which aim to train a surrogate model to mimic the behavior of a given model. We empirically show that pre-trained encoders are highly vulnerable to model stealing attacks. However, most of the current efforts of copyright protection algorithms such as watermarking concentrate on classifiers. Meanwhile, the intrinsic challenges of pre-trained encoder&#39;s copyright protection remain largely unstudied. We fill the gap by proposing SSLGuard, the first watermarking scheme for pre-trained encoders. Given a clean pre-trained encoder, SSLGuard injects a watermark into it and outputs a watermarked version. The shadow training technique is also applied to preserve the watermark under potential model stealing attacks. Our extensive evaluation shows that SSLGuard is effective in watermark injection and verification, and it is robust against model stealing and other watermark removal attacks such as input noising, output perturbing, overwriting, model pruning, and fine-tuning.

preprint2022arXiv

Stability and low-energy orientations of interphase boundaries in multiaxial ferroelectrics: Phase-field simulations

The coexistence of different ferroelectric phases enables the tunability of the macroscopic properties and extensive applications from piezoelectric transducers to nonvolatile memories. Here we develop a thermodynamic model to predict the stability and low-energy orientations of boundaries between different phases in ferroelectrics. Taking lead zirconate titanate and bismuth ferrite as two examples, we demonstrate that the low-energy orientations of interphase boundaries are largely determined by minimizing the electrostatic and elastic energies. Phase-field simulations are employed to analyze the competition between the interfacial energy and the electrostatic and elastic energies. Our simulation results demonstrate that the lowering of crystal symmetry could occur due to the electrical and mechanical incompatibilities between the two phases, which can be used to explain the experimentally observed low-symmetry phases near morphotropic phase boundaries. Our work provides theoretical foundations for understanding and controlling the interphase boundaries in ferroelectric materials for multifunctional applications.

preprint2022arXiv

Strongly anisotropic electronic and magnetic structures in oxide dichlorides RuOCl$_2$ and OsOCl$_2$

Here, using density functional theory and density matrix renormalization group methods, we investigate the electronic and magnetic properties of RuOCl$_2$ and OsOCl$_2$ with $d^4$ electronic configurations. Different from a previous study using VOI$_2$ with $d^1$ configuration, these systems with $4d^4$ or $5d^4$ do not exhibit a ferroelectric instability along the $a$-axis. Due to the fully-occupied $d_{xy}$ orbital in RuOCl$_2$ and OsOCl$_2$, the Peierls instability distortion disappears along the $b$-axis, leading to an undistorted I${\rm mmm}$ phase (No. 71). Furthermore, we observe strongly anisotropic electronic and magnetic structures along the $a$-axis. The large crystal-field splitting energy (between $d_{xz/yz}$ and $d_{xy}$ orbitals) and large hopping between nearest-neighbor Ru and Os atoms suppresses the spin-orbital effect in $M$OCl$_2$ ($M$ = Ru or Os) with electronic density $n = 4$, resulting in a spin-1 system instead of a $J = 0$ singlet ground state. Moreover, we find staggered antiferromagnetic order with $π$ wavevector along the $M$-O chain direction ($a$-axis) while the magnetic coupling along the $b$-axis is weak. Based on Wannier functions from first-principles calculations, we calculated the relevant hopping amplitudes and crystal-field splitting energies of the $t_{2g}$ orbitals for the Os atoms to construct a multi-orbital Hubbard model for the $M$-O chains. Staggered AFM with $\uparrow$-$\downarrow$-$\uparrow$-$\downarrow$ spin structure dominates in our DMRG calculations, in agreement with DFT calculations.

preprint2022arXiv

Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion

Recent years have witnessed the extraordinary development of automatic speaker verification (ASV). However, previous works show that state-of-the-art ASV models are seriously vulnerable to voice spoofing attacks, and the recently proposed high-performance spoofing countermeasure (CM) models only focus solely on the standalone anti-spoofing tasks, and ignore the subsequent speaker verification process. How to integrate the CM and ASV together remains an open question. A spoofing aware speaker verification (SASV) challenge has recently taken place with the argument that better performance can be delivered when both CM and ASV subsystems are optimized jointly. Under the challenge&#39;s scenario, the integrated systems proposed by the participants are required to reject both impostor speakers and spoofing attacks from target speakers, which intuitively and effectively matches the expectation of a reliable, spoofing-robust ASV system. This work focuses on fusion-based SASV solutions and proposes a multi-model fusion framework to leverage the power of multiple state-of-the-art ASV and CM models. The proposed framework vastly improves the SASV-EER from 8.75% to 1.17\%, which is 86% relative improvement compared to the best baseline system in the SASV challenge.

preprint2022arXiv

Teacher Model Fingerprinting Attacks Against Transfer Learning

Transfer learning has become a common solution to address training data scarcity in practice. It trains a specified student model by reusing or fine-tuning early layers of a well-trained teacher model that is usually publicly available. However, besides utility improvement, the transferred public knowledge also brings potential threats to model confidentiality, and even further raises other security and privacy issues. In this paper, we present the first comprehensive investigation of the teacher model exposure threat in the transfer learning context, aiming to gain a deeper insight into the tension between public knowledge and model confidentiality. To this end, we propose a teacher model fingerprinting attack to infer the origin of a student model, i.e., the teacher model it transfers from. Specifically, we propose a novel optimization-based method to carefully generate queries to probe the student model to realize our attack. Unlike existing model reverse engineering approaches, our proposed fingerprinting method neither relies on fine-grained model outputs, e.g., posteriors, nor auxiliary information of the model architecture or training dataset. We systematically evaluate the effectiveness of our proposed attack. The empirical results demonstrate that our attack can accurately identify the model origin with few probing queries. Moreover, we show that the proposed attack can serve as a stepping stone to facilitating other attacks against machine learning models, such as model stealing.

preprint2022arXiv

Theoretical study of the crystal and electronic properties of $α$-RuI$_3$

The material $α$-RuCl$_3$, with a two-dimensional Ru-honeycomb sublattice, has attracted considerable attention because it may be a realization of the Kitaev quantum spin liquid (QSL). Recently, a new honeycomb material, $α$-RuI$_3$, was prepared under moderate high-pressure and it is stable under ambient conditions. However, different from $α$-RuCl$_3$, $α$-RuI$_3$ was reported to be a paramagnetic metal without long-range magnetic order down to $0.35$ K. Here, the structural and electronic properties of the quasi-two-dimensional $α$-RuI$_3$ are theoretically studied. First, based on first-principles density functional theory (DFT) calculations, the ABC stacking honeycomb-layer $R\overline{3}$ (No. 148) structure is found to be the most likely stacking order for $α$-RuI$_3$ along the $c$-axis. Furthermore, both $R\overline{3}$ and $P\overline{3}1c$ are dynamically stable because no imaginary frequency modes were obtained in the phononic dispersion spectrum. Moreover, the different physical behavior of $α$-RuI$_3$ compared to $α$-RuCl$_3$ can be understood naturally. The strong hybridization between Ru $4d$ and I $5p$ orbitals decreases the effective atomic Hubbard repulsion $U$, leading the electrons of RuI$_3$ to be less localized than in RuCl$_3$. As a consequence, the effective repulsion $U$ is reduced from Cl to I, leading to the metallic nature of $α$-RuI$_3$. Based on the DFT+$U$ ($U_{\rm eff} = 2$ eV), plus spin-orbital coupling (SOC), we obtained a spin-orbit Mott insulating behavior for $α$-RuCl$_3$ and, by the same procedure, a metallic behavior for $α$-RuI$_3$, in good agreement with experimental results. Furthermore, when introducing a large (unrealistic) $U_{\rm eff} = 6$ eV, the spin-orbit Mott gap opens in $α$-RuI$_3$ as well, supporting the physical picture we are proposing.

preprint2022arXiv

Thermodynamics of the Reissner-Nordström-de Sitter Spacetime with Quintessence

For Anti-de Sitte (AdS) black holes, the isochoric heat capacity of system is vanished, while the isobaric heat capacity is not. However, this situation does not hold on for de Sitter (dS) black holes. In this work, by introducing the interaction between the black hole horizon and the cosmological horizon of the Reissner-Nordström-de Sitter (RNdS) spacetime with quintessence, we discuss the phase transition of this system. The results show that the spacetime not only has the similar phase transition behavior to that of Van der Waals (VdW) system, and the non-vanishing isochoric heat capacity fulfills the whole thermodynamics system. Through the discussion of the entropic force between two horizons, we find out the role of entropic force in the evolution of spacetime. In addition, we also study the influence of various parameters on the phase transition and entropic force, which will provide a new method for exploring the interaction among black hole molecules from a micro perspective.

preprint2022arXiv

Towards Realistic Visual Dubbing with Heterogeneous Sources

The task of few-shot visual dubbing focuses on synchronizing the lip movements with arbitrary speech input for any talking head video. Albeit moderate improvements in current approaches, they commonly require high-quality homologous data sources of videos and audios, thus causing the failure to leverage heterogeneous data sufficiently. In practice, it may be intractable to collect the perfect homologous data in some cases, for example, audio-corrupted or picture-blurry videos. To explore this kind of data and support high-fidelity few-shot visual dubbing, in this paper, we novelly propose a simple yet efficient two-stage framework with a higher flexibility of mining heterogeneous data. Specifically, our two-stage paradigm employs facial landmarks as intermediate prior of latent representations and disentangles the lip movements prediction from the core task of realistic talking head generation. By this means, our method makes it possible to independently utilize the training corpus for two-stage sub-networks using more available heterogeneous data easily acquired. Besides, thanks to the disentanglement, our framework allows a further fine-tuning for a given talking head, thereby leading to better speaker-identity preserving in the final synthesized results. Moreover, the proposed method can also transfer appearance features from others to the target speaker. Extensive experimental results demonstrate the superiority of our proposed method in generating highly realistic videos synchronized with the speech over the state-of-the-art.

preprint2022arXiv

Transcranial photoacoustic computed tomography of human brain function

Herein we report the first in-human transcranial imaging of brain function using photoacoustic computed tomography. Functional responses to benchmark motor tasks were imaged on both the skull-less and the skull-intact hemispheres of a hemicraniectomy patient. The observed brain responses in these preliminary results demonstrate the potential of photoacoustic computed tomography for achieving transcranial functional imaging.

preprint2022arXiv

Trust It or Not: Confidence-Guided Automatic Radiology Report Generation

Medical imaging plays a pivotal role in diagnosis and treatment in clinical practice. Inspired by the significant progress in automatic image captioning, various deep learning (DL)-based methods have been proposed to generate radiology reports for medical images. Despite promising results, previous works overlook the uncertainties of their models and are thus unable to provide clinicians with the reliability/confidence of the generated radiology reports to assist their decision-making. In this paper, we propose a novel method to explicitly quantify both the visual uncertainty and the textual uncertainty for DL-based radiology report generation. Such multi-modal uncertainties can sufficiently capture the model confidence degree at both the report level and the sentence level, and thus they are further leveraged to weight the losses for more comprehensive model optimization. Experimental results have demonstrated that the proposed method for model uncertainty characterization and estimation can produce more reliable confidence scores for radiology report generation, and the modified loss function, which takes into account the uncertainties, leads to better model performance on two public radiology report datasets. In addition, the quality of the automatically generated reports was manually evaluated by human raters and the results also indicate that the proposed uncertainties can reflect the variance of clinical diagnosis.

preprint2022arXiv

Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition

An unsupervised text-to-speech synthesis (TTS) system learns to generate speech waveforms corresponding to any written sentence in a language by observing: 1) a collection of untranscribed speech waveforms in that language; 2) a collection of texts written in that language without access to any transcribed speech. Developing such a system can significantly improve the availability of speech technology to languages without a large amount of parallel speech and text data. This paper proposes an unsupervised TTS system based on an alignment module that outputs pseudo-text and another synthesis module that uses pseudo-text for training and real text for inference. Our unsupervised system can achieve comparable performance to the supervised system in seven languages with about 10-20 hours of speech each. A careful study on the effect of text units and vocoders has also been conducted to better understand what factors may affect unsupervised TTS performance. The samples generated by our models can be found at https://cactuswiththoughts.github.io/UnsupTTS-Demo, and our code can be found at https://github.com/lwang114/UnsupTTS.

preprint2022arXiv

WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models

Large-scale auto-regressive language models pretrained on massive text have demonstrated their impressive ability to perform new natural language tasks with only a few text examples, without the need for fine-tuning. Recent studies further show that such a few-shot learning ability can be extended to the text-image setting by training an encoder to encode the images into embeddings functioning like the text embeddings of the language model. Interested in exploring the possibility of transferring the few-shot learning ability to the audio-text setting, we propose a novel speech understanding framework, WavPrompt, where we finetune a wav2vec model to generate a sequence of audio embeddings understood by the language model. We show that WavPrompt is a few-shot learner that can perform speech understanding tasks better than a naive text baseline. We conduct detailed ablation studies on different components and hyperparameters to empirically identify the best model configuration. In addition, we conduct a non-speech understanding experiment to show WavPrompt can extract more information than just the transcriptions. Code is available at https://github.com/Hertin/WavPrompt

preprint2022arXiv

Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots

Chatbots are used in many applications, e.g., automated agents, smart home assistants, interactive characters in online games, etc. Therefore, it is crucial to ensure they do not behave in undesired manners, providing offensive or toxic responses to users. This is not a trivial task as state-of-the-art chatbot models are trained on large, public datasets openly collected from the Internet. This paper presents a first-of-its-kind, large-scale measurement of toxicity in chatbots. We show that publicly available chatbots are prone to providing toxic responses when fed toxic queries. Even more worryingly, some non-toxic queries can trigger toxic responses too. We then set out to design and experiment with an attack, ToxicBuddy, which relies on fine-tuning GPT-2 to generate non-toxic queries that make chatbots respond in a toxic manner. Our extensive experimental evaluation demonstrates that our attack is effective against public chatbot models and outperforms manually-crafted malicious queries proposed by previous work. We also evaluate three defense mechanisms against ToxicBuddy, showing that they either reduce the attack performance at the cost of affecting the chatbot&#39;s utility or are only effective at mitigating a portion of the attack. This highlights the need for more research from the computer security and online safety communities to ensure that chatbot models do not hurt their users. Overall, we are confident that ToxicBuddy can be used as an auditing tool and that our work will pave the way toward designing more effective defenses for chatbot safety.

preprint2022arXiv

Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

This paper reports the results and post-challenge analyses of ChaLearn&#39;s AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a &#34;meta-learner&#34;, &#34;data ingestor&#34;, &#34;model selector&#34;, &#34;model/learner&#34;, and &#34;evaluator&#34;. This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free &#34;AutoDL self-service&#34;.

preprint2021arXiv

&#34;Go eat a bat, Chang!&#34;: On the Emergence of Sinophobic Behavior on Web Communities in the Face of COVID-19

The outbreak of the COVID-19 pandemic has changed our lives in unprecedented ways. In the face of the projected catastrophic consequences, many countries have enacted social distancing measures in an attempt to limit the spread of the virus. Under these conditions, the Web has become an indispensable medium for information acquisition, communication, and entertainment. At the same time, unfortunately, the Web is being exploited for the dissemination of potentially harmful and disturbing content, such as the spread of conspiracy theories and hateful speech towards specific ethnic groups, in particular towards Chinese people since COVID-19 is believed to have originated from China. In this paper, we make a first attempt to study the emergence of Sinophobic behavior on the Web during the outbreak of the COVID-19 pandemic. We collect two large-scale datasets from Twitter and 4chan&#39;s Politically Incorrect board (/pol/) over a time period of approximately five months and analyze them to investigate whether there is a rise or important differences with regard to the dissemination of Sinophobic content. We find that COVID-19 indeed drives the rise of Sinophobia on the Web and that the dissemination of Sinophobic content is a cross-platform phenomenon: it exists on fringe Web communities like \dspol, and to a lesser extent on mainstream ones like Twitter. Also, using word embeddings over time, we characterize the evolution and emergence of new Sinophobic slurs on both Twitter and /pol/. Finally, we find interesting differences in the context in which words related to Chinese people are used on the Web before and after the COVID-19 outbreak: on Twitter we observe a shift towards blaming China for the situation, while on /pol/ we find a shift towards using more (and new) Sinophobic slurs.

preprint2021arXiv

$t$-$k$-means: A Robust and Stable $k$-means Variant

$k$-means algorithm is one of the most classical clustering methods, which has been widely and successfully used in signal processing. However, due to the thin-tailed property of the Gaussian distribution, $k$-means algorithm suffers from relatively poor performance on the dataset containing heavy-tailed data or outliers. Besides, standard $k$-means algorithm also has relatively weak stability, $i.e.$ its results have a large variance, which reduces its credibility. In this paper, we propose a robust and stable $k$-means variant, dubbed the $t$-$k$-means, as well as its fast version to alleviate those problems. Theoretically, we derive the $t$-$k$-means and analyze its robustness and stability from the aspect of the loss function and the expression of the clustering center, respectively. Extensive experiments are also conducted, which verify the effectiveness and efficiency of the proposed method. The code for reproducing main results is available at \url{https://github.com/THUYimingLi/t-k-means}.

preprint2021arXiv

A Unified Light Framework for Real-time Fault Detection of Freight Train Images

Real-time fault detection for freight trains plays a vital role in guaranteeing the security and optimal operation of railway transportation under stringent resource requirements. Despite the promising results for deep learning based approaches, the performance of these fault detectors on freight train images, are far from satisfactory in both accuracy and efficiency. This paper proposes a unified light framework to improve detection accuracy while supporting a real-time operation with a low resource requirement. We firstly design a novel lightweight backbone (RFDNet) to improve the accuracy and reduce computational cost. Then, we propose a multi region proposal network using multi-scale feature maps generated from RFDNet to improve the detection performance. Finally, we present multi level position-sensitive score maps and region of interest pooling to further improve accuracy with few redundant computations. Extensive experimental results on public benchmark datasets suggest that our RFDNet can significantly improve the performance of baseline network with higher accuracy and efficiency. Experiments on six fault datasets show that our method is capable of real-time detection at over 38 frames per second and achieves competitive accuracy and lower computation than the state-of-the-art detectors.

preprint2021arXiv

Affinity Fusion Graph-based Framework for Natural Image Segmentation

This paper proposes an affinity fusion graph framework to effectively connect different graphs with highly discriminating power and nonlinearity for natural image segmentation. The proposed framework combines adjacency-graphs and kernel spectral clustering based graphs (KSC-graphs) according to a new definition named affinity nodes of multi-scale superpixels. These affinity nodes are selected based on a better affiliation of superpixels, namely subspace-preserving representation which is generated by sparse subspace clustering based on subspace pursuit. Then a KSC-graph is built via a novel kernel spectral clustering to explore the nonlinear relationships among these affinity nodes. Moreover, an adjacency-graph at each scale is constructed, which is further used to update the proposed KSC-graph at affinity nodes. The fusion graph is built across different scales, and it is partitioned to obtain final segmentation result. Experimental results on the Berkeley segmentation dataset and Microsoft Research Cambridge dataset show the superiority of our framework in comparison with the state-of-the-art methods. The code is available at https://github.com/Yangzhangcst/AF-graph.

preprint2021arXiv

AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-21

Acronym identification focuses on finding the acronyms and the phrases that have been abbreviated, which is crucial for scientific document understanding tasks. However, the limited size of manually annotated datasets hinders further improvement for the problem. Recent breakthroughs of language models pre-trained on large corpora clearly show that unsupervised pre-training can vastly improve the performance of downstream tasks. In this paper, we present an Adversarial Training BERT method named AT-BERT, our winning solution to acronym identification task for Scientific Document Understanding (SDU) Challenge of AAAI 2021. Specifically, the pre-trained BERT is adopted to capture better semantic representation. Then we incorporate the FGM adversarial training strategy into the fine-tuning of BERT, which makes the model more robust and generalized. Furthermore, an ensemble mechanism is devised to involve the representations learned from multiple BERT variants. Assembling all these components together, the experimental results on the SciAI dataset show that our proposed approach outperforms all other competitive state-of-the-art methods.

preprint2021arXiv

ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders

This paper presents ByteSing, a Chinese singing voice synthesis (SVS) system based on duration allocated Tacotron-like acoustic models and WaveRNN neural vocoders. Different from the conventional SVS models, the proposed ByteSing employs Tacotron-like encoder-decoder structures as the acoustic models, in which the CBHG models and recurrent neural networks (RNNs) are explored as encoders and decoders respectively. Meanwhile an auxiliary phoneme duration prediction model is utilized to expand the input sequence, which can enhance the model controllable capacity, model stability and tempo prediction accuracy. WaveRNN neural vocoders are also adopted as neural vocoders to further improve the voice quality of synthesized songs. Both objective and subjective experimental results prove that the SVS method proposed in this paper can produce quite natural, expressive and high-fidelity songs by improving the pitch and spectrogram prediction accuracy and the models using attention mechanism can achieve best performance.

preprint2021arXiv

Discovery of carbon-based strongest and hardest amorphous material

Carbon is likely the most fascinating element of the periodic table because of the diversity of its allotropes stemming from its variable (sp, sp2, and sp3) bonding motifs. Exploration of new forms of carbon has been an eternal theme of contemporary scientific research. Here we report on novel amorphous carbon phases containing high fraction of sp3 bonded atoms recovered after compressing fullerene C60 to previously unexplored high pressure and temperature. The synthesized carbons are the hardest and strongest amorphous materials known to date, capable of scratching diamond crystal and approaching its strength which is evidenced by complimentary mechanical tests. Photoluminescence and absorption spectra of the materials demonstrate they are semiconductors with tunable bandgaps in the range of 1.5-2.2 eV, comparable to that of amorphous silicon. A remarkable combination of the outstanding mechanical and electronic properties makes this class of amorphous carbons an excellent candidate for photovoltaic applications demanding ultrahigh strength and wear resistance.

preprint2021arXiv

Does Non-COVID19 Lung Lesion Help? Investigating Transferability in COVID-19 CT Image Segmentation

Coronavirus disease 2019 (COVID-19) is a highly contagious virus spreading all around the world. Deep learning has been adopted as an effective technique to aid COVID-19 detection and segmentation from computed tomography (CT) images. The major challenge lies in the inadequate public COVID-19 datasets. Recently, transfer learning has become a widely used technique that leverages the knowledge gained while solving one problem and applying it to a different but related problem. However, it remains unclear whether various non-COVID19 lung lesions could contribute to segmenting COVID-19 infection areas and how to better conduct this transfer procedure. This paper provides a way to understand the transferability of non-COVID19 lung lesions. Based on a publicly available COVID-19 CT dataset and three public non-COVID19 datasets, we evaluate four transfer learning methods using 3D U-Net as a standard encoder-decoder method. The results reveal the benefits of transferring knowledge from non-COVID19 lung lesions, and learning from multiple lung lesion datasets can extract more general features, leading to accurate and robust pre-trained models. We further show the capability of the encoder to learn feature representations of lung lesions, which improves segmentation accuracy and facilitates training convergence. In addition, our proposed Hybrid-encoder learning method incorporates transferred lung lesion features from non-COVID19 datasets effectively and achieves significant improvement. These findings promote new insights into transfer learning for COVID-19 CT image segmentation, which can also be further generalized to other medical tasks.

preprint2021arXiv

Dye-Encapsulated Zeolitic Imidazolate Framework (ZIF-71) for Fluorochromic Sensing of Pressure, Temperature, and Volatile Solvents

Luminescent metal-organic frameworks (MOFs) offer a multifunctional platform for creating non-invasive sensors and tuneable optoelectronics. However, fluorochromic materials that are photophysically resilient and show high sensitivity towards different physical and chemical stimuli are scarce. We report a facile host-guest nanoconfinement strategy to construct a fluorescent hybrid material with multiple sensing capabilities. We design and fabricate a new Guest@MOF material: comprising a zeolitic MOF (ZIF-71) as a nanoporous host for encapsulating rhodamine B (RhB dye) guest molecules, resulting in the RhB@ZIF-71 system with mechanochromic, thermochromic, and solvatochromic sensing response. The fluorochromic sensing properties stem from the nanoconfinement effect that ZIF-71 imposes on RhB monomers, yielding the H- or J-type aggregates with tuneable photophysical and photochemical properties. For mechanochromism, the external pressure causes an emission red shift in a linear fashion, switching RhB guests from H-type to J-type aggregates through a shear deformation. For thermochromism, we demonstrate a linear scaling as a function of temperature due to the spatial restriction imposed on J-type aggregates incarcerated in ZIF-71 pores. Harnessing the solvatochromism of RhB@ZIF-71, we identified three diverse groups of volatile organic compounds. The multimodal sensing response pave the way to smart applications like photonic pressure sensors, non-invasive thermometers, and ultrasensitive chemosensors.

preprint2021arXiv

Evidence for $Z_{c}^{\pm}$ decays into the $ρ^{\pm} η_{c}$ final state

We study $e^{+}e^{-}$ collisions with a $π^{+}π^{-}π^{0}η_{c}$ final state using data samples collected with the BESIII detector at center-of-mass energies $\sqrt{s}=4.226$, $4.258$, $4.358$, $4.416$, and $4.600$ GeV. Evidence for the decay $\zcpm\to\rhopm\etac$ is reported with a statistical significance of $3.9σ$ with various systematic uncertainties taken into account at $\sqrt{s} = 4.226$ GeV, and the Born cross section times branching fraction $σ^{B}(\EE\to \pimp\zcpm)\times \BR(\zcpm\to\rhopm\etac)$ is measured to be $(48 \pm 11 \pm 11)\,\rm{pb}$. The $\zcpm\to \rhopm\etac$ signal is not significant at the other center-of-mass energies and the corresponding upper limits are determined. In addition, no significant signal is observed in a search for $\zcppm\to ρ^{\pm}\etac$ with the same data samples. The ratios $R_{\zc}=\BR(\zcpm\to ρ^{\pm} \etac)/\BR(\zcpm\to π^{\pm} \jpsi)$ and $R_{\zcp}=\BR(\zcppm\to ρ^{\pm} \etac)/\BR(\zcppm\to π^{\pm} \hc)$ are obtained and used to discriminate between different theoretical interpretations of the $\zcpm$ and $\zcppm$.

preprint2021arXiv

Explicit generators and relations for the centre of the quantum group

For the standard Drinfeld-Jimbo quantum group ${\rm U}_q(\mathfrak{g})$ associated with a simple Lie algebra $\mathfrak{g}$, we construct explicit generators of the centre $Z({\rm U}_q(\mathfrak{g}))$, and determine the relations satisfied by the generators. For $\mathfrak{g}$ of type $A_n(n\geq 2)$, $D_{2k+1}(k\geq 2)$ or $E_6$, the centre $Z({\rm U}_q(\mathfrak{g}))$ is isomorphic to a quotient of a polynomial algebra in multiple variables, which is described in a uniform manner for all cases. For $\mathfrak{g}$ of any other type, $Z({\rm U}_q(\mathfrak{g}))$ is generated by $n=$rank$(\mathfrak{g})$ algebraically independent elements.

preprint2021arXiv

Fuzzing Based on Function Importance by Interprocedural Control Flow Graph

Coverage-based graybox fuzzer (CGF), such as AFL has gained great success in vulnerability detection thanks to its ease-of-use and bug-finding power. Since some code fragments such as memory allocation are more vulnerable than others, various improving techniques have been proposed to explore the more vulnerable areas by collecting extra information from the program under test or its executions. However, these improvements only consider limited types of information sources and ignore the fact that the priority a seed input to be fuzzed may be influenced by all the code it covers. Based on the above observations, we propose a fuzzing method based on the importance of functions. First, a data structure called Attributed Interprocedural Control Flow Graph (AICFG) is devised to combine different features of code fragments. Second, the importance of each node in the AICFG is calculated based on an improved PageRank algorithm, which also models the influence between connected nodes. During the fuzzing process, the node importance is updated periodically by a propagation algorithm. Then the seed selection and energy scheduling of a seed input are determined by the importance of its execution trace. We implement this approach on top of AFL in a tool named FunAFL and conduct an evaluation on 14 real-world programs against AFL and two of its improvements. FunAFL, with 17% higher branch coverage than others on average, finds 13 bugs and 3 of them are confirmed by CVE after 72 hours.

preprint2021arXiv

Gluino-SUGRA scenarios in light of FNAL muon g-2 anomaly

Gluino-SUGRA ($\tilde{g}$SUGRA), which is an economical extension of the predictive mSUGRA, adopts much heavier gluino mass parameter than other gauginos mass parameters and universal scalar mass parameter at the unification scale. It can elegantly reconcile the experimental results on the Higgs boson mass, the muon $g-2$, the null results in search for supersymmetry at the LHC and the results from B-physics. In this work, we propose several new ways to generate large gaugino hierarchy (i.e. $M_3\gg M_1,M_2$) for $\tilde{g}$SUGRA model building and then discuss in detail the implications of the new muon $g-2$ results with the updated LHC constraints on such $\tilde{g}$SUGRA scenarios. We obtain the following observations: (i) For the most interesting $M_1=M_2$ case at the GUT scale with a viable bino-like dark matter, the $\tilde{g}$SUGRA can explain the muon $g-2$ anomaly at $1σ$ level and be consistent with the updated LHC constraints for $6\leq M_3/M_1 \leq 9$ at the GUT scale; (ii) For $M_1:M_2=5:1$ at the GUT scale with wino-like dark matter, the $\tilde{g}$SUGRA model can explain the muon $g-2$ anomaly at $2σ$ level and be consistent with the updated LHC constraints for $3\leq M_3/M_1 \leq 3.2$ at the GUT scale; (iii) For $M_1:M_2=3:2$ at the GUT scale with mixed bino-wino dark matter, the $\tilde{g}$SUGRA model can explain the muon $g-2$ anomaly at $1σ$ level and be consistent with the updated LHC constraints for $6.9\leq M_3/M_1 \leq 7.5$ at the GUT scale. Although the choice of heavy gluino will always increase the FT involved, some of the $1σ/2σ$ survived points of $Δa_μ^{combine}$ can still allow low EWFT of order several hundreds and be fairly natural. Constraints from (dimension-five operator induced) proton decay are also discussed.

preprint2021arXiv

Interfacial ferroelectricity in rhombohedral-stacked bilayer transition metal dichalcogenides

Van der Waals (vdW) materials have greatly expanded our design space of heterostructures by allowing individual layers to be stacked at non-equilibrium configurations, for example via control of the twist angle. Such heterostructures not only combine characteristics of the individual building blocks, but can also exhibit emergent physical properties absent in the parent compounds through interlayer interactions. Here we report on a new family of emergent, nanometer-thick, semiconductor 2D ferroelectrics, where the individual constituents are well-studied non-ferroelectric monolayer transition metal dichalcogenides (TMDs), namely WSe2, MoSe2, WS2, and MoS2. By stacking two identical monolayer TMDs in parallel, we obtain electrically switchable rhombohedral-stacking configurations, with out-of-plane polarization that is flipped by in-plane sliding motion. Fabricating nearly-parallel stacked bilayers enables the visualization of moiré ferroelectric domains as well as electric-field-induced domain wall motion with piezoelectric force microscopy (PFM). Furthermore, by using a nearby graphene electronic sensor in a ferroelectric field transistor geometry, we quantify the ferroelectric built-in interlayer potential, in good agreement with first-principles calculations. The novel semiconducting ferroelectric properties of these four new TMDs opens up the possibility of studying the interplay between ferroelectricity and their rich electric and optical properties.

preprint2021arXiv

Node-Level Membership Inference Attacks Against Graph Neural Networks

Many real-world data comes in the form of graphs, such as social networks and protein structure. To fully utilize the information contained in graph data, a new family of machine learning (ML) models, namely graph neural networks (GNNs), has been introduced. Previous studies have shown that machine learning models are vulnerable to privacy attacks. However, most of the current efforts concentrate on ML models trained on data from the Euclidean space, like images and texts. On the other hand, privacy risks stemming from GNNs remain largely unstudied. In this paper, we fill the gap by performing the first comprehensive analysis of node-level membership inference attacks against GNNs. We systematically define the threat models and propose three node-level membership inference attacks based on an adversary&#39;s background knowledge. Our evaluation on three GNN structures and four benchmark datasets shows that GNNs are vulnerable to node-level membership inference even when the adversary has minimal background knowledge. Besides, we show that graph density and feature similarity have a major impact on the attack&#39;s success. We further investigate two defense mechanisms and the empirical results indicate that these defenses can reduce the attack performance but with moderate utility loss.

preprint2021arXiv

Orbital ordering in the layered perovskite material CsVF$_4$

In strongly correlated electronic systems, several novel physical properties are induced by the orbital degree of freedom. In particular, orbital degeneracy near the Fermi level leads to spontaneous symmetry breaking, such as the nematic state in FeSe and the orbital ordering in several perovskite systems. Here, the novel layered perovskite material CsVF$_4$, with a $3d^2$ electronic configuration, was systematically studied using density functional theory and a multiorbital Hubbard model within the Hatree-Fock approximation. Our results show that CsVF$_4$ should be magnetic, with a G-type antiferromagnetic arrangement in the $ab$ plane and weak antiferromagnetic exchange along the $c$-axis, in agreement with experimental results. Driven by the Jahn-Teller distortion in the VF$_6$ octahedra that shorten the $c$-axis, the system displays an interesting electron occupancy $d_{xy}^1(d_{xz}d_{yz})^1$ corresponding to the lower nondegenerate $d_{xy}$ orbital being half-filled and the other two degenerate $d_{yz}$ and $d_{xz}$ orbitals sharing one electron per site. We show that this degeneracy is broken and a novel $d_{yz}$/$d_{xz}$ staggered orbital pattern is here predicted by both the first-principles and Hubbard model calculations. This orbital ordering is driven by the electronic instability associated with degeneracy removal to lower the energy.

preprint2021arXiv

Q-dependent Collective Relaxation Dynamics of Glass-Forming Liquid Ca0.4K0.6(NO3)1.4 Investigated by Wide-Angle Neutron Spin-Echo

Employing wide-angle neutron spin echo spectroscopy, we measured the Q-dependent coherent intermediate scattering function of the prototypical ionic glass former Ca0.4K0.6(NO3)1.4, in the equilibrium and supercooled liquid states beyond the hydrodynamic regime. The data reveal a clear two-step relaxation: an exponential fast process, and a stretched exponential slow alpha process. de Gennes narrowing is observed in all characteristic variables of the alpha process: the relaxation time, amplitude, and stretching exponent. At all length scales probed, the relative amplitude of the alpha-relaxation decreases with increasing temperature and levels off in the normal liquid state. The temperature dependence of the stretching exponent and the relaxation time at different Q&#39;s indicate that modifications of the relaxation mechanisms at the local length scales, manifested as temperature independent dynamic heterogeneity and smaller deviations from Arrhenius behavior, have occurred even above the alpha-beta (Johari-Goldstein) bifurcation temperature.

preprint2021arXiv

Quantum versus Classical Regime in Circuit Quantum Acoustodynamics

We experimentally study a circuit quantum acoustodynamics system, which consists of a superconducting artificial atom, coupled to both a two-dimensional surface acoustic wave resonator and a one-dimensional microwave transmission line. The strong coupling between the artificial atom and the acoustic wave resonator is confirmed by the observation of the vacuum Rabi splitting at the base temperature of dilution refrigerator. We show that the propagation of microwave photons in the microwave transmission line can be controlled by a few phonons in the acoustic wave resonator. Furthermore, we demonstrate the temperature effect on the measurements of the Rabi splitting and temperature induced transitions from high excited dressed states. We find that the spectrum structure of two-peak for the Rabi splitting becomes into those of several peaks, and gradually disappears with the increase of the environmental temperature $T$. The quantum-to-classical transition is observed around the crossover temperature $T_{c}$, which is determined via the thermal fluctuation energy $k_{B}T$ and the characteristic energy level spacing of the coupled system. Experimental results agree well with the theoretical simulations via the master equation of the coupled system at different effective temperatures.

preprint2020arXiv

A cryogenic-helium pipe flow facility with unique double-line molecular tagging velocimetry capability

Cryogenic helium-4 has extremely small kinetic viscosity, which makes it a promising material for high Reynolds ($Re$) number turbulence research in compact laboratory apparatuses. In its superfluid phase (He II), helium has an extraordinary heat transfer capability and has been utilized in various scientific and engineering applications. In order to unlock the full potential of helium in turbulence research and to improve our understanding of the heat transfer mechanism in He II, a flow facility that allows quantitative study of helium heat-and-mass transfer processes is needed. Here we report our work in assembling and testing a unique helium pipe flow facility that incorporates a novel double-line molecular tracking velocimetry (DL-MTV) system. This flow facility allows us to generate turbulent pipe flows with $Re$ above $10^7$, and it can also be adapted to produce heat-induced counterflow in He II. The DL-MTV system, which is based on the generation and tracking of two parallel thin He$^*_2$ molecular tracer lines with an adjustable separation distance, allows us to measure not only the velocity profile but also both the transverse and longitudinal spatial velocity structure functions. We have also installed a deferential pressure sensor to the flow pipe for pressure drop measurement. The testing results of the flow facility and the measurement devices are presented. We discuss how this facility will allow us to solve some outstanding problems in the helium heat-and-mass transfer topic area.

preprint2020arXiv

A deep-learning based generalized reduced-order model of glottal flow during normal phonation

This paper proposes a deep-learning based generalized reduced-order model (ROM) that can provide a fast and accurate prediction of the glottal flow during normal phonation. The approach is based on the assumption that the vibration of the vocal folds can be represented by a universal kinematics equation (UKE), which is used to generate a glottal shape library. For each shape in the library, the ground truth values of the flow rate and pressure distribution are obtained from the high-fidelity Navier-Stokes (N-S) solution. A fully-connected deep neural network (DNN)is then trained to build the empirical mapping between the shapes and the flow rate and pressure distributions. The obtained DNN based reduced-order flow solver is coupled with a finite-element method (FEM) based solid dynamics solver for FSI simulation of phonation. The reduced-order model is evaluated by comparing to the Navier-Stokes solutions in both statics glottal shaps and FSI simulations. The results demonstrate a good prediction performance in accuracy and efficiency.

preprint2020arXiv

A Fast and Robust BERT-based Dialogue State Tracker for Schema-Guided Dialogue Dataset

Dialog State Tracking (DST) is one of the most crucial modules for goal-oriented dialogue systems. In this paper, we introduce FastSGT (Fast Schema Guided Tracker), a fast and robust BERT-based model for state tracking in goal-oriented dialogue systems. The proposed model is designed for the Schema-Guided Dialogue (SGD) dataset which contains natural language descriptions for all the entities including user intents, services, and slots. The model incorporates two carry-over procedures for handling the extraction of the values not explicitly mentioned in the current user utterance. It also uses multi-head attention projections in some of the decoders to have a better modelling of the encoder outputs. In the conducted experiments we compared FastSGT to the baseline model for the SGD dataset. Our model keeps the efficiency in terms of computational and memory consumption while improving the accuracy significantly. Additionally, we present ablation studies measuring the impact of different parts of the model on its performance. We also show the effectiveness of data augmentation for improving the accuracy without increasing the amount of computational resources.

preprint2020arXiv

A hybrid text normalization system using multi-head self-attention for mandarin

In this paper, we propose a hybrid text normalization system using multi-head self-attention. The system combines the advantages of a rule-based model and a neural model for text preprocessing tasks. Previous studies in Mandarin text normalization usually use a set of hand-written rules, which are hard to improve on general cases. The idea of our proposed system is motivated by the neural models from recent studies and has a better performance on our internal news corpus. This paper also includes different attempts to deal with imbalanced pattern distribution of the dataset. Overall, the performance of the system is improved by over 1.5% on sentence-level and it has a potential to improve further.

preprint2020arXiv

A massless scalar field in Robertson-Walker spacetimes: Adiabatic regularization and Green&#39;s function

We study adiabatic regularization of a coupling massless scalar field in general spatially flat Robertson-Walker (RW) spacetimes. For the conformally-coupling, the 0th-order regularized power spectrum and 0th-order regularized stress tensor are zero, and no trace anomaly exists in general RW spacetimes. This is a new result which extents those found in de Sitter space. For the minimally-coupling, the regularized spectra are also zero in the radiation-dominant stage, the matter-dominant stage, and de Sitter space as well. The vanishing of these adiabatically regularized spectra are also confirmed by direct regularization of the Green&#39;s functions. For a general coupling and general RW spacetimes, the regularized spectra can be negative under the conventional prescription. By going to higher order of regularization, the spectra will generally become positive, but will also acquire IR divergence which is inevitable for a massless field. To avoid the IR divergence, the inside-horizon regularization is applied. By these procedures, one will eventually achieve nonnegative, UV- and IR-convergent power spectrum and spectral energy density.

preprint2020arXiv

A New Data Normalization Method to Improve Dialogue Generation by Minimizing Long Tail Effect

Recent neural models have shown significant progress in dialogue generation. Most generation models are based on language models. However, due to the Long Tail Phenomenon in linguistics, the trained models tend to generate words that appear frequently in training datasets, leading to a monotonous issue. To address this issue, we analyze a large corpus from Wikipedia and propose three frequency-based data normalization methods. We conduct extensive experiments based on transformers and three datasets respectively collected from social media, subtitles, and the industrial application. Experimental results demonstrate significant improvements in diversity and informativeness (defined as the numbers of nouns and verbs) of generated responses. More specifically, the unigram and bigram diversity are increased by 2.6%-12.6% and 2.2%-18.9% on the three datasets, respectively. Moreover, the informativeness, i.e. the numbers of nouns and verbs, are increased by 4.0%-7.0% and 1.4%-12.1%, respectively. Additionally, the simplicity and effectiveness enable our methods to be adapted to different generation models without much extra computational cost.

preprint2020arXiv

A Novel Software-based Multi-path RDMA Solutionfor Data Center Networks

In this paper we propose Virtuoso, a purely software-based multi-path RDMA solution for data center networks (DCNs) to effectively utilize the rich multi-path topology for load balancing and reliability. As a &#34;middleware&#34; library operating at the user space, Virtuoso employs three innovative mechanisms to achieve its goal. In contrast to existing hardware-based MP-RDMA solution, Virtuoso can be readily deployed in DCNs with existing RDMA NICs. It also decouples path selection and load balancing mechanisms from hardware features, allowing DCN operators and applications to make flexible decisions by employing the best mechanisms (as &#34;plug-in&#34; software library modules) as needed. Our experiments show that Virtuoso is capable of fully utilizing multiple paths with negligible CPU overheads

preprint2020arXiv

A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations

Backpropagation-based visualizations have been proposed to interpret convolutional neural networks (CNNs), however a theory is missing to justify their behaviors: Guided backpropagation (GBP) and deconvolutional network (DeconvNet) generate more human-interpretable but less class-sensitive visualizations than saliency map. Motivated by this, we develop a theoretical explanation revealing that GBP and DeconvNet are essentially doing (partial) image recovery which is unrelated to the network decisions. Specifically, our analysis shows that the backward ReLU introduced by GBP and DeconvNet, and the local connections in CNNs are the two main causes of compelling visualizations. Extensive experiments are provided that support the theoretical analysis.

preprint2020arXiv

Adaptive Multiscale Illumination-Invariant Feature Representation for Undersampled Face Recognition

This paper presents an novel illumination-invariant feature representation approach used to eliminate the varying illumination affection in undersampled face recognition. Firstly, a new illumination level classification technique based on Singular Value Decomposition (SVD) is proposed to judge the illumination level of input image. Secondly, we construct the logarithm edgemaps feature (LEF) based on lambertian model and local near neighbor feature of the face image, applying to local region within multiple scales. Then, the illumination level is referenced to construct the high performance LEF as well realize adaptive fusion for multiple scales LEFs for the face image, performing JLEF-feature. In addition, the constrain operation is used to remove the useless high-frequency interference, disentangling useful facial feature edges and constructing AJLEF-face. Finally, the effects of the our methods and other state-of-the-art algorithms including deep learning methods are tested on Extended Yale B, CMU PIE, AR as well as our Self-build Driver database (SDB). The experimental results demonstrate that the JLEF-feature and AJLEF-face outperform other related approaches for undersampled face recognition under varying illumination.

preprint2020arXiv

Adiabatic regularization and Green&#39;s function of a scalar field in de Sitter space: Positive energy spectrum and no trace anomaly

In the conventional adiabatic regularization the vacuum ultraviolet divergences of a quantum field in curved spacetime are removed by subtracting the $k$-mode of the stress tensor to the 4th-order. For a scalar field in de Sitter space, we find that the 4th-order regularized spectral energy density is negative. Moreover, the 2nd-order regularization for minimal coupling ($ξ=0$) and the 0th-order regularization for conformal coupling ($ξ=\frac16$) yield a positive and UV-convergent spectral energy density and power spectrum. The regularized stress tensor in the vacuum is maximally symmetric and can drive inflation, while its $k$-modes representing the primordial fluctuations are nonuniformly distributed. Conventional regularization of a Green&#39;s function in position space is generally plagued by a log IR divergence. Only in the massless case with $ξ=0$ or $\frac16$, we can directly regularize the Green&#39;s functions and obtain vanishing results that agree with the adiabatic regularization results. In this case, the regularized power spectrum and stress tensor are both zero, and no trace anomaly exists. To overcome the log IR divergence problem in the massive cases with $ξ=0$ and $\frac16$, we perform Fourier transformation of the regularized power spectra and obtain the regularized analytical Green&#39;s functions which are IR- and UV-convergent.

preprint2020arXiv

Analysis of the decay $D^0\rightarrow K_{S}^{0} K^{+} K^{-}$

Using a data sample of $2.93~fb^{-1}$ of $e^+e^-$ collisions collected at $\sqrt{s}=3.773 GeV$ in the BESIII experiment, we perform an analysis of the decay $D^0\rightarrow K_{S}^{0} K^{+} K^{-}$. The Dalitz plot is analyzed using $1856\pm 45$ flavor-tagged signal decays. We find that the Dalitz plot is well described by a set of six resonances: $a_0(980)^0$, $a_0(980)^+$, $ϕ(1020)$, $a_2(1320)^+$, $a_2(1320)^-$ and $a_0(1450)^-$. Their magnitudes, phases and fit fractions are determined as well as the coupling of $a_0(980)$ to $K\bar{K}$, $g_{K\bar{K}}=3.77\pm 0.24\text{(stat.)}\pm0.35\text{(sys.)} GeV$. The branching fraction of the decay $D^0\rightarrow K_{S}^{0} K^{+} K^{-}$ is measured using $11660\pm 118$ untagged signal decays to be $(4.51\pm 0.05\text{(stat.)}\pm 0.16\text{(sys.)})10^{-3}$. Both measurements are limited by their systematic uncertainties.

preprint2020arXiv

Anisotropic electrical and thermal magnetotransport in the magnetic semimetal GdPtBi

The half-Heusler rare-earth intermetallic GdPtBi has recently gained attention due to peculiar magnetotransport phenomena that have been associated with the possible existence of Weyl fermions, thought to arise from the crossings of spin-split conduction and valence bands. On the other hand, similar magnetotransport phenomena observed in other rare-earth intermetallics have often been attributed to the interaction of itinerant carriers with localized magnetic moments stemming from the $4f$-shell of the rare-earth element. In order to address the origin of the magnetotransport phenomena in GdPtBi, we performed a comprehensive study of the magnetization, electrical and thermal magnetoresistivity on two single-crystalline GdPtBi samples. In addition, we performed an analysis of the Fermi surface via Shubnikov-de Haas oscillations in one of the samples and compared the results to \emph{ab initio} band structure calculations. Our findings indicate that the electrical and thermal magnetotransport in GdPtBi cannot be solely explained by Weyl physics and is strongly influenced by the interaction of both itinerant charge carriers and phonons with localized magnetic Gd-ions and possibly also paramagnetic impurities.

preprint2020arXiv

Anomalous Hall effect in Weyl semimetal half Heusler compounds RPtBi (R = Gd and Nd)

Topological materials ranging from topological insulators to Weyl and Dirac semimetals form one of the most exciting current fields in condensed-matter research. Many half-Heusler compounds, RPtBi (R= rare earth) have been theoretically predicted to be topological semimetals. Among various topological attributes envisaged in RPtBi, topological surface states, chiral anomaly and planar Hall effect have been observed experimentally. Here, we report on an unusual intrinsic anomalous Hall effect (AHE) in the antiferromagnetic Heusler Weyl semimetal compounds GdPtBi and NdPtBi that is observed over a wide temperature range. In particular, GdPtBi exhibits an anomalous Hall conductivity of up to 60 ohm-1cm-1 and an anomalous Hall angle as large as 23%. Muon spin resonance (mu-SR) studies of GdPtBi indicate a sharp antiferromagnetic transition (T_N) at 9 K without any noticeable magnetic correlations above T_N. Our studies indicate that Weyl points in these half-Heuslers are induced by a magnetic field via exchange-splitting of the electronic bands at or near to the Fermi energy which is the source of the chiral anomaly and the AHE.

preprint2020arXiv

Antiferromagnetism of Double Molybdate LiFe(MoO$_4$)$_2$

The magnetic properties of the spin-5/2 double molybdate LiFe(MoO$_4$)$_2$ have been characterized by heat capacity, magnetic susceptibility, and neutron powder diffraction techniques. Unlike the multiferroic system LiFe(MoO$_4$)$_2$ which exhibits two successive magnetic transitions, LiFe(MoO$_4$)$_2$ undergoes only one antiferromagnetic transition at $T_N$ ~ 23.8 K. Its antiferromagnetic magnetic structure with the commensurate propagation vector k = (0, 0.5, 0) has been determined. Density functional theory calculations confirm the antiferromagnetic ground state and provide a numerical estimate of the relevant exchange coupling constants.

preprint2020arXiv

Borophosphene as a promising Dirac anode with large capacity and high-rate capability for sodium-ion batteries

Sodium-ion batteries (SIBs) have attracted a great deal of attention as potential low-cost energy storage alternatives to Lithium-ion batteries (LIBs) due to the intrinsic safety and great abundance of sodium on Earth. For developing competitive SIBs, highly efficient anode materials with large capacity and rapid ion diffusion are indispensable. In this study, a two-dimensional (2D) Dirac monolayer, that is borophosphene, is proposed to be a promising anode material for high performance SIBs on the basis of density functional theory calculations. The performances of Na adsorption and diffusion, maximum specific capacity, open circuit voltage, cyclical stability and electronic properties combined with Bader charge analysis are explored. It is found that the borophosphene can spontaneously adsorb Na atom with binding energy of -0.838 eV. A low diffusion energy barrier of 0.221 eV suggests rapid ion conductivity. More intriguingly, a maximum specific capacity of 1282 mAh/g can be achieved in borophosphene, which is one of the largest values reported in 2D anode materials for SIBs. A low average voltage of 0.367 V is estimated, implying a suitable voltage of the anode material. Metallic properties, tiny surface expansion, and good kinetic stability of sodiated borophosphene give rise to high electrical conductivity and favorable cyclability. These advantages above suggest the borophosphene can be used as a Dirac anode material for SIBs with excellent performances of large specific capacity, high-rate capability, and favorable cyclability.

preprint2020arXiv

CCA: Exploring the Possibility of Contextual Camouflage Attack on Object Detection

Deep neural network based object detection hasbecome the cornerstone of many real-world applications. Alongwith this success comes concerns about its vulnerability tomalicious attacks. To gain more insight into this issue, we proposea contextual camouflage attack (CCA for short) algorithm to in-fluence the performance of object detectors. In this paper, we usean evolutionary search strategy and adversarial machine learningin interactions with a photo-realistic simulated environment tofind camouflage patterns that are effective over a huge varietyof object locations, camera poses, and lighting conditions. Theproposed camouflages are validated effective to most of the state-of-the-art object detectors.

preprint2020arXiv

Citation Recommendations Considering Content and Structural Context Embedding

The number of academic papers being published is increasing exponentially in recent years, and recommending adequate citations to assist researchers in writing papers is a non-trivial task. Conventional approaches may not be optimal, as the recommended papers may already be known to the users, or be solely relevant to the surrounding context but not other ideas discussed in the manuscript. In this work, we propose a novel embedding algorithm DocCit2Vec, along with the new concept of ``structural context&#39;&#39;, to tackle the aforementioned issues. The proposed approach demonstrates superior performances to baseline models in extensive experiments designed to simulate practical usage scenarios.

preprint2020arXiv

Cognitive Radio Network Throughput Maximization with Deep Reinforcement Learning

Radio Frequency powered Cognitive Radio Networks (RF-CRN) are likely to be the eyes and ears of upcoming modern networks such as Internet of Things (IoT), requiring increased decentralization and autonomous operation. To be considered autonomous, the RF-powered network entities need to make decisions locally to maximize the network throughput under the uncertainty of any network environment. However, in complex and large-scale networks, the state and action spaces are usually large, and existing Tabular Reinforcement Learning technique is unable to find the optimal state-action policy quickly. In this paper, deep reinforcement learning is proposed to overcome the mentioned shortcomings and allow a wireless gateway to derive an optimal policy to maximize network throughput. When benchmarked against advanced DQN techniques, our proposed DQN configuration offers performance speedup of up to 1.8x with good overall performance.

preprint2020arXiv

Comprehensive scan for nonmagnetic Weyl semimetals with nonlinear optical response

As the development of topological band theory, comprehensive databases about time reversal and crystalline symmetries protected nonmagnetic topological materials were developed via first-principles calculations recently. However, owing to the low symmetry requirement of Weyl points, the symmetry-based topological indicator cannot be applied to Weyl semimetals (WSMs). Hitherto, the WSMs with Weyl points in arbitrary positions are still absent in the well-known databases. In this work, we develop an efficient algorithm to search for Weyl points automatically and establish a database of nonmagnetic WSMs with Weyl points near Fermi level based on the total experimental noncentrosymmetric crystal structures in the Inorganic Crystal Structure Database (ICSD). Totally 46 Weyl semimetals were discovered to have nearly clean Fermi surface and Weyl points near Fermi level within 300 meV, and 9 of them are chiral structures which may host the quantized circular photogalvanic effect. In addition, the nonlinear optical response is studied and giant shift current is explored in the end. Besides nonmagnetic WSMs, our powerful tools can also be used in the discovery of magnetic topological materials.

preprint2020arXiv

Copy and Paste GAN: Face Hallucination from Shaded Thumbnails

Existing face hallucination methods based on convolutional neural networks (CNN) have achieved impressive performance on low-resolution (LR) faces in a normal illumination condition. However, their performance degrades dramatically when LR faces are captured in low or non-uniform illumination conditions. This paper proposes a Copy and Paste Generative Adversarial Network (CPGAN) to recover authentic high-resolution (HR) face images while compensating for low and non-uniform illumination. To this end, we develop two key components in our CPGAN: internal and external Copy and Paste nets (CPnets). Specifically, our internal CPnet exploits facial information residing in the input image to enhance facial details; while our external CPnet leverages an external HR face for illumination compensation. A new illumination compensation loss is thus developed to capture illumination from the external guided face image effectively. Furthermore, our method offsets illumination and upsamples facial details alternately in a coarse-to-fine fashion, thus alleviating the correspondence ambiguity between LR inputs and external HR inputs. Extensive experiments demonstrate that our method manifests authentic HR face images in a uniform illumination condition and outperforms state-of-the-art methods qualitatively and quantitatively.

preprint2020arXiv

Cross section measurement of $e^+e^- \rightarrow η&#39;J/ψ$ from $\sqrt{s} = 4.178$ to $4.600$ GeV

The cross section of the process $e^+e^- \rightarrow η&#39;J/ψ$ is measured at center-of-mass energies from $\sqrt{s} =$ 4.178 to 4.600 GeV using data samples corresponding to a total integrated luminosity of 11 fb$^{-1}$ collected with the BESIII detector operating at the BEPCII storage ring. The dependence of the cross section on $\sqrt{s}$ shows an enhancement around $4.2$ GeV. While the shape of the cross section cannot be fully explained with a single $ψ(4160)$ or $ψ(4260)$ state, a coherent sum of the two states does provide a reasonable description of the data.

preprint2020arXiv

Cylinder partition function of the 6-vertex model from algebraic geometry

We compute the exact partition function of the isotropic 6-vertex model on a cylinder geometry with free boundary conditions, for lattices of intermediate size, using Bethe ansatz and algebraic geometry. We perform the computations in both the open and closed channels. We also consider the partial thermodynamic limits, whereby in the open (closed) channel, the open (closed) direction is kept small while the other direction becomes large. We compute the zeros of the partition function in the two partial thermodynamic limits, and compare with the condensation curves.

preprint2020arXiv

Dark matter, electroweak phase transition and gravitational wave in the type-II two-Higgs-doublet model with a singlet scalar field

In the framework of type-II two-Higgs-doublet model with a singlet scalar dark matter $S$, we study the dark matter observables, the electroweak phase transition, and the gravitational wave signals by such strongly first order phase transition after imposing the constraints of the LHC Higgs data. We take the heavy CP-even Higgs $H$ as the only portal between the dark matter and SM sectors, and find the LHC Higgs data and dark matter observables require $m_S$ and $m_H$ to be larger than 130 GeV and 360 GeV for $m_A=600$ GeV in the case of the 125 GeV Higgs with the SM-like coupling. Next, we carve out some parameter space where a strongly first order electroweak phase transition can be achieved, and find benchmark points for which the amplitudes of gravitational wave spectra reach the sensitivities of the future gravitational wave detectors.

preprint2020arXiv

Deep Anomaly Detection for Time-series Data in Industrial IoT: A Communication-Efficient On-device Federated Learning Approach

Since edge device failures (i.e., anomalies) seriously affect the production of industrial products in Industrial IoT (IIoT), accurately and timely detecting anomalies is becoming increasingly important. Furthermore, data collected by the edge device may contain the user&#39;s private data, which is challenging the current detection approaches as user privacy is calling for the public concern in recent years. With this focus, this paper proposes a new communication-efficient on-device federated learning (FL)-based deep anomaly detection framework for sensing time-series data in IIoT. Specifically, we first introduce a FL framework to enable decentralized edge devices to collaboratively train an anomaly detection model, which can improve its generalization ability. Second, we propose an Attention Mechanism-based Convolutional Neural Network-Long Short Term Memory (AMCNN-LSTM) model to accurately detect anomalies. The AMCNN-LSTM model uses attention mechanism-based CNN units to capture important fine-grained features, thereby preventing memory loss and gradient dispersion problems. Furthermore, this model retains the advantages of LSTM unit in predicting time series data. Third, to adapt the proposed framework to the timeliness of industrial anomaly detection, we propose a gradient compression mechanism based on Top-\textit{k} selection to improve communication efficiency. Extensive experiment studies on four real-world datasets demonstrate that the proposed framework can accurately and timely detect anomalies and also reduce the communication overhead by 50\% compared to the federated learning framework that does not use a gradient compression scheme.

preprint2020arXiv

Deep Attentive Generative Adversarial Network for Photo-Realistic Image De-Quantization

Most of current display devices are with eight or higher bit-depth. However, the quality of most multimedia tools cannot achieve this bit-depth standard for the generating images. De-quantization can improve the visual quality of low bit-depth image to display on high bit-depth screen. This paper proposes DAGAN algorithm to perform super-resolution on image intensity resolution, which is orthogonal to the spatial resolution, realizing photo-realistic de-quantization via an end-to-end learning pattern. Until now, this is the first attempt to apply Generative Adversarial Network (GAN) framework for image de-quantization. Specifically, we propose the Dense Residual Self-attention (DenseResAtt) module, which is consisted of dense residual blocks armed with self-attention mechanism, to pay more attention on high-frequency information. Moreover, the series connection of sequential DenseResAtt modules forms deep attentive network with superior discriminative learning ability in image de-quantization, modeling representative feature maps to recover as much useful information as possible. In addition, due to the adversarial learning framework can reliably produce high quality natural images, the specified content loss as well as the adversarial loss are back-propagated to optimize the training of model. Above all, DAGAN is able to generate the photo-realistic high bit-depth image without banding artifacts. Experiment results on several public benchmarks prove that the DAGAN algorithm possesses ability to achieve excellent visual effect and satisfied quantitative performance.

preprint2020arXiv

Density functional approach to correlated moire states: itinerant magnetism

Two-dimensional moire superlattices have recently emerged as a fertile ground for creating novel electronic phases of matter with unprecedented control. Despite intensive efforts, theoretical investigation of correlated moire systems has been challenged by the large number of atoms in a superlattice unit cell and the inherent difficulty of treating electron correlation. The physics of correlated moire systems is governed by low-energy electrons in a coarse-grained long-wavelength potential, unlike the singular Coulomb potential of atomically-spaced ions in natural solids. Motivated by the separation between moire and atomic length scales, in this work we apply density functional theory to study directly the continuum model of interacting electrons in the periodic moire potential. Using this quantitatively accurate method, we predict itinerant spin-valley ferromagnetism in transition metal dichalchogenide heterobilayers, which originates from the constructive interplay between moire potential and Coulomb interaction in a two-dimensional electron system.

preprint2020arXiv

Design Space Exploration of Power Delivery For Advanced Packaging Technologies

In this paper, a design space exploration of power delivery networks is performed for multi-chip 2.5-D and 3-D IC technologies. The focus of the paper is the effective placement of the voltage regulator modules (VRMs) for power supply noise (PSN) suppression. Multiple on-package VRM configurations have been analyzed and compared. Additionally, 3D IC chip-on-VRM and backside-of-the-package VRM configurations are studied. From the PSN perspective, the 3D IC chip-on-VRM case suppresses the PSN the most even with high current density hotspots. The paper also studies the impact of different parameters such as VRM-chip distance on the package, on-chip decoupling capacitor density, etc. on the PSN.

preprint2020arXiv

Determination of strong-phase parameters in $D\rightarrow K^0_{S,L}π^+π^-$

We report the most precise measurements to date of the strong-phase parameters between $D^0$ and $\bar{D}^0$ decays to $K^0_{S,L}π^+π^-$ using a sample of 2.93 fb$^{-1}$ of $e^+e^-$ annihilation data collected at a center-of-mass energy of 3.773 GeV with the BESIII detector at the BEPCII collider. Our results provide the key inputs for a binned model-independent determination of the Cabibbo-Kobayashi-Maskawa angle $γ/ϕ_3$ with $B$ decays. Using our results, the decay model sensitivity to the $γ/ϕ_3$ measurement is expected to be between 0.7$^{\circ}$ and 1.2$^{\circ}$, approximately a factor of three smaller than that achievable with previous measurements. The improved precision of this work ensures that measurements of $γ/ϕ_3$ will not be limited by knowledge of strong phases for the next decade. Furthermore, our results provide critical input for other flavor-physics investigations, including charm mixing, other measurements of $CP$ violation, and the measurement of strong-phase parameters for other $D$-decay modes.

preprint2020arXiv

Direct Visualization of Irreducible Ferrielectricity in Crystals

In solids, charge polarity can one-to-one correspond to spin polarity phenomenologically, e.g. ferroelectricity/ferromagnetism, antiferroelectricity/antiferromagnetism, and even dipole-vortex/magnetic-vortex, but ferrielectricity/ferrimagnetism kept telling a disparate story in microscopic level. Since the definition of a charge dipole involves more than one ion, there may be multiple choices for a dipole unit, which makes most ferrielectric orders equivalent to ferroelectric ones, i.e. this ferrielectricity is not necessary to be a real independent branch of polarity. In this work, by using the spherical aberration-corrected scanning transmission electron microscope, we visualize a nontrivial ferrielectric structural evolution in BaFe2Se3, in which the development of two polar sub-lattices is out-of-sync, for which we term it as irreducible ferrielectricity. Such irreducible ferrielectricity leads to a non-monotonic behavior for the temperature-dependent polarization, and even a compensation point in the ordered state. Our finding unambiguously distinguishes ferrielectrics from ferroelectrics in solids.

preprint2020arXiv

Everything About You: A Multimodal Approach towards Friendship Inference in Online Social Networks

Most previous works in privacy of Online Social Networks (OSN) focus on a restricted scenario of using one type of information to infer another type of information or using only static profile data such as username, profile picture or home location. However the multimedia footprints of users has become extremely diverse nowadays. In reality, an adversary would exploit all types of information obtainable over time, to achieve its goal. In this paper, we analyse OSN privacy by jointly exploiting longterm multimodal information. We focus in particular on inference of social relationships. We consider five popular components of posts shared by users, namely images, hashtags, captions, geo-locations and published friendships. Large scale evaluation on a real-world OSN dataset shows that while our monomodal attacks achieve strong predictions, our multimodal attack leads to a stronger performance with AUC (area under the ROC curve) above 0.9. Our results highlight the need for multimodal obfuscation approaches towards protecting privacy in an era where multimedia footprints of users get increasingly diverse.

preprint2020arXiv

Existence and axial symmetry of minimal action odd solutions for 2-D Schrödinger-Newton equation

We consider the following 2-D Schrödinger-Newton equation \begin{eqnarray*} \begin{cases} -Δu+u=w|u|^{p-1}u \\ -Δw=2 π|u|^p \end{cases}\text{in} \; \mathbb{R}^2 \end{eqnarray*} for $ p \geq 2 $. Using variational method with the Cerami compactness property, we prove the existence of minimal action odd solutions. Also by carefully applying the method of moving plane to a similar but more complex equation on the upper half space, we prove these solutions are in fact axially symmetric. Our results partially can be seen as the counterpart of results in paper \cite{GS} for the 2-D case, or the extension of the results \cite{CW} to the odd solution case.

preprint2020arXiv

Explaining The XENON1T Excess With Light Goldstini Dark Matter

In the scenario with a multiplicity of sectors which independently break supersymmetry, multiplicity of goldstini are predicted. We propose a new interpretation of the electron recoil excess at 2-7 keV observed in the XENON1T experiment with very long-lived goldstini DM elastically scattering off the electrons. The goldstini DM can be boosted by the late-decay of the other nearly degenerate (long-lived) goldstini DM, with their tiny mass difference being converted into kinetic energy of the lighter goldstini DM and neutrinos. We show that viable parameter space can be found which can explain the excess of electron recoil events around 2-3 keV recently reported by the XENON1T experiment.

preprint2020arXiv

FA-GANs: Facial Attractiveness Enhancement with Generative Adversarial Networks on Frontal Faces

Facial attractiveness enhancement has been an interesting application in Computer Vision and Graphics over these years. It aims to generate a more attractive face via manipulations on image and geometry structure while preserving face identity. In this paper, we propose the first Generative Adversarial Networks (GANs) for enhancing facial attractiveness in both geometry and appearance aspects, which we call &#34;FA-GANs&#34;. FA-GANs contain two branches and enhance facial attractiveness in two perspectives: facial geometry and facial appearance. Each branch consists of individual GANs with the appearance branch adjusting the facial image and the geometry branch adjusting the facial landmarks in appearance and geometry aspects, respectively. Unlike the traditional facial manipulations learning from paired faces, which are infeasible to collect before and after enhancement of the same individual, we achieve this by learning the features of attractiveness faces through unsupervised adversarial learning. The proposed FA-GANs are able to extract attractiveness features and impose them on the enhancement results. To better enhance faces, both the geometry and appearance networks are considered to refine the facial attractiveness by adjusting the geometry layout of faces and the appearance of faces independently. To the best of our knowledge, we are the first to enhance the facial attractiveness with GANs in both geometry and appearance aspects. The experimental results suggest that our FA-GANs can generate compelling perceptual results in both geometry structure and facial appearance and outperform current state-of-the-art methods.

preprint2020arXiv

Face Hallucination with Finishing Touches

Obtaining a high-quality frontal face image from a low-resolution (LR) non-frontal face image is primarily important for many facial analysis applications. However, mainstreams either focus on super-resolving near-frontal LR faces or frontalizing non-frontal high-resolution (HR) faces. It is desirable to perform both tasks seamlessly for daily-life unconstrained face images. In this paper, we present a novel Vivid Face Hallucination Generative Adversarial Network (VividGAN) for simultaneously super-resolving and frontalizing tiny non-frontal face images. VividGAN consists of coarse-level and fine-level Face Hallucination Networks (FHnet) and two discriminators, i.e., Coarse-D and Fine-D. The coarse-level FHnet generates a frontal coarse HR face and then the fine-level FHnet makes use of the facial component appearance prior, i.e., fine-grained facial components, to attain a frontal HR face image with authentic details. In the fine-level FHnet, we also design a facial component-aware module that adopts the facial geometry guidance as clues to accurately align and merge the frontal coarse HR face and prior information. Meanwhile, two-level discriminators are designed to capture both the global outline of a face image as well as detailed facial characteristics. The Coarse-D enforces the coarsely hallucinated faces to be upright and complete while the Fine-D focuses on the fine hallucinated ones for sharper details. Extensive experiments demonstrate that our VividGAN achieves photo-realistic frontal HR faces, reaching superior performance in downstream tasks, i.e., face recognition and expression classification, compared with other state-of-the-art methods.

preprint2020arXiv

First principles calculation of shift current in chalcopyrite semiconductor ZnSnP$_2$

The bulk photovoltaic effect generates intrinsic photocurrents in materials without inversion symmetry. Shift current is one of the bulk photovoltaic phenomena related to the Berry phase of the constituting electronic bands: photo-excited carriers coherently shift in real space due to the difference in the Berry connection between the valence and conduction bands. Ferroelectric semiconductors and Weyl semimetals are known to exhibit such nonlinear optical phenomena. Here we consider chalcopyrite semiconductor ZnSnP$_2$ which lacks inversion symmetry and calculate the shift current conductivity. We find that the magnitude of the shift current is comparable to the recently measured values on other ferroelectric semiconductors and an order of magnitude larger than bismuth ferrite. The peak response for both optical and shift current conductivity, which mainly comes from P-3$p$ and Sn-5$p$ orbitals, is several eV above the bandgap.

preprint2020arXiv

First-principles study of the low-temperature charge density wave phase in the quasi-one-dimensional Weyl chiral compound (TaSe$_4$)$_2$I

Using {\it ab initio} density functional theory, we study the lattice phase transition of quasi-one-dimensional (TaSe$_4$)$_2$I. In the undistorted state, the strongly anisotropic semimetal band structure presents two non-equivalent Weyl points. In previous efforts, two possible Ta-tetramerization patterns were proposed to be associated with the low-temperature structure. Our phonon calculations indicate that the orthorhombic $F222$ CDW-I phase is the most likely ground state for this quasi-one-dimensional system. In addition, the monoclinic $C2$ CDW-II phase may also be stable according to the phonon dispersion spectrum. Since these two phases have very similar energies in our DFT calculations, both these Ta-tetramerization distortions likely compete or coexist at low temperatures. The semimetal to insulator transition is induced by a Fermi-surface-driven instability that supports the Peierls scenario, which affects the Weyl physics developed above $T_{\rm CDW}$. Furthermore, the spin-orbit coupling generates Rashba-like band splittings in the insulating CDW phases.

preprint2020arXiv

Future Physics Programme of BESIII

There has recently been a dramatic renewal of interest in the subjects of hadron spectroscopy and charm physics. This renaissance has been driven in part by the discovery of a plethora of charmonium-like $XYZ$ states at BESIII and $B$ factories, and the observation of an intriguing proton-antiproton threshold enhancement and the possibly related $X(1835)$ meson state at BESIII, as well as the threshold measurements of charm mesons and charm baryons. We present a detailed survey of the important topics in tau-charm physics and hadron physics that can be further explored at BESIII over the remaining lifetime of BEPCII operation. This survey will help in the optimization of the data-taking plan over the coming years, and provides physics motivation for the possible upgrade of BEPCII to higher luminosity.

preprint2020arXiv

Giant anomalous Hall and Nernst effect in magnetic cubic Heusler compounds

The interplay of magnetism and topology opens up the possibility for exotic linear response effects, such as the anomalous Hall effect and the anomalous Nernst effect, which can be strongly enhanced by designing a strong Berry curvature in the electronic structure. It is even possible to utilize this to create a quantum anomalous Hall state at high temperatures by reducing the dimensionality. Magnetic Heusler compounds are a promising class of materials for this purpose because they grow in thin films, have a high Curie temperature, and their electronic structure hosts strong topological features. Here, we provide a comprehensive study of the intrinsic anomalous transport for magnetic cubic full Heusler compounds and we illustrate that several Heusler compounds outperform the best so far reported materials. The results reveal the importance of symmetries, especially mirror planes, in combination with magnetism for giant anomalous Hall and Nernst effects, which should be valid in general for linear responses (spin Hall effect, spin orbital torque, etc.) dominated by intrinsic contributions.

preprint2020arXiv

Gradient-Leaks: Understanding and Controlling Deanonymization in Federated Learning

Federated Learning (FL) systems are gaining popularity as a solution to training Machine Learning (ML) models from large-scale user data collected on personal devices (e.g., smartphones) without their raw data leaving the device. At the core of FL is a network of anonymous user devices sharing training information (model parameter updates) computed locally on personal data. However, the type and degree to which user-specific information is encoded in the model updates is poorly understood. In this paper, we identify model updates encode subtle variations in which users capture and generate data. The variations provide a strong statistical signal, allowing an adversary to effectively deanonymize participating devices using a limited set of auxiliary data. We analyze resulting deanonymization attacks on diverse tasks on real-world (anonymized) user-generated data across a range of closed- and open-world scenarios. We study various strategies to mitigate the risks of deanonymization. As random perturbation methods do not offer convincing operating points, we propose data-augmentation strategies which introduces adversarial biases in device data and thereby, offer substantial protection against deanonymization threats with little effect on utility.

preprint2020arXiv

How to Retrain Recommender System? A Sequential Meta-Learning Method

Practical recommender systems need be periodically retrained to refresh the model with new interaction data. To pursue high model fidelity, it is usually desirable to retrain the model on both historical and new data, since it can account for both long-term and short-term user preference. However, a full model retraining could be very time-consuming and memory-costly, especially when the scale of historical data is large. In this work, we study the model retraining mechanism for recommender systems, a topic of high practical values but has been relatively little explored in the research community. Our first belief is that retraining the model on historical data is unnecessary, since the model has been trained on it before. Nevertheless, normal training on new data only may easily cause overfitting and forgetting issues, since the new data is of a smaller scale and contains fewer information on long-term user preference. To address this dilemma, we propose a new training method, aiming to abandon the historical data during retraining through learning to transfer the past training experience. Specifically, we design a neural network-based transfer component, which transforms the old model to a new model that is tailored for future recommendations. To learn the transfer component well, we optimize the &#34;future performance&#34; -- i.e., the recommendation accuracy evaluated in the next time period. Our Sequential Meta-Learning(SML) method offers a general training paradigm that is applicable to any differentiable model. We demonstrate SML on matrix factorization and conduct experiments on two real-world datasets. Empirical results show that SML not only achieves significant speed-up, but also outperforms the full model retraining in recommendation accuracy, validating the effectiveness of our proposals. We release our codes at: https://github.com/zyang1580/SML.

preprint2020arXiv

Hybrid deep neural network based prediction method for unsteady flows with moving boundaries

A novel hybrid deep neural network architecture is designed to capture the spatial-temporal features of unsteady flows around moving boundaries directly from high-dimensional unsteady flow fields data. The hybrid deep neural network is constituted by the convolutional neural network (CNN), improved convolutional Long-Short Term Memory neural network (ConvLSTM) and deconvolutional neural network (DeCNN). Flow fields at future time step can be predicted through flow fields by previous time steps and boundary positions at those steps by the novel hybrid deep neural network. Unsteady wake flows around a forced oscillation cylinder with various amplitudes are calculated to establish the datasets as training samples for training the hybrid deep neural networks. The trained hybrid deep neural networks are then tested by predicting the unsteady flow fields around a forced oscillation cylinder with new amplitude. The effect of neural network structure parameters on prediction accuracy was analyzed. The hybrid deep neural network, constituted by the best parameter combination, is used to predict the flow fields in the future time. The predicted flow fields are in good agreement with those calculated directly by computational fluid dynamic solver, which means that this kind of deep neural network can capture accurate spatial-temporal information from the spatial-temporal series of unsteady flows around moving boundaries. The result shows the potential capability of this kind novel hybrid deep neural network in flow control for vibrating cylinder, where the fast calculation of high-dimensional nonlinear unsteady flow around moving boundaries is needed.

preprint2020arXiv

Improved YOLOv3 Object Classification in Intelligent Transportation System

The technology of vehicle and driver detection in Intelligent Transportation System(ITS) is a hot topic in recent years. In particular, the driver detection is still a challenging problem which is conductive to supervising traffic order and maintaining public safety. In this paper, an algorithm based on YOLOv3 is proposed to realize the detection and classification of vehicles, drivers, and people on the highway, so as to achieve the purpose of distinguishing driver and passenger and form a one-to-one correspondence between vehicles and drivers. The proposed model and contrast experiment are conducted on our self-build traffic driver&#39;s face database. The effectiveness of our proposed algorithm is validated by extensive experiments and verified under various complex highway conditions. Compared with other advanced vehicle and driver detection technologies, the model has a good performance and is robust to road blocking, different attitudes, and extreme lighting.

preprint2020arXiv

Integrated Arbitrary Filter with Spiral Gratings: Design and Characterization

We report the design and characterization of a high performance integrated arbitrary filter from 1450 nm to 1640 nm. The filter&#39;s target spectrum is chosen to suppress the night-sky OH emission lines, which is critical for ground-based astronomical telescopes. This type of filter is featured by its large spectral range, high rejection ratio and narrow notch width. Traditionally it is only successfully accomplished with fiber Bragg gratings. The technique we demonstrate here is proven to be very efficient for on-chip platforms, which can bring many benefits for device footprint, performance and cost. For the design part, two inverse scattering algorithms are compared, the frequency domain discrete layer-peeling (f-DLP) and the time domain discrete layer-peeling (t-DLP). f-DLP is found to be superior for the grating reconstruction in terms of accuracy and robustness. A method is proposed to resolve the non-uniformity issue caused by the non-zero layer size in the DLP algorithm. The designed 55-notch filter is 50-mm-long and implemented on a compact Si3N4/SiO2 spiral waveguide with a total length of 63 mm. Experimentally, we demonstrate that the device has a insertion loss as low as 2.5 dB, and that the waveguide propagation loss is as low as 0.10 dB/cm. We are also able to achieve uniform notch depths and 3-dB widths of about 28 dB and 0.22 nm, respectively.

preprint2020arXiv

Invariant Rationalization

Selective rationalization improves neural network interpretability by identifying a small subset of input features -- the rationale -- that best explains or supports the prediction. A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale. However, MMI can be problematic because it picks up spurious correlations between the input features and the output. Instead, we introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments. We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments. Our data and code are available.

preprint2020arXiv

Learning Differential Diagnosis of Skin Conditions with Co-occurrence Supervision using Graph Convolutional Networks

Skin conditions are reported the 4th leading cause of nonfatal disease burden worldwide. However, given the colossal spectrum of skin disorders defined clinically and shortage in dermatology expertise, diagnosing skin conditions in a timely and accurate manner remains a challenging task. Using computer vision technologies, a deep learning system has proven effective assisting clinicians in image diagnostics of radiology, ophthalmology and more. In this paper, we propose a deep learning system (DLS) that may predict differential diagnosis of skin conditions using clinical images. Our DLS formulates the differential diagnostics as a multi-label classification task over 80 conditions when only incomplete image labels are available. We tackle the label incompleteness problem by combining a classification network with a Graph Convolutional Network (GCN) that characterizes label co-occurrence and effectively regularizes it towards a sparse representation. Our approach is demonstrated on 136,462 clinical images and concludes that the classification accuracy greatly benefit from the Co-occurrence supervision. Our DLS achieves 93.6% top-5 accuracy on 12,378 test images and consistently outperform the baseline classification network.

preprint2020arXiv

Measurement of {\boldmath $J/ψ\toΞ(1530)^{-}\barΞ^{+}$} and evidence for the radiative decay {\boldmath $Ξ(1530)^{-}\toγΞ^-$}

The SU(3)-flavor violating decay $J/ψ\toΞ(1530)^{-}\barΞ^{+}+c.c.$ is studied using $(1310.6\pm7.0)\times 10^{6} ~J/ψ$ events collected with the BESIII detector at BEPCII and the branching fraction is measured to be ${\cal{B}}(J/ψ\toΞ(1530)^{-}\barΞ^{+}+c.c.)=(3.17\pm0.02_{\rm stat.}\pm0.08_{\rm syst.})\times10^{-4}$. This is consistent with previous measurements with an improved precision. The angular parameter for this decay is measured for the first time and is found to be $α=-0.21\pm0.04_{\rm stat.}\pm0.06_{\rm syst.}$. In addition, we report evidence for the radiative decay $Ξ(1530)^{-}\toγΞ^- $ with a significance of 3.9$σ$, including the systematic uncertainties. The 90\% confidence level upper limit on the branching fraction is determined to be $\mathcal{B}(Ξ(1530)^{-}\toγΞ^- )\leq3.7$\%.

preprint2020arXiv

Measurement of proton electromagnetic form factors in $e^+e^- \to p\bar{p}$ in the energy region 2.00-3.08 GeV

The process of $e^+e^- \rightarrow p\bar{p}$ is studied at 22 center-of-mass energy points ($\sqrt{s}$) from 2.00 to 3.08 GeV, exploiting 688.5~pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross section~($σ_{p\bar{p}}$) of $e^+e^- \rightarrow p\bar{p}$ is measured with the energy-scan technique and it is found to be consistent with previously published data, but with much improved accuracy. In addition, the electromagnetic form-factor ratio ($|G_{E}/G_{M}|$) and the value of the effective ($|G_{\rm{eff}}|$), electric ($|G_E|$) and magnetic ($|G_M|$) form factors are measured by studying the helicity angle of the proton at 16 center-of-mass energy points. $|G_{E}/G_{M}|$ and $|G_M|$ are determined with high accuracy, providing uncertainties comparable to data in the space-like region, and $|G_E|$ is measured for the first time. We reach unprecedented accuracy, and precision results in the time-like region provide information to improve our understanding of the proton inner structure and to test theoretical models which depend on non-perturbative Quantum Chromodynamics.

preprint2020arXiv

Measurement of the cross section for $e^{+}e^{-}\rightarrowΞ^{-}\barΞ^{+}$ and observation of an excited $Ξ$ baryon

Using a total of 11.0 fb$^{-1}$ of $e^{+}e^{-}$ collision data with center-of-mass energies between 4.009 GeV and 4.6 GeV and collected with the BESIII detector at BEPCII, we measure fifteen exclusive cross sections and effective form factors for the process $e^{+}e^{-}\rightarrowΞ^{-}\barΞ^{+}$ by means of a single baryon-tag method. After performing a fit to the dressed cross section of $e^{+}e^{-}\rightarrowΞ^{-}\barΞ^{+}$, no significant $ψ(4230)$ or $ψ(4260)$ resonance is observed in the $Ξ^{-}\barΞ^{+}$ final states, and upper limits at the 90\% confidence level on $Γ_{ee}\mathcal{B}$ for the processes $ψ(4230)$/$ψ(4260)\rightarrowΞ^{-}\barΞ^{+}$ are determined. In addition, an excited $Ξ$ baryon at 1820 MeV/$c^{2}$ is observed with a statistical significance of 6.2 $\sim$ 6.5$σ$ by including the systematic uncertainty, and the mass and width are measured to be $M = (1825.5 \pm 4.7 \pm 4.7)$~MeV/$c^{2}$ and $Γ= (17.0 \pm 15.0 \pm 7.9)$~MeV, which confirms the existence of the $J^{P}=\frac{3}{2}^{-}$ state $Ξ(1820)$.

preprint2020arXiv

Modality-Agnostic Attention Fusion for visual search with text feedback

Image retrieval with natural language feedback offers the promise of catalog search based on fine-grained visual features that go beyond objects and binary attributes, facilitating real-world applications such as e-commerce. Our Modality-Agnostic Attention Fusion (MAAF) model combines image and text features and outperforms existing approaches on two visual search with modifying phrase datasets, Fashion IQ and CSS, and performs competitively on a dataset with only single-word modifications, Fashion200k. We also introduce two new challenging benchmarks adapted from Birds-to-Words and Spot-the-Diff, which provide new settings with rich language inputs, and we show that our approach without modification outperforms strong baselines. To better understand our model, we conduct detailed ablations on Fashion IQ and provide visualizations of the surprising phenomenon of words avoiding &#34;attending&#34; to the image region they refer to.

preprint2020arXiv

Modality-Pairing Learning for Brain Tumor Segmentation

Automatic brain tumor segmentation from multi-modality Magnetic Resonance Images (MRI) using deep learning methods plays an important role in assisting the diagnosis and treatment of brain tumor. However, previous methods mostly ignore the latent relationship among different modalities. In this work, we propose a novel end-to-end Modality-Pairing learning method for brain tumor segmentation. Paralleled branches are designed to exploit different modality features and a series of layer connections are utilized to capture complex relationships and abundant information among modalities. We also use a consistency loss to minimize the prediction variance between two branches. Besides, learning rate warmup strategy is adopted to solve the problem of the training instability and early over-fitting. Lastly, we use average ensemble of multiple models and some post-processing techniques to get final results. Our method is tested on the BraTS 2020 online testing dataset, obtaining promising segmentation performance, with average dice scores of 0.891, 0.842, 0.816 for the whole tumor, tumor core and enhancing tumor, respectively. We won the second place of the BraTS 2020 Challenge for the tumor segmentation task.

preprint2020arXiv

Model-independent determination of the relative strong-phase difference between $D^0$ and $\bar{D}^0\rightarrow K^0_{S,L}π^+π^-$ and its impact on the measurement of the CKM angle $γ/ϕ_3$

Crucial inputs for a variety of $CP$-violation studies can be determined through the analysis of pairs of quantum-entangled neutral $D$ mesons, which are produced in the decay of the $ψ(3770)$ resonance. The relative strong-phase parameters between $D^0$ and $\bar{D}^0$ in the decays $D^0\rightarrow K^0_{S,L}π^+π^-$ are studied using 2.93~${\rm fb}^{-1}$ of $e^+e^-$ annihilation data delivered by the BEPCII collider and collected by the BESIII detector at a center-of-mass energy of 3.773 GeV. Results are presented in regions of the phase space of the decay. These are the most precise measurements to date of the strong-phase parameters in $D \to K_{S,L}^0π^+π^-$ decays. Using these parameters, the associated uncertainty on the Cabibbo-Kobayashi-Maskawa angle $γ/ϕ_3$ is expected to be between $0.7^\circ$ and $1.2^\circ$, for an analysis using the decay $B^{\pm}\rightarrow DK^{\pm}$, $D\rightarrow K^0_Sπ^+π^-$, where $D$ represents a superposition of $D^0$ and $\bar{D^0}$ states. This is a factor of three smaller than that achievable with previous measurements. Furthermore, these results provide valuable input for charm-mixing studies, other measurements of $CP$ violation, and the measurement of strong-phase parameters for other $D$-decay modes.

preprint2020arXiv

Origin of the Magnetic and Orbital ordering in $α$-Sr$_2$CrO$_4$

Motivated by recent experimental progress in transition metal oxides with the K$_2$NiF$_4$ structure, we investigate the magnetic and orbital ordering in $α$-Sr$_2$CrO$_4$. Using first principles calculations, first we derive a three-orbital Hubbard model, which reproduces the {\it ab initio} band structure near the Fermi level. The unique reverse splitting of $t_{2g}$ orbitals in $α$-Sr$_2$CrO$_4$, with the $3d^2$ electronic configuration for the Cr$^{4+}$ oxidation state, opens up the possibility of orbital ordering in this material. Using real-space Hartree-Fock for multi-orbital systems, we constructed the ground-state phase diagram for the two-dimensional compound $α$-Sr$_2$CrO$_4$. We found stable ferromagnetic, antiferromagnetic, antiferro-orbital, and staggered orbital stripe ordering in robust regions of the phase diagram. Furthermore, using the density matrix renormalization group method for two-leg ladders with the realistic hopping parameters of $α$-Sr$_2$CrO$_4$, we explore magnetic and orbital ordering for experimentally relevant interaction parameters. Again, we find a clear signature of antiferromagnetic spin ordering along with antiferro-orbital ordering at moderate to large Hubbard interaction strength. We also explore the orbital-resolved density of states with Lanczos, predicting insulating behavior for the compound $α$-Sr$_2$CrO$_4$, in agreement with experiments. Finally, an intuitive understanding of the results is provided based on a hierarchy between orbitals, with $d_{xy}$ driving the spin order, while electronic repulsion and the effective one dimensionality of the movement within the $d_{xz}$ and $d_{yz}$ orbitals driving the orbital order.

preprint2020arXiv

Overview of the CCKS 2019 Knowledge Graph Evaluation Track: Entity, Relation, Event and QA

Knowledge graph models world knowledge as concepts, entities, and the relationships between them, which has been widely used in many real-world tasks. CCKS 2019 held an evaluation track with 6 tasks and attracted more than 1,600 teams. In this paper, we give an overview of the knowledge graph evaluation tract at CCKS 2019. By reviewing the task definition, successful methods, useful resources, good strategies and research challenges associated with each task in CCKS 2019, this paper can provide a helpful reference for developing knowledge graph applications and conducting future knowledge graph researches.

preprint2020arXiv

Partial wave analysis of $ψ(3686)\rightarrow K^{+}K^{-}η$

Using a sample of $(448.1\pm2.9)\times10^6$ $ψ(3686)$ events collected with the BESIII detector, we perform the first partial wave analysis of $ψ(3686)\rightarrow K^+K^-η$. In addition to the well established states, $ϕ(1020)$, $ϕ(1680)$, and $K_3^*(1780)$, contributions from $X(1750)$, $ρ(2150)$, $ρ_3(2250)$, and $K^*_2(1980)$ are also observed. The $X(1750)$ state is determined to be a $1^{--}$ resonance. The simultaneous observation of the $ϕ(1680)$ and $X(1750)$ indicates that the $X(1750)$, with previous observations in photoproduction, is distinct from the $ϕ(1680)$. The masses, widths, branching fractions of $ψ(3686)\rightarrow K^+K^-η$ and the intermediate resonances are also measured.

preprint2020arXiv

Phase transition and entropic force of de Sitter black hole in massive gravity

It is well known that de Sitter(dS) black holes generally have a black hole horizon and a cosmological horizon, both of which have Hawking radiation. But the radiation temperature of the two horizons is generally different, so dS black holes do not meet the requirements of thermal equilibrium stability, which brings certain difficulties to the study of the thermodynamic characteristics of black holes. In this paper, dS black hole is regarded as a thermodynamic system, and the effective thermodynamic quantities of the system are obtained. The influence of various state parameters on the effective thermodynamic quantities in the massive gravity space-time is discussed. The condition of the phase transition of the de Sitter black hole in massive gravity space-time is given. We consider that the total entropy of the dS black hole is the sum of the corresponding entropy of the two horizons plus an extra term from the correlation of the two horizons. By comparing the entropic force of interaction between black hole horizon and the cosmological horizon with Lennard-Jones force between two particles, we find that the change rule of entropic force between the two system is surprisingly the same. The research will help us to explore the real reason of accelerating expansion of the universe.

preprint2020arXiv

Phase transition and entropy force between two horizons in (n+2)-dimensional de Sitter space

In this paper, the effect of the space-time dimension on effective thermodynamic quantities in (n+2)-dimensional Reissoner-Nordstrom-de Sitter space has been stud ied. Based on derived effective thermodynamic quantities, conditions for the phase transition are obtained. The result shows that the accelerating cosmic expansion can be attained by the entropy force arisen from the interaction between horizons of black holes and our universe, which provides a possible way to explain the physical mechanism for the accelerating cosmic expansion.

preprint2020arXiv

PhaseTracer: tracing cosmological phases and calculating transition properties

We present a C++ software package called PhaseTracer for mapping out cosmological phases, and potential transitions between them, for Standard Model extensions with any number of scalar fields. PhaseTracer traces the minima of effective potential as the temperature changes, and then calculates the critical temperatures, at which the minima are degenerate. PhaseTracer is constructed with modularity, flexibility and practicality in mind. It is fast and stable, and can receive potentials provided by other packages such as FlexibleSUSY. PhaseTracer can be useful analysing cosmological phase transitions which played an important role in the very early evolution of the Universe. If they were first order they could generate detectable gravitational waves and/or trigger electroweak baryogenesis to generate the observed matter anti-matter asymmetry of the Universe. The code can be obtained from https://github.com/PhaseTracer/PhaseTracer.

preprint2020arXiv

PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation

The need for fine-grained perception in autonomous driving systems has resulted in recently increased research on online semantic segmentation of single-scan LiDAR. Despite the emerging datasets and technological advancements, it remains challenging due to three reasons: (1) the need for near-real-time latency with limited hardware; (2) uneven or even long-tailed distribution of LiDAR points across space; and (3) an increasing number of extremely fine-grained semantic classes. In an attempt to jointly tackle all the aforementioned challenges, we propose a new LiDAR-specific, nearest-neighbor-free segmentation algorithm - PolarNet. Instead of using common spherical or bird&#39;s-eye-view projection, our polar bird&#39;s-eye-view representation balances the points across grid cells in a polar coordinate system, indirectly aligning a segmentation network&#39;s attention with the long-tailed distribution of the points along the radial axis. We find that our encoding scheme greatly increases the mIoU in three drastically different segmentation datasets of real urban LiDAR single scans while retaining near real-time throughput.

preprint2020arXiv

PP-YOLO: An Effective and Efficient Implementation of Object Detector

Object detection is one of the most important areas in computer vision, which plays a key role in various practical scenarios. Due to limitation of hardware, it is often necessary to sacrifice accuracy to ensure the infer speed of the detector in practice. Therefore, the balance between effectiveness and efficiency of object detector must be considered. The goal of this paper is to implement an object detector with relatively balanced effectiveness and efficiency that can be directly applied in actual application scenarios, rather than propose a novel detection model. Considering that YOLOv3 has been widely used in practice, we develop a new object detector based on YOLOv3. We mainly try to combine various existing tricks that almost not increase the number of model parameters and FLOPs, to achieve the goal of improving the accuracy of detector as much as possible while ensuring that the speed is almost unchanged. Since all experiments in this paper are conducted based on PaddlePaddle, we call it PP-YOLO. By combining multiple tricks, PP-YOLO can achieve a better balance between effectiveness (45.2% mAP) and efficiency (72.9 FPS), surpassing the existing state-of-the-art detectors such as EfficientDet and YOLOv4.Source code is at https://github.com/PaddlePaddle/PaddleDetection.

preprint2020arXiv

Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

While being deployed in many critical applications as core components, machine learning (ML) models are vulnerable to various security and privacy attacks. One major privacy attack in this domain is membership inference, where an adversary aims to determine whether a target data sample is part of the training set of a target ML model. So far, most of the current membership inference attacks are evaluated against ML models trained from scratch. However, real-world ML models are typically trained following the transfer learning paradigm, where a model owner takes a pretrained model learned from a different dataset, namely teacher model, and trains her own student model by fine-tuning the teacher model with her own data. In this paper, we perform the first systematic evaluation of membership inference attacks against transfer learning models. We adopt the strategy of shadow model training to derive the data for training our membership inference classifier. Extensive experiments on four real-world image datasets show that membership inference can achieve effective performance. For instance, on the CIFAR100 classifier transferred from ResNet20 (pretrained with Caltech101), our membership inference achieves $95\%$ attack AUC. Moreover, we show that membership inference is still effective when the architecture of target model is unknown. Our results shed light on the severity of membership risks stemming from machine learning models in practice.

preprint2020arXiv

PrivSyn: Differentially Private Data Synthesis

In differential privacy (DP), a challenging problem is to generate synthetic datasets that efficiently capture the useful information in the private data. The synthetic dataset enables any task to be done without privacy concern and modification to existing algorithms. In this paper, we present PrivSyn, the first automatic synthetic data generation method that can handle general tabular datasets (with 100 attributes and domain size $>2^{500}$). PrivSyn is composed of a new method to automatically and privately identify correlations in the data, and a novel method to generate sample data from a dense graphic model. We extensively evaluate different methods on multiple datasets to demonstrate the performance of our method.

preprint2020arXiv

Protein structure and sequence re-analysis of 2019-nCoV genome does not indicate snakes as its intermediate host or the unique similarity between its spike protein insertions and HIV-1

As the infection of 2019-nCoV coronavirus is quickly developing into a global pneumonia epidemic, careful analysis of its transmission and cellular mechanisms is sorely needed. In this report, we re-analyzed the computational approaches and findings presented in two recent manuscripts by Ji et al. (https://doi.org/10.1002/jmv.25682) and by Pradhan et al. (https://doi.org/10.1101/2020.01.30.927871), which concluded that snakes are the intermediate hosts of 2019-nCoV and that the 2019-nCoV spike protein insertions shared a unique similarity to HIV-1. Results from our re-implementation of the analyses, built on larger-scale datasets using state-of-the-art bioinformatics methods and databases, do not support the conclusions proposed by these manuscripts. Based on our analyses and existing data of coronaviruses, we concluded that the intermediate hosts of 2019-nCoV are more likely to be mammals and birds than snakes, and that the &#34;novel insertions&#34; observed in the spike protein are naturally evolved from bat coronaviruses.

preprint2020arXiv

Reinterpretation of LHC Results for New Physics: Status and Recommendations after Run 2

We report on the status of efforts to improve the reinterpretation of searches and measurements at the LHC in terms of models for new physics, in the context of the LHC Reinterpretation Forum. We detail current experimental offerings in direct searches for new particles, measurements, technical implementations and Open Data, and provide a set of recommendations for further improving the presentation of LHC results in order to better enable reinterpretation in the future. We also provide a brief description of existing software reinterpretation frameworks and recent global analyses of new physics that make use of the current data.

preprint2020arXiv

Scalable and Communication-efficient Decentralized Federated Edge Learning with Multi-blockchain Framework

The emerging Federated Edge Learning (FEL) technique has drawn considerable attention, which not only ensures good machine learning performance but also solves &#34;data island&#34; problems caused by data privacy concerns. However, large-scale FEL still faces following crucial challenges: (i) there lacks a secure and communication-efficient model training scheme for FEL; (2) there is no scalable and flexible FEL framework for updating local models and global model sharing (trading) management. To bridge the gaps, we first propose a blockchain-empowered secure FEL system with a hierarchical blockchain framework consisting of a main chain and subchains. This framework can achieve scalable and flexible decentralized FEL by individually manage local model updates or model sharing records for performance isolation. A Proof-of-Verifying consensus scheme is then designed to remove low-quality model updates and manage qualified model updates in a decentralized and secure manner, thereby achieving secure FEL. To improve communication efficiency of the blockchain-empowered FEL, a gradient compression scheme is designed to generate sparse but important gradients to reduce communication overhead without compromising accuracy, and also further strengthen privacy preservation of training data. The security analysis and numerical results indicate that the proposed schemes can achieve secure, scalable, and communication-efficient decentralized FEL.

preprint2020arXiv

Search for baryon and lepton number violating decays $D^+\to\barΛ(\barΣ^0)e^+$ and $D^+\toΛ(Σ^0)e^+$

Using a 2.93 fb$^{-1}$ data sample of electron-positron collisions taken with the BESIII detector at a center-of-mass energy of 3.773 GeV, which corresponds to $(8296\pm31\pm64)\times10^3 D^+D^-$ pairs, we search for the baryon and lepton number violating decays $D^+\to\barΛ(\barΣ^0)e^+$ and $D^+\toΛ(Σ^0)e^+$. No obvious signals are found with the current statistics and upper limits on the branching fractions of these four decays are set at the level of $10^{-6}$ at 90% confidence level.

preprint2020arXiv

Search for the $D^*\bar{D}^*$ molecular state $Z_c(4000)$ in the reaction $B^{-} \rightarrow J/ψρ^0 K^{-}$

Based on the prediction of a $D^*\bar{D}^*$ molecular state $Z_c(4000)$ with isospin $I=1$ in the coupled channel approach, we suggest to search for this state in the reaction $B^- \to J/ψρ^0 K^-$. By taking into account the final state interactions of $J/ψρ$ and $D^{*0}\bar{D}^{*0}$, and the contribution from the $K_1(1270)$ resonance, we find that the $J/ψρ$ mass distribution shows a peak around 4000~MeV, which could be associated to the $D^*\bar{D}^*$ molecular state $Z_c(4000)$. Searching for the $Z_c(4000)$ in the reaction $B^- \to J/ψρ^0 K^-$ is crucial to understand the internal structures of the exotic hadrons, and our predictions can be tested by the Belle II and LHCb in future.

preprint2020arXiv

Spin-orbitronic materials with record spin-charge conversion from high-throughput ab initio calculations

The spin Hall effect (SHE) is an important spintronics phenomenon, which allows transforming a charge current into a spin current and vice versa without the use of magnetic materials or magnetic fields. To gain new insight into the physics of the SHE and to identify materials with a substantial spin Hall conductivities (SHC), we performed high-precision, high-throughput ab initio electronic structure calculations of the intrinsic SHC for over 20,000 non-magnetic crystals. The calculations reveal a strong and unexpected relation of the magnitude of the SHC with the crystalline symmetry, which we show exists because large SHC is typically associated with mirror symmetry protected nodal lines in the band structure. From the new developed database, we identify new promising materials. This includes eleven materials with a SHC comparable or even larger than that the up to now record Pt as well as materials with different types of spin currents, which could allow for new types of spin-obitronics devices.

preprint2020arXiv

Stereoscopic molecular tagging for superconducting accelerator-cavity quench spot detection

Superconducting radio-frequency (SRF) cavities cooled by superfluid helium-4 (He II) are building blocks of many modern particle accelerators due to their high quality factor. However, Joule heating from sub-millimeter surface defects on cavities can lead to cavity quenching, which limits the maximum acceleration gradient of the accelerators. Developing a non-contacting detection technology to accurately locate these surface defects is the key to improve the performance of SRF cavities and hence the accelerators. In a recent proof-of-concept experiment (Phys. Rev. Applied, 11, 044003 (2019)), we demonstrated that a molecular tagging velocimetry (MTV) technique based on the tracking of a He$_2^*$ molecular tracer line created nearby a surface hot spot in He II can be utilized to locate the hot spot. In order to make this technique practically useful, here we describe our further development of a stereoscopic MTV setup for tracking the tracer line&#39;s motion in three-dimensional (3D) space. We simulate a quench spot by applying a transient voltage pulse to a small heater mounted on a substrate plate. Images of the drifted tracer line, taken with two cameras from orthogonal directions, are used to reconstruct the line profile in 3D space. A new algorithm for analyzing the 3D line profile is developed, which incorporates the finite size effect of the heater. We show that the center location of the heater can be reproduced on the substrate surface with an uncertainty of only a few hundred microns, thereby proving the practicability of this method.

preprint2020arXiv

Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks

We propose NovoGrad, an adaptive stochastic gradient descent method with layer-wise gradient normalization and decoupled weight decay. In our experiments on neural networks for image classification, speech recognition, machine translation, and language modeling, it performs on par or better than well tuned SGD with momentum and Adam or AdamW. Additionally, NovoGrad (1) is robust to the choice of learning rate and weight initialization, (2) works well in a large batch setting, and (3) has two times smaller memory footprint than Adam.

preprint2020arXiv

Study of $e^{+}e^{-} \to D^{+} D^{-} π^{+} π^{-} $ at center-of-mass energies from 4.36 to 4.60 GeV

We report a study of the $e^{+}e^{-} \to D^{+} D^{-} π^{+} π^{-}$ process using $e^{+}e^{-}$ collision data samples with an integrated luminosity of $2.5\,\rm{fb}^{-1}$ at center-of-mass energies from 4.36 to $4.60 \rm{GeV}$, collected with the BESIII detector at the BEPCII storage ring. The $D_{1}(2420)^+$ is observed in the $D^{+} π^{+} π^{-}$ mass spectrum. The mass and width of the $D_{1}(2420)^+$ are measured to be $(2427.2\pm 1.0_{\rm stat.}\pm 1.2_{\rm syst.}) \rm{MeV}/c^2$ and $(23.2\pm 2.3_{\rm stat.} \pm2.3_{\rm syst.}) \rm{MeV}$, respectively. The first errors are statistical and the second ones are systematic. In addition, the Born cross sections of the $e^{+}e^{-} \to D_{1}(2420)^+D^- + c.c. \to D^{+} D^{-} π^{+} π^{-}$ and $e^{+}e^{-} \to ψ(3770) π^{+} π^{-} \to D^{+} D^{-} π^{+} π^{-}$ processes are measured as a function of the center-of-mass energy.

preprint2020arXiv

The Tensor Rank Problem over the Quaternions

We provide a nontrivial bound on the rank of any tensor $T$ over the quaternions $\mathbb{H}$ in the $n_1\times n_2\times n_3$ cases where $2\leq n_i\leq 3$. We describe a decomposition of $T$ into $3$ simple tensors in the $2\times 2\times 2$ case. We also show that the upper bound is the best possible for some of the cases, and we provide various partial results involving tensor decompositions over $\mathbb{C}$ and $\mathbb{H}$.

preprint2020arXiv

Thermodynamic properties of higher-dimensional dS black holes in dRGT massive gravity

On the basis of the state parameter of de Sitter space-time satisfying the first law of thermodynamics,we can derive some effective thermodynamic quantities.When the temperature of the black hole horizon is equal to that of the cosmological horizon, we think that the effective temperature of the space-time should have the same value. Using this condition, we obtain a differential equation of the entropy of the de Sitter black hole in the higherdimensional de Rham, Gabadadze and Tolley (dRGT) massive gravity. Solving the differential equation, we obtain the corrected entropy and effective thermodynamic quantities of the de Sitter black hole. The results show that for multiparameter black holes, the entropy satisfied differential equation is invariable with different independent state parameters. Therefore, the entropy of higher-dimensional dS black holes in dRGT massive gravity is only a function of the position of the black hole horizon, and is independent of other state parameters. It is consistent with the corresponding entropy of the black hole horizon and the cosmological horizon. The thermodynamic quantities of self-consistent de Sitter spacetime are given theoretically, and the equivalent thermodynamic quantities have the second-order phase transformation similar to AdS black hole, but unlike AdS black hole, the equivalent temperature of de Sitter space-time has a maximum value. By satisfying the requirement of thermodynamic equilibrium and stability of space-time, the conditions for the existence of dS black holes in the universe are obtained.

preprint2020arXiv

Unified Mandarin TTS Front-end Based on Distilled BERT Model

The front-end module in a typical Mandarin text-to-speech system (TTS) is composed of a long pipeline of text processing components, which requires extensive efforts to build and is prone to large accumulative model size and cascade errors. In this paper, a pre-trained language model (PLM) based model is proposed to simultaneously tackle the two most important tasks in TTS front-end, i.e., prosodic structure prediction (PSP) and grapheme-to-phoneme (G2P) conversion. We use a pre-trained Chinese BERT[1] as the text encoder and employ multi-task learning technique to adapt it to the two TTS front-end tasks. Then, the BERT encoder is distilled into a smaller model by employing a knowledge distillation technique called TinyBERT[2], making the whole model size 25% of that of benchmark pipeline models while maintaining competitive performance on both tasks. With the proposed the methods, we are able to run the whole TTS front-end module in a light and unified manner, which is more friendly to deployment on mobile devices.

preprint2020arXiv

Unscented Kalman filter (UKF) based nonlinear parameter estimation for a turbulent boundary layer: a data assimilation framework

A turbulent boundary layer is an essential flow case of fundamental and applied fluid mechanics. However, accurate measurements of turbulent boundary layer parameters (e.g., friction velocity $u_τ$ and wall shear $τ_w$), are challenging, especially for high speed flows (Smits et al., 2011). Many direct and/or indirect diagnostic techniques have been developed to measure wall shear stress (Vinuesa et al., 2017). However, based on different principles, these techniques usually give different results with different uncertainties. The current study introduces a nonlinear data assimilation framework based on the Unscented Kalman Filter that can fuse information from i) noisy and gappy measurements from Stereo Particle Image Velocimetry, a Preston tube, and a MEMS shear stress sensor, as well as ii) the uncertainties of the measurements to estimate the parameters of a turbulent boundary layer. A direct numerical simulation of a fully developed turbulent boundary layer flow at Mach 0.3 is used first to validate the data assimilation algorithm. The algorithm is then applied to experimental data of a flow at Mach 0.3, which are obtained in a blowdown wind tunnel facility. The UKF-based data assimilation algorithm is robust to uncertain and gappy experimental data and is able to provide accurate estimates of turbulent boundary layer parameters.

preprint2020arXiv

Zero-field Nernst effect in a ferromagnetic kagome-lattice Weyl-semimetal Co3Sn2S2

The discovery of magnetic topological semimetals recently attracted significant attention in the field of topology and thermoelectrics. In a thermoelectric device based on the Nernst geometry, an external magnet is required as an integral part. We report a zero-field Nernst effect in a newly discovered hard-ferromagnetic kagome-lattice Weyl-semimetal Co3Sn2S2. A maximum Nernst thermopower of 3 microvolt/K at 80 K in zero field is achieved in this magnetic Weyl-semimetal. Our results demonstrate the possibility of application of topological hard magnetic semimetals for low-power thermoelectric devices based on the Nernst effect and are thus valuable for the comprehensive understanding of transport properties in this class of materials.

preprint2019arXiv

Iron telluride ladder compounds: Predicting the structural and magnetic properties of BaFe$_2$Te$_3$

Since the discovery of pressure-induced superconductivity in the two-leg ladder system BaFe$_2X_3$ ($X$=S, Se), with the 3$d$ iron electronic density $n = 6$, the quasi-one-dimensional iron-based ladders have attracted considerable attention. Here, we use Density Functional Theory (DFT) to predict that the novel $n = 6$ iron ladder BaFe$_2$Te$_3$ could be stable with a similar crystal structure as BaFe$_2$Se$_3$. Our results also indicate that BaFe$_2$Te$_3$ will display the complex 2$\times$2 Block-type magnetic order. Due to the magnetic striction effects of this Block order, BaFe$_2$Te$_3$ should be a magnetic noncollinear ferrielectric system with a net polarization $0.31$ $μ$C/cm$^2$. Compared with the S- or Se-based iron ladders, the electrons of the Te-based ladders are more localized, implying that the degree of electronic correlation is enhanced for the Te case which may induce additional interesting properties. The physical and structural similarity with BaFe$_2$Se$_3$ also suggests that BaFe$_2$Te$_3$ could become superconducting under high pressure.

preprint2019arXiv

Phase transitions and entropy force of charged de Sitter black holes with cloud of string and quintessence

In this paper, we investigate the combined effects of the cloud of strings and quintessence on the thermodynamics of a Reissner-Nordström-de Sitter black hole. Based on the equivalent thermodynamic quantities considering the correlation between the black hole horizon and the cosmological horizon, we extensively discuss the phase transitions of the space-time. Our analysis prove that similar to the case in AdS space-time, second-order phase transitions could take place under certain conditions, with the absence of first-order phase transition in the charged de Sitter black holes with cloud of string and quintessence. The effects of different thermodynamic quantities on the phase transitions are also quantitatively discussed, which provides a new approach to study the thermodynamic qualities of unstable dS space-time. Focusing on the entropy force generated by the interaction between the black hole horizon and the cosmological horizon, as well as the Lennard-Jones force between two particles, our results demonstrate the strong degeneracy between the entropy force of the two horizons and the ratio of the horizon positions, which follows the surprisingly similar law given the relation between the Lennard-Jones force and the ratio of two particle positions. Therefore, the study of the entropy force between two horizons, is not only beneficial to the deep exploration of the three modes of cosmic evolution, but also helpful to understand the correlation between the microstates of particles in black holes and those in ordinary thermodynamic systems.

preprint2019arXiv

Visually Constructing the Chemical Structure of a Single Molecule by Scanning Raman Picoscopy

The strong spatial confinement of a nanocavity plasmonic field has made it possible to visualize the inner structure of a single molecule and even to distinguish its vibrational modes in real space. With such ever-improved spatial resolution, it is anticipated that full vibrational imaging of a molecule could be achieved to reveal molecular structural details. Here we demonstrate full Raman images of individual vibrational modes on the Ångström level for a single Mg-porphine molecule, revealing distinct characteristics of each vibrational mode in real space. Furthermore, by exploiting the underlying interference effect and Raman fingerprint database, we propose a new methodology for structural determination, coined as scanning Raman picoscopy, to show how such ultrahigh-resolution spectromicroscopic vibrational images can be used to visually assemble the chemical structure of a single molecule through a simple Lego-like building process.