Source author record

Bo Ren

Bo Ren appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision nlin.SI hep-ex hep-ph Computation and Language astro-ph.CO nucl-ex nucl-th Artificial Intelligence Networking and Internet Architecture

Catalog footprint

What is connected

33works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Analyzing Communication Predictability in LLM Training

Effective communication is essential in distributed training, with predictability being one of its most significant characteristics. However, existing studies primarily focus on exploiting predictability through online profiling for runtime optimization, without a systematic understanding of it. In this work, we aim to systematically formulate communication predictability in distributed training, particularly in Large Language Models (LLMs) that utilize hybrid parallelism. Our analysis focuses on both traffic patterns and communication overhead. Specifically, we investigate predictable traffic patterns in typical LLMs and evaluate how various factors influence GPU utilization and effective bandwidth (two critical variables affecting communication overhead). Furthermore, we develop an analytical formulation to estimate communication overhead in LLM training, which is validated with high accuracy against empirical data. Leveraging this formulation, we propose a configuration tuning tool, ConfigTuner, to optimize training performance. Compared to Megatron-LM, the training configurations optimized by ConfigTuner demonstrate up to a 1.36$\times$ increase in throughput. Compared to Alpa, ConfigTuner generates the same configuration suggestion while significantly reducing the search complexity.

preprint2022arXiv

Contrastive Graph Multimodal Model for Text Classification in Videos

The extraction of text information in videos serves as a critical step towards semantic understanding of videos. It usually involved in two steps: (1) text recognition and (2) text classification. To localize texts in videos, we can resort to large numbers of text recognition methods based on OCR technology. However, to our knowledge, there is no existing work focused on the second step of video text classification, which will limit the guidance to downstream tasks such as video indexing and browsing. In this paper, we are the first to address this new task of video text classification by fusing multimodal information to deal with the challenging scenario where different types of video texts may be confused with various colors, unknown fonts and complex layouts. In addition, we tailor a specific module called CorrelationNet to reinforce feature representation by explicitly extracting layout information. Furthermore, contrastive learning is utilized to explore inherent connections between samples using plentiful unlabeled videos. Finally, we construct a new well-defined industrial dataset from the news domain, called TI-News, which is dedicated to building and evaluating video text recognition and classification applications. Extensive experiments on TI-News demonstrate the effectiveness of our method.

preprint2022arXiv

EDN: Salient Object Detection via Extremely-Downsampled Network

Recent progress on salient object detection (SOD) mainly benefits from multi-scale learning, where the high-level and low-level features collaborate in locating salient objects and discovering fine details, respectively. However, most efforts are devoted to low-level feature learning by fusing multi-scale features or enhancing boundary representations. High-level features, which although have long proven effective for many other tasks, yet have been barely studied for SOD. In this paper, we tap into this gap and show that enhancing high- level features is essential for SOD as well. To this end, we introduce an Extremely-Downsampled Network (EDN), which employs an extreme downsampling technique to effectively learn a global view of the whole image, leading to accurate salient object localization. To accomplish better multi-level feature fusion, we construct the Scale-Correlated Pyramid Convolution (SCPC) to build an elegant decoder for recovering object details from the above extreme downsampling. Extensive experiments demonstrate that EDN achieves state-of-the-art performance with real-time speed. Our efficient EDN-Lite also achieves competitive performance with a speed of 316fps. Hence, this work is expected to spark some new thinking in SOD. Code is available at https://github.com/yuhuan-wu/EDN.

preprint2022arXiv

GMN: Generative Multi-modal Network for Practical Document Information Extraction

Document Information Extraction (DIE) has attracted increasing attention due to its various advanced applications in the real world. Although recent literature has already achieved competitive results, these approaches usually fail when dealing with complex documents with noisy OCR results or mutative layouts. This paper proposes Generative Multi-modal Network (GMN) for real-world scenarios to address these problems, which is a robust multi-modal generation method without predefined label categories. With the carefully designed spatial encoder and modal-aware mask module, GMN can deal with complex documents that are hard to serialized into sequential order. Moreover, GMN tolerates errors in OCR results and requires no character-level annotation, which is vital because fine-grained annotation of numerous documents is laborious and even requires annotators with specialized domain knowledge. Extensive experiments show that GMN achieves new state-of-the-art performance on several public DIE datasets and surpasses other methods by a large margin, especially in realistic scenes.

preprint2022arXiv

Interactive Style Transfer: All is Your Palette

Neural style transfer (NST) can create impressive artworks by transferring reference style to content image. Current image-to-image NST methods are short of fine-grained controls, which are often demanded by artistic editing. To mitigate this limitation, we propose a drawing-like interactive style transfer (IST) method, by which users can interactively create a harmonious-style image. Our IST method can serve as a brush, dip style from anywhere, and then paint to any region of the target content image. To determine the action scope, we formulate a fluid simulation algorithm, which takes styles as pigments around the position of brush interaction, and diffusion in style or content images according to the similarity maps. Our IST method expands the creative dimension of NST. By dipping and painting, even employing one style image can produce thousands of eye-catching works. The demo video is available in supplementary files or in http://mmcheng.net/ist.

preprint2022arXiv

Knowledge Mining with Scene Text for Fine-Grained Recognition

Recently, the semantics of scene text has been proven to be essential in fine-grained image classification. However, the existing methods mainly exploit the literal meaning of scene text for fine-grained recognition, which might be irrelevant when it is not significantly related to objects/scenes. We propose an end-to-end trainable network that mines implicit contextual knowledge behind scene text image and enhance the semantics and correlation to fine-tune the image representation. Unlike the existing methods, our model integrates three modalities: visual feature extraction, text semantics extraction, and correlating background knowledge to fine-grained image classification. Specifically, we employ KnowBert to retrieve relevant knowledge for semantic representation and combine it with image features for fine-grained classification. Experiments on two benchmark datasets, Con-Text, and Drink Bottle, show that our method outperforms the state-of-the-art by 3.72\% mAP and 5.39\% mAP, respectively. To further validate the effectiveness of the proposed method, we create a new dataset on crowd activity recognition for the evaluation. The source code and new dataset of this work are available at https://github.com/lanfeng4659/KnowledgeMiningWithSceneText.

preprint2022arXiv

Neural Collaborative Graph Machines for Table Structure Recognition

Recently, table structure recognition has achieved impressive progress with the help of deep graph models. Most of them exploit single visual cues of tabular elements or simply combine visual cues with other modalities via early fusion to reason their graph relationships. However, neither early fusion nor individually reasoning in terms of multiple modalities can be appropriate for all varieties of table structures with great diversity. Instead, different modalities are expected to collaborate with each other in different patterns for different table cases. In the community, the importance of intra-inter modality interactions for table structure reasoning is still unexplored. In this paper, we define it as heterogeneous table structure recognition (Hetero-TSR) problem. With the aim of filling this gap, we present a novel Neural Collaborative Graph Machines (NCGM) equipped with stacked collaborative blocks, which alternatively extracts intra-modality context and models inter-modality interactions in a hierarchical way. It can represent the intra-inter modality relationships of tabular elements more robustly, which significantly improves the recognition performance. We also show that the proposed NCGM can modulate collaborative pattern of different modalities conditioned on the context of intra-modality cues, which is vital for diversified table cases. Experimental results on benchmarks demonstrate our proposed NCGM achieves state-of-the-art performance and beats other contemporary methods by a large margin especially under challenging scenarios.

preprint2022arXiv

NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition

Recently, Vision Transformers (ViT), with the self-attention (SA) as the de facto ingredients, have demonstrated great potential in the computer vision community. For the sake of trade-off between efficiency and performance, a group of works merely perform SA operation within local patches, whereas the global contextual information is abandoned, which would be indispensable for visual recognition tasks. To solve the issue, the subsequent global-local ViTs take a stab at marrying local SA with global one in parallel or alternative way in the model. Nevertheless, the exhaustively combined local and global context may exist redundancy for various visual data, and the receptive field within each layer is fixed. Alternatively, a more graceful way is that global and local context can adaptively contribute per se to accommodate different visual data. To achieve this goal, we in this paper propose a novel ViT architecture, termed NomMer, which can dynamically Nominate the synergistic global-local context in vision transforMer. By investigating the working pattern of our proposed NomMer, we further explore what context information is focused. Beneficial from this "dynamic nomination" mechanism, without bells and whistles, the NomMer can not only achieve 84.5% Top-1 classification accuracy on ImageNet with only 73M parameters, but also show promising performance on dense prediction tasks, i.e., object detection and semantic segmentation. The code and models will be made publicly available at https://github.com/TencentYoutuResearch/VisualRecognition-NomMer

preprint2022arXiv

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification

Scene segmentation and classification (SSC) serve as a critical step towards the field of video structuring analysis. Intuitively, jointly learning of these two tasks can promote each other by sharing common information. However, scene segmentation concerns more on the local difference between adjacent shots while classification needs the global representation of scene segments, which probably leads to the model dominated by one of the two tasks in the training phase. In this paper, from an alternate perspective to overcome the above challenges, we unite these two tasks into one task by a new form of predicting shots link: a link connects two adjacent shots, indicating that they belong to the same scene or category. To the end, we propose a general One Stage Multimodal Sequential Link Framework (OS-MSL) to both distinguish and leverage the two-fold semantics by reforming the two learning tasks into a unified one. Furthermore, we tailor a specific module called DiffCorrNet to explicitly extract the information of differences and correlations among shots. Extensive experiments on a brand-new large scale dataset collected from real-world applications, and MovieScenes are conducted. Both the results demonstrate the effectiveness of our proposed method against strong baselines.

preprint2022arXiv

RAAT: Relation-Augmented Attention Transformer for Relation Modeling in Document-Level Event Extraction

In document-level event extraction (DEE) task, event arguments always scatter across sentences (across-sentence issue) and multiple events may lie in one document (multi-event issue). In this paper, we argue that the relation information of event arguments is of great significance for addressing the above two issues, and propose a new DEE framework which can model the relation dependencies, called Relation-augmented Document-level Event Extraction (ReDEE). More specifically, this framework features a novel and tailored transformer, named as Relation-augmented Attention Transformer (RAAT). RAAT is scalable to capture multi-scale and multi-amount argument relations. To further leverage relation information, we introduce a separate event relation prediction task and adopt multi-task learning method to explicitly enhance event extraction performance. Extensive experiments demonstrate the effectiveness of the proposed method, which can achieve state-of-the-art performance on two public datasets. Our code is available at https://github. com/TencentYoutuResearch/RAAT.

preprint2022arXiv

Relational Representation Learning in Visually-Rich Documents

Relational understanding is critical for a number of visually-rich documents (VRDs) understanding tasks. Through multi-modal pre-training, recent studies provide comprehensive contextual representations and exploit them as prior knowledge for downstream tasks. In spite of their impressive results, we observe that the widespread relational hints (e.g., relation of key/value fields on receipts) built upon contextual knowledge are not excavated yet. To mitigate this gap, we propose DocReL, a Document Relational Representation Learning framework. The major challenge of DocReL roots in the variety of relations. From the simplest pairwise relation to the complex global structure, it is infeasible to conduct supervised training due to the definition of relation varies and even conflicts in different tasks. To deal with the unpredictable definition of relations, we propose a novel contrastive learning task named Relational Consistency Modeling (RCM), which harnesses the fact that existing relations should be consistent in differently augmented positive views. RCM provides relational representations which are more compatible to the urgent need of downstream tasks, even without any knowledge about the exact definition of relation. DocReL achieves better performance on a wide variety of VRD relational understanding tasks, including table structure recognition, key information extraction and reading order detection.

preprint2022arXiv

Scene Consistency Representation Learning for Video Scene Segmentation

A long-term video, such as a movie or TV show, is composed of various scenes, each of which represents a series of shots sharing the same semantic story. Spotting the correct scene boundary from the long-term video is a challenging task, since a model must understand the storyline of the video to figure out where a scene starts and ends. To this end, we propose an effective Self-Supervised Learning (SSL) framework to learn better shot representations from unlabeled long-term videos. More specifically, we present an SSL scheme to achieve scene consistency, while exploring considerable data augmentation and shuffling methods to boost the model generalizability. Instead of explicitly learning the scene boundary features as in the previous methods, we introduce a vanilla temporal model with less inductive bias to verify the quality of the shot features. Our method achieves the state-of-the-art performance on the task of Video Scene Segmentation. Additionally, we suggest a more fair and reasonable benchmark to evaluate the performance of Video Scene Segmentation methods. The code is made available.

preprint2022arXiv

See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval

Text-based person retrieval aims to find the query person based on a textual description. The key is to learn a common latent space mapping between visual-textual modalities. To achieve this goal, existing works employ segmentation to obtain explicitly cross-modal alignments or utilize attention to explore salient alignments. These methods have two shortcomings: 1) Labeling cross-modal alignments are time-consuming. 2) Attention methods can explore salient cross-modal alignments but may ignore some subtle and valuable pairs. To relieve these issues, we introduce an Implicit Visual-Textual (IVT) framework for text-based person retrieval. Different from previous models, IVT utilizes a single network to learn representation for both modalities, which contributes to the visual-textual interaction. To explore the fine-grained alignment, we further propose two implicit semantic alignment paradigms: multi-level alignment (MLA) and bidirectional mask modeling (BMM). The MLA module explores finer matching at sentence, phrase, and word levels, while the BMM module aims to mine \textbf{more} semantic alignments between visual and textual modalities. Extensive experiments are carried out to evaluate the proposed IVT on public datasets, i.e., CUHK-PEDES, RSTPReID, and ICFG-PEDES. Even without explicit body part alignment, our approach still achieves state-of-the-art performance. Code is available at: https://github.com/TencentYoutuResearch/PersonRetrieval-IVT.

preprint2022arXiv

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation

The task of Grammatical Error Correction (GEC) has received remarkable attention with wide applications in Natural Language Processing (NLP) in recent years. While one of the key principles of GEC is to keep the correct parts unchanged and avoid over-correction, previous sequence-to-sequence (seq2seq) models generate results from scratch, which are not guaranteed to follow the original sentence structure and may suffer from the over-correction problem. In the meantime, the recently proposed sequence tagging models can overcome the over-correction problem by only generating edit operations, but are conditioned on human designed language-specific tagging labels. In this paper, we combine the pros and alleviate the cons of both models by proposing a novel Sequence-to-Action~(S2A) module. The S2A module jointly takes the source and target sentences as input, and is able to automatically generate a token-level action sequence before predicting each token, where each action is generated from three choices named SKIP, COPY and GENerate. Then the actions are fused with the basic seq2seq framework to provide final predictions. We conduct experiments on the benchmark datasets of both English and Chinese GEC tasks. Our model consistently outperforms the seq2seq baselines, while being able to significantly alleviate the over-correction problem as well as holding better generality and diversity in the generation results compared to the sequence tagging models.

preprint2022arXiv

TaCo: Textual Attribute Recognition via Contrastive Learning

As textual attributes like font are core design elements of document format and page style, automatic attributes recognition favor comprehensive practical applications. Existing approaches already yield satisfactory performance in differentiating disparate attributes, but they still suffer in distinguishing similar attributes with only subtle difference. Moreover, their performance drop severely in real-world scenarios where unexpected and obvious imaging distortions appear. In this paper, we aim to tackle these problems by proposing TaCo, a contrastive framework for textual attribute recognition tailored toward the most common document scenes. Specifically, TaCo leverages contrastive learning to dispel the ambiguity trap arising from vague and open-ended attributes. To realize this goal, we design the learning paradigm from three perspectives: 1) generating attribute views, 2) extracting subtle but crucial details, and 3) exploiting valued view pairs for learning, to fully unlock the pre-training potential. Extensive experiments show that TaCo surpasses the supervised counterparts and advances the state-of-the-art remarkably on multiple attribute recognition tasks. Online services of TaCo will be made available.

preprint2022arXiv

The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training

The self-supervised Masked Image Modeling (MIM) schema, following "mask-and-reconstruct" pipeline of recovering contents from masked image, has recently captured the increasing interest in the multimedia community, owing to the excellent ability of learning visual representation from unlabeled data. Aiming at learning representations with high semantics abstracted, a group of works attempts to reconstruct non-semantic pixels with large-ratio masking strategy, which may suffer from "over-smoothing" problem, while others directly infuse semantics into targets in off-line way requiring extra data. Different from them, we shift the perspective to the Fourier domain which naturally has global perspective and present a new Masked Image Modeling (MIM), termed Geminated Gestalt Autoencoder (Ge$^2$-AE) for visual pre-training. Specifically, we equip our model with geminated decoders in charge of reconstructing image contents from both pixel and frequency space, where each other serves as not only the complementation but also the reciprocal constraints. Through this way, more robust representations can be learned in the pre-trained encoders, of which the effectiveness is confirmed by the juxtaposing experimental results on downstream recognition tasks. We also conduct several quantitative and qualitative experiments to investigate the learning behavior of our method. To our best knowledge, this is the first MIM work to solve the visual pre-training through the lens of frequency domain.

preprint2022arXiv

VLMAE: Vision-Language Masked Autoencoder

Image and language modeling is of crucial importance for vision-language pre-training (VLP), which aims to learn multi-modal representations from large-scale paired image-text data. However, we observe that most existing VLP methods focus on modeling the interactions between image and text features while neglecting the information disparity between image and text, thus suffering from focal bias. To address this problem, we propose a vision-language masked autoencoder framework (VLMAE). VLMAE employs visual generative learning, facilitating the model to acquire fine-grained and unbiased features. Unlike the previous works, VLMAE pays attention to almost all critical patches in an image, providing more comprehensive understanding. Extensive experiments demonstrate that VLMAE achieves better performance in various vision-language downstream tasks, including visual question answering, image-text retrieval and visual grounding, even with up to 20% pre-training speedup.

preprint2020arXiv

PuzzleNet: Scene Text Detection by Segment Context Graph Learning

Recently, a series of decomposition-based scene text detection methods has achieved impressive progress by decomposing challenging text regions into pieces and linking them in a bottom-up manner. However, most of them merely focus on linking independent text pieces while the context information is underestimated. In the puzzle game, the solver often put pieces together in a logical way according to the contextual information of each piece, in order to arrive at the correct solution. Inspired by it, we propose a novel decomposition-based method, termed Puzzle Networks (PuzzleNet), to address the challenging scene text detection task in this work. PuzzleNet consists of the Segment Proposal Network (SPN) that predicts the candidate text segments fitting arbitrary shape of text region, and the two-branch Multiple-Similarity Graph Convolutional Network (MSGCN) that models both appearance and geometry correlations between each segment to its contextual ones. By building segments as context graphs, MSGCN effectively employs segment context to predict combinations of segments. Final detections of polygon shape are produced by merging segments according to the predicted combinations. Evaluations on three benchmark datasets, ICDAR15, MSRA-TD500 and SCUT-CTW1500, have demonstrated that our method can achieve better or comparable performance than current state-of-the-arts, which is beneficial from the exploitation of segment context graph.

preprint2016arXiv

Interaction solutions for supersymmetric mKdV-B equation

The ${\cal N} =1$ supersymmetric mKdV-B system is transformed to a system of coupled bosonic equations by using the bosonization approach. The bosonized supersymmetric mKdV-B (BSmKdV-B) equation can be solved by the usual mKdV equation together with a linear differential equations without fermionic variables. The bosonization approach can thus effectively avoid difficulties caused by anticommutative fermionic fields of the supersymmetric systems. The consistent tanh expansion (CTE) method is applied to the BSmKdV-B equation. An auto-Bäcklund (BT) theorem is obtained by using CTE method. The interaction solutions among solitons and other complicated waves including Painlevé waves and periodic cnoidal waves are given through an auto-BT theorem. For the soliton-cnoidal interaction solution, two concrete cases are investigated both in analytical and graphical ways by combining the mapping and deformation method.

preprint2015arXiv

Interaction solutions for mKP equation with nonlocal symmetry reductions and CTE method

The nonlocal symmetries for the modified Kadomtsev-Petviashvili (mKP) equation are obtained with the truncated Painleve method. The nonlocal symmetries can be localized to the Lie point symmetries by introducing auxiliary dependent variables. The finite symmetry transformations and similarity reductions related with the nonlocal symmetries are computed. The multi-solitary wave solution and interaction solutions among a soliton and the cnoidal waves of the mKP equation are presented. In the meanwhile, the consistent tanh expansion (CTE) method is applied to the mKP equation. The explicit interaction solutions among a soliton and other types of nonlinear waves such as the cnoidal periodic waves and multiple resonant soliton solutions are given.

preprint2014arXiv

Backlund transformations for Burgers Equation via localization of residual symmetries

In this paper, we obtained the non-local residual symmetry related to truncated Painlevé expansion of Burgers equation. In order to localize the residual symmetry, we introduced new variables to prolong the original Burgers equation into a new system. By using Lie's first theorem, we got the finite transformation for the localized residual symmetry. More importantly, we also localized the linear superposition of multiple residual symmetries to find the corresponding finite transformations. It is interesting to find that the nth Backlund transformation for Burgers equation can be expressed by determinants in a compact way.

preprint2014arXiv

Bosonization, Painleve property, exact solutions for N=1 supersymmetric mKdV equation

The N=1 supersymmetric modified Korteweg-de Vries (SmKdV) system is transformed to a system of coupled bosonic equations with the bosonization approach. The bosonized SmKdV (BSmKdV) passes the Painlevé test and allows a set of Bäcklund transformation (BT) by truncating the series expansions of the solutions about the singularity manifold. The traveling wave solutions of the BSmKdV system are obtained using the mapping and deformation method. Some special types of exact solutions for the BSmKdV system are found with the solutions and symmetries of the usual mKdV equation. In the meanwhile, the similarity reduction solutions of the system are investigated by using the Lie point symmetry theory. The generalized tanh function expansion method for the BSmKdV system leads to a nonauto-BT theorem. Using the nonauto-BT theorem, the novel exact explicit solutions of the BSmKdV system can be obtained. All these solutions obtained via the bosonization procedure are different from those obtained via other methods.

preprint2014arXiv

Dark parameterization approach to Ito equation

The novel coupling Ito systems are obtained with the dark parameterization approach. By solving the coupling equations, the traveling wave solutions are constructed with the mapping and deformation method. Some novel types of exact solutions are constructed with the solutions and symmetries of the usual Ito equation. In the meanwhile, the similarity reduction solutions of the model are also studied with the Lie point symmetry theory.

preprint2014arXiv

New interaction solutions from Lax pair related symmetry of the Generalized fifth order KdV equation

The nonlocal symmetry of the generalized fifth order KdV equation (FOKdV) is first obtained by using the related Lax pair and then localized in a new enlarged system by introducing some new variables. On this basis, new Backlund transformation is obtained through Lie's first theorem. Furthermore, the general form of Lie point symmetry for the enlarged FOKdV system is found and new interaction solutions for the FOKdV equation are explored by using classical symmetry reduction method.

preprint2013arXiv

Lepton number violation and $h\to γγ$ in a radiative inverse seesaw dark matter model

We study phenomenological implications of a radiative inverse seesaw dark matter model. In this model, because neutrino masses are generated at two loop level with inverse seesaw, the new physics mass scale can be as low as a few hundred GeV and the model also naturally contain dark matter candidate. The Yukawa couplings linking the SM leptons and new particles can be large. This can lead to large lepton flavor violating effects. We find that future experimental data on $μ\to e γ$ and $μ- e$ conversion can further test the model. The new charged particles can affect significantly the $h \to γγ$ branching ratio in the SM. The model is able to explain the deviation between the SM prediction and the LHC data. We also study some LHC signatures of the new particles in the model.

preprint2013arXiv

New interaction solutions of Kadomtsev-Petviashvili equation

The residual symmetry coming from truncated Painleve expansion of KP equation is nonlocal, which is localized in this paper by introducing multiple new dependent variables. By using the standard Lie group approach, the symmetry reduction solutions for KP equation is obtained based on the general form of Lie point symmetry for the prolonged system. In this way, the interaction solutions between solitons and background waves is obtained, which is hard to study by other traditional methods.

preprint2013arXiv

New symmetry reductions related with the residual symmetry of Boussinesq equation

The Backlund transformation related symmetry is nonlocal, which is hardly to apply in constructing solutions for nonlinear equations. In this paper, we first localize nonlocal residual symmetry to Lie point symmetry by introducing multiple new variables and obtain new Baaklund transformation. Then, by solving out the general form of localized the residual symmetry, we reduce the enlarged system by classical symmetry approach and obtain the corresponding reduction solutions as well as related reduction equations. The localization procedure provides a new way to investigate interaction solutions between different waves.

preprint2013arXiv

Residual Symmetry Reductions and Interaction Solutions of (2+1)-Dimensional Burgers Equation

The (2+1)-dimensional Burgers equation has been investigated first from prospective of symmetry by localizing the nonlocal residual symmetries and then studied by a simple generalized tanh expansion method. New symmetry reduction solutions has been obtained by using the standard Lie point symmetry group approach. A new Bäklund transformation for Burgers equation has been given with the generalized tanh expansion method . From this BT, interactive solutions among different nonlinear excitations which is hard to obtain by other methods has also been obtained easily.

preprint2012arXiv

Hints of Standard Model Higgs Boson at the LHC and Light Dark Matter Searches

The most recent results of searches at the LHC for the Higgs boson h have turned up possible hints of such a particle with mass m_h about 125 GeV consistent with standard model (SM) expectations. This has many potential implications for the SM and beyond. We consider some of them in the contexts of a simple Higgs-portal dark matter (DM) model, the SM plus a real gauge-singlet scalar field D as the DM candidate, and a couple of its variations. In the simplest model with one Higgs doublet and three or four generations of fermions, for D mass m_D < m_h/2 the invisible decay h -> DD tends to have a substantial branching ratio. If future LHC data confirm the preliminary Higgs indications, m_D will have to exceed m_h/2. To keep the DM lighter than m_h/2, one will need to extend the model and also satisfy constraints from DM direct searches. The latter can be accommodated if the model provides sizable isospin violation in the DM-nucleon interactions. We explore this in a two-Higgs-doublet model combined with the scalar field D. This model can offer a 125-GeV SM-like Higgs and a light DM candidate having isospin-violating interactions with nucleons at roughly the required level, albeit with some degree of fine-tuning.

preprint2011arXiv

A Higgs Quadruplet for Type III Seesaw and Implications for $μ\to eγ$ and $μ- e$ Conversion

In Type III seesaw model the heavy neutrinos are contained in leptonic triplet representations. The Yukawa couplings of the triplet fermion and the left-handed neutrinos with the doublet Higgs field produce the Dirac mass terms. Together with the Majorana masses for the leptonic triplets, the light neutrinos obtain non-zero seesaw masses. We point out that it is also possible to have a quadruplet Higgs field to produce the Dirac mass terms to facilitate the seesaw mechanism. The vacuum expectation value of the quadruplet Higgs is constrained to be small by electroweak precision data. Therefore the Yukawa couplings of a quadruplet can be much larger than those for a doublet. We also find that unlike the usual Type III seesaw model where at least two copies of leptonic triplets are needed, with both doublet and quadruplet Higgs representations, just one leptonic triplet is possible to have a phenomenologically acceptable model because light neutrino masses can receive sizable contributions at both tree and one loop levels. Large Yukawa couplings of the quadruplet can induce observable effects for lepton flavor violating processes $μ\to e γ$ and $μ- e$ conversion. Implications of the recent $μ\to eγ$ limit from MEG and also limit on $μ- e $ conversion on Au are also given. Some interesting collider signatures for the doubly charged Higgs boson in the quadruplet are discussed.

preprint2011arXiv

LHC Evidence Of A 126 GeV Higgs Boson From $H \to γγ$ With Three And Four Generations

Searches for Higgs boson at the LHC have excluded standard model (SM) Higgs boson mass in the range between 127 GeV to 600 GeV. With a fourth generation, the excluded range is wider. To close the windows between 114 GeV to 127 GeV, the mode $H \to γγ$ plays an important role. There are evidences that the Higgs boson mass is about 126 GeV from LHC data. $H\to γγ$ can occur at one loop level in the SM. In the SM with three generations (SM3), the dominant contribution is from W boson with some cancellation from top quark in the loop. With SM4, the large mass of the fourth generation quarks and charged lepton cancel the W boson contribution significantly, the decay width is suppressed by a factorin the range of 0.25 $\sim$ 0.55 for the fourth generation mass in the range of 500 to 1000 GeV. This reduction factor makes $σ(pp\to H X)Br(H\to γγ)$ for SM4 comparable to that for SM3 for Higgs boson mass in the window allowed mentioned earlier. Using $H \to γγ$ alone, therefore, it is difficult at present to distinguish whether the Higgs boson is from SM3 or SM4. We also comments on some other detection channels.

preprint2011arXiv

Low Mass Dark Matter and Invisible Higgs Width In Darkon Models

The Standard Model (SM) plus a real gauge-singlet scalar field dubbed darkon (SM+D) is the simplest model possessing a weakly interacting massive particle (WIMP) dark-matter candidate. In this model, the parameters are constrained from dark matter relic density and direct searches. The fact that interaction between darkon and SM particles is only mediated by Higgs boson exchange may lead to significant modifications to the Higgs boson properties. If the dark matter mass is smaller than a half of the Higgs boson mass, the Higgs boson can decay into a pair of darkons resulting in a large invisible branching ratio. The Higgs boson will be searched for at the LHC and may well be discovered in the near future. If a Higgs boson with a small invisible decay width will be found, the SM+D model with small dark matter mass will be in trouble. We find that by extending the SM+D to a two-Higgs-doublet model plus a darkon (THDM+D) it is possible to have a Higgs boson with a small invisible branching ratio and at the same time the dark matter can have a low mass. We also comment on other implications of this model.

preprint2010arXiv

Large Dimuon Asymmetry In Bs-bar B_s Mixing From Unparticle Indced Gamma^{12}_s

Exchange of unparticle stuff of dimension $d_\U$ with FCNC interaction can induce $M^{12,u}$ and $Γ^{12,u}$ causing meson and anti-meson mixing with the relation $Γ^{12,u}/M^{12,u} = 2 \tan(πd_\U)$. We show that this type of unparticle contribution can provide the much needed large $Γ^{12}_s$ to explain the recently observed anomalously large dimuon asymmetry in $B_s -\bar B_s$ system reported by D0 collaboration. The same interaction can also accommodate large mixing induced CP violation in $B_s \to J/ψϕ$ indicated by CDF and D0 data. Experimental data can provide constraints on the unparticle dimension and scale.

Bo Ren

What is connected

Connect this record

See the researcher in context

Building this map preview

33 published item(s)

Analyzing Communication Predictability in LLM Training

Contrastive Graph Multimodal Model for Text Classification in Videos

EDN: Salient Object Detection via Extremely-Downsampled Network

GMN: Generative Multi-modal Network for Practical Document Information Extraction

Interactive Style Transfer: All is Your Palette

Knowledge Mining with Scene Text for Fine-Grained Recognition

Neural Collaborative Graph Machines for Table Structure Recognition

NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification

RAAT: Relation-Augmented Attention Transformer for Relation Modeling in Document-Level Event Extraction

Relational Representation Learning in Visually-Rich Documents

Scene Consistency Representation Learning for Video Scene Segmentation

See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation

TaCo: Textual Attribute Recognition via Contrastive Learning

The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training

VLMAE: Vision-Language Masked Autoencoder

PuzzleNet: Scene Text Detection by Segment Context Graph Learning

Interaction solutions for supersymmetric mKdV-B equation

Interaction solutions for mKP equation with nonlocal symmetry reductions and CTE method

Backlund transformations for Burgers Equation via localization of residual symmetries

Bosonization, Painleve property, exact solutions for N=1 supersymmetric mKdV equation

Dark parameterization approach to Ito equation

New interaction solutions from Lax pair related symmetry of the Generalized fifth order KdV equation

Lepton number violation and $h\to γγ$ in a radiative inverse seesaw dark matter model

New interaction solutions of Kadomtsev-Petviashvili equation

New symmetry reductions related with the residual symmetry of Boussinesq equation

Residual Symmetry Reductions and Interaction Solutions of (2+1)-Dimensional Burgers Equation

Hints of Standard Model Higgs Boson at the LHC and Light Dark Matter Searches

A Higgs Quadruplet for Type III Seesaw and Implications for $μ\to eγ$ and $μ- e$ Conversion

LHC Evidence Of A 126 GeV Higgs Boson From $H \to γγ$ With Three And Four Generations

Low Mass Dark Matter and Invisible Higgs Width In Darkon Models

Large Dimuon Asymmetry In Bs-bar B_s Mixing From Unparticle Indced Gamma^{12}_s

Bo Ren

What is connected

Connect this record

See the researcher in context

Building this map preview

33 published item(s)

Analyzing Communication Predictability in LLM Training

Contrastive Graph Multimodal Model for Text Classification in Videos

EDN: Salient Object Detection via Extremely-Downsampled Network

GMN: Generative Multi-modal Network for Practical Document Information Extraction

Interactive Style Transfer: All is Your Palette

Knowledge Mining with Scene Text for Fine-Grained Recognition

Neural Collaborative Graph Machines for Table Structure Recognition

NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification

RAAT: Relation-Augmented Attention Transformer for Relation Modeling in Document-Level Event Extraction

Relational Representation Learning in Visually-Rich Documents

Scene Consistency Representation Learning for Video Scene Segmentation

See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation

TaCo: Textual Attribute Recognition via Contrastive Learning

The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training

VLMAE: Vision-Language Masked Autoencoder

PuzzleNet: Scene Text Detection by Segment Context Graph Learning

Interaction solutions for supersymmetric mKdV-B equation

Interaction solutions for mKP equation with nonlocal symmetry reductions and CTE method

Backlund transformations for Burgers Equation via localization of residual symmetries

Bosonization, Painleve property, exact solutions for N=1 supersymmetric mKdV equation

Dark parameterization approach to Ito equation

New interaction solutions from Lax pair related symmetry of the Generalized fifth order KdV equation

Lepton number violation and $h\to γγ$ in a radiative inverse seesaw dark matter model

New interaction solutions of Kadomtsev-Petviashvili equation

New symmetry reductions related with the residual symmetry of Boussinesq equation

Residual Symmetry Reductions and Interaction Solutions of (2+1)-Dimensional Burgers Equation

Hints of Standard Model Higgs Boson at the LHC and Light Dark Matter Searches

A Higgs Quadruplet for Type III Seesaw and Implications for $μ\to eγ$ and $μ- e$ Conversion

LHC Evidence Of A 126 GeV Higgs Boson From $H \to γγ$ With Three And Four Generations

Low Mass Dark Matter and Invisible Higgs Width In Darkon Models

Large Dimuon Asymmetry In Bs-bar B_s Mixing From Unparticle Indced Gamma^{12}_s

Backlund transformations for Burgers Equation via localization of residual symmetries