Researcher profile

Xin Cong

Xin Cong contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2024arXiv

Exploring Format Consistency for Instruction Tuning

Instruction tuning has emerged as a promising approach to enhancing large language models in following human instructions. It is shown that increasing the diversity and number of instructions in the training data can consistently enhance generalization performance, which facilitates a recent endeavor to collect various instructions and integrate existing instruction tuning datasets into larger collections. However, different users have their unique ways of expressing instructions, and there often exist variations across different datasets in the instruction styles and formats, i.e., format inconsistency. In this work, we propose a framework named Unified Instruction Tuning (UIT), which calls OpenAI APIs for automatic format transfer among different instruction tuning datasets such as PromptSource, FLAN and CrossFit. With the framework, we (1) demonstrate the necessity of maintaining format consistency in instruction tuning; (2) improve the generalization performance on unseen instructions on T5-LM-xl; (3) provide a novel perplexity-based denoising method to reduce the noise of automatic format transfer to make the UIT framework more practical and a smaller offline model based on GPT-J that achieves comparable format transfer capability to OpenAI APIs to reduce costs in practice. Further analysis regarding variations of targeted formats and other effects is intended.

preprint2022arXiv

Cross-Domain Recommendation to Cold-Start Users via Variational Information Bottleneck

Recommender systems have been widely deployed in many real-world applications, but usually suffer from the long-standing user cold-start problem. As a promising way, Cross-Domain Recommendation (CDR) has attracted a surge of interest, which aims to transfer the user preferences observed in the source domain to make recommendations in the target domain. Previous CDR approaches mostly achieve the goal by following the Embedding and Mapping (EMCDR) idea which attempts to learn a mapping function to transfer the pre-trained user representations (embeddings) from the source domain into the target domain. However, they pre-train the user/item representations independently for each domain, ignoring to consider both domain interactions simultaneously. Therefore, the biased pre-trained representations inevitably involve the domain-specific information which may lead to negative impact to transfer information across domains. In this work, we consider a key point of the CDR task: what information needs to be shared across domains? To achieve the above idea, this paper utilizes the information bottleneck (IB) principle, and proposes a novel approach termed as CDRIB to enforce the representations encoding the domain-shared information. To derive the unbiased representations, we devise two IB regularizers to model the cross-domain/in-domain user-item interactions simultaneously and thereby CDRIB could consider both domain interactions jointly for de-biasing.

preprint2022arXiv

Document-Level Event Extraction via Human-Like Reading Process

Document-level Event Extraction (DEE) is particularly tricky due to the two challenges it poses: scattering-arguments and multi-events. The first challenge means that arguments of one event record could reside in different sentences in the document, while the second one reflects one document may simultaneously contain multiple such event records. Motivated by humans' reading cognitive to extract information of interests, in this paper, we propose a method called HRE (Human Reading inspired Extractor for Document Events), where DEE is decomposed into these two iterative stages, rough reading and elaborate reading. Specifically, the first stage browses the document to detect the occurrence of events, and the second stage serves to extract specific event arguments. For each concrete event role, elaborate reading hierarchically works from sentences to characters to locate arguments across sentences, thus the scattering-arguments problem is tackled. Meanwhile, rough reading is explored in a multi-round manner to discover undetected events, thus the multi-events problem is handled. Experiment results show the superiority of HRE over prior competitive methods.

preprint2020arXiv

Electronic Raman Scattering in Suspended Semiconducting Carbon Nanotubes

The electronic Raman scattering (ERS) features of single-walled carbon nanotubes (SWNTs) can reveal a wealth of information about their electronic structures, but have previously been thought to appear exclusively in metallic (M-) but not in semiconducting (S-) SWNTs. We report the experimental observation of the ERS features with an accuracy of 1 meV in suspended S-SWNTs, the processes of which are accomplished via the available high-energy electron-hole pairs. The ERS features can facilitate further systematic studies on the properties of SWNT, both metallic and semiconducting, with defined chirality.

preprint2020arXiv

Inductive Unsupervised Domain Adaptation for Few-Shot Classification via Clustering

Few-shot classification tends to struggle when it needs to adapt to diverse domains. Due to the non-overlapping label space between domains, the performance of conventional domain adaptation is limited. Previous work tackles the problem in a transductive manner, by assuming access to the full set of test data, which is too restrictive for many real-world applications. In this paper, we set out to tackle this issue by introducing a inductive framework, DaFeC, to improve Domain adaptation performance for Few-shot classification via Clustering. We first build a representation extractor to derive features for unlabeled data from the target domain (no test data is necessary) and then group them with a cluster miner. The generated pseudo-labeled data and the labeled source-domain data are used as supervision to update the parameters of the few-shot classifier. In order to derive high-quality pseudo labels, we propose a Clustering Promotion Mechanism, to learn better features for the target domain via Similarity Entropy Minimization and Adversarial Distribution Alignment, which are combined with a Cosine Annealing Strategy. Experiments are performed on the FewRel 2.0 dataset. Our approach outperforms previous work with absolute gains (in classification accuracy) of 4.95%, 9.55%, 3.99% and 11.62%, respectively, under four few-shot settings.

preprint2020arXiv

Label Enhanced Event Detection with Heterogeneous Graph Attention Networks

Event Detection (ED) aims to recognize instances of specified types of event triggers in text. Different from English ED, Chinese ED suffers from the problem of word-trigger mismatch due to the uncertain word boundaries. Existing approaches injecting word information into character-level models have achieved promising progress to alleviate this problem, but they are limited by two issues. First, the interaction between characters and lexicon words is not fully exploited. Second, they ignore the semantic information provided by event labels. We thus propose a novel architecture named Label enhanced Heterogeneous Graph Attention Networks (L-HGAT). Specifically, we transform each sentence into a graph, where character nodes and word nodes are connected with different types of edges, so that the interaction between words and characters is fully reserved. A heterogeneous graph attention networks is then introduced to propagate relational message and enrich information interaction. Furthermore, we convert each label into a trigger-prototype-based embedding, and design a margin loss to guide the model distinguish confusing event labels. Experiments on two benchmark datasets show that our model achieves significant improvement over a range of competitive baseline methods.

preprint2020arXiv

Understanding angle-resolved polarized Raman scattering from black phosphorus at normal and oblique laser incidences

The selection rule for angle-resolved polarized Raman (ARPR) intensity of phonons from standard group-theoretical method in isotropic materials would break down in anisotropic layered materials (ALMs) due to birefringence and linear dichroism effects. The two effects result in depth-dependent polarization and intensity of incident laser and scattered signal inside ALMs and thus make a challenge to predict ARPR intensity at any laser incidence direction. Herein, taking in-plane anisotropic black phosphorus as a prototype, we developed a so-called birefringence-linear-dichroism (BLD) model to quantitatively understand its ARPR intensity at both normal and oblique laser incidences by the same set of real Raman tensors for certain laser excitation. No fitting parameter is needed, once the birefringence and linear dichroism effects are considered with the complex refractive indexes. An approach was proposed to experimentally determine real Raman tensor and complex refractive indexes, respectively, from the relative Raman intensity along its principle axes and incident-angle resolved reflectivity by Fresnel$'$s law. The results suggest that the previously reported ARPR intensity of ultrathin ALM flakes deposited on a multilayered substrate at normal laser incidence can be also understood based on the BLD model by considering the depth-dependent polarization and intensity of incident laser and scattered Raman signal induced by both birefringence and linear dichroism effects within ALM flakes and the interference effects in the multilayered structures, which are dependent on the excitation wavelength, thickness of ALM flakes and dielectric layers of the substrate. This work can be generally applicable to any opaque anisotropic crystals, offering a promising route to predict and manipulate the polarized behaviors of related phonons.

preprint2019arXiv

Edge-Epitaxial Growth of InSe Nanowires toward High-Performance Photodetectors

Semiconducting nanowires offer many opportunities for electronic and optoelectronic device applications due to their special geometries and unique physical properties. However, it has been challenging to synthesize semiconducting nanowires directly on SiO2/Si substrate due to lattice mismatch. Here, we developed a catalysis-free approach to achieve direct synthesis of long and straight InSe nanowires on SiO2/Si substrate through edge-homoepitaxial growth. We further achieved parallel InSe nanowires on SiO2/Si substrate through controlling growth conditions. We attributed the underlying growth mechanism to selenium self-driven vapor-liquid-solid process, which is distinct from conventional metal-catalytic vapor-liquid-solid method widely used for growing Si and III-V nanowires. Furthermore, we demonstrated that the as-grown InSe nanowire-based visible light photodetector simultaneously possesses an extraordinary photoresponsivity of 271 A/W, ultrahigh detectivity of 1.57*10^14 Jones and a fast response speed of microsecond scale. The excellent performance of the photodetector indicates that as-grown InSe nanowires are promising in future optoelectronic applications. More importantly, the proposed edge-homoepitaxial approach may open up a novel avenue for direct synthesis of semiconducting nanowire arrays on SiO2/Si substrate.