Source author record

Chen Zheng

Chen Zheng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.CO Computation and Language astro-ph.SR Computer Vision cond-mat.mtrl-sci Databases Performance

Catalog footprint

What is connected

9works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers

The emergence of Large Language Models (LLMs) such as ChatGPT and LLaMA encounter limitations in domain-specific tasks, with these models often lacking depth and accuracy in specialized areas, and exhibiting a decrease in general capabilities when fine-tuned, particularly analysis ability in small sized models. To address these gaps, we introduce ICE-GRT, utilizing Reinforcement Learning from Human Feedback (RLHF) grounded in Proximal Policy Optimization (PPO), demonstrating remarkable ability in in-domain scenarios without compromising general task performance. Our exploration of ICE-GRT highlights its understanding and reasoning ability to not only generate robust answers but also to provide detailed analyses of the reasons behind the answer. This capability marks a significant progression beyond the scope of Supervised Fine-Tuning models. The success of ICE-GRT is dependent on several crucial factors, including Appropriate Data, Reward Size Scaling, KL-Control, Advantage Normalization, etc. The ICE-GRT model exhibits state-of-the-art performance in domain-specific tasks and across 12 general Language tasks against equivalent size and even larger size LLMs, highlighting the effectiveness of our approach. We provide a comprehensive analysis of the ICE-GRT, underscoring the significant advancements it brings to the field of LLM.

preprint2022arXiv

Relevant CommonSense Subgraphs for "What if..." Procedural Reasoning

We study the challenge of learning causal reasoning over procedural text to answer "What if..." questions when external commonsense knowledge is required. We propose a novel multi-hop graph reasoning model to 1) efficiently extract a commonsense subgraph with the most relevant information from a large knowledge graph; 2) predict the causal answer by reasoning over the representations obtained from the commonsense subgraph and the contextual interactions between the questions and context. We evaluate our model on WIQA benchmark and achieve state-of-the-art performance compared to the recent models.

preprint2020arXiv

AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite

Domain-specific software and hardware co-design is encouraging as it is much easier to achieve efficiency for fewer tasks. Agile domain-specific benchmarking speeds up the process as it provides not only relevant design inputs but also relevant metrics, and tools. Unfortunately, modern workloads like Big data, AI, and Internet services dwarf the traditional one in terms of code size, deployment scale, and execution path, and hence raise serious benchmarking challenges. This paper proposes an agile domain-specific benchmarking methodology. Together with seventeen industry partners, we identify ten important end-to-end application scenarios, among which sixteen representative AI tasks are distilled as the AI component benchmarks. We propose the permutations of essential AI and non-AI component benchmarks as end-to-end benchmarks. An end-to-end benchmark is a distillation of the essential attributes of an industry-scale application. We design and implement a highly extensible, configurable, and flexible benchmark framework, on the basis of which, we propose the guideline for building end-to-end benchmarks, and present the first end-to-end Internet service AI benchmark. The preliminary evaluation shows the value of our benchmark suite---AIBench against MLPerf and TailBench for hardware and software designers, micro-architectural researchers, and code developers. The specifications, source code, testbed, and results are publicly available from the web site \url{http://www.benchcouncil.org/AIBench/index.html}.

preprint2020arXiv

Cross-Modality Relevance for Reasoning on Language and Vision

This work deals with the challenge of learning and reasoning over language and vision data for the related downstream tasks such as visual question answering (VQA) and natural language for visual reasoning (NLVR). We design a novel cross-modality relevance module that is used in an end-to-end framework to learn the relevance representation between components of various input modalities under the supervision of a target task, which is more generalizable to unobserved data compared to merely reshaping the original representation space. In addition to modeling the relevance between the textual entities and visual entities, we model the higher-order relevance between entity relations in the text and object relations in the image. Our proposed approach shows competitive performance on two different language and vision tasks using public benchmarks and improves the state-of-the-art published results. The learned alignments of input spaces and their relevance representations by NLVR task boost the training efficiency of VQA task.

preprint2014arXiv

BigDataBench: a Big Data Benchmark Suite from Internet Services

As architecture, systems, and data management communities pay greater attention to innovative big data systems and architectures, the pressure of benchmarking and evaluating these systems rises. Considering the broad use of big data systems, big data benchmarks must include diversity of data and workloads. Most of the state-of-the-art big data benchmarking efforts target evaluating specific types of applications or system software stacks, and hence they are not qualified for serving the purposes mentioned above. This paper presents our joint research efforts on this issue with several industrial partners. Our big data benchmark suite BigDataBench not only covers broad application scenarios, but also includes diverse and representative data sets. BigDataBench is publicly available from http://prof.ict.ac.cn/BigDataBench . Also, we comprehensively characterize 19 big data workloads included in BigDataBench with varying data inputs. On a typical state-of-practice processor, Intel Xeon E5645, we have the following observations: First, in comparison with the traditional benchmarks: including PARSEC, HPCC, and SPECCPU, big data applications have very low operation intensity; Second, the volume of data input has non-negligible impact on micro-architecture characteristics, which may impose challenges for simulation-based big data architecture research; Last but not least, corroborating the observations in CloudSuite and DCBench (which use smaller data inputs), we find that the numbers of L1 instruction cache misses per 1000 instructions of the big data applications are higher than in the traditional benchmarks; also, we find that L3 caches are effective for the big data applications, corroborating the observation in DCBench.

preprint2014arXiv

Hubble Space Telescope and Ground-Based Observations of the Type Iax Supernovae SN 2005hk and SN 2008A

We present Hubble Space Telescope (HST) and ground-based optical and near-infrared observations of SN 2005hk and SN 2008A, typical members of the Type Iax class of supernovae (SNe). Here we focus on late-time observations, where these objects deviate most dramatically from all other SN types. Instead of the dominant nebular emission lines that are observed in other SNe at late phases, spectra of SNe 2005hk and 2008A show lines of Fe II, Ca II, and Fe I more than a year past maximum light, along with narrow [Fe II] and [Ca II] emission. We use spectral features to constrain the temperature and density of the ejecta, and find high densities at late times, with n_e >~ 10^9 cm^-3. Such high densities should yield enhanced cooling of the ejecta, making these objects good candidates to observe the expected "infrared catastrophe," a generic feature of SN Ia models. However, our HST photometry of SN 2008A does not match the predictions of an infrared catastrophe. Moreover, our HST observations rule out a "complete deflagration" that fully disrupts the white dwarf for these peculiar SNe, showing no evidence for unburned material at late times. Deflagration explosion models that leave behind a bound remnant can match some of the observed properties of SNe Iax, but no published model is consistent with all of our observations of SNe 2005hk and 2008A.

preprint2014arXiv

Thermodynamics of Surface Defects at the Aspirin/Water Interface

We present a simulation scheme to calculate defect formation free energies at a molecular crystal/water interface based on force-field molecular dynamics (MD) simulations. To this end we adopt and modify existing approaches to calculate binding free energies of biological ligand/receptor complexes to be applicable to common surface defects, such as step edges and kink sites. We obtain statistically accurate and reliable free energy values for the aspirin/water interface, which can be applied to estimate the distribution of defects using well-established thermodynamic relations. As a show case we calculate the free energy upon dissolving molecules from kink sites at the interface. This free energy can be related to the solubility concentration and we obtain solubility values in excellent agreement with experimental results.

preprint2009arXiv

First-year Sloan Digital Sky Survey-II (SDSS-II) Supernova Results: Hubble Diagram and Cosmological Parameters

We present measurements of the Hubble diagram for 103 Type Ia supernovae (SNe) with redshifts 0.04 < z < 0.42, discovered during the first season (Fall 2005) of the Sloan Digital Sky Survey-II (SDSS-II) Supernova Survey. These data fill in the redshift "desert" between low- and high-redshift SN Ia surveys. We combine the SDSS-II measurements with new distance estimates for published SN data from the ESSENCE survey, the Supernova Legacy Survey, the Hubble Space Telescope, and a compilation of nearby SN Ia measurements. Combining the SN Hubble diagram with measurements of Baryon Acoustic Oscillations from the SDSS Luminous Red Galaxy sample and with CMB temperature anisotropy measurements from WMAP, we estimate the cosmological parameters w and Omega_M, assuming a spatially flat cosmological model (FwCDM) with constant dark energy equation of state parameter, w. For the FwCDM model and the combined sample of 288 SNe Ia, we find w = -0.76 +- 0.07(stat) +- 0.11(syst), Omega_M = 0.306 +- 0.019(stat) +- 0.023(syst) using MLCS2k2 and w = -0.96 +- 0.06(stat) +- 0.12(syst), Omega_M = 0.265 +- 0.016(stat) +- 0.025(syst) using the SALT-II fitter. We trace the discrepancy between these results to a difference in the rest-frame UV model combined with a different luminosity correction from color variations; these differences mostly affect the distance estimates for the SNLS and HST supernovae. We present detailed discussions of systematic errors for both light-curve methods and find that they both show data-model discrepancies in rest-frame $U$-band. For the SALT-II approach, we also see strong evidence for redshift-dependence of the color-luminosity parameter (beta). Restricting the analysis to the 136 SNe Ia in the Nearby+SDSS-II samples, we find much better agreement between the two analysis methods but with larger uncertainties.

preprint2009arXiv

The Sloan Digital Sky Survey-II: Photometry and Supernova Ia Light Curves from the 2005 data

We present ugriz light curves for 146 spectroscopically confirmed or spectroscopically probable Type Ia supernovae from the 2005 season of the SDSS-II Supernova survey. The light curves have been constructed using a photometric technique that we call scene modelling, which is described in detail here; the major feature is that supernova brightnesses are extracted from a stack of images without spatial resampling or convolution of the image data. This procedure produces accurate photometry along with accurate estimates of the statistical uncertainty, and can be used to derive photometry taken with multiple telescopes. We discuss various tests of this technique that demonstrate its capabilities. We also describe the methodology used for the calibration of the photometry, and present calibrated magnitudes and fluxes for all of the spectroscopic SNe Ia from the 2005 season.

Chen Zheng

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers

Relevant CommonSense Subgraphs for "What if..." Procedural Reasoning

AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite

Cross-Modality Relevance for Reasoning on Language and Vision

BigDataBench: a Big Data Benchmark Suite from Internet Services

Hubble Space Telescope and Ground-Based Observations of the Type Iax Supernovae SN 2005hk and SN 2008A

Thermodynamics of Surface Defects at the Aspirin/Water Interface

First-year Sloan Digital Sky Survey-II (SDSS-II) Supernova Results: Hubble Diagram and Cosmological Parameters

The Sloan Digital Sky Survey-II: Photometry and Supernova Ia Light Curves from the 2005 data