Source author record

Bin Ma

Bin Ma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

24works

22topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Highly Magnetic Ultra Massive White Dwarf with a 23-minute Rotation Period

We present a physical characterization of TMTS J00063798+3104160 (J0006), a rapidly rotating,ultra-massive white dwarf (WD) identified in high-cadence light curves from the Tsinghua University-Ma Huateng Telescope for Survey (TMTS). A coherent 23-minute periodicity is detected in TMTS, TESS, and ZTF photometry. A time series of low-resolution spectra with the Keck-I 10 m telescope reveals broad, shallow hydrogen absorption features indicative of an extreme magnetic field and shows no evidence for radial-velocity variations. Atmospheric modeling yields a magnetic field strength of $\sim$ 250 MG, while Gaia astrometry and photometry imply a mass of 1.06 $\pm$ 0.01 M$_{\odot}$. A significant infrared excess is detected in the WISE W1 band and is well fitted by a 550 K blackbody, likely arising from residual material of a merger. We interpret the 23-minute photometric modulation as the rotation period of an isolated, massive WD formed likely through the merger of a double WD binary. With one of the shortest rotation periods known among candidate merger remnants and with constraints from a deep Einstein Probe X-ray nondetection, J0006 provides a rare and important observational window into the poorly explored intermediate stages of post-merger evolution.

preprint2022arXiv

Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings

Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a specialist weeks to assign the observed resonances to specific chemical groups. There has thus been growing interest in the NMR community to use deep learning to automate NMR data annotation. Due to similarities between NMR and audio data, we propose that methods used in acoustic signal processing can be applied to NMR as well. Using a simulated amino acid dataset, we show that by swapping out filter banks with a trainable convolutional encoder, acoustic signal embeddings from speaker verification models can be used for amino acid classification in 2D NMR spectra by treating each amino acid as a unique speaker. On an NMR dataset comparable in size with of 46 hours of audio, we achieve a classification performance of 97.7% on a 20-class problem. We also achieve a 23% relative improvement by using an acoustic embedding model compared to an existing NMR-based model.

preprint2022arXiv

End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression

Echo and noise suppression is an integral part of a full-duplex communication system. Many recent acoustic echo cancellation (AEC) systems rely on a separate adaptive filtering module for linear echo suppression and a neural module for residual echo suppression. However, not only do adaptive filtering modules require convergence and remain susceptible to changes in acoustic environments, but this two-stage framework also often introduces unnecessary delays to the AEC system when neural modules are already capable of both linear and nonlinear echo suppression. In this paper, we exploit the offset-compensating ability of complex time-frequency masks and propose an end-to-end complex-valued neural network architecture. The building block of the proposed model is a pseudocomplex extension based on the densely-connected multidilated DenseNet (D3Net) building block, resulting in a very small network of only 354K parameters. The architecture utilized the multi-resolution nature of the D3Net building blocks to eliminate the need for pooling, allowing the network to extract features using large receptive fields without any loss of output resolution. We also propose a dual-mask technique for joint echo and noise suppression with simultaneous speech enhancement. Evaluation on both synthetic and real test sets demonstrated promising results across multiple energy-based metrics and perceptual proxies.

preprint2022arXiv

I2CR: Improving Noise Robustness on Keyword Spotting Using Inter-Intra Contrastive Regularization

Noise robustness in keyword spotting remains a challenge as many models fail to overcome the heavy influence of noises, causing the deterioration of the quality of feature embeddings. We proposed a contrastive regularization method called Inter-Intra Contrastive Regularization (I2CR) to improve the feature representations by guiding the model to learn the fundamental speech information specific to the cluster. This involves maximizing the similarity across Intra and Inter samples of the same class. As a result, it pulls the instances closer to more generalized representations that form more prominent clusters and reduces the adverse impact of noises. We show that our method provides consistent improvements in accuracy over different backbone model architectures under different noise environments. We also demonstrate that our proposed framework has improved the accuracy of unseen out-of-domain noises and unseen variant noise SNRs. This indicates the significance of our work with the overall refinement in noise robustness.

preprint2022arXiv

Learning Disentangled Representations for Counterfactual Regression via Mutual Information Minimization

Learning individual-level treatment effect is a fundamental problem in causal inference and has received increasing attention in many areas, especially in the user growth area which concerns many internet companies. Recently, disentangled representation learning methods that decompose covariates into three latent factors, including instrumental, confounding and adjustment factors, have witnessed great success in treatment effect estimation. However, it remains an open problem how to learn the underlying disentangled factors precisely. Specifically, previous methods fail to obtain independent disentangled factors, which is a necessary condition for identifying treatment effect. In this paper, we propose Disentangled Representations for Counterfactual Regression via Mutual Information Minimization (MIM-DRCFR), which uses a multi-task learning framework to share information when learning the latent factors and incorporates MI minimization learning criteria to ensure the independence of these factors. Extensive experiments including public benchmarks and real-world industrial user growth datasets demonstrate that our method performs much better than state-of-the-art methods.

preprint2022arXiv

M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge

Recent development of speech processing, such as speech recognition, speaker diarization, etc., has inspired numerous applications of speech technologies. The meeting scenario is one of the most valuable and, at the same time, most challenging scenarios for the deployment of speech technologies. Specifically, two typical tasks, speaker diarization and multi-speaker automatic speech recognition have attracted much attention recently. However, the lack of large public meeting data has been a major obstacle for the advancement of the field. Therefore, we make available the AliMeeting corpus, which consists of 120 hours of recorded Mandarin meeting data, including far-field data collected by 8-channel microphone array as well as near-field data collected by headset microphone. Each meeting session is composed of 2-4 speakers with different speaker overlap ratio, recorded in rooms with different size. Along with the dataset, we launch the ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) with two tracks, namely speaker diarization and multi-speaker ASR, aiming to provide a common testbed for meeting rich transcription and promote reproducible research in this field. In this paper we provide a detailed introduction of the AliMeeting dateset, challenge rules, evaluation methods and baseline systems.

preprint2022arXiv

Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge

The ICASSP 2022 Multi-channel Multi-party Meeting Transcription Grand Challenge (M2MeT) focuses on one of the most valuable and the most challenging scenarios of speech technologies. The M2MeT challenge has particularly set up two tracks, speaker diarization (track 1) and multi-speaker automatic speech recognition (ASR) (track 2). Along with the challenge, we released 120 hours of real-recorded Mandarin meeting speech data with manual annotation, including far-field data collected by 8-channel microphone array as well as near-field data collected by each participants' headset microphone. We briefly describe the released dataset, track setups, baselines and summarize the challenge results and major techniques used in the submissions.

preprint2021arXiv

Cloud Cover and Aurora Contamination at Dome A in 2017 from KLCAM

Dome A in Antarctica has many characteristics that make it an excellent site for astronomical observations, from the optical to the terahertz. Quantitative site testing is still needed to confirm the site's properties. In this paper, we present a statistical analysis of cloud cover and aurora contamination from the Kunlun Cloud and Aurora Monitor (KLCAM). KLCAM is an automatic, unattended all-sky camera aiming for long-term monitoring of the usable observing time and optical sky background at Dome~A. It was installed at Dome~A in January 2017, worked through the austral winter, and collected over 47,000 images over 490 days. A semi-quantitative visual data analysis of cloud cover and auroral contamination was carried out by five individuals. The analysis shows that the night sky was free of cloud for 83 per cent of the time, which ranks Dome~A highly in a comparison with other observatory sites. Although aurorae were detected somewhere on an image for nearly 45 per cent of the time, the strongest auroral emission lines can be filtered out with customized filters.

preprint2021arXiv

Sampling Subgraph Network with Application to Graph Classification

Graphs are naturally used to describe the structures of various real-world systems in biology, society, computer science etc., where subgraphs or motifs as basic blocks play an important role in function expression and information processing. However, existing research focuses on the basic statistics of certain motifs, largely ignoring the connection patterns among them. Recently, a subgraph network (SGN) model is proposed to study the potential structure among motifs, and it was found that the integration of SGN can enhance a series of graph classification methods. However, SGN model lacks diversity and is of quite high time complexity, making it difficult to widely apply in practice. In this paper, we introduce sampling strategies into SGN, and design a novel sampling subgraph network model, which is scale-controllable and of higher diversity. We also present a hierarchical feature fusion framework to integrate the structural features of diverse sampling SGNs, so as to improve the performance of graph classification. Extensive experiments demonstrate that, by comparing with the SGN model, our new model indeed has much lower time complexity (reduced by two orders of magnitude) and can better enhance a series of graph classification methods (doubling the performance enhancement).

preprint2021arXiv

Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram

Cross-lingual voice conversion (VC) is an important and challenging problem due to significant mismatches of the phonetic set and the speech prosody of different languages. In this paper, we build upon the neural text-to-speech (TTS) model, i.e., FastSpeech, and LPCNet neural vocoder to design a new cross-lingual VC framework named FastSpeech-VC. We address the mismatches of the phonetic set and the speech prosody by applying Phonetic PosteriorGrams (PPGs), which have been proved to bridge across speaker and language boundaries. Moreover, we add normalized logarithm-scale fundamental frequency (Log-F0) to further compensate for the prosodic mismatches and significantly improve naturalness. Our experiments on English and Mandarin languages demonstrate that with only mono-lingual corpus, the proposed FastSpeech-VC can achieve high quality converted speech with mean opinion score (MOS) close to the professional records while maintaining good speaker similarity. Compared to the baselines using Tacotron2 and Transformer TTS models, the FastSpeech-VC can achieve controllable converted speech rate and much faster inference speed. More importantly, the FastSpeech-VC can easily be adapted to a speaker with limited training utterances.

preprint2020arXiv

AstroCatR: a Mechanism and Tool for Efficient Time Series Reconstruction of Large-Scale Astronomical Catalogues

Time series data of celestial objects are commonly used to study valuable and unexpected objects such as extrasolar planets and supernova in time domain astronomy. Due to the rapid growth of data volume, traditional manual methods are becoming extremely hard and infeasible for continuously analyzing accumulated observation data. To meet such demands, we designed and implemented a special tool named AstroCatR that can efficiently and flexibly reconstruct time series data from large-scale astronomical catalogues. AstroCatR can load original catalogue data from Flexible Image Transport System (FITS) files or databases, match each item to determine which object it belongs to, and finally produce time series datasets. To support the high-performance parallel processing of large-scale datasets, AstroCatR uses the extract-transform-load (ETL) preprocessing module to create sky zone files and balance the workload. The matching module uses the overlapped indexing method and an in-memory reference table to improve accuracy and performance. The output of AstroCatR can be stored in CSV files or be transformed other into formats as needed. Simultaneously, the module-based software architecture ensures the flexibility and scalability of AstroCatR. We evaluated AstroCatR with actual observation data from The three Antarctic Survey Telescopes (AST3). The experiments demonstrate that AstroCatR can efficiently and flexibly reconstruct all time series data by setting relevant parameters and configuration files. Furthermore, the tool is approximately 3X faster than methods using relational database management systems at matching massive catalogues.

preprint2020arXiv

Automation of the AST3 optical sky survey from Dome~A, Antarctica

The 0.5\,m Antarctic Survey Telescopes (AST3) were designed for time-domain optical/infrared astronomy. They are located in Dome~A, Antarctica, where they can take advantage of the continuous dark time during winter. Since the site is unattended in winter, everything for the operation, from observing to data reduction, had to be fully automated. Here, we present a brief overview of the AST3 project and some of its unique characteristics due to its location in Antarctica. We summarise the various components of the survey, including the customized hardware and software, that make complete automation possible.

preprint2020arXiv

Leveraging Text Data Using Hybrid Transformer-LSTM Based End-to-End ASR in Transfer Learning

In this work, we study leveraging extra text data to improve low-resource end-to-end ASR under cross-lingual transfer learning setting. To this end, we extend our prior work [1], and propose a hybrid Transformer-LSTM based architecture. This architecture not only takes advantage of the highly effective encoding capacity of the Transformer network but also benefits from extra text data due to the LSTM-based independent language model network. We conduct experiments on our in-house Malay corpus which contains limited labeled data and a large amount of extra text. Results show that the proposed architecture outperforms the previous LSTM-based architecture [1] by 24.2% relative word error rate (WER) when both are trained using limited labeled data. Starting from this, we obtain further 25.4% relative WER reduction by transfer learning from another resource-rich language. Moreover, we obtain additional 13.6% relative WER reduction by boosting the LSTM decoder of the transferred model with the extra text data. Overall, our best model outperforms the vanilla Transformer ASR by 11.9% relative WER. Last but not least, the proposed hybrid architecture offers much faster inference compared to both LSTM and Transformer architectures.

preprint2020arXiv

Night-time measurements of astronomical seeing at Dome A in Antarctica

Seeing, the angular size of stellar images blurred by atmospheric turbulence, is a critical parameter used to assess the quality of astronomical sites. Median values at the best mid-latitude sites are generally in the range of 0.6--0.8\,arcsec. Sites on the Antarctic plateau are characterized by comparatively-weak turbulence in the free-atmosphere above a strong but thin boundary layer. The median seeing at Dome C is estimated to be 0.23--0.36 arcsec above a boundary layer that has a typical height of 30\,m. At Dome A and F, the only previous seeing measurements were made during daytime. Here we report the first direct measurements of night-time seeing at Dome A, using a Differential Image Motion Monitor. Located at a height of just 8\,m, it recorded seeing as low as 0.13\,arcsec, and provided seeing statistics that are comparable to those for a 20\,m height at Dome C. It indicates that the boundary layer was below 8\,m 31\% of the time. At such times the median seeing was 0.31\,arcsec, consistent with free-atmosphere seeing. The seeing and boundary layer thickness are found to be strongly correlated with the near-surface temperature gradient. The correlation confirms a median thickness of approximately 14\,m for the boundary layer at Dome A, as found from a sonic radar. The thinner boundary layer makes it less challenging to locate a telescope above it, thereby giving greater access to the free-atmosphere.

preprint2020arXiv

Optical turbulence at Ali, China -- Results from the first year of lunar scintillometer observations

The location of an astronomical observatory is a key factor that affects its scientific productivity. The best astronomical sites are generally those found at high altitudes. Several such sites in western China and the Tibetan plateau are presently under development for astronomy. One of these is Ali, which at over 5000 m is one of the highest astronomical sites in the world. In order to further investigate the astronomical potential of Ali, we have installed a lunar scintillometer, for the primary purpose of profiling atmospheric turbulence. This paper describes the instrument and technique, and reports results from the first year of observations. We find that ground layer (GL) turbulence at Ali is remarkably weak and relatively thin. The median seeing, from turbulence in the range 11- 500 m above ground is 0.34 arcsec, with seeing better than 0.26 arcsec occurring 25 per cent of the time. Under median conditions, half of the GL turbulence lies below a height of 62 m. These initial results, and the high altitude and relatively low temperatures, suggest that Ali could prove to be an outstanding site for ground-based astronomy.

preprint2019arXiv

Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data

The lack of code-switch training data is one of the major concerns in the development of end-to-end code-switching automatic speech recognition (ASR) models. In this work, we propose a method to train an improved end-to-end code-switching ASR using only monolingual data. Our method encourages the distributions of output token embeddings of monolingual languages to be similar, and hence, promotes the ASR model to easily code-switch between languages. Specifically, we propose to use Jensen-Shannon divergence and cosine distance based constraints. The former will enforce output embeddings of monolingual languages to possess similar distributions, while the later simply brings the centroids of two distributions to be close to each other. Experimental results demonstrate high effectiveness of the proposed method, yielding up to 4.5% absolute mixed error rate improvement on Mandarin-English code-switching ASR task.

preprint2016arXiv

Fantastic 4 system for NIST 2015 Language Recognition Evaluation

This article describes the systems jointly submitted by Institute for Infocomm (I$^2$R), the Laboratoire d'Informatique de l'Université du Maine (LIUM), Nanyang Technology University (NTU) and the University of Eastern Finland (UEF) for 2015 NIST Language Recognition Evaluation (LRE). The submitted system is a fusion of nine sub-systems based on i-vectors extracted from different types of features. Given the i-vectors, several classifiers are adopted for the language detection task including support vector machines (SVM), multi-class logistic regression (MCLR), Probabilistic Linear Discriminant Analysis (PLDA) and Deep Neural Networks (DNN).

preprint2014arXiv

A new method of CCD dark current correction via extracting the dark information from scientific images

We have developed a new method to correct dark current at relatively high temperatures for Charge-Coupled Device (CCD) images when dark frames cannot be obtained on the telescope. For images taken with the Antarctic Survey Telescopes (AST3) in 2012, due to the low cooling efficiency, the median CCD temperature was -46$^\circ$C, resulting in a high dark current level of about 3$e^-$/pix/sec, even comparable to the sky brightness (10$e^-$/pix/sec). If not corrected, the nonuniformity of the dark current could even overweight the photon noise of the sky background. However, dark frames could not be obtained during the observing season because the camera was operated in frame-transfer mode without a shutter, and the telescope was unattended in winter. Here we present an alternative, but simple and effective method to derive the dark current frame from the scientific images. Then we can scale this dark frame to the temperature at which the scientific images were taken, and apply the dark frame corrections to the scientific images. We have applied this method to the AST3 data, and demonstrated that it can reduce the noise to a level roughly as low as the photon noise of the sky brightness, solving the high noise problem and improving the photometric precision. This method will also be helpful for other projects that suffer from similar issues.

preprint2014arXiv

Meteorological data for the astronomical site at Dome A, Antarctica

We present an analysis of the meteorological data collected at Dome A, Antarctica by the Kunlun Automated Weather Station, including temperatures and wind speeds at eight elevations above the snow surface between 0m and 14.5m. The average temperatures at 2m and 14.5m are $-54^{\circ}$C and $-46^{\circ}$C, respectively. We find that a strong temperature inversion existed at all heights for more than 70% of the time, and the temperature inversion typically lasts longer than 25 hours, indicating an extremely stable atmosphere. The temperature gradient is larger at lower elevations than higher elevations. The average wind speed was 1.5m/s at 4m elevation. We find that the temperature inversion is stronger when the wind speed is lower and the temperature gradient decreases sharply at a specific wind speed for each elevation. The strong temperature inversion and low wind speed results in a shallow and stable boundary layer with weak atmospheric turbulence above it, suggesting that Dome A should be an excellent site for astronomical observations. All the data from the weather station are available for download.

preprint2014arXiv

Problems with twilight/supersky flat-field for wide-field robotic telescopes and the solution

Twilight/night sky images are often used for flat-fielding CCD images, but the brightness gradient in twilight/night sky causes problems of accurate flat-field correction in astronomical images for wide-field telescopes. Using data from the Antarctic Survey Telescope (AST3), we found that when the sky brightness gradient is minimum and stable, there is still a gradient of 1% across AST3's field-of-view of 4.3 square degrees. We tested various approaches to remove the varying gradients in individual flat-field images. Our final optimal method can reduce the spatially dependent errors caused by the gradient to the negligible level. We also suggest a guideline of flat-fielding using twilight/night sky images for wide-field robotic autonomous telescopes.

preprint2014arXiv

The nonlinear photon transfer curve of CCDs and its effects on photometry

The photon transfer curve (PTC, variance vs. signal level) is a commonly used and effective tool in characterizing CCD performance. It is theoretically linear in the range where photon shot noise dominates, and its slope is utilized to derive the gain of the CCD. However, recent researches on different CCDs have revealed that the variance progressively drops at high signal levels, while the linearity shown by signal versus exposure time is still excellent and unaffected. On the other hand, bright stars are found to exhibit fatter point spread function (PSF). Both nonlinear PTC and the brighter-fatter effect are regarded as the result of spreading of charges between pixels, an interaction progress increasing with signal level. In this work we investigate the nonlinear PTC based on the images with a STA1600FT CCD camera, whose PTC starts to become nonlinear at about 1/3 full well. To explain the phenomenon, we present a model to characterize the charge-sharing PSF. This signal-dependent PSF can be derived from flat-field frames, and allow us to quantify the effects on photometry and measured shape of stars. This effect is essentially critical for projects requiring accurate photometry and shape parameters.

preprint2011arXiv

On the fairness of the main galaxy sample of SDSS

Flux-limited and volume-limited galaxy samples are constructed from SDSS data releases DR4, DR6 and DR7 for statistical analysis. The two-point correlation functions $ξ(s)$, monopole of three-point correlation functions $ζ_0$, projected two-point correlation function $w_p$ and pairwise velocity dispersion $σ_{12}$ are measured to test if galaxy samples are fair for these statistics. We find that with increment of sky coverage of SDSS, $ξ(s)$ of flux-limited sample is extremely robust and insensitive to local structures at low redshift. But for volume-limited samples fainter than $L^*$ at large scales $s>\sim 10\hmpc$, deviation of $ξ(s)$ and $ζ_0$ of DR7 to those of DR4 and DR6 increases with larger absolute magnitude. In the weakly nonlinear regime, there is no agreement between $ζ_0$ of different data releases in all luminosity bins. Furthermore, $w_p$ of volume-limited samples of DR7 in luminosity bins fainter than $-M_{r,0.1}=[18.5,19.5]$ are significantly larger, and $σ_{12}$ of the two faintest volume-limited samples of DR7 display very different scale dependence than results of DR4 and DR6. Our findings call for cautions in understanding clustering analysis results of SDSS faint galaxy samples, and higher order statistics of SDSS volume-limited samples in the weakly nonlinear regime. The first zero-crossing points of $ξ(s)$ of volume-limited samples are also investigated and discussed.

preprint2009arXiv

Clustering of K-band selected local galaxies

We present detailed clustering analysis of a large K-band selected local galaxy sample, which is constructed from the 2MASS and the SDSS and consists of $82,486$ galaxies with $10 < K < 13.5$ and $0.01 < z < 0.1$. The two-point correlation function of the magnitude-limited sample in real space at small scales is well described by a power law $ξ(r)=(r/6.44\pm0.23)^{-1.81\pm0.02}$. The pairwise velocity dispersion is derived from the anisotropic two-point correlation function and find the dispersion $σ_{12}=685\pm 17\kms$ if its scale invariance is assumed, which is larger than values measured in optical bands selected galaxy samples. We further investigate the dependence of the two-point correlation function and the $σ_{12}$ on the $g-r$ color and the $K$-band luminosity, obtain similar results to previous works in optical bands. Comparing a mock galaxy sample with our real data indicates that the semi-analytical model can not mimic the $σ_{12}$ in observation albeit it can approximate the two-point correlation function within measurement uncertainties.

preprint2004arXiv

The similarity metric

A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new ``normalized information distance'', based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it minorizes every computable distance in the class (that is, it is universal in that it discovers all computable similarities). We demonstrate that it is a metric and call it the {\em similarity metric}. This theory forms the foundation for a new practical tool. To evidence generality and robustness we give two distinctive applications in widely divergent areas using standard compression programs like gzip and GenCompress. First, we compare whole mitochondrial genomes and infer their evolutionary history. This results in a first completely automatic computed whole mitochondrial phylogeny tree. Secondly, we fully automatically compute the language tree of 52 different languages.

Bin Ma

What is connected

Connect this record

See the researcher in context

Building this map preview

24 published item(s)

A Highly Magnetic Ultra Massive White Dwarf with a 23-minute Rotation Period

Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings

End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression

I2CR: Improving Noise Robustness on Keyword Spotting Using Inter-Intra Contrastive Regularization

Learning Disentangled Representations for Counterfactual Regression via Mutual Information Minimization

M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge

Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge

Cloud Cover and Aurora Contamination at Dome A in 2017 from KLCAM

Sampling Subgraph Network with Application to Graph Classification

Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram

AstroCatR: a Mechanism and Tool for Efficient Time Series Reconstruction of Large-Scale Astronomical Catalogues

Automation of the AST3 optical sky survey from Dome~A, Antarctica

Leveraging Text Data Using Hybrid Transformer-LSTM Based End-to-End ASR in Transfer Learning

Night-time measurements of astronomical seeing at Dome A in Antarctica

Optical turbulence at Ali, China -- Results from the first year of lunar scintillometer observations

Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data

Fantastic 4 system for NIST 2015 Language Recognition Evaluation

A new method of CCD dark current correction via extracting the dark information from scientific images

Meteorological data for the astronomical site at Dome A, Antarctica

Problems with twilight/supersky flat-field for wide-field robotic telescopes and the solution

The nonlinear photon transfer curve of CCDs and its effects on photometry

On the fairness of the main galaxy sample of SDSS

Clustering of K-band selected local galaxies

The similarity metric