Source author record

Jun Suzuki

Jun Suzuki appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

quant-ph Computation and Language Machine Learning cond-mat.quant-gas gr-qc

Catalog footprint

What is connected

29works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Balancing Cost and Quality: An Exploration of Human-in-the-loop Frameworks for Automated Short Answer Scoring

Short answer scoring (SAS) is the task of grading short text written by a learner. In recent years, deep-learning-based approaches have substantially improved the performance of SAS models, but how to guarantee high-quality predictions still remains a critical issue when applying such models to the education field. Towards guaranteeing high-quality predictions, we present the first study of exploring the use of human-in-the-loop framework for minimizing the grading cost while guaranteeing the grading quality by allowing a SAS model to share the grading task with a human grader. Specifically, by introducing a confidence estimation method for indicating the reliability of the model predictions, one can guarantee the scoring quality by utilizing only predictions with high reliability for the scoring results and casting predictions with low reliability to human graders. In our experiments, we investigate the feasibility of the proposed framework using multiple confidence estimation methods and multiple SAS datasets. We find that our human-in-the-loop framework allows automatic scoring models and human graders to achieve the target scoring quality.

preprint2022arXiv

Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model

Ensembling is a popular method used to improve performance as a last resort. However, ensembling multiple models finetuned from a single pretrained model has been not very effective; this could be due to the lack of diversity among ensemble members. This paper proposes Multi-Ticket Ensemble, which finetunes different subnetworks of a single pretrained model and ensembles them. We empirically demonstrated that winning-ticket subnetworks produced more diverse predictions than dense networks, and their ensemble outperformed the standard ensemble on some tasks.

preprint2022arXiv

JParaCrawl v3.0: A Large-scale English-Japanese Parallel Corpus

Most current machine translation models are mainly trained with parallel corpora, and their translation accuracy largely depends on the quality and quantity of the corpora. Although there are billions of parallel sentences for a few language pairs, effectively dealing with most language pairs is difficult due to a lack of publicly available parallel corpora. This paper creates a large parallel corpus for English-Japanese, a language pair for which only limited resources are available, compared to such resource-rich languages as English-German. It introduces a new web-based English-Japanese parallel corpus named JParaCrawl v3.0. Our new corpus contains more than 21 million unique parallel sentence pairs, which is more than twice as many as the previous JParaCrawl v2.0 corpus. Through experiments, we empirically show how our new corpus boosts the accuracy of machine translation models on various domains. The JParaCrawl v3.0 corpus will eventually be publicly available online for research purposes.

preprint2022arXiv

N-best Response-based Analysis of Contradiction-awareness in Neural Response Generation Models

Avoiding the generation of responses that contradict the preceding context is a significant challenge in dialogue response generation. One feasible method is post-processing, such as filtering out contradicting responses from a resulting n-best response list. In this scenario, the quality of the n-best list considerably affects the occurrence of contradictions because the final response is chosen from this n-best list. This study quantitatively analyzes the contextual contradiction-awareness of neural response generation models using the consistency of the n-best lists. Particularly, we used polar questions as stimulus inputs for concise and quantitative analyses. Our tests illustrate the contradiction-awareness of recent neural response generation models and methodologies, followed by a discussion of their properties and limitations.

preprint2022arXiv

Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond

Natural language processing technology has rapidly improved automated grammatical error correction tasks, and the community begins to explore document-level revision as one of the next challenges. To go beyond sentence-level automated grammatical error correction to NLP-based document-level revision assistant, there are two major obstacles: (1) there are few public corpora with document-level revisions being annotated by professional editors, and (2) it is not feasible to elicit all possible references and evaluate the quality of revision with such references because there are infinite possibilities of revision. This paper tackles these challenges. First, we introduce a new document-revision corpus, TETRA, where professional editors revised academic papers sampled from the ACL anthology which contain few trivial grammatical errors that enable us to focus more on document- and paragraph-level edits such as coherence and consistency. Second, we explore reference-less and interpretable methods for meta-evaluation that can detect quality improvements by document revision. We show the uniqueness of TETRA compared with existing document revision corpora and demonstrate that a fine-tuned pre-trained language model can discriminate the quality of documents after revision even when the difference is subtle. This promising result will encourage the community to further explore automated document revision models and metrics in future.

preprint2021arXiv

An Investigation Between Schema Linking and Text-to-SQL Performance

Text-to-SQL is a crucial task toward developing methods for understanding natural language by computers. Recent neural approaches deliver excellent performance; however, models that are difficult to interpret inhibit future developments. Hence, this study aims to provide a better approach toward the interpretation of neural models. We hypothesize that the internal behavior of models at hand becomes much easier to analyze if we identify the detailed performance of schema linking simultaneously as the additional information of the text-to-SQL performance. We provide the ground-truth annotation of schema linking information onto the Spider dataset. We demonstrate the usefulness of the annotated data and how to analyze the current state-of-the-art neural models.

preprint2020arXiv

Direct estimation of minimum gate fidelity

With the current interest in building quantum computers, there is a strong need for accurate and efficient characterization of the noise in quantum gate implementations. A key measure of the performance of a quantum gate is the minimum gate fidelity, i.e., the fidelity of the gate, minimized over all input states. Conventionally, the minimum fidelity is estimated by first accurately reconstructing the full gate process matrix using the experimental procedure of quantum process tomography (QPT). Then, a numerical minimization is carried out to find the minimum fidelity. QPT is, however, well known to be costly, and it might appear that we can do better, if the goal is only to estimate one single number. In this work, we propose a hybrid numerical-experimental scheme that employs a numerical gradient-free minimization (GFM) and an experimental target-fidelity estimation procedure to directly estimate the minimum fidelity without reconstructing the process matrix. We compare this to an alternative scheme, referred to as QPT fidelity estimation, that does use QPT, but directly employs the minimum gate fidelity as the termination criterion. Both approaches can thus be considered as direct estimation schemes. General theoretical bounds suggest a significant resource savings for the GFM scheme over QPT fidelity estimation; numerical simulations for specific classes of noise, however, show that both schemes have similar performance, reminding us of the need for caution when using general bounds for specific examples. The GFM scheme, however, presents potential for future improvements in resource cost, with the development of even more efficient GFM algorithms.

preprint2020arXiv

Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition

In general, the labels used in sequence labeling consist of different types of elements. For example, IOB-format entity labels, such as B-Person and I-Person, can be decomposed into span (B and I) and type information (Person). However, while most sequence labeling models do not consider such label components, the shared components across labels, such as Person, can be beneficial for label prediction. In this work, we propose to integrate label component information as embeddings into models. Through experiments on English and Japanese fine-grained named entity recognition, we demonstrate that the proposed method improves performance, especially for instances with low-frequency labels.

preprint2020arXiv

Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction

This paper investigates how to effectively incorporate a pre-trained masked language model (MLM), such as BERT, into an encoder-decoder (EncDec) model for grammatical error correction (GEC). The answer to this question is not as straightforward as one might expect because the previous common methods for incorporating a MLM into an EncDec model have potential drawbacks when applied to GEC. For example, the distribution of the inputs to a GEC model can be considerably different (erroneous, clumsy, etc.) from that of the corpora used for pre-training MLMs; however, this issue is not addressed in the previous methods. Our experiments show that our proposed method, where we first fine-tune a MLM with a given GEC corpus and then use the output of the fine-tuned MLM as additional features in the GEC model, maximizes the benefit of the MLM. The best-performing model achieves state-of-the-art performances on the BEA-2019 and CoNLL-2014 benchmarks. Our code is publicly available at: https://github.com/kanekomasahiro/bert-gec.

preprint2020arXiv

Evaluating Dialogue Generation Systems via Response Selection

Existing automatic evaluation metrics for open-domain dialogue response generation systems correlate poorly with human evaluation. We focus on evaluating response generation systems via response selection. To evaluate systems properly via response selection, we propose the method to construct response selection test sets with well-chosen false candidates. Specifically, we propose to construct test sets filtering out some types of false candidates: (i) those unrelated to the ground-truth response and (ii) those acceptable as appropriate responses. Through experiments, we demonstrate that evaluating systems via response selection with the test sets developed by our method correlates more strongly with human evaluation, compared with widely used automatic evaluation metrics such as BLEU.

preprint2020arXiv

Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition

Interpretable rationales for model predictions play a critical role in practical applications. In this study, we develop models possessing interpretable inference process for structured prediction. Specifically, we present a method of instance-based learning that learns similarities between spans. At inference time, each span is assigned a class label based on its similar spans in the training set, where it is easy to understand how much each training instance contributes to the predictions. Through empirical analysis on named entity recognition, we demonstrate that our method enables to build models that have high interpretability without sacrificing performance.

preprint2020arXiv

JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus

Recent machine translation algorithms mainly rely on parallel corpora. However, since the availability of parallel corpora remains limited, only some resource-rich language pairs can benefit from them. We constructed a parallel corpus for English-Japanese, for which the amount of publicly available parallel corpora is still limited. We constructed the parallel corpus by broadly crawling the web and automatically aligning parallel sentences. Our collected corpus, called JParaCrawl, amassed over 8.7 million sentence pairs. We show how it includes a broader range of domains and how a neural machine translation model trained with it works as a good pre-trained model for fine-tuning specific domains. The pre-training and fine-tuning approaches achieved or surpassed performance comparable to model training from the initial state and reduced the training time. Additionally, we trained the model with an in-domain dataset and JParaCrawl to show how we achieved the best performance with them. JParaCrawl and the pre-trained models are freely available online for research purposes.

preprint2020arXiv

Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese

We examine a methodology using neural language models (LMs) for analyzing the word order of language. This LM-based method has the potential to overcome the difficulties existing methods face, such as the propagation of preprocessor errors in count-based methods. In this study, we explore whether the LM-based method is valid for analyzing the word order. As a case study, this study focuses on Japanese due to its complex and flexible word order. To validate the LM-based method, we test (i) parallels between LMs and human word order preference, and (ii) consistency of the results obtained using the LM-based method with previous linguistic studies. Through our experiments, we tentatively conclude that LMs display sufficient word order knowledge for usage as an analysis tool. Finally, using the LM-based method, we demonstrate the relationship between the canonical word order and topicalization, which had yet to be analyzed by large-scale experiments.

preprint2020arXiv

Robust phase estimation of Gaussian states in the presence of outlier quantum states

In this paper, we investigate the problem of estimating the phase of a coherent state in the presence of unavoidable noisy quantum states. These unwarranted quantum states are represented by outlier quantum states in this study. We first present a statistical framework of robust statistics in a quantum system to handle outlier quantum states. We then apply the method of M-estimators to suppress untrusted measurement outcomes due to outlier quantum states. Our proposal has the advantage over the classical methods in being systematic, easy to implement, and robust against occurrence of noisy states.

preprint2020arXiv

Single Model Ensemble using Pseudo-Tags and Distinct Vectors

Model ensemble techniques often increase task performance in neural networks; however, they require increased time, memory, and management effort. In this study, we propose a novel method that replicates the effects of a model ensemble with a single model. Our approach creates K-virtual models within a single parameter space using K-distinct pseudo-tags and K-distinct vectors. Experiments on text classification and sequence labeling tasks on several datasets demonstrate that our method emulates or outperforms a traditional model ensemble with 1/K-times fewer parameters.

preprint2020arXiv

Uncertainty relation for the position of an electron in a uniform magnetic field from quantum estimation theory

We investigate the uncertainty relation for estimating the position of one electron in a uniform magnetic field in the framework of the quantum estimation theory. Two kinds of momenta, canonical one and mechanical one, are used to generate a shift in the position of the electron. We first consider pure state models whose wave function is in the ground state with zero angular momentum. The model generated by the two-commuting canonical momenta becomes the quasi-classical model, in which the symmetric logarithmic derivative quantum Cramér-Rao bound is achievable. The model generated by the two non-commuting mechanical momenta, on the other hand, turns out to be a Gaussian model, where the generalized right logarithmic derivative quantum Cramér-Rao bound is achievable. We next consider mixed-state models by taking into account the effects of thermal noise. The model with the canonical momenta now becomes genuine quantum mechanical, although its generators commute with each other. The derived uncertainty relationship is in general determined by two different quantum Cramér-Rao bounds in a non-trivial manner. The model with the mechanical momenta is identified with the well-known Gaussian shift model, and the uncertainty relation is governed by the right logarithmic derivative Cramér-Rao bound.

preprint2020arXiv

User-specified random sampling of quantum channels and its applications

Random samples of quantum channels have many applications in quantum information processing tasks. Due to the Choi--Jamiołkowski isomorphism, there is a well-known correspondence between channels and states, and one can imagine adapting \emph{state} sampling methods to sample quantum channels. Here, we discuss such an adaptation, using the Hamiltonian Monte Carlo method, a well-known classical method capable of producing high quality samples from arbitrary, user-specified distributions. Its implementation requires an exact parameterization of the space of quantum channels, with no superfluous parameters and no constraints. We construct such a parameterization, and demonstrate its use in three common channel sampling applications.

preprint2019arXiv

Nuisance parameter problem in quantum estimation theory: General formulation and qubit examples

In this paper, we analyze quantum-state estimation problems when some of the parameters are of no interest to be estimated. In classical statistics, these irrelevant parameters are called nuisance parameters and this problem is of great importance in many practical applications of statistics. However, little is known regarding the effects of nuisance parameters in quantum-state estimation problems. The main contribution of this paper is first to formulate the nuisance parameter problem for the quantum-state estimation theory, then to propose a method of how to eliminate the nuisance parameters to obtain an estimation error bound for the parameters of interest. We also develop useful methods of dealing with the nuisance parameter problem in the quantum case and reveal the significant difference from the classical case. In particular, we clarify an intrinsic tradeoff relation to estimate the nuisance parameters and parameters of interest. The general qubit model is examined in detail to emphasize that we cannot ignore the effects of the nuisance parameters in general. Several examples in qubit systems are worked out to illustrate our findings.

preprint2016arXiv

Explicit formula for the Holevo bound for two-parameter qubit estimation problem

The main contribution of this paper is to derive an explicit expression for the fundamental precision bound, the Holevo bound, for estimating any two-parameter family of qubit mixed-states in terms of quantum versions of Fisher information. The obtained formula depends solely on the symmetric logarithmic derivative (SLD), the right logarithmic derivative (RLD) Fisher information, and a given weight matrix. This result immediately provides necessary and sufficient conditions for the following two important classes of quantum statistical models; the Holevo bound coincides with the SLD Cramer-Rao bound and it does with the RLD Cramer-Rao bound. One of the important results of this paper is that a general model other than these two special cases exhibits an unexpected property: The structure of the Holevo bound changes smoothly when the weight matrix varies. In particular, it always coincides with the RLD Cramer-Rao bound for a certain choice of the weight matrix. Several examples illustrate these findings.

preprint2015arXiv

Entanglement detection from channel parameter estimation problem

We derive a general criterion to detect entangled states in multi-partite systems based on the symmetric logarithmic derivative quantum Fisher information. This criterion is a direct consequence of the fact that separable states do not improve the accuracy upon estimating one-parameter family of quantum channels. Our result is a generalization of the previously known criterion for one-parameter unitary channel to any one-parameter quantum channel. We discuss several examples to illustrate our criterion. The proposed criterion is extended to the case of open quantum systems and we briefly discuss how to detect entangled states in the presence of decoherence.

preprint2015arXiv

Parameter estimation of qubit states with unknown phase parameter

We discuss a problem of parameter estimation for quantum two-level system, qubit system, in presence of unknown phase parameter. We analyze trade-off relations for mean-square errors when estimating relevant parameters with separable measurements based on known precision bounds; the symmetric logarithmic derivative Cramer-Rao bound and Hayashi-Gill-Massar (HGM) bound. We investigate the optimal measurement which attains the HGM bound and discuss its properties. We show that the HGM bound for relevant parameters can be attained asymptotically by using some fraction of given $n$ quantum states to estimate the phase parameter. We also discuss the Holevo bound which can be attained asymptotically by a collective measurement.

preprint2014arXiv

Qubit subalgebra and tensor product in Weyl algebra of angular momentum system

We analyze Weyl algebra of quantum angular momentum system and construct qubit subalgebra out of it. We show that the commutant of this qubit subalgebra is isomorphic to the original algebra and prove the tensor product structure between qubit subalgebra and its commutant. This construction can be iterated to construct arbitrary number of qubit subalgebras from a single quantum system. We show a simple experimental realization of this proposed scheme using orbital angular momentum of single photons. We briefly discuss about construction of qudit subalgbra and generalization to other infinite dimensional systems.

preprint2013arXiv

Creation of excitations from a uniform impurity motion in the condensate

We investigate a phenomenon of creation of excitations in the homogenous Bose-Einstein condensate due to an impurity moving with a constant velocity. A simple model is considered to take into account dynamical effects due to motions of the impurity. Based on this model, we show that there can be a finite amount of excitations created even if velocity of the impurity is below Landau's critical velocity. We also show that the total number of excitations scales differently for large time across the speed of sound. Thus, our result dictates the critical behavior across Landau's one and validates Landau's institution to the problem. We discuss how Landau's critical velocity emerges and its validity within our model.

preprint2012arXiv

Symmetric coupling of four spin-1/2 systems

We address the non-binary coupling of identical angular momenta based upon the representation theory for the symmetric group. A correspondence is pointed out between the complete set of commuting operators and the reference-frame-free subsystems. We provide a detailed analysis of the coupling of three and four spin-1/2 systems and discuss a symmetric coupling of four spin-1/2 systems.

preprint2010arXiv

Encoding many qubits in a rotor

We propose a scheme for encoding many qubits in a single rotor, that is, a continuous and periodic degree of freedom. A key feature of this scheme is its ability to manipulate and entangle the encoded qubits with a single operation on the system. We also show, using quantum error-correcting codes, how to protect the qubits against small errors in angular position and momentum which may affect the rotor. We then discuss the feasibility of this scheme and suggest several candidates for its implementation. The proposed scheme is immediately generalizable to qudits of any finite dimension.

preprint2010arXiv

Entanglement detection from interference fringes in atom-photon systems

A measurement scheme of atomic qubits pinned at given positions is studied by analyzing the interference pattern obtained when they emit photons spontaneously. In the case of two qubits, a well-known relation is revisited, in which the interference visibility is equal to the concurrence of the state in the infinite spatial separation limit of the qubits. By taking into account the super-radiant and sub-radiant effects, it is shown that a state tomography is possible when the qubit spatial separation is comparable to the wavelength of the atomic transition. In the case of three qubits, the relations between various entanglement measures and the interference visibility are studied, where the visibility is defined from the two-qubit case. A qualitative correspondence among these entanglement relations is discussed. In particular, it is shown that the interference visibility is directly related to the maximal bipartite negativity.

preprint2010arXiv

Radiation from accelerated impurities in Bose-Einstein condensate

We investigate radiation spectra arising from accelerated point-like impurities in the homogeneous Bose-Einstein condensate. A general formula for the radiation spectrum is obtained in the integral form as a function of given impurity trajectory. The Planckian spectrum is obtained for a special accelerated motion, which is shown to be unphysical. Non-Planckian spectrum is found in the case of a uniformly accelerated impurity. We compare our result with similar settings as discussed in other quantum many-body systems.

preprint2006arXiv

Raw-data attacks in quantum cryptography with partial tomography

We consider a variant of the BB84 protocol for quantum cryptography, the prototype of tomographically incomplete protocols, where the key is generated by one-way communication rather than the usual two-way communication. Our analysis, backed by numerical evidence, establishes thresholds for eavesdropping attacks on the raw data and on the generated key at quantum bit error rates of 10% and 6.15%, respectively. Both thresholds are lower than the threshold for unconditional security in the standard BB84 protocol.

preprint2005arXiv

Motion of Classical Impurities in the Homogeneous Bose-Einstein Condensate

Motion of classical point-like impurities in the homogeneous Einstein condensate of bosons is studied in the framework of second quantization method. A toy model is proposed and its general solution within the Bogoliubov approximation is obtained. The effective Minkowski space-time structure arises naturally in this non-relativistic quantum many-body system in the low energy regime. This is shown to be true in this model. Several examples are discussed in order to illustrate our model. The homogeneous condensate produces an effective Yukawa type attractive force between impurities sitting in condensate. Landau's criterion is naturally derived in a case of linear motion of impurity. The analytic expressions for spectra of Bogoliubov excitations produced by the accelerated motions of impurities are obtained. A quick look at the analytic expression reveals that the spectrum of gapless excitations emitted by the linearly accelerated impurity {\it is not thermal}. If the homogeneous condensate is the physically correct model for Minkowski space-time then it follows that the apparent thermal response of the simple linearly accelerated detector models may be the result of improper regularization.

Jun Suzuki

What is connected

Connect this record

See the researcher in context

Building this map preview

29 published item(s)

Balancing Cost and Quality: An Exploration of Human-in-the-loop Frameworks for Automated Short Answer Scoring

Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model

JParaCrawl v3.0: A Large-scale English-Japanese Parallel Corpus

N-best Response-based Analysis of Contradiction-awareness in Neural Response Generation Models

Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond

An Investigation Between Schema Linking and Text-to-SQL Performance

Direct estimation of minimum gate fidelity

Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition

Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction

Evaluating Dialogue Generation Systems via Response Selection

Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition

JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus

Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese

Robust phase estimation of Gaussian states in the presence of outlier quantum states

Single Model Ensemble using Pseudo-Tags and Distinct Vectors

Uncertainty relation for the position of an electron in a uniform magnetic field from quantum estimation theory

User-specified random sampling of quantum channels and its applications

Nuisance parameter problem in quantum estimation theory: General formulation and qubit examples

Explicit formula for the Holevo bound for two-parameter qubit estimation problem

Entanglement detection from channel parameter estimation problem

Parameter estimation of qubit states with unknown phase parameter

Qubit subalgebra and tensor product in Weyl algebra of angular momentum system

Creation of excitations from a uniform impurity motion in the condensate

Symmetric coupling of four spin-1/2 systems

Encoding many qubits in a rotor

Entanglement detection from interference fringes in atom-photon systems

Radiation from accelerated impurities in Bose-Einstein condensate

Raw-data attacks in quantum cryptography with partial tomography

Motion of Classical Impurities in the Homogeneous Bose-Einstein Condensate