Source author record

Takuya Higuchi

Takuya Higuchi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.mtrl-sci physics.optics eess.AS Sound cond-mat.mes-hall Machine Learning Artificial Intelligence physics.app-ph physics.atom-ph

Catalog footprint

What is connected

9works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Improving Voice Trigger Detection with Metric Learning

Voice trigger detection is an important task, which enables activating a voice assistant when a target user speaks a keyword phrase. A detector is typically trained on speech data independent of speaker information and used for the voice trigger detection task. However, such a speaker independent voice trigger detector typically suffers from performance degradation on speech from underrepresented groups, such as accented speakers. In this work, we propose a novel voice trigger detector that can use a small number of utterances from a target speaker to improve detection accuracy. Our proposed model employs an encoder-decoder architecture. While the encoder performs speaker independent voice trigger detection, similar to the conventional detector, the decoder predicts a personalized embedding for each utterance. A personalized voice trigger score is then obtained as a similarity score between the embeddings of enrollment utterances and a test utterance. The personalized embedding allows adapting to target speaker's speech when computing the voice trigger score, hence improving voice trigger detection accuracy. Experimental results show that the proposed approach achieves a 38% relative reduction in a false rejection rate (FRR) compared to a baseline speaker independent voice trigger model.

preprint2021arXiv

Dynamic curriculum learning via data parameters for noise robust keyword spotting

We propose dynamic curriculum learning via data parameters for noise robust keyword spotting. Data parameter learning has recently been introduced for image processing, where weight parameters, so-called data parameters, for target classes and instances are introduced and optimized along with model parameters. The data parameters scale logits and control importance over classes and instances during training, which enables automatic curriculum learning without additional annotations for training data. Similarly, in this paper, we propose using this curriculum learning approach for acoustic modeling, and train an acoustic model on clean and noisy utterances with the data parameters. The proposed approach automatically learns the difficulty of the classes and instances, e.g. due to low speech to noise ratio (SNR), in the gradient descent optimization and performs curriculum learning. This curriculum learning leads to overall improvement of the accuracy of the acoustic model. We evaluate the effectiveness of the proposed approach on a keyword spotting task. Experimental results show 7.7% relative reduction in false reject ratio with the data parameters compared to a baseline model which is simply trained on the multiconditioned dataset.

preprint2020arXiv

Attosecond-fast internal photoemission

The photoelectric effect has a sister process relevant in optoelectronics called internal photoemission. Here an electron is photoemitted from a metal into a semiconductor. While the photoelectric effect takes place within less than 100 attoseconds, the attosecond time scale has so far not been measured for internal photoemission. Based on the new method CHArge transfer time MEasurement via Laser pulse duration-dependent saturation fluEnce determinatiON, CHAMELEON, we show that the atomically thin semi-metal graphene coupled to bulk silicon carbide, forming a Schottky junction, allows charge transfer times as fast as (300 $\pm$ 200) attoseconds. These results are supported by a simple quantum mechanical model simulation. With the obtained cut-off bandwidth of 3.3 PHz for the charge transfer rate, this semimetal-semiconductor interface represents the first functional solid-state interface offering the speed and design space required for future light-wave signal processing.

preprint2020arXiv

Stacked 1D convolutional networks for end-to-end small footprint voice trigger detection

We propose a stacked 1D convolutional neural network (S1DCNN) for end-to-end small footprint voice trigger detection in a streaming scenario. Voice trigger detection is an important speech application, with which users can activate their devices by simply saying a keyword or phrase. Due to privacy and latency reasons, a voice trigger detection system should run on an always-on processor on device. Therefore, having small memory and compute cost is crucial for a voice trigger detection system. Recently, singular value decomposition filters (SVDFs) has been used for end-to-end voice trigger detection. The SVDFs approximate a fully-connected layer with a low rank approximation, which reduces the number of model parameters. In this work, we propose S1DCNN as an alternative approach for end-to-end small-footprint voice trigger detection. An S1DCNN layer consists of a 1D convolution layer followed by a depth-wise 1D convolution layer. We show that the SVDF can be expressed as a special case of the S1DCNN layer. Experimental results show that the S1DCNN achieve 19.0% relative false reject ratio (FRR) reduction with a similar model size and a similar time delay compared to the SVDF. By using longer time delays, the S1DCNN further improve the FRR up to 12.2% relative.

preprint2020arXiv

Sub-cycle temporal evolution of light-induced electron dynamics in hexagonal 2D materials

Two-dimensional materials with hexagonal symmetry such as graphene and transition metal dichalcogenides} are unique materials to study light-field-controlled electron dynamics inside of a solid. Around the $K$-point, the dispersion relation represents an ideal system to study intricately coupled intraband motion and interband (Landau-Zener) transitions driven by the optical field of phase-controlled few-cycle laser pulses. Based on the coupled nature of the intraband and interband processes, we have recently observed in graphene repeated coherent Landau-Zener transitions between valence and conduction band separated by around half an optical period of ~1.3 fs [Higuchi et al., Nature 550, 224 (2017)]. Due to the low temporal symmetry of the applied laser pulse, a residual current density and a net electron polarization are formed. Here we show extended numerical data on the temporal evolution of the conduction band population of 2D materials with hexagonal symmetry during the light-matter interaction, yielding deep insights to attosecond-fast electron dynamics. In addition, we show that a residual ballistic current density is formed, which strongly increases when a band gap is introduced. Both, the sub-cycle electron dynamics and the resulting residual current are relevant for the fundamental understanding and future applications of strongly driven electrons in two-dimensional materials, including graphene or transition metal dichalcogenide monolayers.

preprint2014arXiv

Strong-Field Perspective on High-Harmonic Radiation from Bulk Solids

Mechanisms of high-harmonic generation from crystals are described by treating the electric field of a laser as a quasi-static strong field. Under the quasi-static electric field, electrons in periodic potentials form dressed states, known as Wannier-Stark states. The energy differences between the dressed states determine the frequencies of the radiation. The radiation yield is determined by the magnitudes of the inter-band and intra-band current matrix elements between the dressed states. The generation of attosecond pulses from solids is predicted. Ramifications for strong-field physics are discussed.

preprint2011arXiv

General considerations of the electrostatic boundary conditions in oxide heterostructures

This is a book chapter that covers general considerations of the electrostatic stability of oxide surfaces and interfaces.

preprint2011arXiv

Vectorial Control of Magnetization by Light

Coherent light-matter interactions have recently extended their applications to the ultrafast control of magnetization in solids. An important but unrealized technique is the manipulation of magnetization vector motion to make it follow an arbitrarily designed multi-dimensional trajectory. Furthermore, for its realization, the phase and amplitude of degenerate modes need to be steered independently. A promising method is to employ Raman-type nonlinear optical processes induced by femtosecond laser pulses, where magnetic oscillations are induced impulsively with a controlled initial phase and an azimuthal angle that follows well defined selection rules determined by the materials' symmetries. Here, we emphasize the fact that temporal variation of the polarization angle of the laser pulses enables us to distinguish between the two degenerate modes. A full manipulation of two-dimensional magnetic oscillations is demonstrated in antiferromagnetic NiO by employing a pair of polarization-twisted optical pulses. These results have lead to a new concept of vectorial control of magnetization by light.

preprint2010arXiv

Optical control of magnetization of micron-size domains in antiferromagnetic NiO single crystals

We propose Raman-induced collinear difference-frequency generation (DFG) as a method to manipulate dynamical magnetization. When a fundamental beam propagates along a threefold rotational axis, this coherent second-order optical process is permitted by angular momentum conservation through the rotational analogue of the Umklapp process. As a demonstration, we experimentally obtained polarization properties of collinear magnetic DFG along a [111] axis of a single crystal of antiferromagnetic NiO with micro multidomain structure, which excellently agreed with the theoretical prediction.

Takuya Higuchi

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Improving Voice Trigger Detection with Metric Learning

Dynamic curriculum learning via data parameters for noise robust keyword spotting

Attosecond-fast internal photoemission

Stacked 1D convolutional networks for end-to-end small footprint voice trigger detection

Sub-cycle temporal evolution of light-induced electron dynamics in hexagonal 2D materials

Strong-Field Perspective on High-Harmonic Radiation from Bulk Solids

General considerations of the electrostatic boundary conditions in oxide heterostructures

Vectorial Control of Magnetization by Light

Optical control of magnetization of micron-size domains in antiferromagnetic NiO single crystals