Source author record

Akira Takahashi

Akira Takahashi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.mtrl-sci cond-mat.str-el Sound cond-mat.other eess.AS Machine Learning

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

MMAudio-LABEL: Audio Event Labeling via Audio Generation for Silent Video

Recent advances in multimodal generation have enabled high-quality audio generation from silent videos. Practical applications, such as sound production, demand not only the generated audio but also explicit sound event labels detailing the type and timing of sounds. One straightforward approach involves applying a standard sound event detection to the generated audio. However, this post-hoc pipeline is inherently limited, as it is prone to error accumulation. To address this limitation, we propose MMAudio-LABEL (LAtent-Based Event Labeling), an event-aware audio generation framework built on a foundational audio generation model as its backbone that jointly generates audio and frame-aligned sound event predictions from silent videos. We evaluate our method on the Greatest Hits dataset for onset detection and 17-class material classification. Our approach improves onset-detection accuracy from 46.7% to 75.0% and material-classification accuracy from 40.6% to 61.0% over baselines. These results suggest that jointly learning audio generation and event prediction enables a more interpretable and practical video-to-audio synthesis.

preprint2026arXiv

MMAudioReverbs: Video-Guided Acoustic Modeling for Dereverberation and Room Impulse Response Estimation

Although recent video-to-audio (V2A) models excelled at synthesizing semantically plausible sounds from visual inputs, they do not explicitly model room-acoustic effects such as reverberation or room impulse responses (RIRs), and thus offer limited controllability over these effects. However, we hypothesize that such V2A models implicitly have semantic knowledge of the relationship between spatial audio and the corresponding vision cues. In this paper, we revisit a V2A model for the sake of the above, and propose the way to utilize the pretrained model as prior for physically grounded room-acoustic processing. Based on one of the state-of-the-art V2A models, MMAudio, we propose MMAudioReverbs that is a unified framework dealing with i) dereverberation and ii) room impulse response (RIR) estimation without network architectural modification, and fine-tuned on a small dataset. Experimental results showed that audio and visual cues respectively have advantage depending on the type of physical room acoustics. It implies that foundation V2A models can be used for physically grounded room-acoustic analysis.

preprint2019arXiv

A charge model as an effective model of one-dimensional Hubbard and extended Hubbard systems: its application to linear optical spectrum calculations in large systems based upon many-body Wannier functions

We propose an effective model called the "charge model", for the half-filled one-dimensional Hubbard and extended Hubbard models. In this model, spin-charge separation, which has been justified from an infinite on-site repulsion ($U$) in the strict sense, is compatible with charge fluctuations. Our analyses based on the many-body Wannier functions succeeded in determining the optical conductivity spectra in large systems. The obtained spectra reproduce the spectra for the original models well even in the intermediate $U$ region of $U=5-10T$, with $T$ being the nearest-neighbor electron hopping energy. These results indicate that the spin-charge separation works fairly well in this intermediate $U$ region against the usual expectation and that the charge model is an effective model that applies to actual quasi-one-dimensional materials classified as strongly correlated electron systems.

preprint2015arXiv

First-principles interatomic potentials for ten elemental metals via compressed sensing

Interatomic potentials have been widely used in atomistic simulations such as molecular dynamics. Recently, frameworks to construct accurate interatomic potentials that combine a systematic set of density functional theory (DFT) calculations with machine learning techniques have been proposed. One of these methods is to use compressed sensing to derive a sparse representation for the interatomic potential. This facilitates the control of the accuracy of interatomic potentials. In this study, we demonstrate the applicability of compressed sensing to deriving the interatomic potential of ten elemental metals, namely, Ag, Al, Au, Ca, Cu, Ga, In, K, Li and Zn. For each elemental metal, the interatomic potential is obtained from DFT calculations using elastic net regression. The interatomic potentials are found to have prediction errors of less than 3.5 meV/atom, 0.03 eV/Å and 0.15 GPa for the energy, force and the stress tensor, respectively, which enable the accurate prediction of physical properties such as lattice constants and the phonon dispersion relationship.

preprint2014arXiv

A sparse representation for potential energy surface

We propose a simple scheme to estimate potential energy surface (PES) with which the accuracy can be easily controlled and improved up to the level of the density functional theory (DFT) calculations. It is based on a model selection within the framework of linear regression using the least absolute shrinkage and selection operator (LASSO) technique. Basis functions are selected from a systematic large set of candidate functions. The sparsity of PES significantly reduces the computational demands for evaluation of the energy and the force in molecular dynamics simulations without losing the accuracy. The usefulness of the scheme is well demonstrated for describing elemental metals of Na and Mg.

preprint2010arXiv

Purely electronic THz polarization in dimer Mott insulators

We theoretically discover purely electronic polarization modes in THz frequency region in dimer Mott insulators $κ$-(BEDT-TTF)$_2$X. The unusual low-frequency modes arise from the coupling between the oscillation of intradimer electric dipole moments and that of alternating interdimer bond orders. These collective motions play an important role in the dynamical dielectric properties of the dimer Mott insulators. Near the phase boundary of the dimer Mott transition, the ferroelectric ground state is realized by introducing electron-lattice coupling.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint