Researcher profile

Masahiro Tanaka

Masahiro Tanaka contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

AutoSP: Unlocking Long-Context LLM Training Via Compiler-Based Sequence Parallelism

Large-language-models (LLMs) demonstrate enormous utility in long-context tasks which require processing prompts that consist of tens to hundreds of thousands of tokens. However, existing LLM training libraries do not provide easy to use abstractions to optimize for long-context training, instead focusing on optimizations for models with large parameter counts through ZeRO-3/FSDP, Tensor and Pipeline parallelism. This forces users to rewrite LLM training libraries to incorporate compositions of various complex long-context optimizations, such as sequence-parallelism, to training pipelines; a process that requires in-depth expertise, reducing developer productivity. To tackle these challenges, we introduce AutoSP: the first automated solution to automatically optimize LLM training for longer-contexts. AutoSP compiles models and applies a targeted set of optimizations: automated sequence parallelism, and long-context aware activation-checkpointing, to drastically enhance LLM trainability at negligible cost to throughput. Our evaluation demonstrates AutoSP's capability on both NVIDIA and AMD hardware, increasing training contexts by upto 2.7$\times$ and 2.5$\times$ respectively over competitive hand-written baseline at negligible cost to runtime performance.

preprint2020arXiv

On the Likelihood of Local Projection Models

A local projection model is defined by a set of linear regressions that account for the associations between exogenous variables and an endogenous variable observed at different time points. While it is standard practice to separately estimate individual regressions using the ordinary least squares estimator, some recent studies treat a local projection model as a multivariate regression with correlated errors, i.e., seemingly unrelated regressions, and propose Bayesian and non-Bayesian methods to improve the estimation accuracy. However, it is not clear how and when this way of treatment of local projection models is justified. The primary purpose of this paper is to fill this gap by showing that the likelihood of local projection models can be analytically derived from a stationary vector moving average process. By means of numerical experiments, we confirm that this treatment of local projections is tenable for finite samples.

preprint2019arXiv

Bayesian Inference of Local Projections with Roughness Penalty Priors

A local projection is a statistical framework that accounts for the relationship between an exogenous variable and an endogenous variable, measured at different time points. Local projections are often applied in impulse response analyses and direct forecasting. While local projections are becoming increasingly popular because of their robustness to misspecification and their flexibility, they are less statistically efficient than standard methods, such as vector autoregression. In this study, we seek to improve the statistical efficiency of local projections by developing a fully Bayesian approach that can be used to estimate local projections using roughness penalty priors. By incorporating such prior-induced smoothness, we can use information contained in successive observations to enhance the statistical efficiency of an inference. We apply the proposed approach to an analysis of monetary policy in the United States, showing that the roughness penalty priors successfully estimate the impulse response functions and improve the predictive accuracy of local projections.