Researcher profile

Tomoyuki Okuno

Tomoyuki Okuno contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Proxy3D: Efficient 3D Representations for Vision-Language Models via Semantic Clustering and Alignment

Spatial intelligence in vision-language models (VLMs) attracts research interest with the practical demand to reason in the 3D world.Despite promising results, most existing methods follow the conventional 2D pipeline in VLMs and use pixel-aligned representations for the vision modality. However, correspondence-based models with implicit 3D scene understanding often fail to achieve spatial consistency, and representation-based models with 3D geometric priors lack efficiency in vision sequence serialization. To address this, we propose a Proxy3D method with compact yet comprehensive 3D proxy representations for the vision modality. Given only video frames as input, we employ semantic and geometric encoders to extract scene features and then perform their semantic-aware clustering to obtain a set of proxies in the 3D space. For representation alignment, we further curate the SpaceSpan dataset and apply multi-stage training to adopt the proposed 3D proxy representations with the VLM. When using shorter sequences for vision information, our method achieves competitive or state-of-the-art performance in 3D visual question answering, visual grounding and general spatial intelligence benchmarks.

preprint2022arXiv

MTTrans: Cross-Domain Object Detection with Mean-Teacher Transformer

Recently, DEtection TRansformer (DETR), an end-to-end object detection pipeline, has achieved promising performance. However, it requires large-scale labeled data and suffers from domain shift, especially when no labeled data is available in the target domain. To solve this problem, we propose an end-to-end cross-domain detection Transformer based on the mean teacher framework, MTTrans, which can fully exploit unlabeled target domain data in object detection training and transfer knowledge between domains via pseudo labels. We further propose the comprehensive multi-level feature alignment to improve the pseudo labels generated by the mean teacher framework taking advantage of the cross-scale self-attention mechanism in Deformable DETR. Image and object features are aligned at the local, global, and instance levels with domain query-based feature alignment (DQFA), bi-level graph-based prototype alignment (BGPA), and token-wise image feature alignment (TIFA). On the other hand, the unlabeled target domain data pseudo-labeled and available for the object detection training by the mean teacher framework can lead to better feature extraction and alignment. Thus, the mean teacher framework and the comprehensive multi-level feature alignment can be optimized iteratively and mutually based on the architecture of Transformers. Extensive experiments demonstrate that our proposed method achieves state-of-the-art performance in three domain adaptation scenarios, especially the result of Sim10k to Cityscapes scenario is remarkably improved from 52.6 mAP to 57.9 mAP. Code will be released.

preprint2021arXiv

Rapid Deceleration of Blast Waves Witnessed in Tycho's Supernova Remnant

In spite of their importance as standard candles in cosmology and as major major sites of nucleosynthesis in the Universe, what kinds of progenitor systems lead to type Ia supernovae (SN) remains a subject of considerable debate in the literature. This is true even for the case of Tycho's SN exploded in 1572 although it has been deeply studied both observationally and theoretically. Analyzing X-ray data of Tycho's supernova remnant (SNR) obtained with Chandra in 2003, 2007, 2009, and 2015, we discover that the expansion before 2007 was substantially faster than radio measurements reported in the past decades and then rapidly decelerated during the last ~ 15 years. The result is well explained if the shock waves recently hit a wall of dense gas surrounding the SNR. Such a gas structure is in fact expected in the so-called single-degenerate scenario, in which the progenitor is a binary system consisting of a white dwarf and a stellar companion, whereas it is not generally predicted by a competing scenario, the double-degenerate scenario, which has a binary of two white dwarfs as the progenitor. Our result thus favors the former scenario. This work also demonstrates a novel technique to probe gas environments surrounding SNRs and thus disentangle the two progenitor scenarios for Type Ia SNe.

preprint2020arXiv

Time Variability of Nonthermal X-ray Stripes in Tycho's Supernova Remnant with Chandra

Analyzing Chandra data of Tycho's supernova remnant (SNR) taken in 2000, 2003, 2007, 2009, and 2015, we search for time variable features of synchrotron X-rays in the southwestern part of the SNR, where stripe structures of hard X-ray emission were previous found. By comparing X-ray images obtained at each epoch, we discover a knot-like structure in the northernmost part of the stripe region became brighter particularly in 2015. We also find a bright filamentary structure gradually became fainter and narrower as it moved outward. Our spectral analysis reveal that not only the nonthermal X-ray flux but also the photon indices of the knot-like structure change from year to year. During the period from 2000 to 2015, the small knot shows brightening of $\sim 70\%$ and hardening of $ΔΓ\sim 0.45$. The time variability can be explained if the magnetic field is amplified to $\sim 100~\mathrm{μG}$ and/or if magnetic turbulence significantly changes with time.