Source author record

Ke Ma

Ke Ma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

15works

21topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Response-G1: Explicit Scene Graph Modeling for Proactive Streaming Video Understanding

Proactive streaming video understanding requires Video-LLMs to decide when to respond as a video unfolds, a task where existing methods often fall short due to their implicit, query-agnostic modeling of visual evidence. We introduce Response-G1, a novel framework that establishes explicit, structured alignment between the accumulated video evidence and the query's expected response conditions via scene graphs. The framework operates in three fine-tuning-free stages: (1) online query-guided scene graph generation from streaming clips; (2) memory-based retrieval of the most semantically relevant historical scene graphs; and (3) retrieval-augmented trigger prompting for per-frame "silence/response" decisions. By grounding both evidence and conditions in a shared graph representation, Response-G1 achieves more interpretable and accurate response timing decisions. Experimental results on established benchmarks demonstrate the superiority of our method in both proactive and reactive tasks, validating the advantage of explicit scene graph modeling and retrieval in streaming video understanding.

preprint2024arXiv

A Robbins--Monro Sequence That Can Exploit Prior Information For Faster Convergence

We propose a new method to improve the convergence speed of the Robbins-Monro algorithm by introducing prior information about the target point into the Robbins-Monro iteration. We achieve the incorporation of prior information without the need of a -- potentially wrong -- regression model, which would also entail additional constraints. We show that this prior-information Robbins-Monro sequence is convergent for a wide range of prior distributions, even wrong ones, such as Gaussian, weighted sum of Gaussians, e.g., in a kernel density estimate, as well as bounded arbitrary distribution functions greater than zero. We furthermore analyse the sequence numerically to understand its performance and the influence of parameters. The results demonstrate that the prior-information Robbins-Monro sequence converges faster than the standard one, especially during the first steps, which are particularly important for applications where the number of function measurements is limited, and when the noise of observing the underlying function is large. We finally propose a rule to select the parameters of the sequence.

preprint2023arXiv

MAMMOTH-Subaru III. Ly$α$ Halo Extended to $\sim200$ kpc Identified by Stacking $\sim 3300$ Ly$α$ Emitters at $z=2.2-2.3$

In this paper, we present a Ly$α$ halo extended to $\sim200$ kpc identified by stacking $\sim 3300$ Ly$α$ emitters at $z=2.2-2.3$. We carry out imaging observations and data reduction with Subaru/Hyper Suprime-Cam (HSC). Our total survey area is $\sim12$ deg$^2$ and imaging depths are $25.5-27.0$ mag. Using the imaging data, we select 1,240 and 2,101 LAE candidates at $z=2.2$ and 2.3, respectively. We carry out spectroscopic observations of our LAE candidates and data reduction with Magellan/IMACS to estimate the contamination rate of our LAE candidates. We find that the contamination rate of our sample is low (8%). We stack our LAE candidates with a median stacking method to identify the Ly$α$ halo at $z=2$. We show that the Ly$α$ halo is extended to $\sim200$ kpc at a surface brightness level of $10^{-20}$ erg s$^{-1}$ cm$^{-2}$ arcsec$^{-2}$. Comparing to previous studies, our Ly$α$ halo is more extended at radii of $\sim25-100$ kpc, which is not likely caused by the contamination in our sample but by different redshifts and fields instead. To investigate how central galaxies affect surrounding LAHs, we divide our LAEs into subsamples based on the Ly$α$ luminosity ($L_{\rm Lyα}$), rest-frame Ly$α$ equivalent width (EW$_0$), and UV magnitude (M$_{\rm uv}$). We stack the subsamples and find that higher $L_{\rm Lyα}$, lower EW$_0$, and brighter M$_{\rm uv}$ cause more extended halos. Our results suggest that more massive LAEs generally have more extended Ly$α$ halos.

preprint2023arXiv

MAMMOTH-Subaru V. Effects of Cosmic Variance on Ly$α$ Luminosity Functions at $z=2.2-2.3$

Cosmic variance introduces significant uncertainties into galaxy number density properties when surveying the high-z Universe with a small volume, such uncertainties produce the field-to-field variance of galaxy number $σ_{g}$ in observational astronomy. This uncertainty significantly affects the Luminosity Functions (LF) measurement of Lya Emitters (LAEs). For most previous Lya LF studies, $σ_{g}$ is often adopted from predictions by cosmological simulations, but barely confirmed by observations. Measuring cosmic variance requires a huge sample over a large volume, exceeding the capabilities of most astronomical instruments. In this study, we demonstrate an observational approach for measuring the cosmic variance contribution for $z\approx2.2$ Lya LFs. The LAE candidates are observed using narrowband and broadband of the Subaru/Hyper Suprime-Cam (HSC), with 8 independent fields, making the total survey area $\simeq11.62$deg$^2$ and a comoving volume of $\simeq8.71\times10^6$Mpc$^3$. These eight fields are selected using the project of MAMMOTH. We report a best-fit Schechter function with parameters $α=-1.75$ (fixed), $L_{Lyα}^{*}=5.18_{-0.40}^{+0.43} \times 10^{42}$erg s$^{-1}$ and $ϕ_{Lya}^{*}=4.87_{-0.55}^{+0.54}\times10^{-4}$Mpc$^{-3}$ for the overall Lya LFs. After clipping out the regions that can bias the cosmic variance measurements, we calculate $σ_{g}$, by sampling LAEs within multiple pointings assigned on the field image. We investigate the relation between $σ_{g}$ and survey volume $V$, and fit a simple power law: $σ_g=k\times(\frac{V_{\rm eff}}{10^5 {\rm Mpc}^3})^β$. We find best-fit values of $-1.209_{-0.106}^{+0.106}$ for $β$ and $0.986_{-0.100}^{+0.108}$ for k. We compare our measurements with predictions from simulations and find that the cosmic variance of LAEs might be larger than that of general star-forming galaxies.

preprint2022arXiv

A Tale of HodgeRank and Spectral Method: Target Attack Against Rank Aggregation Is the Fixed Point of Adversarial Game

Rank aggregation with pairwise comparisons has shown promising results in elections, sports competitions, recommendations, and information retrieval. However, little attention has been paid to the security issue of such algorithms, in contrast to numerous research work on the computational and statistical characteristics. Driven by huge profits, the potential adversary has strong motivation and incentives to manipulate the ranking list. Meanwhile, the intrinsic vulnerability of the rank aggregation methods is not well studied in the literature. To fully understand the possible risks, we focus on the purposeful adversary who desires to designate the aggregated results by modifying the pairwise data in this paper. From the perspective of the dynamical system, the attack behavior with a target ranking list is a fixed point belonging to the composition of the adversary and the victim. To perform the targeted attack, we formulate the interaction between the adversary and the victim as a game-theoretic framework consisting of two continuous operators while Nash equilibrium is established. Then two procedures against HodgeRank and RankCentrality are constructed to produce the modification of the original data. Furthermore, we prove that the victims will produce the target ranking list once the adversary masters the complete information. It is noteworthy that the proposed methods allow the adversary only to hold incomplete information or imperfect feedback and perform the purposeful attack. The effectiveness of the suggested target attack strategies is demonstrated by a series of toy simulations and several real-world data experiments. These experimental results show that the proposed methods could achieve the attacker's goal in the sense that the leading candidate of the perturbed ranking list is the designated one by the adversary.

preprint2022arXiv

LAS-AT: Adversarial Training with Learnable Attack Strategy

Adversarial training (AT) is always formulated as a minimax problem, of which the performance depends on the inner optimization that involves the generation of adversarial examples (AEs). Most previous methods adopt Projected Gradient Decent (PGD) with manually specifying attack parameters for AE generation. A combination of the attack parameters can be referred to as an attack strategy. Several works have revealed that using a fixed attack strategy to generate AEs during the whole training phase limits the model robustness and propose to exploit different attack strategies at different training stages to improve robustness. But those multi-stage hand-crafted attack strategies need much domain expertise, and the robustness improvement is limited. In this paper, we propose a novel framework for adversarial training by introducing the concept of "learnable attack strategy", dubbed LAS-AT, which learns to automatically produce attack strategies to improve the model robustness. Our framework is composed of a target network that uses AEs for training to improve robustness and a strategy network that produces attack strategies to control the AE generation. Experimental evaluations on three benchmark databases demonstrate the superiority of the proposed method. The code is released at https://github.com/jiaxiaojunQAQ/LAS-AT.

preprint2022arXiv

Optimal, centralized dynamic curbside parking space zoning

In this paper we formulate a dynamic mixed integer program for optimally zoning curbside parking spaces subject to transportation policy-inspired constraints and regularization terms. First, we illustrate how given some objective of curb zoning valuation as a function of zone type (e.g., paid parking or bus stop), dynamically rezoning involves unrolling this optimization program over a fixed time horizon. Second, we implement two different solution methods that optimize for a given curb zoning value function. In the first method, we solve long horizon dynamic zoning problems via approximate dynamic programming. In the second method, we employ Dantzig-Wolfe decomposition to break-up the mixed-integer program into a master problem and several sub-problems that we solve in parallel; this decomposition accelerates the MIP solver considerably. We present simulation results and comparisons of the different employed techniques on vehicle arrival-rate data obtained for a neighborhood in downtown Seattle, Washington, USA

preprint2022arXiv

Prior-Guided Adversarial Initialization for Fast Adversarial Training

Fast adversarial training (FAT) effectively improves the efficiency of standard adversarial training (SAT). However, initial FAT encounters catastrophic overfitting, i.e.,the robust accuracy against adversarial attacks suddenly and dramatically decreases. Though several FAT variants spare no effort to prevent overfitting, they sacrifice much calculation cost. In this paper, we explore the difference between the training processes of SAT and FAT and observe that the attack success rate of adversarial examples (AEs) of FAT gets worse gradually in the late training stage, resulting in overfitting. The AEs are generated by the fast gradient sign method (FGSM) with a zero or random initialization. Based on the observation, we propose a prior-guided FGSM initialization method to avoid overfitting after investigating several initialization strategies, improving the quality of the AEs during the whole training process. The initialization is formed by leveraging historically generated AEs without additional calculation cost. We further provide a theoretical analysis for the proposed initialization method. We also propose a simple yet effective regularizer based on the prior-guided initialization,i.e., the currently generated perturbation should not deviate too much from the prior-guided initialization. The regularizer adopts both historical and current adversarial perturbations to guide the model learning. Evaluations on four datasets demonstrate that the proposed method can prevent catastrophic overfitting and outperform state-of-the-art FAT methods. The code is released at https://github.com/jiaxiaojunQAQ/FGSM-PGI.

preprint2021arXiv

AdaSpring: Context-adaptive and Runtime-evolutionary Deep Model Compression for Mobile Applications

There are many deep learning (e.g., DNN) powered mobile and wearable applications today continuously and unobtrusively sensing the ambient surroundings to enhance all aspects of human lives. To enable robust and private mobile sensing, DNN tends to be deployed locally on the resource-constrained mobile devices via model compression. The current practice either hand-crafted DNN compression techniques, i.e., for optimizing DNN-relative performance (e.g., parameter size), or on-demand DNN compression methods, i.e., for optimizing hardware-dependent metrics (e.g., latency), cannot be locally online because they require offline retraining to ensure accuracy. Also, none of them have correlated their efforts with runtime adaptive compression to consider the dynamic nature of the deployment context of mobile applications. To address those challenges, we present AdaSpring, a context-adaptive and self-evolutionary DNN compression framework. It enables the runtime adaptive DNN compression locally online. Specifically, it presents the ensemble training of a retraining-free and self-evolutionary network to integrate multiple alternative DNN compression configurations (i.e., compressed architectures and weights). It then introduces the runtime search strategy to quickly search for the most suitable compression configurations and evolve the corresponding weights. With evaluation on five tasks across three platforms and a real-world case study, experiment outcomes show that AdaSpring obtains up to 3.1x latency reduction, 4.2 x energy efficiency improvement in DNNs, compared to hand-crafted compression techniques, while only incurring <= 6.2ms runtime-evolution latency.

preprint2021arXiv

Deep Learning Assisted mmWave Beam Prediction with Prior Low-frequency Information

Huge overhead of beam training poses a significant challenge to mmWave communications. To address this issue, beam tracking has been widely investigated whereas existing methods are hard to handle serious multipath interference and non-stationary scenarios. Inspired by the spatial similarity between low-frequency and mmWave channels in non-standalone architectures, this paper proposes to utilize prior low-frequency information to predict the optimal mmWave beam, where deep learning is adopted to enhance the prediction accuracy. Specifically, periodically estimated low-frequency channel state information (CSI) is applied to track the movement of user equipment, and timing offset indicator is proposed to indicate the instant of mmWave beam training relative to low-frequency CSI estimation. Meanwhile, long-short term memory networks based dedicated models are designed to implement the prediction. Simulation results show that our proposed scheme can achieve higher beamforming gain than the conventional methods while requiring little overhead of mmWave beam training.

preprint2021arXiv

HAIR: Head-mounted AR Intention Recognition

Human teams exhibit both implicit and explicit intention sharing. To further development of human-robot collaboration, intention recognition is crucial on both sides. Present approaches rely on a vast sensor suite on and around the robot to achieve intention recognition. This relegates intuitive human-robot collaboration purely to such bulky systems, which are inadequate for large-scale, real-world scenarios due to their complexity and cost. In this paper we propose an intention recognition system that is based purely on a portable head-mounted display. In addition robot intention visualisation is also supported. We present experiments to show the quality of our human goal estimation component and some basic interactions with an industrial robot. HAIR should raise the quality of interaction between robots and humans, instead of such interactions raising the hair on the necks of the human coworkers.

preprint2021arXiv

On Stochastic Variance Reduced Gradient Method for Semidefinite Optimization

The low-rank stochastic semidefinite optimization has attracted rising attention due to its wide range of applications. The nonconvex reformulation based on the low-rank factorization, significantly improves the computational efficiency but brings some new challenge to the analysis. The stochastic variance reduced gradient (SVRG) method has been regarded as one of the most effective methods. SVRG in general consists of two loops, where a reference full gradient is first evaluated in the outer loop and then used to yield a variance reduced estimate of the current gradient in the inner loop. Two options have been suggested to yield the output of the inner loop, where Option I sets the output as its last iterate, and Option II yields the output via random sampling from all the iterates in the inner loop. However, there is a significant gap between the theory and practice of SVRG when adapted to the stochastic semidefinite programming (SDP). SVRG practically works better with Option I, while most of existing theoretical results focus on Option II. In this paper, we fill this gap via exploiting a new semi-stochastic variant of the original SVRG with Option I adapted to the semidefinite optimization. Equipped with this, we establish the global linear submanifold convergence (i.e., converging exponentially fast to a submanifold of a global minimum under the orthogonal group action) of the proposed SVRG method, given a provable initialization scheme and under certain smoothness and restricted strongly convex assumptions. Our analysis includes the effects of the mini-batch size and update frequency in the inner loop as well as two practical step size strategies, the fixed and stabilized Barzilai-Borwein step sizes. Some numerical results in matrix sensing demonstrate the efficiency of proposed SVRG method outperforming Option II counterpart as well as others.

preprint2020arXiv

Attribute-guided image generation from layout

Recent approaches have achieved great success in image generation from structured inputs, e.g., semantic segmentation, scene graph or layout. Although these methods allow specification of objects and their locations at image-level, they lack the fidelity and semantic control to specify visual appearance of these objects at an instance-level. To address this limitation, we propose a new image generation method that enables instance-level attribute control. Specifically, the input to our attribute-guided generative model is a tuple that contains: (1) object bounding boxes, (2) object categories and (3) an (optional) set of attributes for each object. The output is a generated image where the requested objects are in the desired locations and have prescribed attributes. Several losses work collaboratively to encourage accurate, consistent and diverse image generation. Experiments on Visual Genome dataset demonstrate our model's capacity to control object-level attributes in generated images, and validate plausibility of disentangled object-attribute representation in the image generation from layout task. Also, the generated images from our model have higher resolution, object classification accuracy and consistency, as compared to the previous state-of-the-art.

preprint2014arXiv

A description of pseudorapidity distributions in p-p collisions at center-of-mass energy from 23.6 to 900 GeV

In the context of combined model of evolution-dominated hydrodynamics + leading particles, we discuss the pseudorapidity distributions of charged particles produced in p-p collisions. A comparison is made between the theoretical predictions and experimental measurements. The combined model works well in p-p collisions in the whole available energy region from sqrt(s_NN)=23.6 to 900 GeV.

preprint2012arXiv

The Development of WIFIS: a Wide Integral Field Infrared Spectrograph

We present the current results from the development of a wide integral field infrared spectrograph (WIFIS). WIFIS offers an unprecedented combination of etendue and spectral resolving power for seeing-limited, integral field observations in the 0.9-1.8 um range and is most sensitive in the 0.9-1.35 um range. Its optical design consists of front-end re-imaging optics, an all-reflective image slicer-type, integral field unit (IFU) called FISICA, and a long-slit grating spectrograph back-end that is coupled with a HAWAII 2RG focal plane array. The full wavelength range is achieved by selecting between two different gratings. By virtue of its re-imaging optics, the spectrograph is quite versatile and can be used at multiple telescopes. The size of its field-of-view is unrivalled by other similar spectrographs, offering a 4.5" x 12" integral field at a 10-meter class telescope (or 20" x 50" at a 2.3-meter telescope). The use of WIFIS will be crucial in astronomical problems which require wide-field, two-dimensional spectroscopy such as the study of merging galaxies at moderate redshift and nearby star/planet-forming regions and supernova remnants. We discuss the final optical design of WIFIS, and its predicted on-sky performance on two reference telescope platforms: the 2.3-m Steward Bok telescope and the 10.4-m Gran Telescopio Canarias. We also present the results from our laboratory characterization of FISICA. IFU properties such as magnification, field-mapping, and slit width along the entire slit length were measured by our tests. The construction and testing of WIFIS is expected to be completed by early 2013. We plan to commission the instrument at the 2.3-m Steward Bok telescope at Kitt Peak, USA in Spring 2013.

Ke Ma

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Response-G1: Explicit Scene Graph Modeling for Proactive Streaming Video Understanding

A Robbins--Monro Sequence That Can Exploit Prior Information For Faster Convergence

MAMMOTH-Subaru III. Ly$α$ Halo Extended to $\sim200$ kpc Identified by Stacking $\sim 3300$ Ly$α$ Emitters at $z=2.2-2.3$

MAMMOTH-Subaru V. Effects of Cosmic Variance on Ly$α$ Luminosity Functions at $z=2.2-2.3$

A Tale of HodgeRank and Spectral Method: Target Attack Against Rank Aggregation Is the Fixed Point of Adversarial Game

LAS-AT: Adversarial Training with Learnable Attack Strategy

Optimal, centralized dynamic curbside parking space zoning

Prior-Guided Adversarial Initialization for Fast Adversarial Training

AdaSpring: Context-adaptive and Runtime-evolutionary Deep Model Compression for Mobile Applications

Deep Learning Assisted mmWave Beam Prediction with Prior Low-frequency Information

HAIR: Head-mounted AR Intention Recognition

On Stochastic Variance Reduced Gradient Method for Semidefinite Optimization

Attribute-guided image generation from layout

A description of pseudorapidity distributions in p-p collisions at center-of-mass energy from 23.6 to 900 GeV

The Development of WIFIS: a Wide Integral Field Infrared Spectrograph