Source author record

Raymond H. Chan

Raymond H. Chan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Applications math.NA math.OC Artificial Intelligence eess.IV Numerical Analysis physics.optics

Catalog footprint

What is connected

11works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Beyond Text Prompts: Visual-to-Visual Generation as A Unified Paradigm

Humans often specify and create through visual artifacts: typography sheets, sketches, reference images, and annotated scenes. Yet modern visual generators still ask users to serialize this intent into text, a bottleneck that compresses signals like spatial structure, exact appearance, and glyph shape. We propose \textbf{\emph{visual-to-visual} (V2V)} generation, in which the user conditions a generative model with a visual specification page rather than a text prompt. The page is not an edit target, but a visual document that specifies the desired output. We introduce \textbf{V2V-Zero}, a training-free framework that exposes this interface in existing vision-language model (VLM) conditioned generators by replacing text-only conditioning with final-layer hidden states extracted from visual pages, exploiting the fact that the frozen VLM already maps both text and images into the generator's conditioning space. On GenEval, V2V-Zero reaches 0.85 with a frozen Qwen-Image backbone, closely matching its optimized text-to-image performance without fine-tuning. To evaluate the broader V2V space, we introduce \textbf{Simple-V2V Bench}, spanning seven visual-conditioning tasks and seven models, including GPT Image 2, Nano Banana 2, Seedream 5.0 Lite, open-weight baselines, and a video extension. V2V-Zero scores 32.7/100, outperforming evaluated open-weight image baselines and revealing a clear capability hierarchy: attribute binding is strong, content generation is unreliable, and structural control remains hard even for commercial systems. A HunyuanVideo-1.5 extension scores 20.2/100, showing the interface transfers beyond images. Mechanistic analysis shows the default reasoning path is primarily visually routed, with 95.0\% of conditioning-token attention mass on visual-page hidden states.

preprint2026arXiv

Extending Pretrained 10-Second ECG Foundation Models to Longer Horizons

Electrocardiogram (ECG) foundation models pretrained on typical diagnostic 10-second ECG segments, have demonstrated strong transferability across a range of clinical applications. However, many real-world applications produce recordings that are typically longer, and are varied in duration during inference time. These 10-second models have no built-in way to combine information across time. Extending them to longer horizons introduces two challenges: structural incompatibilities arising from input-length disparities, and semantic challenges that limit meaningful temporal aggregation. We propose a parameter-efficient framework that extends pretrained ECG foundation models to longer and variable-length ECGs without retraining the backbone. Guided by a frozen pretrained 10-second model, we introduce a lightweight plug-in module that extends the model in two complementary ways: (i) structurally compatible long-sequence processing and (ii) semantically informed temporal modeling. Experiments on multiple long-horizon ECG tasks, datasets, and foundation model backbones demonstrate that our method enables robust long-horizon extension from pretrained snapshot models, consistently outperforming sliding-window and pooling-based baselines with strong parameter efficiency.

preprint2022arXiv

A 3-stage Spectral-spatial Method for Hyperspectral Image Classification

Hyperspectral images often have hundreds of spectral bands of different wavelengths captured by aircraft or satellites that record land coverage. Identifying detailed classes of pixels becomes feasible due to the enhancement in spectral and spatial resolution of hyperspectral images. In this work, we propose a novel framework that utilizes both spatial and spectral information for classifying pixels in hyperspectral images. The method consists of three stages. In the first stage, the pre-processing stage, Nested Sliding Window algorithm is used to reconstruct the original data by {enhancing the consistency of neighboring pixels} and then Principal Component Analysis is used to reduce the dimension of data. In the second stage, Support Vector Machines are trained to estimate the pixel-wise probability map of each class using the spectral information from the images. Finally, a smoothed total variation model is applied to smooth the class probability vectors by {ensuring spatial connectivity} in the images. We demonstrate the superiority of our method against three state-of-the-art algorithms on six benchmark hyperspectral data sets with 10 to 50 training labels for each class. The results show that our method gives the overall best performance in accuracy. Especially, our gain in accuracy increases when the number of labeled pixels decreases and therefore our method is more advantageous to be applied to problems with small training set. Hence it is of great practical significance since expert annotations are often expensive and difficult to collect.

preprint2022arXiv

Classification of Hyperspectral Images Using SVM with Shape-adaptive Reconstruction and Smoothed Total Variation

In this work, a novel algorithm called SVM with Shape-adaptive Reconstruction and Smoothed Total Variation (SaR-SVM-STV) is introduced to classify hyperspectral images, which makes full use of spatial and spectral information. The Shape-adaptive Reconstruction (SaR) is introduced to preprocess each pixel based on the Pearson Correlation between pixels in its shape-adaptive (SA) region. Support Vector Machines (SVMs) are trained to estimate the pixel-wise probability maps of each class. Then the Smoothed Total Variation (STV) model is applied to denoise and generate the final classification map. Experiments show that SaR-SVM-STV outperforms the SVM-STV method with a few training labels, demonstrating the significance of reconstructing hyperspectral images before classification.

preprint2022arXiv

Selecting Regularization Parameters for nuclear norm type minimization problems

The reconstruction of low-rank matrix from its noisy observation finds its usage in many applications. It can be reformulated into a constrained nuclear norm minimization problem, where the bound $η$ of the constraint is explicitly given or can be estimated by the probability distribution of the noise. When the Lagrangian method is applied to find the minimizer, the solution can be obtained by the singular value thresholding operator where the thresholding parameter $λ$ is related to the Lagrangian multiplier. In this paper, we first show that the Frobenius norm of the discrepancy between the minimizer and the observed matrix is a strictly increasing function of $λ$. From that we derive a closed-form solution for $λ$ in terms of $η$. The result can be used to solve the constrained nuclear-norm-type minimization problem when $η$ is given. For the unconstrained nuclear-norm-type regularized problems, our result allows us to automatically choose a suitable regularization parameter by using the discrepancy principle. The regularization parameters obtained are comparable to (and sometimes better than) those obtained by Stein's unbiased risk estimator (SURE) approach while the cost of solving the minimization problem can be reduced by 11--18 times. Numerical experiments with both synthetic data and real MRI data are performed to validate the proposed approach.

preprint2022arXiv

Semi-supervised Change Detection of Small Water Bodies Using RGB and Multispectral Images in Peruvian Rainforests

Artisanal and Small-scale Gold Mining (ASGM) is an important source of income for many households, but it can have large social and environmental effects, especially in rainforests of developing countries. The Sentinel-2 satellites collect multispectral images that can be used for the purpose of detecting changes in water extent and quality which indicates the locations of mining sites. This work focuses on the recognition of ASGM activities in Peruvian Amazon rainforests. We tested several semi-supervised classifiers based on Support Vector Machines (SVMs) to detect the changes of water bodies from 2019 to 2021 in the Madre de Dios region, which is one of the global hotspots of ASGM activities. Experiments show that SVM-based models can achieve reasonable performance for both RGB (using Cohen's $κ$ 0.49) and 6-channel images (using Cohen's $κ$ 0.71) with very limited annotations. The efficacy of incorporating Lab color space for change detection is analyzed as well.

preprint2022arXiv

Unsupervised Spatial-spectral Hyperspectral Image Reconstruction and Clustering with Diffusion Geometry

Hyperspectral images, which store a hundred or more spectral bands of reflectance, have become an important data source in natural and social sciences. Hyperspectral images are often generated in large quantities at a relatively coarse spatial resolution. As such, unsupervised machine learning algorithms incorporating known structure in hyperspectral imagery are needed to analyze these images automatically. This work introduces the Spatial-Spectral Image Reconstruction and Clustering with Diffusion Geometry (DSIRC) algorithm for partitioning highly mixed hyperspectral images. DSIRC reduces measurement noise through a shape-adaptive reconstruction procedure. In particular, for each pixel, DSIRC locates spectrally correlated pixels within a data-adaptive spatial neighborhood and reconstructs that pixel's spectral signature using those of its neighbors. DSIRC then locates high-density, high-purity pixels far in diffusion distance (a data-dependent distance metric) from other high-density, high-purity pixels and treats these as cluster exemplars, giving each a unique label. Non-modal pixels are assigned the label of their diffusion distance-nearest neighbor of higher density and purity that is already labeled. Strong numerical results indicate that incorporating spatial information through image reconstruction substantially improves the performance of pixel-wise clustering.

preprint2020arXiv

Point Spread Function Engineering for 3D Imaging of Space Debris using a Continuous Exact l0 Penalty (CEL0) Based Algorithm

We consider three-dimensional (3D) localization and imaging of space debris from only one two-dimensional (2D) snapshot image. The technique involves an optical imager that exploits off-center image rotation to encode both the lateral and depth coordinates of point sources, with the latter being encoded in the angle of rotation of the PSF. We formulate 3D localization into a large-scale sparse 3D inverse problem in the discretized form. A recently developed penalty called continuous exact l0 (CEL0) is applied in this problem for the Gaussian noise model. Numerical experiments and comparisons illustrate the efficiency of the algorithm.

preprint2015arXiv

A FEAST Algorithm with oblique projection for generalized eigenvalue problems

The contour-integral based eigensolvers are the recent efforts for computing the eigenvalues inside a given region in the complex plane. The best-known members are the Sakurai-Sugiura (SS) method, its stable version CIRR, and the FEAST algorithm. An attractive computational advantage of these methods is that they are easily parallelizable. The FEAST algorithm was developed for the generalized Hermitian eigenvalue problems. It is stable and accurate. However, it may fail when applied to non-Hermitian problems. In this paper, we extend the FEAST algorithm to non-Hermitian problems. The approach can be summarized as follows: (i) to construct a particular contour integral to form a subspace containing the desired eigenspace, and (ii) to use the oblique projection technique to extract desired eigenpairs with appropriately chosen test subspace. The related mathematical framework is established. We also address some implementation issues such as how to choose a suitable starting matrix and design good stopping criteria. Numerical experiments are provided to illustrate that our method is stable and efficient.

preprint2015arXiv

Geometric Tight Frame based Stylometry for Art Authentication of van Gogh Paintings

This paper is about authenticating genuine van Gogh paintings from forgeries. The authentication process depends on two key steps: feature extraction and outlier detection. In this paper, a geometric tight frame and some simple statistics of the tight frame coefficients are used to extract features from the paintings. Then a forward stage-wise rank boosting is used to select a small set of features for more accurate classification so that van Gogh paintings are highly concentrated towards some center point while forgeries are spread out as outliers. Numerical results show that our method can achieve 86.08% classification accuracy under the leave-one-out cross-validation procedure. Our method also identifies five features that are much more predominant than other features. Using just these five features for classification, our method can give 88.61% classification accuracy which is the highest so far reported in literature. Evaluation of the five features is also performed on two hundred datasets generated by bootstrap sampling with replacement. The median and the mean are 88.61% and 87.77% respectively. Our results show that a small set of statistics of the tight frame coefficients along certain orientations can serve as discriminative features for van Gogh paintings. It is more important to look at the tail distributions of such directional coefficients than mean values and standard deviations. It reflects a highly consistent style in van Gogh's brushstroke movements, where many forgeries demonstrate a more diverse spread in these features.

preprint2014arXiv

Inertial primal-dual algorithms for structured convex optimization

The primal-dual algorithm recently proposed by Chambolle & Pock (abbreviated as CPA) for structured convex optimization is very efficient and popular. It was shown by Chambolle & Pock in \cite{CP11} and also by Shefi & Teboulle in \cite{ST14} that CPA and variants are closely related to preconditioned versions of the popular alternating direction method of multipliers (abbreviated as ADM). In this paper, we further clarify this connection and show that CPAs generate exactly the same sequence of points with the so-called linearized ADM (abbreviated as LADM) applied to either the primal problem or its Lagrangian dual, depending on different updating orders of the primal and the dual variables in CPAs, as long as the initial points for the LADM are properly chosen. The dependence on initial points for LADM can be relaxed by focusing on cyclically equivalent forms of the algorithms. Furthermore, by utilizing the fact that CPAs are applications of a general weighted proximal point method to the mixed variational inequality formulation of the KKT system, where the weighting matrix is positive definite under a parameter condition, we are able to propose and analyze inertial variants of CPAs. Under certain conditions, global point-convergence, nonasymptotic $O(1/k)$ and asymptotic $o(1/k)$ convergence rate of the proposed inertial CPAs can be guaranteed, where $k$ denotes the iteration index. Finally, we demonstrate the profits gained by introducing the inertial extrapolation step via experimental results on compressive image reconstruction based on total variation minimization.

Raymond H. Chan

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Beyond Text Prompts: Visual-to-Visual Generation as A Unified Paradigm

Extending Pretrained 10-Second ECG Foundation Models to Longer Horizons

A 3-stage Spectral-spatial Method for Hyperspectral Image Classification

Classification of Hyperspectral Images Using SVM with Shape-adaptive Reconstruction and Smoothed Total Variation

Selecting Regularization Parameters for nuclear norm type minimization problems

Semi-supervised Change Detection of Small Water Bodies Using RGB and Multispectral Images in Peruvian Rainforests

Unsupervised Spatial-spectral Hyperspectral Image Reconstruction and Clustering with Diffusion Geometry

Point Spread Function Engineering for 3D Imaging of Space Debris using a Continuous Exact l0 Penalty (CEL0) Based Algorithm

A FEAST Algorithm with oblique projection for generalized eigenvalue problems

Geometric Tight Frame based Stylometry for Art Authentication of van Gogh Paintings

Inertial primal-dual algorithms for structured convex optimization