Source author record

Shaohua Li

Shaohua Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Machine Learning Computation and Language Computer Vision Artificial Intelligence eess.IV Cryptography and Security Discrete Mathematics Information Retrieval Robotics

Catalog footprint

What is connected

14works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow

Optical flow estimation aims to find the 2D motion field by identifying corresponding pixels between two images. Despite the tremendous progress of deep learning-based optical flow methods, it remains a challenge to accurately estimate large displacements with motion blur. This is mainly because the correlation volume, the basis of pixel matching, is computed as the dot product of the convolutional features of the two images. The locality of convolutional features makes the computed correlations susceptible to various noises. On large displacements with motion blur, noisy correlations could cause severe errors in the estimated flow. To overcome this challenge, we propose a new architecture "CRoss-Attentional Flow Transformer" (CRAFT), aiming to revitalize the correlation volume computation. In CRAFT, a Semantic Smoothing Transformer layer transforms the features of one frame, making them more global and semantically stable. In addition, the dot-product correlations are replaced with transformer Cross-Frame Attention. This layer filters out feature noises through the Query and Key projections, and computes more accurate correlations. On Sintel (Final) and KITTI (foreground) benchmarks, CRAFT has achieved new state-of-the-art performance. Moreover, to test the robustness of different models on large motions, we designed an image shifting attack that shifts input images to generate large artificial motions. Under this attack, CRAFT performs much more robustly than two representative methods, RAFT and GMA. The code of CRAFT is is available at https://github.com/askerlee/craft.

preprint2022arXiv

REFUGE2 Challenge: A Treasure Trove for Multi-Dimension Analysis and Evaluation in Glaucoma Screening

With the rapid development of artificial intelligence (AI) in medical image processing, deep learning in color fundus photography (CFP) analysis is also evolving. Although there are some open-source, labeled datasets of CFPs in the ophthalmology community, large-scale datasets for screening only have labels of disease categories, and datasets with annotations of fundus structures are usually small in size. In addition, labeling standards are not uniform across datasets, and there is no clear information on the acquisition device. Here we release a multi-annotation, multi-quality, and multi-device color fundus image dataset for glaucoma analysis on an original challenge -- Retinal Fundus Glaucoma Challenge 2nd Edition (REFUGE2). The REFUGE2 dataset contains 2000 color fundus images with annotations of glaucoma classification, optic disc/cup segmentation, as well as fovea localization. Meanwhile, the REFUGE2 challenge sets three sub-tasks of automatic glaucoma diagnosis and fundus structure analysis and provides an online evaluation framework. Based on the characteristics of multi-device and multi-quality data, some methods with strong generalizations are provided in the challenge to make the predictions more robust. This shows that REFUGE2 brings attention to the characteristics of real-world multi-domain data, bridging the gap between scientific research and clinical application.

preprint2021arXiv

Hardness of Metric Dimension in Graphs of Constant Treewidth

The Metric Dimension problem asks for a minimum-sized resolving set in a given (unweighted, undirected) graph $G$. Here, a set $S \subseteq V(G)$ is resolving if no two distinct vertices of $G$ have the same distance vector to $S$. The complexity of Metric Dimension in graphs of bounded treewidth remained elusive in the past years. Recently, Bonnet and Purohit [IPEC 2019] showed that the problem is W[1]-hard under treewidth parameterization. In this work, we strengthen their lower bound to show that Metric Dimension is NP-hard in graphs of treewidth 24.

preprint2020arXiv

An improved FPT algorithm for Independent Feedback Vertex Set

We study the Independent Feedback Vertex Set problem - a variant of the classic Feedback Vertex Set problem where, given a graph $G$ and an integer $k$, the problem is to decide whether there exists a vertex set $S\subseteq V(G)$ such that $G\setminus S$ is a forest and $S$ is an independent set of size at most $k$. We present an $O^\ast((1+φ^{2})^{k})$-time FPT algorithm for this problem, where $φ<1.619$ is the golden ratio, improving the previous fastest $O^\ast(4.1481^{k})$-time algorithm given by Agrawal et al [IPEC 2016]. The exponential factor in our time complexity bound matches the fastest deterministic FPT algorithm for the classic Feedback Vertex Set problem. On the technical side, the main novelty is a refined measure of an input instance in a branching process, that allows for a simpler and more concise description and analysis of the algorithm.

preprint2020arXiv

Enabling Cross-chain Transactions: A Decentralized Cryptocurrency Exchange Protocol

Inspired by Bitcoin, many different kinds of cryptocurrencies based on blockchain technology have turned up on the market. Due to the special structure of the blockchain, it has been deemed impossible to directly trade between traditional currencies and cryptocurrencies or between different types of cryptocurrencies. Generally, trading between different currencies is conducted through a centralized third-party platform. However, it has the problem of a single point of failure, which is vulnerable to attacks and thus affects the security of the transactions. In this paper, we propose a distributed cryptocurrency trading scheme to solve the problem of centralized exchanges, which can achieve trading between different types of cryptocurrencies. Our scheme is implemented with smart contracts on the Ethereum blockchain and deployed on the Ethereum test network. We not only implement transactions between individual users, but also allow transactions between multiple users. The experimental result proves that the cost of our scheme is acceptable.

preprint2020arXiv

Feature Lenses: Plug-and-play Neural Modules for Transformation-Invariant Visual Representations

Convolutional Neural Networks (CNNs) are known to be brittle under various image transformations, including rotations, scalings, and changes of lighting conditions. We observe that the features of a transformed image are drastically different from the ones of the original image. To make CNNs more invariant to transformations, we propose "Feature Lenses", a set of ad-hoc modules that can be easily plugged into a trained model (referred to as the "host model"). Each individual lens reconstructs the original features given the features of a transformed image under a particular transformation. These lenses jointly counteract feature distortions caused by various transformations, thus making the host model more robust without retraining. By only updating lenses, the host model is freed from iterative updating when facing new transformations absent in the training data; as feature semantics are preserved, downstream applications, such as classifiers and detectors, automatically gain robustness without retraining. Lenses are trained in a self-supervised fashion with no annotations, by minimizing a novel "Top-K Activation Contrast Loss" between lens-transformed features and original features. Evaluated on ImageNet, MNIST-rot, and CIFAR-10, Feature Lenses show clear advantages over baseline methods.

preprint2020arXiv

Many visits TSP revisited

We study the Many Visits TSP problem, where given a number $k(v)$ for each of $n$ cities and pairwise (possibly asymmetric) integer distances, one has to find an optimal tour that visits each city $v$ exactly $k(v)$ times. The currently fastest algorithm is due to Berger, Kozma, Mnich and Vincze [SODA 2019, TALG 2020] and runs in time and space $\mathcal{O}^*(5^n)$. They also show a polynomial space algorithm running in time $\mathcal{O}^*(16^{n+o(n)})$. In this work, we show three main results: (i) A randomized polynomial space algorithm in time $\mathcal{O}^*(2^nD)$, where $D$ is the maximum distance between two cities. By using standard methods, this results in $(1+ε)$-approximation in time $\mathcal{O}^*(2^nε^{-1})$. Improving the constant $2$ in these results would be a major breakthrough, as it would result in improving the $\mathcal{O}^*(2^n)$-time algorithm for Directed Hamiltonian Cycle, which is a 50 years old open problem. (ii) A tight analysis of Berger et al.'s exponential space algorithm, resulting in $\mathcal{O}^*(4^n)$ running time bound. (iii) A new polynomial space algorithm, running in time $\mathcal{O}(7.88^n)$.

preprint2019arXiv

Illumination Robust Loop Closure Detection with the Constraint of Pose

Background: Loop closure detection is a crucial part in robot navigation and simultaneous location and mapping (SLAM). Appearance-based loop closure detection still faces many challenges, such as illumination changes, perceptual aliasing and increasing computational complexity. Method: In this paper, we proposed a visual loop-closure detection algorithm which combines illumination robust descriptor DIRD and odometry information. The estimated pose and variance are calculated by the visual inertial odometry (VIO), then the loop closure candidate areas are found based on the distance between images. We use a new distance combing the the Euclidean distance and the Mahalanobis distance and a dynamic threshold to select the loop closure candidate areas. Finally, in loop-closure candidate areas, we do image retrieval with DIRD which is an illumination robust descriptor. Results: The proposed algorithm is evaluated on KITTI_00 and EuRoc datasets. The results show that the loop closure areas could be correctly detected and the time consumption is effectively reduced. We compare it with SeqSLAM algorithm, the proposed algorithm gets better performance on PR-curve.

preprint2016arXiv

Generative Topic Embedding: a Continuous Representation of Documents (Extended Version with Proofs)

Word embedding maps words into a low-dimensional continuous embedding space by exploiting the local word collocation patterns in a small context window. On the other hand, topic modeling maps documents onto a low-dimensional topic space, by utilizing the global word collocation patterns in the same document. These two types of patterns are complementary. In this paper, we propose a generative topic embedding model to combine the two types of patterns. In our model, topics are represented by embedding vectors, and are shared across documents. The probability of each word is influenced by both its local context and its topic. A variational inference method yields the topic embeddings as well as the topic mixing proportions for each document. Jointly they represent the document in a low-dimensional continuous space. In two document classification tasks, our method performs better than eight existing methods, with fewer features. In addition, we illustrate with an example that our method can generate coherent topics even based on only one document.

preprint2016arXiv

PSDVec: a Toolbox for Incremental and Scalable Word Embedding

PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping of words in a natural language to continuous vectors which encode the semantic/syntactic regularities between the words. PSDVec implements a word embedding learning method based on a weighted low-rank positive semidefinite approximation. To scale up the learning process, we implement a blockwise online learning algorithm to learn the embeddings incrementally. This strategy greatly reduces the learning time of word embeddings on a large vocabulary, and can learn the embeddings of new words without re-learning the whole vocabulary. On 9 word similarity/analogy benchmark sets and 2 Natural Language Processing (NLP) tasks, PSDVec produces embeddings that has the best average performance among popular word embedding tools. PSDVec provides a new option for NLP practitioners.

preprint2015arXiv

A Generative Word Embedding Model and its Low Rank Positive Semidefinite Solution

Most existing word embedding methods can be categorized into Neural Embedding Models and Matrix Factorization (MF)-based methods. However some models are opaque to probabilistic interpretation, and MF-based methods, typically solved using Singular Value Decomposition (SVD), may incur loss of corpus information. In addition, it is desirable to incorporate global latent factors, such as topics, sentiments or writing styles, into the word embedding model. Since generative models provide a principled way to incorporate latent factors, we propose a generative word embedding model, which is easy to interpret, and can serve as a basis of more sophisticated latent factor models. The model inference reduces to a low rank weighted positive semidefinite approximation problem. Its optimization is approached by eigendecomposition on a submatrix, followed by online blockwise regression, which is scalable and avoids the information loss in SVD. In experiments on 7 common benchmark datasets, our vectors are competitive to word2vec, and better than other MF-based methods.

preprint2015arXiv

Cascade hash tables: a series of multilevel double hashing schemes with O(1) worst case lookup time

In this paper, the author proposes a series of multilevel double hashing schemes called cascade hash tables. They use several levels of hash tables. In each table, we use the common double hashing scheme. Higher level hash tables work as fail-safes of lower level hash tables. By this strategy, it could effectively reduce collisions in hash insertion. Thus it gains a constant worst case lookup time with a relatively high load factor(70%-85%) in random experiments. Different parameters of cascade hash tables are tested.

preprint2015arXiv

Factorized Asymptotic Bayesian Inference for Factorial Hidden Markov Models

Factorial hidden Markov models (FHMMs) are powerful tools of modeling sequential data. Learning FHMMs yields a challenging simultaneous model selection issue, i.e., selecting the number of multiple Markov chains and the dimensionality of each chain. Our main contribution is to address this model selection issue by extending Factorized Asymptotic Bayesian (FAB) inference to FHMMs. First, we offer a better approximation of marginal log-likelihood than the previous FAB inference. Our key idea is to integrate out transition probabilities, yet still apply the Laplace approximation to emission probabilities. Second, we prove that if there are two very similar hidden states in an FHMM, i.e. one is redundant, then FAB will almost surely shrink and eliminate one of them, making the model parsimonious. Experimental results show that FAB for FHMMs significantly outperforms state-of-the-art nonparametric Bayesian iFHMM and Variational FHMM in model selection accuracy, with competitive held-out perplexity.

preprint2015arXiv

On the Equivalence of Factorized Information Criterion Regularization and the Chinese Restaurant Process Prior

Factorized Information Criterion (FIC) is a recently developed information criterion, based on which a novel model selection methodology, namely Factorized Asymptotic Bayesian (FAB) Inference, has been developed and successfully applied to various hierarchical Bayesian models. The Dirichlet Process (DP) prior, and one of its well known representations, the Chinese Restaurant Process (CRP), derive another line of model selection methods. FIC can be viewed as a prior distribution over the latent variable configurations. Under this view, we prove that when the parameter dimensionality $D_{c}=2$, FIC is equivalent to CRP. We argue that when $D_{c}>2$, FIC avoids an inherent problem of DP/CRP, i.e. the data likelihood will dominate the impact of the prior, and thus the model selection capability will weaken as $D_{c}$ increases. However, FIC overestimates the data likelihood. As a result, FIC may be overly biased towards models with less components. We propose a natural generalization of FIC, which finds a middle ground between CRP and FIC, and may yield more accurate model selection results than FIC.

Shaohua Li

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow

REFUGE2 Challenge: A Treasure Trove for Multi-Dimension Analysis and Evaluation in Glaucoma Screening

Hardness of Metric Dimension in Graphs of Constant Treewidth

An improved FPT algorithm for Independent Feedback Vertex Set

Enabling Cross-chain Transactions: A Decentralized Cryptocurrency Exchange Protocol

Feature Lenses: Plug-and-play Neural Modules for Transformation-Invariant Visual Representations

Many visits TSP revisited

Illumination Robust Loop Closure Detection with the Constraint of Pose

Generative Topic Embedding: a Continuous Representation of Documents (Extended Version with Proofs)

PSDVec: a Toolbox for Incremental and Scalable Word Embedding

A Generative Word Embedding Model and its Low Rank Positive Semidefinite Solution

Cascade hash tables: a series of multilevel double hashing schemes with O(1) worst case lookup time

Factorized Asymptotic Bayesian Inference for Factorial Hidden Markov Models

On the Equivalence of Factorized Information Criterion Regularization and the Chinese Restaurant Process Prior