Source author record

Won-Sook Lee

Won-Sook Lee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV Machine Learning Computation and Language Artificial Intelligence eess.AS Graphics Human-Computer Interaction Neural and Evolutionary Computing Robotics Sound Systems and Control

Catalog footprint

What is connected

7works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Deep Learning for Automatic Tracking of Tongue Surface in Real-time Ultrasound Videos, Landmarks instead of Contours

One usage of medical ultrasound imaging is to visualize and characterize human tongue shape and motion during a real-time speech to study healthy or impaired speech production. Due to the low-contrast characteristic and noisy nature of ultrasound images, it might require expertise for non-expert users to recognize tongue gestures in applications such as visual training of a second language. Moreover, quantitative analysis of tongue motion needs the tongue dorsum contour to be extracted, tracked, and visualized. Manual tongue contour extraction is a cumbersome, subjective, and error-prone task. Furthermore, it is not a feasible solution for real-time applications. The growth of deep learning has been vigorously exploited in various computer vision tasks, including ultrasound tongue contour tracking. In the current methods, the process of tongue contour extraction comprises two steps of image segmentation and post-processing. This paper presents a new novel approach of automatic and real-time tongue contour tracking using deep neural networks. In the proposed method, instead of the two-step procedure, landmarks of the tongue surface are tracked. This novel idea enables researchers in this filed to benefits from available previously annotated databases to achieve high accuracy results. Our experiment disclosed the outstanding performances of the proposed technique in terms of generalization, performance, and accuracy.

preprint2020arXiv

HRINet: Alternative Supervision Network for High-resolution CT image Interpolation

Image interpolation in medical area is of high importance as most 3D biomedical volume images are sampled where the distance between consecutive slices significantly greater than the in-plane pixel size due to radiation dose or scanning time. Image interpolation creates a number of new slices between known slices in order to obtain an isotropic volume image. The results can be used for the higher quality of 3D reconstruction and visualization of human body structures. Semantic interpolation on the manifold has been proved to be very useful for smoothing image interpolation. Nevertheless, all previous methods focused on low-resolution image interpolation, and most of them work poorly on high-resolution image. We propose a novel network, High Resolution Interpolation Network (HRINet), aiming at producing high-resolution CT image interpolations. We combine the idea of ACAI and GANs, and propose a novel idea of alternative supervision method by applying supervised and unsupervised training alternatively to raise the accuracy of human organ structures in CT while keeping high quality. We compare an MSE based and a perceptual based loss optimizing methods for high quality interpolation, and show the tradeoff between the structural correctness and sharpness. Our experiments show the great improvement on 256 2 and 5122 images quantitatively and qualitatively.

preprint2020arXiv

IrisNet: Deep Learning for Automatic and Real-time Tongue Contour Tracking in Ultrasound Video Data using Peripheral Vision

The progress of deep convolutional neural networks has been successfully exploited in various real-time computer vision tasks such as image classification and segmentation. Owing to the development of computational units, availability of digital datasets, and improved performance of deep learning models, fully automatic and accurate tracking of tongue contours in real-time ultrasound data became practical only in recent years. Recent studies have shown that the performance of deep learning techniques is significant in the tracking of ultrasound tongue contours in real-time applications such as pronunciation training using multimodal ultrasound-enhanced approaches. Due to the high correlation between ultrasound tongue datasets, it is feasible to have a general model that accomplishes automatic tongue tracking for almost all datasets. In this paper, we proposed a deep learning model comprises of a convolutional module mimicking the peripheral vision ability of the human eye to handle real-time, accurate, and fully automatic tongue contour tracking tasks, applicable for almost all primary ultrasound tongue datasets. Qualitative and quantitative assessment of IrisNet on different ultrasound tongue datasets and PASCAL VOC2012 revealed its outstanding generalization achievement in compare with similar techniques.

preprint2019arXiv

Real-time Ultrasound-enhanced Multimodal Imaging of Tongue using 3D Printable Stabilizer System: A Deep Learning Approach

Despite renewed awareness of the importance of articulation, it remains a challenge for instructors to handle the pronunciation needs of language learners. There are relatively scarce pedagogical tools for pronunciation teaching and learning. Unlike inefficient, traditional pronunciation instructions like listening and repeating, electronic visual feedback (EVF) systems such as ultrasound technology have been employed in new approaches. Recently, an ultrasound-enhanced multimodal method has been developed for visualizing tongue movements of a language learner overlaid on the face-side of the speaker's head. That system was evaluated for several language courses via a blended learning paradigm at the university level. The result was asserted that visualizing the articulator's system as biofeedback to language learners will significantly improve articulation learning efficiency. In spite of the successful usage of multimodal techniques for pronunciation training, it still requires manual works and human manipulation. In this article, we aim to contribute to this growing body of research by addressing difficulties of the previous approaches by proposing a new comprehensive, automatic, real-time multimodal pronunciation training system, benefits from powerful artificial intelligence techniques. The main objective of this research was to combine the advantages of ultrasound technology, three-dimensional printing, and deep learning algorithms to enhance the performance of previous systems. Our preliminary pedagogical evaluation of the proposed system revealed a significant improvement in flexibility, control, robustness, and autonomy.

preprint2016arXiv

Multi-Objective Design of State Feedback Controllers Using Reinforced Quantum-Behaved Particle Swarm Optimization

In this paper, a novel and generic multi-objective design paradigm is proposed which utilizes quantum-behaved PSO(QPSO) for deciding the optimal configuration of the LQR controller for a given problem considering a set of competing objectives. There are three main contributions introduced in this paper as follows. (1) The standard QPSO algorithm is reinforced with an informed initialization scheme based on the simulated annealing algorithm and Gaussian neighborhood selection mechanism. (2) It is also augmented with a local search strategy which integrates the advantages of memetic algorithm into conventional QPSO. (3) An aggregated dynamic weighting criterion is introduced that dynamically combines the soft and hard constraints with control objectives to provide the designer with a set of Pareto optimal solutions and lets her to decide the target solution based on practical preferences. The proposed method is compared against a gradient-based method, seven meta-heuristics, and the trial-and-error method on two control benchmarks using sensitivity analysis and full factorial parameter selection and the results are validated using one-tailed T-test. The experimental results suggest that the proposed method outperforms opponent methods in terms of controller effort, measures associated with transient response and criteria related to steady-state.

preprint2016arXiv

Multilevel Thresholding Segmentation of T2 weighted Brain MRI images using Convergent Heterogeneous Particle Swarm Optimization

This paper proposes a new image thresholding segmentation approach using the heuristic method, Convergent Heterogeneous Particle Swarm Optimization algorithm. The proposed algorithm incorporates a new strategy of searching the problem space by dividing the swarm into subswarms. Each subswarm particles search for better solution separately lead to better exploitation while they cooperate with each other to find the best global position. The consequence of the aforementioned cooperation is better exploration, convergence and it able the algorithm to jump from local optimal solution to the better spots. A practical application of this method is demonstrated for the problem of medical image thresholding segmentation. We considered two classical thresholding techniques of Otsu and Kapur separately as the objective function for the optimization method and applied on a set of brain MR images. Comparative experimental results reveal that the proposed method outperforms another state of the art method from the literature in terms of accuracy, computation time and stable results.

preprint2016arXiv

Visualizing Natural Language Descriptions: A Survey

A natural language interface exploits the conceptual simplicity and naturalness of the language to create a high-level user-friendly communication channel between humans and machines. One of the promising applications of such interfaces is generating visual interpretations of semantic content of a given natural language that can be then visualized either as a static scene or a dynamic animation. This survey discusses requirements and challenges of developing such systems and reports 26 graphical systems that exploit natural language interfaces and addresses both artificial intelligence and visualization aspects. This work serves as a frame of reference to researchers and to enable further advances in the field.

Won-Sook Lee

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Deep Learning for Automatic Tracking of Tongue Surface in Real-time Ultrasound Videos, Landmarks instead of Contours

HRINet: Alternative Supervision Network for High-resolution CT image Interpolation

IrisNet: Deep Learning for Automatic and Real-time Tongue Contour Tracking in Ultrasound Video Data using Peripheral Vision

Real-time Ultrasound-enhanced Multimodal Imaging of Tongue using 3D Printable Stabilizer System: A Deep Learning Approach

Multi-Objective Design of State Feedback Controllers Using Reinforced Quantum-Behaved Particle Swarm Optimization

Multilevel Thresholding Segmentation of T2 weighted Brain MRI images using Convergent Heterogeneous Particle Swarm Optimization

Visualizing Natural Language Descriptions: A Survey