Source author record

Thanh Tran

Thanh Tran appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Information Retrieval math.AP Artificial Intelligence Computation and Language eess.AS Social and Information Networks Sound cs.CY Human-Computer Interaction Machine Learning math-ph math.MP Neural and Evolutionary Computing

Catalog footprint

What is connected

17works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information Retrieval and Entailment Tasks

The Competition on Legal Information Extraction/Entailment (COLIEE) is held annually to encourage advancements in the automatic processing of legal texts. Processing legal documents is challenging due to the intricate structure and meaning of legal language. In this paper, we outline our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition. Our approach involved utilizing appropriate state-of-the-art deep learning methods, designing methods based on domain characteristics observation, and applying meticulous engineering practices and methodologies to the competition. As a result, our performance in these tasks has been outstanding, with first places in Task 2 and Task 3, and promising results in Task 4. Our source code is available at https://github.com/Nguyen2015/CAPTAIN-COLIEE2023/tree/coliee2023.

preprint2022arXiv

A neural prosody encoder for end-ro-end dialogue act classification

Dialogue act classification (DAC) is a critical task for spoken language understanding in dialogue systems. Prosodic features such as energy and pitch have been shown to be useful for DAC. Despite their importance, little research has explored neural approaches to integrate prosodic features into end-to-end (E2E) DAC models which infer dialogue acts directly from audio signals. In this work, we propose an E2E neural architecture that takes into account the need for characterizing prosodic phenomena co-occurring at different levels inside an utterance. A novel part of this architecture is a learnable gating mechanism that assesses the importance of prosodic features and selectively retains core information necessary for E2E DAC. Our proposed model improves DAC accuracy by 1.07% absolute across three publicly available benchmark datasets.

preprint2022arXiv

Denoising Induction Motor Sounds Using an Autoencoder

Denoising is the process of removing noise from sound signals while improving the quality and adequacy of the sound signals. Denoising sound has many applications in speech processing, sound events classification, and machine failure detection systems. This paper describes a method for creating an autoencoder to map noisy machine sounds to clean sounds for denoising purposes. There are several types of noise in sounds, for example, environmental noise and generated frequency-dependent noise from signal processing methods. Noise generated by environmental activities is environmental noise. In the factory, environmental noise can be created by vehicles, drilling, people working or talking in the survey area, wind, and flowing water. Those noises appear as spikes in the sound record. In the scope of this paper, we demonstrate the removal of generated noise with Gaussian distribution and the environmental noise with a specific example of the water sink faucet noise from the induction motor sounds. The proposed method was trained and verified on 49 normal function sounds and 197 horizontal misalignment fault sounds from the Machinery Fault Database (MAFAULDA). The mean square error (MSE) was used as the assessment criteria to evaluate the similarity between denoised sounds using the proposed autoencoder and the original sounds in the test set. The MSE is below or equal to 0.14 when denoise both types of noises on 15 testing sounds of the normal function category. The MSE is below or equal to 0.15 when denoising 60 testing sounds on the horizontal misalignment fault category. The low MSE shows that both the generated Gaussian noise and the environmental noise were almost removed from the original sounds with the proposed trained autoencoder.

preprint2021arXiv

Remarks on Sobolev norms of fractional orders

When a function belonging to a fractional-order Sobolev space is supported in a proper subset of the Lipschitz domain on which the Sobolev space is defined, how is its Sobolev norm as a function on the smaller set compared to its norm on the whole domain? On what do the comparison constants depend on? Do different norms behave differently? This article addresses these issues. We prove some inequalities and disprove some misconceptions by counter-examples.

preprint2021arXiv

What's in a Name? -- Gender Classification of Names with Character Based Machine Learning Models

Gender information is no longer a mandatory input when registering for an account at many leading Internet companies. However, prediction of demographic information such as gender and age remains an important task, especially in intervention of unintentional gender/age bias in recommender systems. Therefore it is necessary to infer the gender of those users who did not to provide this information during registration. We consider the problem of predicting the gender of registered users based on their declared name. By analyzing the first names of 100M+ users, we found that genders can be very effectively classified using the composition of the name strings. We propose a number of character based machine learning models, and demonstrate that our models are able to infer the gender of users with much higher accuracy than baseline models. Moreover, we show that using the last names in addition to the first names improves classification performance further.

preprint2020arXiv

Quaternion-Based Self-Attentive Long Short-Term User Preference Encoding for Recommendation

Quaternion space has brought several benefits over the traditional Euclidean space: Quaternions (i) consist of a real and three imaginary components, encouraging richer representations; (ii) utilize Hamilton product which better encodes the inter-latent interactions across multiple Quaternion components; and (iii) result in a model with smaller degrees of freedom and less prone to overfitting. Unfortunately, most of the current recommender systems rely on real-valued representations in Euclidean space to model either user's long-term or short-term interests. In this paper, we fully utilize Quaternion space to model both user's long-term and short-term preferences. We first propose a QUaternion-based self-Attentive Long term user Encoding (QUALE) to study the user's long-term intents. Then, we propose a QUaternion-based self-Attentive Short term user Encoding (QUASE) to learn the user's short-term interests. To enhance our models' capability, we propose to fuse QUALE and QUASE into one model, namely QUALSE, by using a Quaternion-based gating mechanism. We further develop Quaternion-based Adversarial learning along with the Bayesian Personalized Ranking (QABPR) to improve our model's robustness. Extensive experiments on six real-world datasets show that our fused QUALSE model outperformed 11 state-of-the-art baselines, improving 8.43% at HIT@1 and 10.27% at NDCG@1 on average compared with the best baseline.

preprint2016arXiv

Existence of arbitrarily smooth solutions of the LLG equation in 3D with natural boundary conditions

We prove that the Landau-Lifshitz-Gilbert equation in three space dimensions with homogeneous Neumann boundary conditions admits arbitrarily smooth solutions, given that the initial data is sufficiently close to a constant function.

preprint2016arXiv

How to Succeed in Crowdfunding: a Long-Term Study in Kickstarter

Crowdfunding platforms have become important sites where people can create projects to seek funds toward turning their ideas into products, and back someone else's projects. As news media have reported successfully funded projects (e.g., Pebble Time, Coolest Cooler), more people have joined crowdfunding platforms and launched projects. But in spite of rapid growth of the number of users and projects, a project success rate at large has been decreasing because of launching projects without enough preparation and experience. Little is known about what reactions project creators made (e.g., giving up or making the failed projects better) when projects failed, and what types of successful projects we can find. To solve these problems, in this manuscript we (i) collect the largest datasets from Kickstarter, consisting of all project profiles, corresponding user profiles, projects' temporal data and users' social media information; (ii) analyze characteristics of successful projects, behaviors of users and understand dynamics of the crowdfunding platform; (iii) propose novel statistical approaches to predict whether a project will be successful and a range of expected pledged money of the project; (iv) develop predictive models and evaluate performance of the models; (v) analyze what reactions project creators had when project failed, and if they did not give up, how they made the failed projects successful; and (vi) cluster successful projects by their evolutional patterns of pledged money toward understanding what efforts project creators should make in order to get more pledged money. Our experimental results show that the predictive models can effectively predict project success and a range of expected pledged money.

preprint2016arXiv

Reconstruction of the electric field of the Helmholtz equation in 3D

In this paper, we rigorously investigate the truncation method for the Cauchy problem of Helmholtz equations which is widely used to model propagation phenomena in physical applications. The method is a well-known approach to the regularization of several types of ill-posed problems, including the model postulated by Regi\' nska and Regi\' nski \cite{RR06}. Under certain specific assumptions, we examine the ill-posedness of the non-homogeneous problem by exploring the representation of solutions based on Fourier mode. Then the so-called regularized solution is established with respect to a frequency bounded by an appropriate regularization parameter. Furthermore, we provide a short analysis of the nonlinear forcing term. The main results show the stability as well as the strong convergence confirmed by the error estimates in $L^2$-norm of such regularized solutions. Besides, the regularization parameters are formulated properly. Finally, some illustrative examples are provided to corroborate our qualitative analysis.

preprint2016arXiv

The Eddy Current-LLG Equations-Part I: FEM-BEM Coupling

We analyse a numerical method for the coupled system of the eddy current equations in $\mathbb{R}^3$ with the Landau-Lifshitz-Gilbert equation in a bounded domain. The unbounded domain is discretised by means of finite-element/boundary-element coupling. Even though the considered problem is strongly nonlinear, the numerical approach is constructed such that only two linear systems per time step have to be solved. In this first part of the paper, we prove unconditional weak convergence (of a subsequence) of the finite-element solutions towards a weak solution. A priori error estimates will be presented in the second part.

preprint2016arXiv

Understanding Citizen Reactions and Ebola-Related Information Propagation on Social Media

In severe outbreaks such as Ebola, bird flu and SARS, people share news, and their thoughts and responses regarding the outbreaks on social media. Understanding how people perceive the severe outbreaks, what their responses are, and what factors affect these responses become important. In this paper, we conduct a comprehensive study of understanding and mining the spread of Ebola-related information on social media. In particular, we (i) conduct a large-scale data-driven analysis of geotagged social media messages to understand citizen reactions regarding Ebola; (ii) build information propagation models which measure locality of information; and (iii) analyze spatial, temporal and social properties of Ebola-related information. Our work provides new insights into Ebola outbreak by understanding citizen reactions and topic-based information propagation, as well as providing a foundation for analysis and response of future public health crises.

preprint2015arXiv

A finite element approximation for the stochastic Landau-Lifshitz-Gilbert equation

The stochastic Landau--Lifshitz--Gilbert (LLG) equation describes the behaviour of the magnetization under the influence of the effective field consisting of random fluctuations. We first reformulate the equation into an equation the unknown of which is differentiable with respect to the time variable. We then propose a convergent $θ$-linear scheme for the numerical solution of the reformulated equation. As a consequence, we show the existence of weak martingale solutions to the stochastic LLG equation. A salient feature of this scheme is that it does not involve a nonlinear system, and that no condition on time and space steps is required when $θ\in(\frac{1}{2},1]$. Numerical results are presented to show the applicability of the method.

preprint2014arXiv

A mixed discontinuous Galerkin method for the time harmonic elasticity problem with reduced symmetry

The aim of this paper is to analyze a mixed discontinuous Galerkin discretization of the time-harmonic elasticity problem. The symmetry of the Cauchy stress tensor is imposed weakly, as in the traditional dual-mixed setting. We show that the discontinuous Galerkin scheme is well-posed and uniformly stable with respect to the mesh parameter $h$ and the Lamé coefficient $λ$. We also derive optimal a-priori error bounds in the energy norm. Several numerical tests are presented in order to illustrate the performance of the method and confirm the theoretical results.

preprint2014arXiv

A shape calculus based method for a transmission problem with random interface

The present work is devoted to approximation of the statistical moments of the unknown solution of a class of elliptic transmission problems in $\mathbb R^3$ with randomly perturbed interfaces. Within this model, the diffusion coefficient has a jump discontinuity across the random transmission interface which models linear diffusion in two different media separated by an uncertain surface. We apply the shape calculus approach to approximate solution's perturbation by the so-called shape derivative, correspondingly statistical moments of the solution's perturbation are approximated by the moments of the shape derivative. We characterize the shape derivative as a solution of a related homogeneous transmission problem with nonzero jump conditions which can be solved with the aid of boundary integral equations. We develop a rigorous theoretical framework for this method, particularly i) extending the method to the case of unbounded domains and ii) closing the gaps and clarifying and adapting results in the existing literature. The theoretical findings are supported by and illustrated in two particular examples.

preprint2013arXiv

A mixed method for Dirichlet problems with radial basis functions

We present a simple discretization by radial basis functions for the Poisson equation with Dirichlet boundary condition. A Lagrangian multiplier using piecewise polynomials is used to accommodate the boundary condition. This simplifies previous attempts to use radial basis functions in the interior domain to approximate the solution and on the boundary to approximate the multiplier, which technically requires that the mesh norm in the interior domain is significantly smaller than that on the boundary. Numerical experiments confirm theoretical results.

preprint2013arXiv

On a decoupled linear FEM integrator for Eddy-current-LLG

We propose a numerical integrator for the coupled system of the eddy-current equation with the nonlinear Landau-Lifshitz-Gilbert equation. The considered effective field contains a general field contribution, and we particularly cover exchange, anisotropy, applied field, and magnetic field (stemming from the eddy-current equation). Even though the considered problem is nonlinear, our scheme requires only the solution of two linear systems per time-step. Moreover, our algorithm decouples both equations so that in each time-step, one linear system is solved for the magnetization, and afterwards one linear system is solved for the magnetic field. Unconditional convergence -- at least of a subsequence -- towards a weak solution is proved, and our analysis even provides existence of such weak solutions. Numerical experiments with a micromagnetic benchmark problem underline the performance of the proposed algorithm.

preprint2012arXiv

Radial basis functions for the solution of hypersingular operators on open surfaces

We analyze the approximation by radial basis functions of a hypersingular integral equation on an open surface. In order to accommodate the homogeneous essential boundary condition along the surface boundary, scaled radial basis functions on an extended surface and Lagrangian multipliers on the extension are used. We prove that our method converges quasi-optimally. Approximation results for scaled radial basis functions indicate that, for highly regular radial basis functions, the achieved convergence rates are close to the one of low-order conforming boundary element schemes. Numerical experiments confirm our conclusions.

Thanh Tran

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information Retrieval and Entailment Tasks

A neural prosody encoder for end-ro-end dialogue act classification

Denoising Induction Motor Sounds Using an Autoencoder

Remarks on Sobolev norms of fractional orders

What's in a Name? -- Gender Classification of Names with Character Based Machine Learning Models

Quaternion-Based Self-Attentive Long Short-Term User Preference Encoding for Recommendation

Existence of arbitrarily smooth solutions of the LLG equation in 3D with natural boundary conditions

How to Succeed in Crowdfunding: a Long-Term Study in Kickstarter

Reconstruction of the electric field of the Helmholtz equation in 3D

The Eddy Current-LLG Equations-Part I: FEM-BEM Coupling

Understanding Citizen Reactions and Ebola-Related Information Propagation on Social Media

A finite element approximation for the stochastic Landau-Lifshitz-Gilbert equation

A mixed discontinuous Galerkin method for the time harmonic elasticity problem with reduced symmetry

A shape calculus based method for a transmission problem with random interface

A mixed method for Dirichlet problems with radial basis functions

On a decoupled linear FEM integrator for Eddy-current-LLG

Radial basis functions for the solution of hypersingular operators on open surfaces