Researcher profile

Jónathan Heras

Jónathan Heras contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2022arXiv

A Topological Approach for Semi-Supervised Learning

Nowadays, Machine Learning and Deep Learning methods have become the state-of-the-art approach to solve data classification tasks. In order to use those methods, it is necessary to acquire and label a considerable amount of data; however, this is not straightforward in some fields, since data annotation is time consuming and might require expert knowledge. This challenge can be tackled by means of semi-supervised learning methods that take advantage of both labelled and unlabelled data. In this work, we present new semi-supervised learning methods based on techniques from Topological Data Analysis (TDA), a field that is gaining importance for analysing large amounts of data with high variety and dimensionality. In particular, we have created two semi-supervised learning methods following two different topological approaches. In the former, we have used a homological approach that consists in studying the persistence diagrams associated with the data using the Bottleneck and Wasserstein distances. In the latter, we have taken into account the connectivity of the data. In addition, we have carried out a thorough analysis of the developed methods using 3 synthetic datasets, 5 structured datasets, and 2 datasets of images. The results show that the semi-supervised methods developed in this work outperform both the results obtained with models trained with only manually labelled data, and those obtained with classical semi-supervised learning methods, reaching improvements of up to a 16%.

preprint2022arXiv

Neural Style Transfer and Unpaired Image-to-Image Translation to deal with the Domain Shift Problem on Spheroid Segmentation

Background and objectives. Domain shift is a generalisation problem of machine learning models that occurs when the data distribution of the training set is different to the data distribution encountered by the model when it is deployed. This is common in the context of biomedical image segmentation due to the variance of experimental conditions, equipment, and capturing settings. In this work, we address this challenge by studying both neural style transfer algorithms and unpaired image-to-image translation methods in the context of the segmentation of tumour spheroids. Methods. We have illustrated the domain shift problem in the context of spheroid segmentation with 4 deep learning segmentation models that achieved an IoU over 97% when tested with images following the training distribution, but whose performance decreased up to an 84\% when applied to images captured under different conditions. In order to deal with this problem, we have explored 3 style transfer algorithms (NST, deep image analogy, and STROTSS), and 6 unpaired image-to-image translations algorithms (CycleGAN, DualGAN, ForkGAN, GANILLA, CUT, and FastCUT). These algorithms have been integrated into a high-level API that facilitates their application to other contexts where the domain-shift problem occurs. Results. We have considerably improved the performance of the 4 segmentation models when applied to images captured under different conditions by using both style transfer and image-to-image translation algorithms. In particular, there are 2 style transfer algorithms (NST and deep image analogy) and 1 unpaired image-to-image translations algorithm (CycleGAN) that improve the IoU of the models in a range from 0.24 to 76.07. Therefore, reaching a similar performance to the one obtained with the models are applied to images following the training distribution.

preprint2022arXiv

Semi-Supervised Learning for Image Classification using Compact Networks in the BioMedical Context

The development of mobile and on the edge applications that embed deep convolutional neural models has the potential to revolutionise biomedicine. However, most deep learning models require computational resources that are not available in smartphones or edge devices; an issue that can be faced by means of compact models. The problem with such models is that they are, at least usually, less accurate than bigger models. In this work, we study how this limitation can be addressed with the application of semi-supervised learning techniques. We conduct several statistical analyses to compare performance of deep compact architectures when trained using semi-supervised learning methods for tackling image classification tasks in the biomedical context. In particular, we explore three families of compact networks, and two families of semi-supervised learning techniques for 10 biomedical tasks. By combining semi-supervised learning methods with compact networks, it is possible to obtain a similar performance to standard size networks. In general, the best results are obtained when combining data distillation with MixNet, and plain distillation with ResNet-18. Also, in general, NAS networks obtain better results than manually designed networks and quantized networks. The work presented in this paper shows the benefits of apply semi-supervised methods to compact networks; this allow us to create compact models that are not only as accurate as standard size models, but also faster and lighter. Finally, we have developed a library that simplifies the construction of compact models using semi-supervised learning methods.

preprint2014arXiv

(Co)recursion in Logic Programming: Lazy vs Eager

CoAlgebraic Logic Programming (CoALP) is a dialect of Logic Programming designed to bring a more precise compile-time and run-time analysis of termination and productivity for recursive and corecursive functions in Logic Programming. Its second goal is to introduce guarded lazy (co)recursion akin to functional theorem provers into logic programming. In this paper, we explain lazy features of CoALP, and compare them with the loop-analysis and eager execution in Coinductive Logic Programming (CoLP). We conclude by outlining the future directions in developing the guarded (co)recursion in logic programming.

preprint2014arXiv

ACL2(ml): Machine-Learning for ACL2

ACL2(ml) is an extension for the Emacs interface of ACL2. This tool uses machine-learning to help the ACL2 user during the proof-development. Namely, ACL2(ml) gives hints to the user in the form of families of similar theorems, and generates auxiliary lemmas automatically. In this paper, we present the two most recent extensions for ACL2(ml). First, ACL2(ml) can suggest now families of similar function definitions, in addition to the families of similar theorems. Second, the lemma generation tool implemented in ACL2(ml) has been improved with a method to generate preconditions using the guard mechanism of ACL2. The user of ACL2(ml) can also invoke directly the latter extension to obtain preconditions for his own conjectures.

preprint2014arXiv

Defining and computing persistent Z-homology in the general case

By general case we mean methods able to process simplicial sets and chain complexes not of finite type. A filtration of the object to be studied is the heart of both subjects persistent homology and spectral sequences. In this paper we present the complete relation between them, both from theoretical and computational points of view. One of the main contributions of this paper is the observation that a slight modification of our previous programs computing spectral sequences is enough to compute also persistent homology. By inheritance from our spectral sequence programs, we obtain for free persistent homology programs applicable to spaces not of finite type (provided they are spaces with effective homology) and with Z-coefficients (significantly generalizing the usual presentation of persistent homology over a field). As an illustration, we compute some persistent homology groups (and the corresponding integer barcodes) in the case of a Postnikov tower.

preprint2014arXiv

HoTT formalisation in Coq: Dependency Graphs \& ML4PG

This note is a response to Bas Spitter's email of 28 February 2014 about ML4PG: "We (Jason actually) are adding dependency graphs to our HoTT library: https://github.com/HoTT/HoTT/wiki I seem to recall that finding the dependency graph was a main obstacle preventing machine learning to be used in Coq. However, you seem to have made progress on this. What tool are you using? https://anne.pacalet.fr/dev/dpdgraph/ ? Or another tool? Would it be easy to use your ML4PG on the HoTT library?" This note gives explanation of how ML4PG can be used in the HoTT library and how ML4PG relates to the two kinds of Dependency graphs available in Coq.

preprint2014arXiv

Procesamiento topo-geométrico de imágenes neuronales

Fruit of the relationship of our research group with the team coordinated by the biologist Miguel Morales (http://spineup.es), we have applied different topo-geometric techniques for neuronal image processing. The images, captured with a powerful confocal microscope, allow to study the evolution of synaptic density under the influence of various substances, with the aim of studying neurodegenerative diseases like Alzheimer. In the paper we make a brief review of the techniques that appear in our bioinformatic problems, including the calculation of ordinary and persistent homology (for which one can use the program Kenzo for symbolic computation in algebraic topology ) and classical problems of digital topology as skeleton location and path tracking. We focus on some particular cases of recent application, with which we will illustrate the previous techniques.

preprint2014arXiv

Proof Pattern Search in Coq/SSReflect

ML4PG is an extension of the Proof General interface, allowing the user to invoke machine-learning algorithms and find proof similarities in Coq/SSReect libraries. In this paper, we present three new improvements to ML4PG. First, a new method of "recurrent clustering" is introduced to collect statistical features from Coq terms. Now the user can receive suggestions about similar definitions, types and lemma statements, in addition to proof strategies. Second, Coq proofs are split into patches to capture proof strategies that could arise at different stages of a proof. Finally, we improve ML4PG's output introducing an automaton-shape representation for proof patterns.

preprint2014arXiv

Recycling Proof Patterns in Coq: Case Studies

Development of Interactive Theorem Provers has led to the creation of big libraries and varied infrastructures for formal proofs. However, despite (or perhaps due to) their sophistication, the re-use of libraries by non-experts or across domains is a challenge. In this paper, we provide detailed case studies and evaluate the machine-learning tool ML4PG built to interactively data-mine the electronic libraries of proofs, and to provide user guidance on the basis of proof patterns found in the existing libraries.

preprint2013arXiv

Exploiting Parallelism in Coalgebraic Logic Programming

We present a parallel implementation of Coalgebraic Logic Programming (CoALP) in the programming language Go. CoALP was initially introduced to reflect coalgebraic semantics of logic programming, with coalgebraic derivation algorithm featuring both corecursion and parallelism. Here, we discuss how the coalgebraic semantics influenced our parallel implementation of logic programming.

preprint2013arXiv

Machine Learning in Proof General: Interfacing Interfaces

We present ML4PG - a machine learning extension for Proof General. It allows users to gather proof statistics related to shapes of goals, sequences of applied tactics, and proof tree structures from the libraries of interactive higher-order proofs written in Coq and SSReflect. The gathered data is clustered using the state-of-the-art machine learning algorithms available in MATLAB and Weka. ML4PG provides automated interfacing between Proof General and MATLAB/Weka. The results of clustering are used by ML4PG to provide proof hints in the process of interactive proof development.

preprint2013arXiv

ML4PG in Computer Algebra verification

ML4PG is a machine-learning extension that provides statistical proof hints during the process of Coq/SSReflect proof development. In this paper, we use ML4PG to find proof patterns in the CoqEAL library -- a library that was devised to verify the correctness of Computer Algebra algorithms. In particular, we use ML4PG to help us in the formalisation of an efficient algorithm to compute the inverse of triangular matrices.

preprint2013arXiv

Proof-Pattern Recognition and Lemma Discovery in ACL2

We present a novel technique for combining statistical machine learning for proof-pattern recognition with symbolic methods for lemma discovery. The resulting tool, ACL2(ml), gathers proof statistics and uses statistical pattern-recognition to pre-processes data from libraries, and then suggests auxiliary lemmas in new proofs by analogy with already seen examples. This paper presents the implementation of ACL2(ml) alongside theoretical descriptions of the proof-pattern recognition and lemma discovery methods involved in it.

preprint2013arXiv

Verifying a platform for digital imaging: a multi-tool strategy

Fiji is a Java platform widely used by biologists and other experimental scientists to process digital images. In particular, in our research - made together with a biologists team; we use Fiji in some pre-processing steps before undertaking a homological digital processing of images. In a previous work, we have formalised the correctness of the programs which use homological techniques to analyse digital images. However, the verification of Fiji's pre-processing step was missed. In this paper, we present a multi-tool approach filling this gap, based on the combination of Why/Krakatoa, Coq and ACL2.

preprint2012arXiv

Computing Persistent Homology within Coq/SSReflect

Persistent homology is one of the most active branches of Computational Algebraic Topology with applications in several contexts such as optical character recognition or analysis of point cloud data. In this paper, we report on the formal development of certified programs to compute persistent Betti numbers, an instrumental tool of persistent homology, using the Coq proof assistant together with the SSReflect extension. To this aim it has been necessary to formalize the underlying mathematical theory of these algorithms. This is another example showing that interactive theorem provers have reached a point where they are mature enough to tackle the formalization of nontrivial mathematical theories.

preprint2012arXiv

Verifying an algorithm computing Discrete Vector Fields for digital imaging

In this paper, we present a formalization of an algorithm to construct admissible discrete vector fields in the Coq theorem prover taking advantage of the SSReflect library. Discrete vector fields are a tool which has been welcomed in the homological analysis of digital images since it provides a procedure to reduce the amount of information but preserving the homological properties. In particular, thanks to discrete vector fields, we are able to compute, inside Coq, homological properties of biomedical images which otherwise are out of the reach of this system.