Researcher profile

Ron Meir

Ron Meir contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2026arXiv

CtD: Composition through Decomposition in Emergent Communication

Compositionality is a cognitive mechanism that allows humans to systematically combine known concepts in novel ways. This study demonstrates how artificial neural agents acquire and utilize compositional generalization to describe previously unseen images. Our method, termed "Composition through Decomposition", involves two sequential training steps. In the 'Decompose' step, the agents learn to decompose an image into basic concepts using a codebook acquired during interaction in a multi-target coordination game. Subsequently, in the 'Compose' step, the agents employ this codebook to describe novel images by composing basic concepts into complex phrases. Remarkably, we observe cases where generalization in the `Compose' step is achieved zero-shot, without the need for additional training.

preprint2023arXiv

Emergent Quantized Communication

The field of emergent communication aims to understand the characteristics of communication as it emerges from artificial agents solving tasks that require information exchange. Communication with discrete messages is considered a desired characteristic, for both scientific and applied reasons. However, training a multi-agent system with discrete communication is not straightforward, requiring either reinforcement learning algorithms or relaxing the discreteness requirement via a continuous approximation such as the Gumbel-softmax. Both these solutions result in poor performance compared to fully continuous communication. In this work, we propose an alternative approach to achieve discrete communication -- quantization of communicated messages. Using message quantization allows us to train the model end-to-end, achieving superior performance in multiple setups. Moreover, quantization is a natural framework that runs the gamut from continuous to discrete communication. Thus, it sets the ground for a broader view of multi-agent communication in the deep learning era.

preprint2022arXiv

Enhancing Causal Estimation through Unlabeled Offline Data

Consider a situation where a new patient arrives in the Intensive Care Unit (ICU) and is monitored by multiple sensors. We wish to assess relevant unmeasured physiological variables (e.g., cardiac contractility and output and vascular resistance) that have a strong effect on the patients diagnosis and treatment. We do not have any information about this specific patient, but, extensive offline information is available about previous patients, that may only be partially related to the present patient (a case of dataset shift). This information constitutes our prior knowledge, and is both partial and approximate. The basic question is how to best use this prior knowledge, combined with online patient data, to assist in diagnosing the current patient most effectively. Our proposed approach consists of three stages: (i) Use the abundant offline data in order to create both a non-causal and a causal estimator for the relevant unmeasured physiological variables. (ii) Based on the non-causal estimator constructed, and a set of measurements from a new group of patients, we construct a causal filter that provides higher accuracy in the prediction of the hidden physiological variables for this new set of patients. (iii) For any new patient arriving in the ICU, we use the constructed filter in order to predict relevant internal variables. Overall, this strategy allows us to make use of the abundantly available offline data in order to enhance causal estimation for newly arriving patients. We demonstrate the effectiveness of this methodology on a (non-medical) real-world task, in situations where the offline data is only partially related to the new observations. We provide a mathematical analysis of the merits of the approach in a linear setting of Kalman filtering and smoothing, demonstrating its utility.

preprint2022arXiv

Integral Probability Metrics PAC-Bayes Bounds

We present a PAC-Bayes-style generalization bound which enables the replacement of the KL-divergence with a variety of Integral Probability Metrics (IPM). We provide instances of this bound with the IPM being the total variation metric and the Wasserstein distance. A notable feature of the obtained bounds is that they naturally interpolate between classical uniform convergence bounds in the worst case (when the prior and posterior are far away from each other), and improved bounds in favorable cases (when the posterior and prior are close). This illustrates the possibility of reinforcing classical generalization bounds with algorithm- and data-dependent components, thus making them more suitable to analyze algorithms that use a large hypothesis space.

preprint2022arXiv

Metalearning Linear Bandits by Prior Update

Fully Bayesian approaches to sequential decision-making assume that problem parameters are generated from a known prior. In practice, such information is often lacking. This problem is exacerbated in setups with partial information, where a misspecified prior may lead to poor exploration and performance. In this work we prove, in the context of stochastic linear bandits and Gaussian priors, that as long as the prior is sufficiently close to the true prior, the performance of the applied algorithm is close to that of the algorithm that uses the true prior. Furthermore, we address the task of learning the prior through metalearning, where a learner updates her estimate of the prior across multiple task instances in order to improve performance on future tasks. We provide an algorithm and regret bounds, demonstrate its effectiveness in comparison to an algorithm that knows the correct prior, and support our theoretical results empirically. Our theoretical results hold for a broad class of algorithms, including Thompson Sampling and Information Directed Sampling.

preprint2022arXiv

Online Meta-Learning in Adversarial Multi-Armed Bandits

We study meta-learning for adversarial multi-armed bandits. We consider the online-within-online setup, in which a player (learner) encounters a sequence of multi-armed bandit episodes. The player's performance is measured as regret against the best arm in each episode, according to the losses generated by an adversary. The difficulty of the problem depends on the empirical distribution of the per-episode best arm chosen by the adversary. We present an algorithm that can leverage the non-uniformity in this empirical distribution, and derive problem-dependent regret bounds. This solution comprises an inner learner that plays each episode separately, and an outer learner that updates the hyper-parameters of the inner algorithm between the episodes. In the case where the best arm distribution is far from uniform, it improves upon the best bound that can be achieved by any online algorithm executed on each episode individually without meta-learning.

preprint2020arXiv

Discount Factor as a Regularizer in Reinforcement Learning

Specifying a Reinforcement Learning (RL) task involves choosing a suitable planning horizon, which is typically modeled by a discount factor. It is known that applying RL algorithms with a lower discount factor can act as a regularizer, improving performance in the limited data regime. Yet the exact nature of this regularizer has not been investigated. In this work, we fill in this gap. For several Temporal-Difference (TD) learning methods, we show an explicit equivalence between using a reduced discount factor and adding an explicit regularization term to the algorithm's loss. Motivated by the equivalence, we empirically study this technique compared to standard $L_2$ regularization by extensive experiments in discrete and continuous domains, using tabular and functional representations. Our experiments suggest the regularization effectiveness is strongly related to properties of the available data, such as size, distribution, and mixing rate.

preprint2020arXiv

Option Discovery in the Absence of Rewards with Manifold Analysis

Options have been shown to be an effective tool in reinforcement learning, facilitating improved exploration and learning. In this paper, we present an approach based on spectral graph theory and derive an algorithm that systematically discovers options without access to a specific reward or task assignment. As opposed to the common practice used in previous methods, our algorithm makes full use of the spectrum of the graph Laplacian. Incorporating modes associated with higher graph frequencies unravels domain subtleties, which are shown to be useful for option discovery. Using geometric and manifold-based analysis, we present a theoretical justification for the algorithm. In addition, we showcase its performance in several domains, demonstrating clear improvements compared to competing methods.

preprint2013arXiv

Mean Field Bayes Backpropagation: scalable training of multilayer neural networks with binary weights

Significant success has been reported recently using deep neural networks for classification. Such large networks can be computationally intensive, even after training is over. Implementing these trained networks in hardware chips with a limited precision of synaptic weights may improve their speed and energy efficiency by several orders of magnitude, thus enabling their integration into small and low-power electronic devices. With this motivation, we develop a computationally efficient learning algorithm for multilayer neural networks with binary weights, assuming all the hidden neurons have a fan-out of one. This algorithm, derived within a Bayesian probabilistic online setting, is shown to work well for both synthetic and real-world problems, performing comparably to algorithms with real-valued weights, while retaining computational tractability.

preprint2013arXiv

Slow dynamics of neuronal excitability under pulse stimulation

Neurons fire irregularly on multiple timescales when stimulated with a periodic pulse train. This raises two questions: Does this irregularity imply significant intrinsic stochasticity? Can existing neuron models be readily extended to describe behavior at long timescales? We show here that for commonly studied neuronal models, dynamics is not chaotic and can only produce stable and periodic firing patterns. This is done by transforming the neuron model to an analytically tractable piecewise linear discrete map. Thus we answer "yes" and "no" to the above questions, respectively.

preprint2012arXiv

An exact reduction of the master equation to a strictly stable system with an explicit expression for the stationary distribution

The evolution of a continuous time Markov process with a finite number of states is usually calculated by the Master equation - a linear differential equations with a singular generator matrix. We derive a general method for reducing the dimensionality of the Master equation by one by using the probability normalization constraint, thus obtaining a affine differential equation with a (non-singular) stable generator matrix. Additionally, the reduced form yields a simple explicit expression for the stationary probability distribution, which is usually derived implicitly. Finally, we discuss the application of this method to stochastic differential equations.

preprint2012arXiv

Dynamic State Estimation Based on Poisson Spike Trains: Towards a Theory of Optimal Encoding

Neurons in the nervous system convey information to higher brain regions by the generation of spike trains. An important question in the field of computational neuroscience is how these sensory neurons encode environmental information in a way which may be simply analyzed by subsequent systems. Many aspects of the form and function of the nervous system have been understood using the concepts of optimal population coding. Most studies, however, have neglected the aspect of temporal coding. Here we address this shortcoming through a filtering theory of inhomogeneous Poisson processes. We derive exact relations for the minimal mean squared error of the optimal Bayesian filter and by optimizing the encoder, obtain optimal codes for populations of neurons. We also show that a class of non-Markovian, smooth stimuli are amenable to the same treatment, and provide results for the filtering and prediction error which hold for a general class of stochastic processes. This sets a sound mathematical framework for a population coding theory that takes temporal aspects into account. It also formalizes a number of studies which discussed temporal aspects of coding using time-window paradigms, by stating them in terms of correlation times and firing rates. We propose that this kind of analysis allows for a systematic study of temporal coding and will bring further insights into the nature of the neural code.

preprint2010arXiv

History dependent dynamics in a generic model of ion channels - an analytic study

Recent experiments have demonstrated that the timescale of adaptation of single neurons and ion channel populations to stimuli slows down as the length of stimulation increases; in fact, no upper bound on temporal time-scales seems to exist in such systems. Furthermore, patch clamp experiments on single ion channels have hinted at the existence of large, mostly unobservable, inactivation state spaces within a single ion channel. This raises the question of the relation between this multitude of inactivation states and the observed behavior. In this work we propose a minimal model for ion channel dynamics which does not assume any specific structure of the inactivation state space. The model is simple enough to render an analytical study possible. This leads to a clear and concise explanation of the experimentally observed exponential history-dependent relaxation in sodium channels in a voltage clamp setting, and shows that their recovery rate from slow inactivation must be voltage dependent. Furthermore, we predict that history-dependent relaxation cannot be created by overly sparse spiking activity. While the model was created with ion channel populations in mind, its simplicity and genericalness render it a good starting point for modeling similar effects in other systems, and for scaling up to higher levels such as single neurons which are also known to exhibit multiple time scales.

preprint2010arXiv

MSE-based analysis of optimal tuning functions predicts phenomena observed in sensory neurons

Biological systems display impressive capabilities in effectively responding to environmental signals in real time. There is increasing evidence that organisms may indeed be employing near optimal Bayesian calculations in their decision-making. An intriguing question relates to the properties of optimal encoding methods, namely determining the properties of neural populations in sensory layers that optimize performance, subject to physiological constraints. Within an ecological theory of neural encoding/decoding, we show that optimal Bayesian performance requires neural adaptation which reflects environmental changes. Specifically, we predict that neuronal tuning functions possess an optimal width, which increases with prior uncertainty and environmental noise, and decreases with the decoding time window. Furthermore, even for static stimuli, we demonstrate that dynamic sensory tuning functions, acting at relatively short time scales, lead to improved performance. Interestingly, the narrowing of tuning functions as a function of time was recently observed in several biological systems. Such results set the stage for a functional theory which may explain the high reliability of sensory systems, and the utility of neuronal adaptation occurring at multiple time scales.

preprint2010arXiv

Neuronal Response Clamp

Since the first recordings made of evoked action potentials it has become apparent that the responses of individual neurons to ongoing physiologically relevant input, are highly variable. This variability is manifested in non-stationary behavior of practically every observable neuronal response feature. Here we introduce the Neuronal Response Clamp, a closed-loop technique enabling full control over two important single neuron activity variables: response probability and stimulus-spike latency. The technique is applicable over extended durations (up to several hours), and is effective even on the background of ongoing neuronal network activity. The Response Clamp technique is a powerful tool, extending the voltage-clamp and dynamic-clamp approaches to the neuron's functional level, namely - its spiking behavior.