Researcher profile

Janne V. Kujala

Janne V. Kujala contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
27works
0followers
15topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

27 published item(s)

preprint2021arXiv

Contextuality and dichotomizations of random variables

The Contextuality-by-Default approach to determining and measuring the (non)contextuality of a system of random variables requires that every random variable in the system be represented by an equivalent set of dichotomous random variables. In this paper we present general principles that justify the use of dichotomizations and determine their choice. The main idea in choosing dichotomizations is that if the set of possible values of a random variable is endowed with a pre-topology (V-space), then the allowable dichotomizations split the space of possible values into two linked subsets ("linkednes" being a weak form of pre-topological connectedness). We primarily focus on two types of random variables most often encountered in practice: categorical and real-valued ones (including continuous random variables, greatly underrepresented in the contextuality literature). A categorical variable (one with a finite number of unordered values) is represented by all of its possible dichotomizations. If the values of a random variable are real numbers, then they are dichotomized by intervals above and below a variable cut point.

preprint2016arXiv

Asymptotic optimality of myopic information-based strategies for Bayesian adaptive estimation

This paper presents a general asymptotic theory of sequential Bayesian estimation giving results for the strongest, almost sure convergence. We show that under certain smoothness conditions on the probability model, the greedy information gain maximization algorithm for adaptive Bayesian estimation is asymptotically optimal in the sense that the determinant of the posterior covariance in a certain neighborhood of the true parameter value is asymptotically minimal. Using this result, we also obtain an asymptotic expression for the posterior entropy based on a novel definition of almost sure convergence on "most trials" (meaning that the convergence holds on a fraction of trials that converges to one). Then, we extend the results to a recently published framework, which generalizes the usual adaptive estimation setting by allowing different trial placements to be associated with different, random costs of observation. For this setting, the author has proposed the heuristic of maximizing the expected information gain divided by the expected cost of that placement. In this paper, we show that this myopic strategy satisfies an analogous asymptotic optimality result when the convergence of the posterior distribution is considered as a function of the total cost (as opposed to the number of observations).

preprint2016arXiv

Classifying and sorting cluttered piles of unknown objects with robots: a learning approach

We consider the problem of sorting a densely cluttered pile of unknown objects using a robot. This yet unsolved problem is relevant in the robotic waste sorting business. By extending previous active learning approaches to grasping, we show a system that learns the task autonomously. Instead of predicting just whether a grasp succeeds, we predict the classes of the objects that end up being picked and thrown onto the target conveyor. Segmenting and identifying objects from the uncluttered target conveyor, as opposed to the working area, is easier due to the added structure since the thrown objects will be the only ones present. Instead of trying to segment or otherwise understand the cluttered working area in any way, we simply allow the controller to learn a mapping from an RGBD image in the neighborhood of the grasp to a predicted result---all segmentation etc. in the working area is implicit in the learned function. The grasp selection operates in two stages: The first stage is hardcoded and outputs a distribution of possible grasps that sometimes succeed. The second stage uses a purely learned criterion to choose the grasp to make from the proposal distribution created by the first stage. In an experiment, the system quickly learned to make good pickups and predict correctly, in advance, which class of object it was going to pick up and was able to sort the objects from a densely cluttered pile by color.

preprint2016arXiv

Context-Content Systems of Random Variables: The Contextuality-by-Default Theory

This paper provides a systematic yet accessible presentation of the Contextuality-by-Default theory. The consideration is confined to finite systems of categorical random variables, which allows us to focus on the basics of the theory without using full-scale measure-theoretic language. Contextuality-by-Default is a theory of random variables identified by their contents and their contexts, so that two variables have a joint distribution if and only if they share a context. Intuitively, the content of a random variable is the entity the random variable measures or responds to, while the context is formed by the conditions under which these measurements or responses are obtained. A system of random variables consists of stochastically unrelated "bunches," each of which is a set of jointly distributed random variables sharing a context. The variables that have the same content in different contexts form "connections" between the bunches. A probabilistic coupling of this system is a set of random variables obtained by imposing a joint distribution on the stochastically unrelated bunches. A system is considered noncontextual or contextual according to whether it can or cannot be coupled so that the joint distributions imposed on its connections possess a certain property (in the present version of the theory, "maximality"). We present a criterion of contextuality for a special class of systems of random variables, called cyclic systems. We also introduce a general measure of contextuality that makes use of (quasi-)couplings whose distributions may involve negative numbers or numbers greater than 1 in place of probabilities.

preprint2016arXiv

Contextuality-by-Default 2.0: Systems with Binary Random Variables

The paper outlines a new development in the Contextuality-by-Default theory as applied to finite systems of binary random variables. The logic and principles of the original theory remain unchanged, but the definition of contextuality of a system of random variables is now based on multimaximal rather than maximal couplings of the variables that measure the same property in different contexts: a system is considered noncontextual if these multimaximal couplings are compatible with the distributions of the random variables sharing contexts. A multimaximal coupling is one that is a maximal coupling of any subset (equivalently, of any pair) of the random variables being coupled. Arguments are presented for why this modified theory is a superior generalization of the traditional understanding of contextuality in quantum mechanics. The modified theory coincides with the previous version in the important case of cyclic systems, which include the systems whose contextuality was most intensively studied in quantum physics and behavioral sciences.

preprint2016arXiv

On Contextuality in Behavioral Data

Dzhafarov, Zhang, and Kujala (Phil. Trans. Roy. Soc. A 374, 20150099) reviewed several behavioral data sets imitating the formal design of the quantum-mechanical contextuality experiments. The conclusion was that none of these data sets exhibited contextuality if understood in the generalized sense proposed in Dzhafarov, Kujala, and Larsson (Found. Phys. 7, 762-782, 2015), while the traditional definition of contextuality does not apply to these data because they violate the condition of consistent connectedness (also known as marginal selectivity, no-signaling condition, no-disturbance principle, etc.). In this paper we clarify the relationship between (in)consistent connectedness and (non)contextuality, as well as between the traditional and extended definitions of (non)contextuality, using as an example the Clauser-Horn-Shimony-Holt (CHSH) inequalities originally designed for detecting contextuality in entangled particles.

preprint2015arXiv

A Qualified Kolmogorovian Account of Probabilistic Contextuality

We describe a mathematical language for determining all possible patterns of contextuality in the dependence of stochastic outputs of a system on its deterministic inputs. The central notion is that of all possible couplings for stochastically unrelated outputs indexed by mutually incompatible values of inputs. A system is characterized by a pattern of which outputs can be "directly influenced" by which inputs (a primitive relation, hypothetical or normative), and by certain constraints imposed on the outputs (such as Bell-type inequalities or their quantum analogues). The set of couplings compatible with these constraints represents a form of contextuality in the dependence of outputs on inputs with respect to the declared pattern of direct influences.

preprint2015arXiv

Contextuality in Three Types of Quantum-Mechanical Systems

We present a formal theory of contextuality for a set of random variables grouped into different subsets (contexts) corresponding to different, mutually incompatible conditions. Within each context the random variables are jointly distributed, but across different contexts they are stochastically unrelated. The theory of contextuality is based on the analysis of the extent to which some of these random variables can be viewed as preserving their identity across different contexts when one considers all possible joint distributions imposed on the entire set of the random variables. We illustrate the theory on three systems of traditional interest in quantum physics (and also in non-physical, e.g., behavioral studies). These are systems of the Klyachko-Can-Binicioglu-Shumovsky-type, Einstein-Podolsky-Rosen-Bell-type, and Suppes-Zanotti-Leggett-Garg-type. Listed in this order, each of them is formally a special case of the previous one. For each of them we derive necessary and sufficient conditions for contextuality while allowing for experimental errors and contextual biases or signaling. Based on the same principles that underly these derivations we also propose a measure for the degree of contextuality and compute it for the three systems in question.

preprint2015arXiv

Contextuality is About Identity of Random Variables

Contextual situations are those in which seemingly "the same" random variable changes its identity depending on the conditions under which it is recorded. Such a change of identity is observed whenever the assumption that the variable is one and the same under different conditions leads to contradictions when one considers its joint distribution with other random variables (this is the essence of all Bell-type theorems). In our Contextuality-by-Default approach, instead of asking why or how the conditions force "one and the same" random variable to change "its" identity, any two random variables recorded under different conditions are considered different "automatically". They are never the same, nor are they jointly distributed, but one can always impose on them a joint distribution (probabilistic coupling). The special situations when there is a coupling in which these random variables are equal with probability 1 are considered non-contextual. Contextuality means that such couplings do not exist. We argue that the determination of the identity of random variables by conditions under which they are recorded is not a causal relationship and cannot violate laws of physics.

preprint2015arXiv

Contextuality-by-Default: A Brief Overview of Ideas, Concepts, and Terminology

This paper is a brief overview of the concepts involved in measuring the degree of contextuality and detecting contextuality in systems of binary measurements of a finite number of objects. We discuss and clarify the main concepts and terminology of the theory called "contextuality-by-default," and then discuss a possible generalization of the theory from binary to arbitrary measurements.

preprint2015arXiv

Embedding Quantum into Classical: Contextualization vs Conditionalization

We compare two approaches to embedding joint distributions of random variables recorded under different conditions (such as spins of entangled particles for different settings) into the framework of classical, Kolmogorovian probability theory. In the contextualization approach each random variable is "automatically" labeled by all conditions under which it is recorded, and the random variables across a set of mutually exclusive conditions are probabilistically coupled (imposed a joint distribution upon). Analysis of all possible probabilistic couplings for a given set of random variables allows one to characterize various relations between their separate distributions (such as Bell-type inequalities or quantum-mechanical constraints). In the conditionalization approach one considers the conditions under which the random variables are recorded as if they were values of another random variable, so that the observed distributions are interpreted as conditional ones. This approach is uninformative with respect to relations between the distributions observed under different conditions, because any set of such distributions is compatible with any distribution assigned to the conditions.

preprint2015arXiv

Generalizing Bell-type and Leggett-Garg-type Inequalities to Systems with Signaling

Contextuality means non-existence of a joint distribution for random variables recorded under mutually incompatible conditions, subject to certain constraints imposed on how the identity of these variables may change across these conditions. In simple quantum systems contextuality is indicated by violations of Bell-type or Leggett-Garg-type inequalities. These inequalities, however, are predicated on the assumption of no-signaling, defined as invariance of the distributions of measurement results with respect to other (e.g., earlier in time) measurements' settings. Signaling makes the inequalities inapplicable: a non-signaling system with any degree of contextuality, however high, loses any relation to this concept as soon as it exhibits any degree of signaling, however small. This is unsatisfactory. We describe a principled way of defining and measuring contextuality in arbitrary systems with random outputs, whether signaling is absent or present.

preprint2015arXiv

Measuring Observable Quantum Contextuality

Contextuality is a central property in comparative analysis of classical, quantum, and supercorrelated systems. We examine and compare two well-motivated approaches to contextuality. One approach ("contextuality-by-default") is based on the idea that one and the same physical property measured under different conditions (contexts) is represented by different random variables. The other approach is based on the idea that while a physical property is represented by a single random variable irrespective of its context, the joint distributions of the random variables describing the system can involve negative (quasi-)probabilities. We show that in the Leggett-Garg and EPR-Bell systems, the two measures essentially coincide.

preprint2015arXiv

Minimal Distance to Approximating Noncontextual System as a Measure of Contextuality

Let random vectors $R^c=\{R_p^c:p\in P_c\}$ represent joint measurements of certain subsets $P_c$ of properties $p\in P$ in different contexts $c\in C$. Such a system is traditionally called noncontextual if there exists a jointly distributed set $\{Q_p:p\in P\}$ of random variables such that $R^c$ has the same distribution as $\{Q_p:p\in P_c\}$ for all $c\in C$. A trivial necessary condition for noncontextuality and a precondition for most approaches to measuring contextuality is that the system is consistently connected, i.e., all $R_p^c,R_p^{c'},\dots$ measuring the same property $p$ have the same distribution. The Contextuality-by-Default (CbD) approach allows detecting and measuring "true" contextuality on top of inconsistent connectedness, but at the price of a higher computational cost. In this paper we propose a novel approach to measuring contextuality that shares the generality and basic definitions of the CbD approach and the computational benefits of the previously proposed Negative Probability (NP) approach. The present approach differs from CbD in that instead of considering all possible joints of the double-indexed random variables $R_p^c$, it considers all possible approximating single-indexed systems $\{Q_p:p\in P\}$. The degree of contextuality is defined based on the minimum possible probabilistic distance of the actual measurements $R^c$ from $\{Q_p:p\in P_c$}. We show that the defined measure agrees with a certain measure of contextuality of the CbD approach for all systems where each property enters in exactly two contexts and that this measure can be calculated far more efficiently than the CbD measure and even more efficiently than the NP measure for sufficiently large systems. The present approach can be modified so as to agree with the NP measure of contextuality on all consistently connected systems while extending it to inconsistently connected systems.

preprint2015arXiv

Necessary and Sufficient Conditions for Extended Noncontextuality in a Broad Class of Quantum Mechanical Systems

The notion of (non)contextuality pertains to sets of properties measured one subset (context) at a time. We extend this notion to include so-called inconsistently connected systems, in which the measurements of a given property in different contexts may have different distributions, due to contextual biases in experimental design or physical interactions (signaling): a system of measurements has a maximally noncontextual description if they can be imposed a joint distribution on in which the measurements of any one property in different contexts are equal to each other with the maximal probability allowed by their different distributions. We derive necessary and sufficient conditions for the existence of such a description in a broad class of systems including Klyachko-Can-Binicioğlu-Shumvosky-type (KCBS), EPR-Bell-type, and Leggett-Garg-type systems. Because these conditions allow for inconsistent connectedness, they are applicable to real experiments. We illustrate this by analyzing an experiment by Lapkiewicz and colleagues aimed at testing contextuality in a KCBS-type system.

preprint2015arXiv

No-Forcing and No-Matching Theorems for Classical Probability Applied to Quantum Mechanics

Correlations of spins in a system of entangled particles are inconsistent with Kolmogorov's probability theory (KPT), provided the system is assumed to be non-contextual. In the Alice-Bob EPR paradigm, non-contextuality means that the identity of Alice's spin (i.e., the probability space on which it is defined as a random variable) is determined only by the axis \alphai chosen by Alice, irrespective of Bob's axis \betaj (and vice versa). Here, we study contextual KPT models, with two properties: (1) Alice's and Bob's spins are identified as Aij and Bij, even though their distributions are determined by, respectively, \alphai alone and \betaj alone, in accordance with the no-signaling requirement; and (2) the joint distributions of the spins Aij,Bij across all values of \alphai,\betaj are constrained by fixing distributions of some subsets thereof. Of special interest among these subsets is the set of probabilistic connections, defined as the pairs \left(Aij,Aij'\right) and \left(Bij,Bi'j\right) with \alphai\not=\alphai' and \betaj\not=\betaj' (the non-contextuality assumption is obtained as a special case of connections, with zero probabilities of Aij\not=Aij' and Bij\not=Bi'j). Thus, one can achieve a complete KPT characterization bof the Bell-type inequalities, or Tsirelson's inequalities, by specifying the distributions of probabilistic connections compatible with those and only those spin pairs \left(Aij,Bij\right) that are subject to these inequalities. We show, however, that quantum-mechanical (QM) constraints are special. No-forcing theorem says that if a set of probabilistic connections is not compatible with correlations violating QM, then it is compatible only with the classical-mechanical correlations. No-matching theorem says that there are no subsets of the spin variables Aij,Bij whose distributions can be fixed to be compatible with and only with QM-compliant correlations.

preprint2015arXiv

Picking a Conveyor Clean by an Autonomously Learning Robot

We present a research picking prototype related to our company's industrial waste sorting application. The goal of the prototype is to be as autonomous as possible and it both calibrates itself and improves its picking with minimal human intervention. The system learns to pick objects better based on a feedback sensor in its gripper and uses machine learning to choosing the best proposal from a random sample produced by simple hard-coded geometric models. We show experimentally the system improving its picking autonomously by measuring the pick success rate as function of time. We also show how this system can pick a conveyor belt clean, depositing 70 out of 80 objects in a difficult to manipulate pile of novel objects into the correct chute. We discuss potential improvements and next steps in this direction.

preprint2015arXiv

Random Variables Recorded under Mutually Exclusive Conditions: Contextuality-by-Default

We present general principles underlying analysis of the dependence of random variables (outputs) on deterministic conditions (inputs). Random outputs recorded under mutually exclusive input values are labeled by these values and considered stochastically unrelated, possessing no joint distribution. An input that does not directly influence an output creates a context for the latter. Any constraint imposed on the dependence of random outputs on inputs can be characterized by considering all possible couplings (joint distributions) imposed on stochastically unrelated outputs. The target application of these principles is a quantum mechanical system of entangled particles, with directions of spin measurements chosen for each particle being inputs and the spins recorded outputs. The sphere of applicability, however, spans systems across physical, biological, and behavioral sciences.

preprint2013arXiv

On selective influences, marginal selectivity, and Bell/CHSH inequalities

The Bell/CHSH inequalities of quantum physics are identical with the inequalities derived in mathematical psychology for the problem of selective influences in cases involving two bi- nary experimental factors and two binary random variables recorded in response to them. The following points are made regarding cognitive science applications: (1) compliance of data with these inequalities is informative only if the data satisfy the requirement known as marginal selectivity; (2) both violations of marginal selectivity and violations of the Bell/CHSH inequalities are interpretable as indicating that at least one of the two responses is influenced by both experimental factors.

preprint2012arXiv

Order-distance and other metric-like functions on jointly distributed random variables

We construct a class of real-valued nonnegative binary functions on a set of jointly distributed random variables, which satisfy the triangle inequality and vanish at identical arguments (pseudo-quasi-metrics). These functions are useful in dealing with the problem of selective probabilistic causality encountered in behavioral sciences and in quantum physics. The problem reduces to that of ascertaining the existence of a joint distribution for a set of variables with known distributions of certain subsets of this set. Any violation of the triangle inequality or its consequences by one of our functions when applied to such a set rules out the existence of this joint distribution. We focus on an especially versatile and widely applicable pseudo-quasi-metric called an order-distance and its special case called a classification distance.

preprint2012arXiv

Quantum Entanglement and the Issue of Selective Influences in Psychology: An Overview

Similar formalisms have been independently developed in psychology, to deal with the issue of selective influences (deciding which of several experimental manipulations selectively influences each of several, generally non-independent, response variables), and in quantum mechanics (QM), to deal with the EPR entanglement phenomena (deciding whether an EPR experiment allows for a "classical" account). The parallels between these problems are established by observing that any two noncommuting measurements in QM are mutually exclusive and can therefore be treated as analogs of different values of one and the same input. Both problems reduce to that of the existence of a jointly distributed system of random variables, one variable for every value of every input (in psychology) or every measurement on every particle involved (in an EPR experiment). We overview three classes of necessary conditions (some of them also sufficient under additional constraints) for the existence of such joint distributions.

preprint2012arXiv

Selectivity in Probabilistic Causality: Where Psychology Runs Into Quantum Physics

Given a set of several inputs into a system (e.g., independent variables characterizing stimuli) and a set of several stochastically non-independent outputs (e.g., random variables describing different aspects of responses), how can one determine, for each of the outputs, which of the inputs it is influenced by? The problem has applications ranging from modeling pairwise comparisons to reconstructing mental processing architectures to conjoint testing. A necessary and sufficient condition for a given pattern of selective influences is provided by the Joint Distribution Criterion, according to which the problem of "what influences what" is equivalent to that of the existence of a joint distribution for a certain set of random variables. For inputs and outputs with finite sets of values this criterion translates into a test of consistency of a certain system of linear equations and inequalities (Linear Feasibility Test) which can be performed by means of linear programming. While new in the behavioral context, both this test and the Joint Distribution Criterion on which it is based have been previously proposed in quantum physics, in dealing with generalizations of Bell inequalities for the quantum entanglement problem. The parallels between this problem and that of selective influences in behavioral sciences is established by observing that noncommuting measurements in quantum physics are mutually exclusive and can therefore be treated as different levels of one and the same factor.

preprint2011arXiv

A Probabilistic Approach to Pronunciation by Analogy

The relationship between written and spoken words is convoluted in languages with a deep orthography such as English and therefore it is difficult to devise explicit rules for generating the pronunciations for unseen words. Pronunciation by analogy (PbA) is a data-driven method of constructing pronunciations for novel words from concatenated segments of known words and their pronunciations. PbA performs relatively well with English and outperforms several other proposed methods. However, the best published word accuracy of 65.5% (for the 20,000 word NETtalk corpus) suggests there is much room for improvement in it. Previous PbA algorithms have used several different scoring strategies such as the product of the frequencies of the component pronunciations of the segments, or the number of different segmentations that yield the same pronunciation, and different combinations of these methods, to evaluate the candidate pronunciations. In this article, we instead propose to use a probabilistically justified scoring rule. We show that this principled approach alone yields better accuracy (66.21% for the NETtalk corpus) than any previously published PbA algorithm. Furthermore, combined with certain ad hoc modifications motivated by earlier algorithms, the performance climbs up to 66.6%, and further improvements are possible by combining this method with other methods.

preprint2011arXiv

A Remark on the Assumptions of Bayes' Theorem

We formulate simple equivalent conditions for the validity of Bayes&#39; formula for conditional densities. We show that for any random variables X and Y (with values in arbitrary measurable spaces), the following are equivalent: 1. X and Y have a joint density w.r.t. a product measure μx ν, 2. P_{X,Y} << P_X x P_Y, (here P_{.} denotes the distribution of {.}) 3. X has a conditional density p(x | y) w.r.t. a sigma-finite measure μ, 4. X has a conditional distribution P_{X|Y} such that P_{X|y} << P_X for all y, 5. X has a conditional distribution P_{X|Y} and a marginal density p(x) w.r.t. a measure μsuch that P_{X|y} << μfor all y. Furthermore, given random variables X and Y with a conditional density p(y | x) w.r.t. νand a marginal density p(x) w.r.t. μ, we show that Bayes&#39; formula p(x | y) = p(y | x)p(x) / \int p(y | x)p(x)dμ(x) yields a conditional density p(x | y) w.r.t. μif and only if X and Y satisfy the above conditions. Counterexamples illustrating the nontriviality of the results are given, and implications for sequential adaptive estimation are considered.

preprint2011arXiv

Selectivity in Probabilistic Causality: Drawing Arrows from Inputs to Stochastic Outputs

Given a set of several inputs into a system (e.g., independent variables characterizing stimuli) and a set of several stochastically non-independent outputs (e.g., random variables describing different aspects of responses), how can one determine, for each of the outputs, which of the inputs it is influenced by? The problem has applications ranging from modeling pairwise comparisons to reconstructing mental processing architectures to conjoint testing. A necessary and sufficient condition for a given pattern of selective influences is provided by the Joint Distribution Criterion, according to which the problem of &#34;what influences what&#34; is equivalent to that of the existence of a joint distribution for a certain set of random variables. For inputs and outputs with finite sets of values this criterion translates into a test of consistency of a certain system of linear equations and inequalities (Linear Feasibility Test) which can be performed by means of linear programming. The Joint Distribution Criterion also leads to a metatheoretical principle for generating a broad class of necessary conditions (tests) for diagrams of selective influences. Among them is the class of distance-type tests based on the observation that certain functionals on jointly distributed random variables satisfy triangle inequality.