Source author record

Jan Hązła

Jan Hązła appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Discrete Mathematics math.CO Computational Complexity math.PR math.ST Multiagent Systems Neural and Evolutionary Computing Social and Information Networks Statistics Theory

Catalog footprint

What is connected

6works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Johnson--Lindenstrauss Framework for Randomly Initialized CNNs

How does the geometric representation of a dataset change after the application of each randomly initialized layer of a neural network? The celebrated Johnson--Lindenstrauss lemma answers this question for linear fully-connected neural networks (FNNs), stating that the geometry is essentially preserved. For FNNs with the ReLU activation, the angle between two inputs contracts according to a known mapping. The question for non-linear convolutional neural networks (CNNs) becomes much more intricate. To answer this question, we introduce a geometric framework. For linear CNNs, we show that the Johnson--Lindenstrauss lemma continues to hold, namely, that the angle between two inputs is preserved. For CNNs with ReLU activation, on the other hand, the behavior is richer: The angle between the outputs contracts, where the level of contraction depends on the nature of the inputs. In particular, after one layer, the geometry of natural images is essentially preserved, whereas for Gaussian correlated inputs, CNNs exhibit the same contracting behavior as FNNs with ReLU activation.

preprint2022arXiv

An initial alignment between neural network and target is needed for gradient descent to learn

This paper introduces the notion of ``Initial Alignment'' (INAL) between a neural network at initialization and a target function. It is proved that if a network and a Boolean target function do not have a noticeable INAL, then noisy gradient descent on a fully connected network with normalized i.i.d. initialization will not learn in polynomial time. Thus a certain amount of knowledge about the target (measured by the INAL) is needed in the architecture design. This also provides an answer to an open problem posed in [AS20]. The results are based on deriving lower-bounds for descent algorithms on symmetric neural networks without explicit knowledge of the target function beyond its INAL.

preprint2020arXiv

On arithmetic progressions in symmetric sets in finite field model

We consider two problems regarding arithmetic progressions in symmetric sets in the finite field (product space) model. First, we show that a symmetric set $S\subseteq\mathbb{Z}_q^n$ containing $|S|=μ\cdot q^n$ elements must contain at least $δ(q,μ)\cdot q^n\cdot 2^n$ arithmetic progressions $x,x+d,\ldots,x+(q-1)\cdot d$ such that the difference $d$ is restricted to lie in $\{0,1\}^n$. Second, we show that for prime $p$ a symmetric set $S\subseteq\mathbb{F}^n_p$ with $|S|=μ\cdot p^n$ elements contains at least $μ^{C(p)}\cdot p^{2n}$ arithmetic progressions of length $p$. This establishes that the qualitative behavior of longer arithmetic progressions in symmetric sets is the same as for progressions of length three.

preprint2019arXiv

Bayesian Decision Making in Groups is Hard

We study the computations that Bayesian agents undertake when exchanging opinions over a network. The agents act repeatedly on their private information and take myopic actions that maximize their expected utility according to a fully rational posterior belief. We show that such computations are NP-hard for two natural utility functions: one with binary actions, and another where agents reveal their posterior beliefs. In fact, we show that distinguishing between posteriors that are concentrated on different states of the world is NP-hard. Therefore, even approximating the Bayesian posterior beliefs is hard. We also describe a natural search algorithm to compute agents' actions, which we call elimination of impossible signals, and show that if the network is transitive, the algorithm can be modified to run in polynomial time.

preprint2016arXiv

Forbidden Subgraph Bounds for Parallel Repetition and the Density Hales-Jewett Theorem

We study a special kind of bounds (so called forbidden subgraph bounds, cf. Feige, Verbitsky '02) for parallel repetition of multi-prover games. First, we show that forbidden subgraph upper bounds for $r \ge 3$ provers imply the same bounds for the density Hales-Jewett theorem for alphabet of size $r$. As a consequence, this yields a new family of games with slow decrease in the parallel repetition value. Second, we introduce a new technique for proving exponential forbidden subgraph upper bounds and explore its power and limitations. In particular, we obtain exponential upper bounds for two-prover games with question graphs of treewidth at most two and show that our method cannot give exponential bounds for all two-prover graphs.

preprint2015arXiv

Upper Tail Estimates with Combinatorial Proofs

We study generalisations of a simple, combinatorial proof of a Chernoff bound similar to the one by Impagliazzo and Kabanets (RANDOM, 2010). In particular, we prove a randomized version of the hitting property of expander random walks and apply it to obtain a concentration bound for expander random walks which is essentially optimal for small deviations and a large number of steps. At the same time, we present a simpler proof that still yields a "right" bound settling a question asked by Impagliazzo and Kabanets. Next, we obtain a simple upper tail bound for polynomials with input variables in $[0, 1]$ which are not necessarily independent, but obey a certain condition inspired by Impagliazzo and Kabanets. The resulting bound is used by Holenstein and Sinha (FOCS, 2012) in the proof of a lower bound for the number of calls in a black-box construction of a pseudorandom generator from a one-way function. We then show that the same technique yields the upper tail bound for the number of copies of a fixed graph in an Erdős-Rényi random graph, matching the one given by Janson, Oleszkiewicz and Ruciński (Israel J. Math, 2002).