Source author record

Eran Yahav

Eran Yahav appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Formal Languages and Automata Theory Programming Languages Computation and Language Cryptography and Security

Catalog footprint

What is connected

8works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

How Attentive are Graph Attention Networks?

Graph Attention Networks (GATs) are one of the most popular GNN architectures and are considered as the state-of-the-art architecture for representation learning with graphs. In GAT, every node attends to its neighbors given its own representation as the query. However, in this paper we show that GAT computes a very limited kind of attention: the ranking of the attention scores is unconditioned on the query node. We formally define this restricted kind of attention as static attention and distinguish it from a strictly more expressive dynamic attention. Because GATs use a static attention mechanism, there are simple graph problems that GAT cannot express: in a controlled problem, we show that static attention hinders GAT from even fitting the training data. To remove this limitation, we introduce a simple fix by modifying the order of operations and propose GATv2: a dynamic graph attention variant that is strictly more expressive than GAT. We perform an extensive evaluation and show that GATv2 outperforms GAT across 11 OGB and other benchmarks while we match their parametric costs. Our code is available at https://github.com/tech-srl/how_attentive_are_gats . GATv2 is available as part of the PyTorch Geometric library, the Deep Graph Library, and the TensorFlow GNN library.

preprint2021arXiv

On the Bottleneck of Graph Neural Networks and its Practical Implications

Since the proposal of the graph neural network (GNN) by Gori et al. (2005) and Scarselli et al. (2008), one of the major problems in training GNNs was their struggle to propagate information between distant nodes in the graph. We propose a new explanation for this problem: GNNs are susceptible to a bottleneck when aggregating messages across a long path. This bottleneck causes the over-squashing of exponentially growing information into fixed-size vectors. As a result, GNNs fail to propagate messages originating from distant nodes and perform poorly when the prediction task depends on long-range interaction. In this paper, we highlight the inherent problem of over-squashing in GNNs: we demonstrate that the bottleneck hinders popular GNNs from fitting long-range signals in the training data; we further show that GNNs that absorb incoming edges equally, such as GCN and GIN, are more susceptible to over-squashing than GAT and GGNN; finally, we show that prior work, which extensively tuned GNN models of long-range problems, suffers from over-squashing, and that breaking the bottleneck improves their state-of-the-art results without any tuning or additional weights. Our code is available at https://github.com/tech-srl/bottleneck/ .

preprint2020arXiv

A Formal Hierarchy of RNN Architectures

We develop a formal hierarchy of the expressive capacity of RNN architectures. The hierarchy is based on two formal properties: space complexity, which measures the RNN's memory, and rational recurrence, defined as whether the recurrent update can be described by a weighted finite-state machine. We place several RNN variants within this hierarchy. For example, we prove the LSTM is not rational, which formally separates it from the related QRNN (Bradbury et al., 2016). We also show how these models' expressive capacity is expanded by stacking multiple layers or composing them with different pooling functions. Our results build on the theory of "saturated" RNNs (Merrill, 2019). While formally extending these findings to unsaturated RNNs is left to future work, we hypothesize that the practical learnable capacity of unsaturated RNNs obeys a similar hierarchy. Experimental findings from training unsaturated networks on formal languages support this conjecture.

preprint2020arXiv

Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples

We present a novel algorithm that uses exact learning and abstraction to extract a deterministic finite automaton describing the state dynamics of a given trained RNN. We do this using Angluin's L* algorithm as a learner and the trained RNN as an oracle. Our technique efficiently extracts accurate automata from trained RNNs, even when the state vectors are large and require fine differentiation.

preprint2020arXiv

Structural Language Models of Code

We address the problem of any-code completion - generating a missing piece of source code in a given program without any restriction on the vocabulary or structure. We introduce a new approach to any-code completion that leverages the strict syntax of programming languages to model a code snippet as a tree - structural language modeling (SLM). SLM estimates the probability of the program's abstract syntax tree (AST) by decomposing it into a product of conditional probabilities over its nodes. We present a neural model that computes these conditional probabilities by considering all AST paths leading to a target node. Unlike previous techniques that have severely restricted the kinds of expressions that can be generated in this task, our approach can generate arbitrary code in any programming language. Our model significantly outperforms both seq2seq and a variety of structured approaches in generating Java and C# code. Our code, data, and trained models are available at http://github.com/tech-srl/slm-code-generation/ . An online demo is available at http://AnyCodeGen.org .

preprint2019arXiv

Learning Deterministic Weighted Automata with Queries and Counterexamples

We present an algorithm for extraction of a probabilistic deterministic finite automaton (PDFA) from a given black-box language model, such as a recurrent neural network (RNN). The algorithm is a variant of the exact-learning algorithm L*, adapted to a probabilistic setting with noise. The key insight is the use of conditional probabilities for observations, and the introduction of a local tolerance when comparing them. When applied to RNNs, our algorithm often achieves better word error rate (WER) and normalised distributed cumulative gain (NDCG) than that achieved by spectral extraction of weighted finite automata (WFA) from the same networks. PDFAs are substantially more expressive than n-grams, and are guaranteed to be stochastic and deterministic - unlike spectrally extracted WFAs.

preprint2016arXiv

Optimal Learning of Specifications from Examples

A fundamental challenge in synthesis from examples is designing a learning algorithm that poses the minimal number of questions to an end user while guaranteeing that the target hypothesis is discovered. Such guarantees are practically important because they ensure that end users will not be overburdened with unnecessary questions. We present SPEX -- a learning algorithm that addresses the above challenge. SPEX considers the hypothesis space of formulas over first-order predicates and learns the correct hypothesis by only asking the user simple membership queries for concrete examples. Thus, SPEX is directly applicable to any learning problem that fits its hypothesis space and uses membership queries. SPEX works by iteratively eliminating candidate hypotheses from the space until converging to the target hypothesis. The main idea is to use the implication order between hypotheses to guarantee that in each step the question presented to the user obtains maximal pruning of the space. This problem is particularly challenging when predicates are potentially correlated. To show that SPEX is practically useful, we expressed two rather different applications domains in its framework: learning programs for the domain of technical analysts (stock trading) and learning data structure specifications. The experimental results show that SPEX's optimality guarantee is effective: it drastically reduces the number of questions posed to the user while successfully learning the exact hypothesis.

preprint2014arXiv

Exploiting Social Navigation

We present an effective Sybil attack against social location based services. Our attack is based on creating a large number of reputed "bot drivers", and controlling their reported locations using fake GPS reports. We show how this attack can be used to influence social navigation systems by applying it to Waze - a prominent social navigation application used by over 50 million drivers. We show that our attack can fake traffic jams and dramatically influence routing decisions. We present several techniques for preventing the attack, and show that effective mitigation likely requires the use of additional carrier information.

Eran Yahav

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

How Attentive are Graph Attention Networks?

On the Bottleneck of Graph Neural Networks and its Practical Implications

A Formal Hierarchy of RNN Architectures

Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples

Structural Language Models of Code

Learning Deterministic Weighted Automata with Queries and Counterexamples

Optimal Learning of Specifications from Examples

Exploiting Social Navigation