Source author record

Daphna Weinshall

Daphna Weinshall appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Artificial Intelligence Computation and Language

Catalog footprint

What is connected

5works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

The Grammar-Learning Trajectories of Neural Language Models

The learning trajectories of linguistic phenomena in humans provide insight into linguistic representation, beyond what can be gleaned from inspecting the behavior of an adult speaker. To apply a similar approach to analyze neural language models (NLM), it is first necessary to establish that different models are similar enough in the generalizations they make. In this paper, we show that NLMs with different initialization, architecture, and training data acquire linguistic phenomena in a similar order, despite their different end performance. These findings suggest that there is some mutual inductive bias that underlies these models' learning of linguistic phenomena. Taking inspiration from psycholinguistics, we argue that studying this inductive bias is an opportunity to study the linguistic representation implicit in NLMs. Leveraging these findings, we compare the relative performance on different phenomena at varying learning stages with simpler reference models. Results suggest that NLMs exhibit consistent "developmental" stages. Moreover, we find the learning trajectory to be approximately one-dimensional: given an NLM with a certain overall performance, it is possible to predict what linguistic generalizations it has already acquired. Initial analysis of these stages presents phenomena clusters (notably morphological ones), whose performance progresses in unison, suggesting a potential link between the generalizations behind them.

preprint2021arXiv

More Is More -- Narrowing the Generalization Gap by Adding Classification Heads

Overfit is a fundamental problem in machine learning in general, and in deep learning in particular. In order to reduce overfit and improve generalization in the classification of images, some employ invariance to a group of transformations, such as rotations and reflections. However, since not all objects exhibit necessarily the same invariance, it seems desirable to allow the network to learn the useful level of invariance from the data. To this end, motivated by self-supervision, we introduce an architecture enhancement for existing neural network models based on input transformations, termed 'TransNet', together with a training algorithm suitable for it. Our model can be employed during training time only and then pruned for prediction, resulting in an equivalent architecture to the base model. Thus pruned, we show that our model improves performance on various data-sets while exhibiting improved generalization, which is achieved in turn by enforcing soft invariance on the convolutional kernels of the last layer in the base model. Theoretical analysis is provided to support the proposed method.

preprint2017arXiv

Distance-based Confidence Score for Neural Network Classifiers

The reliable measurement of confidence in classifiers' predictions is very important for many applications and is, therefore, an important part of classifier design. Yet, although deep learning has received tremendous attention in recent years, not much progress has been made in quantifying the prediction confidence of neural network classifiers. Bayesian models offer a mathematically grounded framework to reason about model uncertainty, but usually come with prohibitive computational costs. In this paper we propose a simple, scalable method to achieve a reliable confidence score, based on the data embedding derived from the penultimate layer of the network. We investigate two ways to achieve desirable embeddings, by using either a distance-based loss or Adversarial Training. We then test the benefits of our method when used for classification error prediction, weighting an ensemble of classifiers, and novelty detection. In all tasks we show significant improvement over traditional, commonly used confidence scores.

preprint2016arXiv

Novelty Detection in MultiClass Scenarios with Incomplete Set of Class Labels

We address the problem of novelty detection in multiclass scenarios where some class labels are missing from the training set. Our method is based on the initial assignment of confidence values, which measure the affinity between a new test point and each known class. We first compare the values of the two top elements in this vector of confidence values. In the heart of our method lies the training of an ensemble of classifiers, each trained to discriminate known from novel classes based on some partition of the training data into presumed-known and presumednovel classes. Our final novelty score is derived from the output of this ensemble of classifiers. We evaluated our method on two datasets of images containing a relatively large number of classes - the Caltech-256 and Cifar-100 datasets. We compared our method to 3 alternative methods which represent commonly used approaches, including the one-class SVM, novelty based on k-NN, novelty based on maximal confidence, and the recent KNFST method. The results show a very clear and marked advantage for our method over all alternative methods, in an experimental setup where class labels are missing during training.

preprint2015arXiv

A Cheap System for Vehicle Speed Detection

The reliable detection of speed of moving vehicles is considered key to traffic law enforcement in most countries, and is seen by many as an important tool to reduce the number of traffic accidents and fatalities. Many automatic systems and different methods are employed in different countries, but as a rule they tend to be expensive and/or labor intensive, often employing outdated technology due to the long development time. Here we describe a speed detection system that relies on simple everyday equipment - a laptop and a consumer web camera. Our method is based on tracking the license plates of cars, which gives the relative movement of the cars in the image. This image displacement is translated to actual motion by using the method of projection to a reference plane, where the reference plane is the road itself. However, since license plates do not touch the road, we must compensate for the entailed distortion in speed measurement. We show how to compute the compensation factor using knowledge of the license plate standard dimensions. Consequently our system computes the true speed of moving vehicles fast and accurately. We show promising results on videos obtained in a number of scenes and with different car models.