Source author record

Tayfun Gokmen

Tayfun Gokmen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.SY Emerging Technologies Machine Learning Systems and Control cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.str-el Hardware Architecture math.OC physics.soc-ph

Catalog footprint

What is connected

6works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Analog In-memory Training on General Non-ideal Resistive Elements: The Impact of Response Functions

As the economic and environmental costs of training and deploying large vision or language models increase dramatically, analog in-memory computing (AIMC) emerges as a promising energy-efficient solution. However, the training perspective, especially its training dynamic, is underexplored. In AIMC hardware, the trainable weights are represented by the conductance of resistive elements and updated using consecutive electrical pulses. While the conductance changes by a constant in response to each pulse, in reality, the change is scaled by asymmetric and non-linear response functions, leading to a non-ideal training dynamic. This paper provides a theoretical foundation for gradient-based training on AIMC hardware with non-ideal response functions. We demonstrate that asymmetric response functions negatively impact Analog SGD by imposing an implicit penalty on the objective. To overcome the issue, we propose Residual Learning algorithm, which provably converges exactly to a critical point by solving a bilevel optimization problem. We demonstrate that the proposed method can be extended to address other hardware imperfections, such as limited response granularity. As we know, it is the first paper to investigate the impact of a class of generic non-ideal response functions. The conclusion is supported by simulations validating our theoretical insights.

preprint2023arXiv

Gridlock Models with the IBM Mega Traffic Simulator: Dependency on Vehicle Acceleration and Road Structure

Rush hour and sustained traffic flows in eight cities are studied using the IBM Mega Traffic Simulator to understand the importance of road structures and vehicle acceleration in the prevention of gridlock. Individual cars among the tens of thousands launched are monitored at every simulation time step using live streaming data transfer from the simulation software to analysis software on another computer. A measure of gridlock is the fraction of cars moving at less than 30% of their local road speed. Plots of this fraction versus the instantaneous number of cars on the road show hysteresis during rush hour simulations, indicating that it can take twice as long to unravel clogged roads as fill them. The area under the hysteresis loop is used as a measure of gridlock to compare different cities normalized to the same central areas. The differences between cities, combined with differences between idealized models using square or triangular road grids, indicate that gridlock tends to occur most when there are a small number of long roads that channel large fractions of traffic. These long roads help light traffic flow but they make heavy flows worse. Increasing the speed on these long roads makes gridlock even worse in heavy conditions. City throughput rates are also modeled using a smooth ramp up to a constant vehicle launch rate. Models with increasing acceleration for the same road speeds show clear improvements in city traffic flow as a result of faster interactions at intersections and merging points. However, these improvements are relatively small when the gridlock is caused by long roads having many cars waiting to exit at the same intersection. In general, gridlock in our models begins at intersections regardless of the available road space in the network.

preprint2022arXiv

Neural Network Training with Asymmetric Crosspoint Elements

Analog crossbar arrays comprising programmable nonvolatile resistors are under intense investigation for acceleration of deep neural network training. However, the ubiquitous asymmetric conductance modulation of practical resistive devices critically degrades the classification performance of networks trained with conventional algorithms. Here, we describe and experimentally demonstrate an alternative fully-parallel training algorithm: Stochastic Hamiltonian Descent. Instead of conventionally tuning weights in the direction of the error function gradient, this method programs the network parameters to successfully minimize the total energy (Hamiltonian) of the system that incorporates the effects of device asymmetry. We provide critical intuition on why device asymmetry is fundamentally incompatible with conventional training algorithms and how the new approach exploits it as a useful feature instead. Our technique enables immediate realization of analog deep learning accelerators based on readily available device technologies.

preprint2019arXiv

Design and Characterization of Superconducting Nanowire-Based Processors for Acceleration of Deep Neural Network Training

Training of deep neural networks (DNNs) is a computationally intensive task and requires massive volumes of data transfer. Performing these operations with the conventional von Neumann architectures creates unmanageable time and power costs. Recent studies have shown that mixed-signal designs involving crossbar architectures are capable of achieving acceleration factors as high as 30,000x over the state of the art digital processors. These approaches involve utilization of non-volatile memory (NVM) elements as local processors. However, no technology has been developed to-date that can satisfy the strict device requirements for the unit cell. This paper presents the superconducting nanowire-based processing element as a cross-point device. The unit cell has many programmable non-volatile states that can be used to perform analog multiplication. Importantly, these states are intrinsically discrete due to quantization of flux, which provides symmetric switching characteristics. Operation of these devices in a crossbar is described and verified with electro-thermal circuit simulations. Finally, validation of the concept in an actual DNN training task is shown using an emulator.

preprint2014arXiv

Suns-V$_\textrm{OC}$ characteristics of high performance kesterite solar cells

Low open circuit voltage ($V_{OC}$) has been recognized as the number one problem in the current generation of Cu$_{2}$ZnSn(Se,S)$_{4}$ (CZTSSe) solar cells. We report high light intensity and low temperature Suns-$V_{OC}$ measurement in high performance CZTSSe devices. The Suns-$V_{OC}$ curves exhibit bending at high light intensity, which points to several prospective $V_{OC}$ limiting mechanisms that could impact the $V_{OC}$, even at 1 sun for lower performing samples. These V$_{OC}$ limiting mechanisms include low bulk conductivity (because of low hole density or low mobility), bulk or interface defects including tail states, and a non-ohmic back contact for low carrier density CZTSSe. The non-ohmic back contact problem can be detected by Suns-$V_{OC}$ measurements with different monochromatic illumination. These limiting factors may also contribute to an artificially lower $J_{SC}$-$V_{OC}$ diode ideality factor.

preprint2010arXiv

Composite fermion valley polarization energies: Evidence for particle-hole asymmetry

In an ideal two-component two-dimensional electron system, particle-hole symmetry dictates that the fractional quantum Hall states around $ν= 1/2$ are equivalent to those around $ν= 3/2$. We demonstrate that composite fermions (CFs) around $ν= 1/2$ in AlAs possess a valley degree of freedom like their counterparts around $ν= 3/2$. However, focusing on $ν= 2/3$ and 4/3, we find that the energy needed to completely valley polarize the CFs around $ν= 1/2$ is considerably smaller than the corresponding value for CFs around $ν= 3/2$ thus betraying a particle-hole symmetry breaking.