Researcher profile

Tayfun Gokmen

Tayfun Gokmen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Analog In-memory Training on General Non-ideal Resistive Elements: The Impact of Response Functions

As the economic and environmental costs of training and deploying large vision or language models increase dramatically, analog in-memory computing (AIMC) emerges as a promising energy-efficient solution. However, the training perspective, especially its training dynamic, is underexplored. In AIMC hardware, the trainable weights are represented by the conductance of resistive elements and updated using consecutive electrical pulses. While the conductance changes by a constant in response to each pulse, in reality, the change is scaled by asymmetric and non-linear response functions, leading to a non-ideal training dynamic. This paper provides a theoretical foundation for gradient-based training on AIMC hardware with non-ideal response functions. We demonstrate that asymmetric response functions negatively impact Analog SGD by imposing an implicit penalty on the objective. To overcome the issue, we propose Residual Learning algorithm, which provably converges exactly to a critical point by solving a bilevel optimization problem. We demonstrate that the proposed method can be extended to address other hardware imperfections, such as limited response granularity. As we know, it is the first paper to investigate the impact of a class of generic non-ideal response functions. The conclusion is supported by simulations validating our theoretical insights.

preprint2023arXiv

Gridlock Models with the IBM Mega Traffic Simulator: Dependency on Vehicle Acceleration and Road Structure

Rush hour and sustained traffic flows in eight cities are studied using the IBM Mega Traffic Simulator to understand the importance of road structures and vehicle acceleration in the prevention of gridlock. Individual cars among the tens of thousands launched are monitored at every simulation time step using live streaming data transfer from the simulation software to analysis software on another computer. A measure of gridlock is the fraction of cars moving at less than 30% of their local road speed. Plots of this fraction versus the instantaneous number of cars on the road show hysteresis during rush hour simulations, indicating that it can take twice as long to unravel clogged roads as fill them. The area under the hysteresis loop is used as a measure of gridlock to compare different cities normalized to the same central areas. The differences between cities, combined with differences between idealized models using square or triangular road grids, indicate that gridlock tends to occur most when there are a small number of long roads that channel large fractions of traffic. These long roads help light traffic flow but they make heavy flows worse. Increasing the speed on these long roads makes gridlock even worse in heavy conditions. City throughput rates are also modeled using a smooth ramp up to a constant vehicle launch rate. Models with increasing acceleration for the same road speeds show clear improvements in city traffic flow as a result of faster interactions at intersections and merging points. However, these improvements are relatively small when the gridlock is caused by long roads having many cars waiting to exit at the same intersection. In general, gridlock in our models begins at intersections regardless of the available road space in the network.

preprint2022arXiv

Neural Network Training with Asymmetric Crosspoint Elements

Analog crossbar arrays comprising programmable nonvolatile resistors are under intense investigation for acceleration of deep neural network training. However, the ubiquitous asymmetric conductance modulation of practical resistive devices critically degrades the classification performance of networks trained with conventional algorithms. Here, we describe and experimentally demonstrate an alternative fully-parallel training algorithm: Stochastic Hamiltonian Descent. Instead of conventionally tuning weights in the direction of the error function gradient, this method programs the network parameters to successfully minimize the total energy (Hamiltonian) of the system that incorporates the effects of device asymmetry. We provide critical intuition on why device asymmetry is fundamentally incompatible with conventional training algorithms and how the new approach exploits it as a useful feature instead. Our technique enables immediate realization of analog deep learning accelerators based on readily available device technologies.

preprint2019arXiv

Design and Characterization of Superconducting Nanowire-Based Processors for Acceleration of Deep Neural Network Training

Training of deep neural networks (DNNs) is a computationally intensive task and requires massive volumes of data transfer. Performing these operations with the conventional von Neumann architectures creates unmanageable time and power costs. Recent studies have shown that mixed-signal designs involving crossbar architectures are capable of achieving acceleration factors as high as 30,000x over the state of the art digital processors. These approaches involve utilization of non-volatile memory (NVM) elements as local processors. However, no technology has been developed to-date that can satisfy the strict device requirements for the unit cell. This paper presents the superconducting nanowire-based processing element as a cross-point device. The unit cell has many programmable non-volatile states that can be used to perform analog multiplication. Importantly, these states are intrinsically discrete due to quantization of flux, which provides symmetric switching characteristics. Operation of these devices in a crossbar is described and verified with electro-thermal circuit simulations. Finally, validation of the concept in an actual DNN training task is shown using an emulator.

preprint2010arXiv

Composite fermion valley polarization energies: Evidence for particle-hole asymmetry

In an ideal two-component two-dimensional electron system, particle-hole symmetry dictates that the fractional quantum Hall states around $ν= 1/2$ are equivalent to those around $ν= 3/2$. We demonstrate that composite fermions (CFs) around $ν= 1/2$ in AlAs possess a valley degree of freedom like their counterparts around $ν= 3/2$. However, focusing on $ν= 2/3$ and 4/3, we find that the energy needed to completely valley polarize the CFs around $ν= 1/2$ is considerably smaller than the corresponding value for CFs around $ν= 3/2$ thus betraying a particle-hole symmetry breaking.