Source author record

Dongdong He

Dongdong He appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Computation and Language math.OC Neurons and Cognition Tissues and Organs

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Training Report of TeleChat3-MoE

TeleChat3-MoE is the latest series of TeleChat large language models, featuring a Mixture-of-Experts (MoE) architecture with parameter counts ranging from 105 billion to over one trillion,trained end-to-end on Ascend NPU cluster. This technical report mainly presents the underlying training infrastructure that enables reliable and efficient scaling to frontier model sizes. We detail systematic methodologies for operator-level and end-to-end numerical accuracy verification, ensuring consistency across hardware platforms and distributed parallelism strategies. Furthermore, we introduce a suite of performance optimizations, including interleaved pipeline scheduling, attention-aware data scheduling for long-sequence training,hierarchical and overlapped communication for expert parallelism, and DVM-based operator fusion. A systematic parallelization framework, leveraging analytical estimation and integer linear programming, is also proposed to optimize multi-dimensional parallelism configurations. Additionally, we present methodological approaches to cluster-level optimizations, addressing host- and device-bound bottlenecks during large-scale training tasks. These infrastructure advancements yield significant throughput improvements and near-linear scaling on clusters comprising thousands of devices, providing a robust foundation for large-scale language model development on hardware ecosystems.

preprint2020arXiv

Last-mile Delivery: Optimal Locker Location Under Multinomial Logit Choice Model

One innovative solution to the last-mile delivery problem is the self-service locker system. Motivated by a real case in Singapore, we consider a POP-Locker Alliance who operates a set of POP-stations and wishes to improve the last-mile delivery by opening new locker facilities. We propose a quantitative approach to determine the optimal locker location with the objective to maximize the overall service provided by the alliance. Customer's choices regarding the use of facilities are explicitly considered. They are predicted by a multinomial logit model. We then formulate the location problem as a multi-ratio linear-fractional 0-1 program and provide two solution approaches. The first one is to reformulate the original problem as a mixed-integer linear program, which is further strengthened using conditional McCormick inequalities. This approach is an exact method, developed for small-scale problems. For large-scale problems, we propose a Suggest-and-Improve framework with two embedded algorithms. Numerical studies indicated that our framework is an efficient approach that yields high-quality solutions. Finally, we conducted a case study. The results highlighted the importance of considering the customers' choices. Under different parameter values of the multinomial logit model, the decisions could be completely different. Therefore, the parameter value should be carefully estimated in advance.

preprint2016arXiv

An extrapolation cascadic multigrid method combined with a fourth order compact scheme for 3D poisson equation

In this paper, we develop an EXCMG method to solve the three-dimensional Poisson equation on rectangular domains by using the compact finite difference (FD) method with unequal meshsizes in different coordinate directions. The resulting linear system from compact FD discretization is solved by the conjugate gradient (CG) method with a relative residual stopping criterion. By combining the Richardson extrapolation and tri-quartic Lagrange interpolation for the numerical solutions from two-level of grids (current and previous grids), we are able to produce an extremely accurate approximation of the actual numerical solution on the next finer grid, which can greatly reduce the number of relaxation sweeps needed. Additionally, a simple method based on the midpoint extrapolation formula is used for the fourth-order FD solutions on two-level of grids to achieve sixth-order accuracy on the entire fine grid cheaply and directly. The gradient of the numerical solution can also be easily obtained through solving a series of tridiagonal linear systems resulting from the fourth-order compact FD discretizations. Numerical results show that our EXCMG method is much more efficient than the classical V-cycle and W-cycle multigrid methods. Moreover, only few CG iterations are required on the finest grid to achieve full fourth-order accuracy in both the $L^2$-norm and $L^{\infty}$-norm for the solution and its gradient when the exact solution belongs to $C^6$. Finally, numerical result shows that our EXCMG method is still effective when the exact solution has a lower regularity, which widens the scope of applicability of our EXCMG method.

preprint2015arXiv

A linearly implicit conservative difference scheme for the generalized Rosenau-Kawahara-RLW equation

This paper concerns the numerical study for the generalized Rosenau-Kawahara-RLW equation obtained by coupling the generalized Rosenau-RLW equation and the generalized Rosenau-Kawahara equation. We first derive the energy conservation law of the equation, and then develop a three-level linearly implicit difference scheme for solving the equation. We prove that the proposed scheme is energy-conserved, unconditionally stable and second-order accurate both in time and space variables. Finally, numerical experiments are carried out to confirm the energy conservation, the convergence rates of the scheme and effectiveness for long-time simulation.

preprint2013arXiv

A mathematical model of the metabolic and perfusion effects on cortical spreading depression

Cortical spreading depression (CSD) is a slow-moving ionic and metabolic disturbance that propagates in cortical brain tissue. In addition to massive cellular depolarization, CSD also involves significant changes in perfusion and metabolism -- aspects of CSD that had not been modeled and are important to traumatic brain injury, subarachnoid hemorrhage, stroke, and migraine. In this study, we develop a mathematical model for CSD where we focus on modeling the features essential to understanding the implications of neurovascular coupling during CSD. In our model, the sodium-potassium--ATPase, mainly responsible for ionic homeostasis and active during CSD, operates at a rate that is dependent on the supply of oxygen. The supply of oxygen is determined by modeling blood flow through a lumped vascular tree with an effective local vessel radius that is controlled by the extracellular potassium concentration. We show that during CSD, the metabolic demands of the cortex exceed the physiological limits placed on oxygen delivery, regardless of vascular constriction or dilation. However, vasoconstriction and vasodilation play important roles in the propagation of CSD and its recovery. Our model replicates the qualitative and quantitative behavior of CSD -- vasoconstriction, oxygen depletion, extracellular potassium elevation, prolonged depolarization -- found in experimental studies. We predict faster, longer duration CSD in vivo than in vitro due to the contribution of the vasculature. Our results also help explain some of the variability of CSD between species and even within the same animal. These results have clinical and translational implications, as they allow for more precise in vitro, in vivo, and in silico exploration of a phenomenon broadly relevant to neurological disease.