Research on the combination of Top-K and Perm-K gradient sparsification algorithms for distributed setting

19 May 2023, 16:25
15m
Физтех.Арктика, Поточная аудитория (МФТИ)

Физтех.Арктика, Поточная аудитория

МФТИ

Computer & Data Science Computer & Data Science 19

Speaker

Кирилл Ачарйа (МФТИ)

Description

The proposed research entails a theoretical analysis of the convergence rate and efficiency of a novel distributed optimization method, which incorporates independent segmentation of gradient coordinates ($PermK$) followed by a greedy coordinate selection process ($TopK$) for each gradient segment. Our findings indicate that the new method attains comparable results to state-of-the-art techniques, such as $MARINA-PermK$ and $EF-TopK$, in terms of zero-variance and general variance regimes, respectively. Additionally, the experimental performance of our approach is demonstrated through its application to quadratic problems and computer vision models.

Primary author

Co-authors

Alexandr Beznosikov (Moscow Institute of Physics and Technology) Timur Kharisov (Moscow Institute of Physics and Technology)

Presentation materials