Research on the combination of Top-K and Perm-K gradient sparsification algorithms for distributed setting

19 May 2023, 16:10
15m
Физтех.Арктика, Поточная аудитория (МФТИ)

Физтех.Арктика, Поточная аудитория

МФТИ

Computer & Data Science Computer & Data Science 19

Speaker

Timur Kharisov (Moscow Institute of Physics and Technology)

Description

The proposed research entails a theoretical analysis of the convergence rate and efficiency of a novel distributed optimization method, which incorporates independent segmentation of gradient coordinates ($PermK$) followed by a greedy coordinate selection process ($TopK$) for each gradient segment. Our findings indicate that the new method attains comparable results to state-of-the-art techniques, such as $MARINA-PermK$ [Szlendak21] and $EF-TopK$ [Alistarh18] , in terms of zero-variance and general variance regimes, respectively. Additionally, the experimental performance of our approach is demonstrated through its application to quadratic problems and computer vision models.

Primary author

Timur Kharisov (Moscow Institute of Physics and Technology)

Co-authors

Mr Alexander Beznosikov (Moscow Institute of Physics and Technology) Mr Kirill Acharya (Moscow Institute of Physics and Technology)

Presentation materials