Speaker
Кирилл Ачарйа
(МФТИ)
Description
The proposed research entails a theoretical analysis of the convergence rate and efficiency of a novel distributed optimization method, which incorporates independent segmentation of gradient coordinates ($PermK$) followed by a greedy coordinate selection process ($TopK$) for each gradient segment. Our findings indicate that the new method attains comparable results to state-of-the-art techniques, such as $MARINA-PermK$ and $EF-TopK$, in terms of zero-variance and general variance regimes, respectively. Additionally, the experimental performance of our approach is demonstrated through its application to quadratic problems and computer vision models.
Primary author
Кирилл Ачарйа
(МФТИ)
Co-authors
Alexandr Beznosikov
(Moscow Institute of Physics and Technology)
Timur Kharisov
(Moscow Institute of Physics and Technology)