paint-brush
Ukuqonda iStochastic Average Gradientnge@kustarev
31,726 ukufunda
31,726 ukufunda

Ukuqonda iStochastic Average Gradient

nge Andrey Kustarev4m2024/06/06
Read on Terminal Reader
Read this story w/o Javascript

Inde kakhulu; Ukufunda

I-Gradient descent lusetyenziso oludumileyo olusetyenziselwa ukufumana i-minima yehlabathi yemisebenzi enikeziweyo yenjongo. I-algorithm isebenzisa i-gradient yomsebenzi wenjongo ukunqumla i-slope yomsebenzi de ifike kwindawo ephantsi. I-Gradient Descent (FG) kunye ne-Stochastic Gradient Descent (SGD) zimbini iinguqu ezidumileyo ze-algorithm. I-FG isebenzisa yonke idatha yedatha ngexesha lokuphindaphinda ngalunye kwaye ibonelela ngezinga eliphezulu lokuhlangana kwixabiso eliphezulu lokubala. Kwi-iteration nganye, i-SGD isebenzisa i-subset yedatha ukuqhuba i-algorithm. Isebenza kakhulu ngakumbi kodwa inokuhlangana okungaqinisekanga. I-Stochastic Average Gradient (SAG) yenye inguqu ebonelela ngeenzuzo zombini ii-algorithms zangaphambili. Isebenzisa i-avareji ye-gradients ezidlulileyo kunye ne-subset ye-dataset ukubonelela ngezinga eliphezulu lokuhlangana kunye nokubala okuphantsi. I-algorithm inokuguqulwa ngakumbi ukuphucula ukusebenza kwayo ngokusebenzisa i-vectorization kunye ne-mini-batches.

People Mentioned

Mention Thumbnail

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Ukuqonda iStochastic Average Gradient
Andrey Kustarev HackerNoon profile picture
0-item


Ukwehla kwegradient yeyona ndlela idumileyo yokufunda koomatshini (ML) imodeli. I-algorithm inciphisa impazamo phakathi kwamaxabiso aqikelelweyo kunye nenyaniso esisiseko. Ekubeni ubuchule buqwalasela indawo nganye yedatha ukuqonda kunye nokunciphisa impazamo, ukusebenza kwayo kuxhomekeke kubungakanani bedatha yoqeqesho. Ubuchwephesha obufana ne-Stochastic Gradient Descent (SGD) yenzelwe ukuphucula ukusebenza kokubala kodwa ngexabiso lokuchaneka kokudibanisa.


I-Stochastic Average Gradient ilinganisa indlela yeklasi, eyaziwa ngokuba yi-Full Gradient Descent kunye ne-SGD, kwaye inikezela zombini izibonelelo. Kodwa ngaphambi kokuba sisebenzise i-algorithm, kufuneka siqale siqonde ukubaluleka kwayo ekuphuculeni imodeli.

Ukuphucula iiNjongo zokuFunda koomatshini ngokuhla kweGradient

Yonke i-algorithm ye-ML inomsebenzi welahleko ohambelanayo ojolise ekunciphiseni okanye ukuphucula ukusebenza kwemodeli. Ngokwezibalo, ilahleko inokuchazwa ngolu hlobo:


Ngumahluko nje phakathi kwesiphumo sokwenyani kunye nesiqikelelweyo, kwaye ukunciphisa lo mahluko kuthetha ukuba imodeli yethu isondela kumaxabiso enyaniso asezantsi.


I-algorithm yokunciphisa isebenzisa ukuhla kwe-gradient ukunqumla umsebenzi welahleko kunye nokufumana ubuncinci behlabathi. Inyathelo ngalinye elinqumlayo libandakanya ukuhlaziya iintsimbi ze-algorithm ukuze kunyuswe imveliso.


Ukwehla kweGradient

I-algorithm yesiqhelo yokwehla kusetyenziswa umndilili wazo zonke izithambiso ezibalwe kuyo yonke isethi yedatha. Umjikelo wobomi bomzekelo woqeqesho omnye ujongeka ngolu hlobo lulandelayo:



I-equation yokuhlaziya ubunzima ibonakala ngolu hlobo lulandelayo:

Apho W imele ubunzima bemodeli kunye ne dJ/dW yi-derivative yomsebenzi welahleko ngokubhekiselele kubunzima bomzekelo. Indlela eqhelekileyo inezinga eliphezulu lokuhlangana kodwa iba yindleko yekhompyutha xa ijongene nedatha enkulu equka izigidi zamanqaku edatha.

Ukuhla kweStochastic Gradient (SGD)

Indlela ye-SGD ihlala ifana ne-GD ecacileyo, kodwa endaweni yokusebenzisa i-dataset yonke ukubala i-gradients, isebenzisa ibhetshi encinci evela kumagalelo. Indlela yokusebenza isebenza kakuhle kakhulu kodwa inokutsiba-tsiba kakhulu ijikeleze ubuncinci behlabathi njengoko ukuphindaphinda ngakunye kusebenzisa kuphela inxenye yedatha yokufunda.

I-Stochastic Average Gradient

Indlela ye-Stochastic Average Gradient (SAG) yaziswa njengendawo ephakathi phakathi kwe-GD ne-SGD. Ikhetha i-random data point kwaye ihlaziya ixabiso layo ngokusekelwe kwi-gradient ngelo xesha kunye ne-avareji ye-weighted of the past gradients egcinwe kuloo ndawo yedatha ethile.


Ngokufana ne-SGD, iimodeli ze-SAG zonke iingxaki njengesixa esilinganiselweyo se-convex, imisebenzi eyahlulahluko. Kuyo nayiphi na i-iteration, isebenzisa iigradient zangoku kunye nomndilili we-gradients zangaphambili zokuhlaziya ubunzima. Inxaki ihamba ngolu hlobo lulandelayo:



Ireyithi yokuHlangana

Phakathi kwee-algorithms ezimbini ezidumileyo, i-gradient epheleleyo (FG) kunye ne-stochastic gradient descent (SGD), i-algorithm ye-FG inezinga elingcono lokuguqulela ekubeni isebenzisa yonke idatha eseti ngexesha lokuphindaphinda ngalunye ukubala.

Nangona i-SAG inesakhiwo esifana ne-SGD, izinga lokuhlangana kwayo liyathelekiseka kwaye ngamanye amaxesha lingcono kunendlela epheleleyo yokuthamba. Itheyibhile 1 ngezantsi ishwankathela iziphumo ezivela kwiimvavanyo ze Schmidt et. al .

Umthombo: https://arxiv.org/pdf/1309.2388

Uhlengahlengiso Olungaphaya

Ngaphandle kokusebenza kwayo okumangalisayo, uhlengahlengiso oluninzi luye lwacetywa kwi-algorithm ye-SGD yasekuqaleni ukunceda ukuphucula ukusebenza.


  • Ukulinganisa kwakhona kwii-Early Iterations: I-SAG convergence ihlala icotha ngexesha lokuphindaphinda okumbalwa kokuqala ukususela ekubeni i-algorithm ilungisa ulwalathiso nge-n (inani lilonke lamanqaku edatha). Oku kunika uqikelelo olungachanekanga njengoko i-algorithm ingekaboni amanqaku amaninzi edatha. Uhlengahlengiso lucebisa ukwenziwa kwesiqhelo ngu-m endaweni ye-n, apho u-m linani lamanqaku edatha abonwayo kube kanye kude kube koko kuphindaphindwa.
  • Iibhetshi ezincinci: Indlela yeStochastic Gradient isebenzisa iibhetshi ezincinci ukucubungula amanqaku amaninzi edatha ngaxeshanye. Kwale ndlela inye inokusetyenziswa kwi-SAG. Oku kuvumela i-vectorization kunye nokuhambelana nokuphucula ukusebenza kakuhle kwekhompyuter. Ikwanciphisa umthwalo wememori, umngeni obalaseleyo we-algorithm ye-SAG.
  • Uvavanyo lobungakanani benyathelo: Ubungakanani benyathelo obukhankanywe ngaphambili (116L) bubonelela ngeziphumo ezimangalisayo, kodwa ababhali baphinda bazama ngokusebenzisa ubungakanani benyathelo le-1L. Le yokugqibela ibonelele ngakumbi ukudibana. Nangona kunjalo, ababhali abakwazanga ukubonisa uhlalutyo olusesikweni lweziphumo eziphuculweyo. Bagqiba kwelokuba ubungakanani benyathelo kufuneka buvavanywe ngalo ukuze kufunyanwe eyona ifanelekileyo yengxaki ethile.


Iingcamango Zokugqibela

I-Gradient descent ludidi oludumileyo olusetyenziselwa ukufumana i-minima yehlabathi yemisebenzi enikeziweyo yenjongo. I-algorithm isebenzisa i-gradient yomsebenzi wenjongo ukunqumla i-slope yomsebenzi de ifike kwindawo ephantsi.

I-Gradient Descent (FG) kunye ne-Stochastic Gradient Descent (SGD) zimbini iinguqu ezidumileyo ze-algorithm. I-FG isebenzisa yonke idatha yedatha ngexesha lokuphindaphinda ngalunye kwaye ibonelela ngezinga eliphezulu lokuhlangana ngexabiso eliphezulu lokubala. Kwi-iteration nganye, i-SGD isebenzisa i-subset yedatha ukuqhuba i-algorithm. Isebenza kakhulu ngakumbi kodwa inokuhlangana okungaqinisekanga.


I-Stochastic Average Gradient (SAG) yenye inguqu ebonelela ngeenzuzo zombini ii-algorithms zangaphambili. Isebenzisa i-avareji ye-gradients ezidlulileyo kunye ne-subset ye-dataset ukubonelela ngezinga eliphezulu lokuhlangana kunye nokubala okuphantsi. I-algorithm inokuguqulwa ngakumbi ukuphucula ukusebenza kwayo ngokusebenzisa i-vectorization kunye ne-mini-batches.