Ukwehla kwegradient yeyona ndlela idumileyo yokufunda koomatshini (ML) imodeli. I-algorithm inciphisa impazamo phakathi kwamaxabiso aqikelelweyo kunye nenyaniso esisiseko. Ekubeni ubuchule buqwalasela indawo nganye yedatha ukuqonda kunye nokunciphisa impazamo, ukusebenza kwayo kuxhomekeke kubungakanani bedatha yoqeqesho. Ubuchwephesha obufana ne-Stochastic Gradient Descent (SGD) yenzelwe ukuphucula ukusebenza kokubala kodwa ngexabiso lokuchaneka kokudibanisa.
I-Stochastic Average Gradient ilinganisa indlela yeklasi, eyaziwa ngokuba yi-Full Gradient Descent kunye ne-SGD, kwaye inikezela zombini izibonelelo. Kodwa ngaphambi kokuba sisebenzise i-algorithm, kufuneka siqale siqonde ukubaluleka kwayo ekuphuculeni imodeli.
Yonke i-algorithm ye-ML inomsebenzi welahleko ohambelanayo ojolise ekunciphiseni okanye ukuphucula ukusebenza kwemodeli. Ngokwezibalo, ilahleko inokuchazwa ngolu hlobo:
Ngumahluko nje phakathi kwesiphumo sokwenyani kunye nesiqikelelweyo, kwaye ukunciphisa lo mahluko kuthetha ukuba imodeli yethu isondela kumaxabiso enyaniso asezantsi.
I-algorithm yokunciphisa isebenzisa ukuhla kwe-gradient ukunqumla umsebenzi welahleko kunye nokufumana ubuncinci behlabathi. Inyathelo ngalinye elinqumlayo libandakanya ukuhlaziya iintsimbi ze-algorithm ukuze kunyuswe imveliso.
I-algorithm yesiqhelo yokwehla kusetyenziswa umndilili wazo zonke izithambiso ezibalwe kuyo yonke isethi yedatha. Umjikelo wobomi bomzekelo woqeqesho omnye ujongeka ngolu hlobo lulandelayo:
I-equation yokuhlaziya ubunzima ibonakala ngolu hlobo lulandelayo:
Apho W
imele ubunzima bemodeli kunye ne dJ/dW
yi-derivative yomsebenzi welahleko ngokubhekiselele kubunzima bomzekelo. Indlela eqhelekileyo inezinga eliphezulu lokuhlangana kodwa iba yindleko yekhompyutha xa ijongene nedatha enkulu equka izigidi zamanqaku edatha.
Indlela ye-SGD ihlala ifana ne-GD ecacileyo, kodwa endaweni yokusebenzisa i-dataset yonke ukubala i-gradients, isebenzisa ibhetshi encinci evela kumagalelo. Indlela yokusebenza isebenza kakuhle kakhulu kodwa inokutsiba-tsiba kakhulu ijikeleze ubuncinci behlabathi njengoko ukuphindaphinda ngakunye kusebenzisa kuphela inxenye yedatha yokufunda.
Indlela ye-Stochastic Average Gradient (SAG) yaziswa njengendawo ephakathi phakathi kwe-GD ne-SGD. Ikhetha i-random data point kwaye ihlaziya ixabiso layo ngokusekelwe kwi-gradient ngelo xesha kunye ne-avareji ye-weighted of the past gradients egcinwe kuloo ndawo yedatha ethile.
Ngokufana ne-SGD, iimodeli ze-SAG zonke iingxaki njengesixa esilinganiselweyo se-convex, imisebenzi eyahlulahluko. Kuyo nayiphi na i-iteration, isebenzisa iigradient zangoku kunye nomndilili we-gradients zangaphambili zokuhlaziya ubunzima. Inxaki ihamba ngolu hlobo lulandelayo:
Phakathi kwee-algorithms ezimbini ezidumileyo, i-gradient epheleleyo (FG) kunye ne-stochastic gradient descent (SGD), i-algorithm ye-FG inezinga elingcono lokuguqulela ekubeni isebenzisa yonke idatha eseti ngexesha lokuphindaphinda ngalunye ukubala.
Nangona i-SAG inesakhiwo esifana ne-SGD, izinga lokuhlangana kwayo liyathelekiseka kwaye ngamanye amaxesha lingcono kunendlela epheleleyo yokuthamba. Itheyibhile 1 ngezantsi ishwankathela iziphumo ezivela kwiimvavanyo ze
Ngaphandle kokusebenza kwayo okumangalisayo, uhlengahlengiso oluninzi luye lwacetywa kwi-algorithm ye-SGD yasekuqaleni ukunceda ukuphucula ukusebenza.
I-Gradient descent ludidi oludumileyo olusetyenziselwa ukufumana i-minima yehlabathi yemisebenzi enikeziweyo yenjongo. I-algorithm isebenzisa i-gradient yomsebenzi wenjongo ukunqumla i-slope yomsebenzi de ifike kwindawo ephantsi.
I-Gradient Descent (FG) kunye ne-Stochastic Gradient Descent (SGD) zimbini iinguqu ezidumileyo ze-algorithm. I-FG isebenzisa yonke idatha yedatha ngexesha lokuphindaphinda ngalunye kwaye ibonelela ngezinga eliphezulu lokuhlangana ngexabiso eliphezulu lokubala. Kwi-iteration nganye, i-SGD isebenzisa i-subset yedatha ukuqhuba i-algorithm. Isebenza kakhulu ngakumbi kodwa inokuhlangana okungaqinisekanga.
I-Stochastic Average Gradient (SAG) yenye inguqu ebonelela ngeenzuzo zombini ii-algorithms zangaphambili. Isebenzisa i-avareji ye-gradients ezidlulileyo kunye ne-subset ye-dataset ukubonelela ngezinga eliphezulu lokuhlangana kunye nokubala okuphantsi. I-algorithm inokuguqulwa ngakumbi ukuphucula ukusebenza kwayo ngokusebenzisa i-vectorization kunye ne-mini-batches.