Ukwehla kwe-Gradient kuyindlela yokuthuthukisa edume kakhulu ekufanekisweni komshini (ML). I-algorithm inciphisa iphutha phakathi kwamanani abikezelwe kanye neqiniso eliyisisekelo. Njengoba inqubo icabangela iphuzu ledatha ngalinye ukuze liqonde futhi linciphise iphutha, ukusebenza kwalo kuncike kusayizi wedatha yokuqeqeshwa. Amasu afana ne-Stochastic Gradient Descent (SGD) yakhelwe ukuthuthukisa ukusebenza kwesibalo kodwa ngezindleko zokunemba kokuhlangana.
I-Stochastic Average Gradient ibhalansisa indlela yakudala, eyaziwa ngokuthi I-Full Gradient Descent ne-SGD, futhi inikeza zombili izinzuzo. Kodwa ngaphambi kokuthi sisebenzise i-algorithm, kufanele siqale siqonde ukubaluleka kwayo ekwenzeni amamodeli.
Yonke i-algorithm ye-ML inomsebenzi wokulahlekelwa ohlobene ohlose ukunciphisa noma ukuthuthukisa ukusebenza kwemodeli. Ngokwezibalo, ukulahlekelwa kungachazwa ngokuthi:
Kumane kuwumehluko phakathi kokuphumayo kwangempela nokubikezelwe, futhi ukunciphisa lo mehluko kusho ukuthi imodeli yethu isondela kumanani eqiniso ayisisekelo.
I-algorithm yokunciphisa isebenzisa ukwehla kwe-gradient ukunqamula umsebenzi wokulahlekelwa futhi ithole ubuncane bomhlaba wonke. Isinyathelo ngasinye esivundlayo sibandakanya ukubuyekeza izisindo ze-algorithm ukuze kuthuthukiswe okukhiphayo.
I-algorithm evamile yokwehla kwegradient isebenzisa isilinganiso sawo wonke ama-gradient abalwe kuyo yonke idathasethi. Umjikelezo wempilo wesibonelo esisodwa sokuqeqesha ubukeka kanjena:
Isibalo sokubuyekeza isisindo sibukeka kanjena:
Lapho W
emele izisindo zemodeli futhi dJ/dW
iphuma kokunye komsebenzi wokulahlekelwa ngokuphathelene nesisindo semodeli. Indlela evamile inezinga eliphezulu lokuhlangana kodwa iba eqolo uma usebenzisana nedathasethi enkulu ehlanganisa izigidi zamaphoyinti edatha.
Indlela ye-SGD ihlala ifana ne-plain GD, kodwa esikhundleni sokusebenzisa yonke idathasethi ukubala ama-gradient, isebenzisa inqwaba encane kokufakwayo. Indlela isebenza kahle kakhulu kodwa ingase igxume kakhulu izungeze ubuncane bomhlaba njengoba ukuphindaphinda ngakunye kusebenzisa ingxenye yedatha yokufunda kuphela.
Indlela ye-Stochastic Average Gradient (SAG) yethulwa njengendawo emaphakathi phakathi kwe-GD ne-SGD. Ikhetha iphoyinti ledatha elingahleliwe futhi ibuyekeze inani layo ngokusekelwe kugradient kulelo phuzu kanye nesilinganiso esisindiwe samagrediyenti adlule agcinwe kulelo phuzu ledatha elithile.
Ngokufana ne-SGD, i-SAG imodela zonke izinkinga njengesamba esilinganiselwe se-convex, imisebenzi ehlukanisayo. Kunoma ikuphi ukuphindwaphindwa, kusebenzisa ama-gradient amanje kanye nesilinganiso samagrediyenti adlule ukuze kuthuthukiswe isisindo. I-equation ithatha ifomu elilandelayo:
Phakathi kwama-algorithms amabili adumile, i-gradient egcwele (FG) kanye nokwehla kwe-stochastic gradient (SGD), i-algorithm ye-FG inenani elingcono lokuhlangana njengoba isebenzisa yonke idatha esethiwe ngesikhathi sokuphindaphinda ngakunye ukuze kubalwe.
Nakuba i-SAG inesakhiwo esifana ne-SGD, izinga lokuhlangana kwayo liyaqhathaniswa futhi ngezinye izikhathi lingcono kunendlela egcwele ye-gradient. Ithebula 1 ngezansi lifingqa imiphumela evela ekuhlolweni kwe
Naphezu kokusebenza kwayo okumangalisayo, izinguquko ezimbalwa ziye zahlongozwa ku-algorithm yoqobo ye-SGD ukusiza ukuthuthukisa ukusebenza.
Ukwehla kwe-Gradient wukulungiselelwa okudumile okusetshenziselwa ukuthola ubuncane bomhlaba bemisebenzi enikeziwe yenjongo. I-algorithm isebenzisa i-gradient yomsebenzi wenhloso ukunqamula umthambeka wokusebenza ize ifike endaweni ephansi kakhulu.
I-Full Gradient Descent (FG) kanye ne-Stochastic Gradient Descent (SGD) izinhlobonhlobo ezimbili ezidumile ze-algorithm. I-FG isebenzisa yonke idathasethi ngesikhathi sokuphindaphinda ngakunye futhi inikeza izinga eliphezulu lokuhlangana ngezindleko eziphezulu zokubala. Ekuphindaphindweni ngakunye, i-SGD isebenzisa isethi engaphansi yedatha ukuze iqalise i-algorithm. Isebenza ngempumelelo kakhulu kepha inokuhlangana okungaqinisekile.
I-Stochastic Average Gradient (SAG) ingenye inguquko ehlinzeka ngezinzuzo zawo womabili ama-algorithms adlule. Isebenzisa isilinganiso samagrediyenti adlule kanye nesethi engaphansi yedathasethi ukuze inikeze izinga eliphezulu lokuhlangana nokubala okuphansi. I-algorithm ingabuye ishintshwe ukuze kuthuthukiswe ukusebenza kahle kwayo kusetshenziswa i-vectorization nama-mini-batches.