Ukwehla kwe-Gradient kuyindlela yokuthuthukisa edume kakhulu ekufanekisweni komshini (ML). I-algorithm inciphisa iphutha phakathi kwamanani abikezelwe kanye neqiniso eliyisisekelo. Njengoba inqubo icabangela iphuzu ledatha ngalinye ukuze liqonde futhi linciphise iphutha, ukusebenza kwalo kuncike kusayizi wedatha yokuqeqeshwa. Amasu afana ne-Stochastic Gradient Descent (SGD) yakhelwe ukuthuthukisa ukusebenza kwesibalo kodwa ngezindleko zokunemba kokuhlangana.
I-Stochastic Average Gradient ibhalansisa indlela yakudala, eyaziwa ngokuthi I-Full Gradient Descent ne-SGD, futhi inikeza zombili izinzuzo. Kodwa ngaphambi kokuthi sisebenzise i-algorithm, kufanele siqale siqonde ukubaluleka kwayo ekwenzeni amamodeli.
Ukuthuthukisa Izinjongo Zokufunda Ngomshini Ngokwehla Kwe-Gradient
Yonke i-algorithm ye-ML inomsebenzi wokulahlekelwa ohlobene ohlose ukunciphisa noma ukuthuthukisa ukusebenza kwemodeli. Ngokwezibalo, ukulahlekelwa kungachazwa ngokuthi:
Kumane kuwumehluko phakathi kokuphumayo kwangempela nokubikezelwe, futhi ukunciphisa lo mehluko kusho ukuthi imodeli yethu isondela kumanani eqiniso ayisisekelo.
I-algorithm yokunciphisa isebenzisa ukwehla kwe-gradient ukunqamula umsebenzi wokulahlekelwa futhi ithole ubuncane bomhlaba wonke. Isinyathelo ngasinye esivundlayo sibandakanya ukubuyekeza izisindo ze-algorithm ukuze kuthuthukiswe okukhiphayo.
I-Plain Gradient Ukwehla
I-algorithm evamile yokwehla kwegradient isebenzisa isilinganiso sawo wonke ama-gradient abalwe kuyo yonke idathasethi. Umjikelezo wempilo wesibonelo esisodwa sokuqeqesha ubukeka kanjena:
Isibalo sokubuyekeza isisindo sibukeka kanjena:
Lapho W
emele izisindo zemodeli futhi dJ/dW
iphuma kokunye komsebenzi wokulahlekelwa ngokuphathelene nesisindo semodeli. Indlela evamile inezinga eliphezulu lokuhlangana kodwa iba eqolo uma usebenzisana nedathasethi enkulu ehlanganisa izigidi zamaphoyinti edatha.
I-Stochastic Gradient Descent (SGD)
Indlela ye-SGD ihlala ifana ne-plain GD, kodwa esikhundleni sokusebenzisa yonke idathasethi ukubala ama-gradient, isebenzisa inqwaba encane kokufakwayo. Indlela isebenza kahle kakhulu kodwa ingase igxume kakhulu izungeze ubuncane bomhlaba njengoba ukuphindaphinda ngakunye kusebenzisa ingxenye yedatha yokufunda kuphela.
I-Stochastic Average Gradient
Indlela ye-Stochastic Average Gradient (SAG) yethulwa njengendawo emaphakathi phakathi kwe-GD ne-SGD. Ikhetha iphoyinti ledatha elingahleliwe futhi ibuyekeze inani layo ngokusekelwe kugradient kulelo phuzu kanye nesilinganiso esisindiwe samagrediyenti adlule agcinwe kulelo phuzu ledatha elithile.
Ngokufana ne-SGD, i-SAG imodela zonke izinkinga njengesamba esilinganiselwe se-convex, imisebenzi ehlukanisayo. Kunoma ikuphi ukuphindwaphindwa, kusebenzisa ama-gradient amanje kanye nesilinganiso samagrediyenti adlule ukuze kuthuthukiswe isisindo. I-equation ithatha ifomu elilandelayo:
Isilinganiso Sokuhlangana
Phakathi kwama-algorithms amabili adumile, i-gradient egcwele (FG) kanye nokwehla kwe-stochastic gradient (SGD), i-algorithm ye-FG inenani elingcono lokuhlangana njengoba isebenzisa yonke idatha esethiwe ngesikhathi sokuphindaphinda ngakunye ukuze kubalwe.
Nakuba i-SAG inesakhiwo esifana ne-SGD, izinga lokuhlangana kwayo liyaqhathaniswa futhi ngezinye izikhathi lingcono kunendlela egcwele ye-gradient. Ithebula 1 ngezansi lifingqa imiphumela evela ekuhlolweni kwe
Okunye Ukuguqulwa
Naphezu kokusebenza kwayo okumangalisayo, izinguquko ezimbalwa ziye zahlongozwa ku-algorithm yoqobo ye-SGD ukusiza ukuthuthukisa ukusebenza.
- Ukukala Kabusha Ekuphindaphindeni Kwasekuqaleni: Ukuhlangana kwe-SAG kuhlala kuhamba kancane phakathi neziphindaphinda ezimbalwa zokuqala njengoba i-algorithm yenza isiqondisindlela sibe sijwayelekile ngo-n (inani eliphelele lamaphoyinti edatha). Lokhu kunikeza isilinganiso esingalungile njengoba i-algorithm ingakawaboni amaphuzu amaningi edatha. Ukuguqulwa kuphakamisa ukujwayela ngo-m esikhundleni sika-n, lapho u-m eyinombolo yamaphoyinti edatha abonwa okungenani kanye kuze kube lokho kuphindaphindwa okuthile.
- Amaqoqo amancane: Indlela ye-Stochastic Gradient isebenzisa ama-mini-batches ukucubungula amaphuzu amaningi edatha ngesikhathi esisodwa. Indlela efanayo ingasetshenziswa ku-SAG. Lokhu kuvumela i-vectorization kanye nokufana ukuze kuthuthukiswe ukusebenza kahle kwekhompyutha. Iphinde yehlise umthwalo wenkumbulo, inselelo evelele ye-algorithm ye-SAG.
- Ukuhlolwa kosayizi wesinyathelo: Usayizi wesinyathelo okukhulunywe ngawo ekuqaleni (116L) unikeza imiphumela emangalisayo, kodwa ababhali baphinde bahlola ngokusebenzisa usayizi wesinyathelo ongu-1L. Lesi sakamuva sinikeze ukuhlangana okungcono nakakhulu. Nokho, ababhali abakwazanga ukwethula ukuhlaziya okusemthethweni kwemiphumela ethuthukisiwe. Baphetha ngokuthi usayizi wesinyathelo kufanele uhlolwe ukuze kutholwe okulungele inkinga ethile.
Imicabango yokugcina
Ukwehla kwe-Gradient wukulungiselelwa okudumile okusetshenziselwa ukuthola ubuncane bomhlaba bemisebenzi enikeziwe yenjongo. I-algorithm isebenzisa i-gradient yomsebenzi wenhloso ukunqamula umthambeka wokusebenza ize ifike endaweni ephansi kakhulu.
I-Full Gradient Descent (FG) kanye ne-Stochastic Gradient Descent (SGD) izinhlobonhlobo ezimbili ezidumile ze-algorithm. I-FG isebenzisa yonke idathasethi ngesikhathi sokuphindaphinda ngakunye futhi inikeza izinga eliphezulu lokuhlangana ngezindleko eziphezulu zokubala. Ekuphindaphindweni ngakunye, i-SGD isebenzisa isethi engaphansi yedatha ukuze iqalise i-algorithm. Isebenza ngempumelelo kakhulu kepha inokuhlangana okungaqinisekile.
I-Stochastic Average Gradient (SAG) ingenye inguquko ehlinzeka ngezinzuzo zawo womabili ama-algorithms adlule. Isebenzisa isilinganiso samagrediyenti adlule kanye nesethi engaphansi yedathasethi ukuze inikeze izinga eliphezulu lokuhlangana nokubala okuphansi. I-algorithm ingabuye ishintshwe ukuze kuthuthukiswe ukusebenza kahle kwayo kusetshenziswa i-vectorization nama-mini-batches.