paint-brush
Ukuqonda I-Stochastic Average Gradientnge@kustarev
31,726 ukufundwa
31,726 ukufundwa

Ukuqonda I-Stochastic Average Gradient

nge Andrey Kustarev4m2024/06/06
Read on Terminal Reader
Read this story w/o Javascript

Kude kakhulu; Uzofunda

Ukwehla kwe-Gradient wukulungiselelwa okudumile okusetshenziselwa ukuthola ubuncane bomhlaba bemisebenzi enikeziwe yenjongo. I-algorithm isebenzisa i-gradient yomsebenzi wenhloso ukunqamula umthambeka wokusebenza ize ifike endaweni ephansi kakhulu. I-Full Gradient Descent (FG) kanye ne-Stochastic Gradient Descent (SGD) izinhlobonhlobo ezimbili ezidumile ze-algorithm. I-FG isebenzisa yonke idathasethi ngesikhathi sokuphindaphinda ngakunye futhi inikeza izinga eliphezulu lokuhlangana ngezindleko eziphezulu zokubala. Ekuphindaphindweni ngakunye, i-SGD isebenzisa isethi engaphansi yedatha ukuze iqalise i-algorithm. Isebenza ngempumelelo kakhulu kepha inokuhlangana okungaqinisekile. I-Stochastic Average Gradient (SAG) ingenye inguquko ehlinzeka ngezinzuzo zawo womabili ama-algorithms adlule. Isebenzisa isilinganiso samagrediyenti adlule kanye nesethi engaphansi yedathasethi ukuze inikeze izinga eliphezulu lokuhlangana nokubala okuphansi. I-algorithm ingabuye ishintshwe ukuze kuthuthukiswe ukusebenza kahle kwayo kusetshenziswa i-vectorization nama-mini-batches.

People Mentioned

Mention Thumbnail

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Ukuqonda I-Stochastic Average Gradient
Andrey Kustarev HackerNoon profile picture
0-item


Ukwehla kwe-Gradient kuyindlela yokuthuthukisa edume kakhulu ekufanekisweni komshini (ML). I-algorithm inciphisa iphutha phakathi kwamanani abikezelwe kanye neqiniso eliyisisekelo. Njengoba inqubo icabangela iphuzu ledatha ngalinye ukuze liqonde futhi linciphise iphutha, ukusebenza kwalo kuncike kusayizi wedatha yokuqeqeshwa. Amasu afana ne-Stochastic Gradient Descent (SGD) yakhelwe ukuthuthukisa ukusebenza kwesibalo kodwa ngezindleko zokunemba kokuhlangana.


I-Stochastic Average Gradient ibhalansisa indlela yakudala, eyaziwa ngokuthi I-Full Gradient Descent ne-SGD, futhi inikeza zombili izinzuzo. Kodwa ngaphambi kokuthi sisebenzise i-algorithm, kufanele siqale siqonde ukubaluleka kwayo ekwenzeni amamodeli.

Ukuthuthukisa Izinjongo Zokufunda Ngomshini Ngokwehla Kwe-Gradient

Yonke i-algorithm ye-ML inomsebenzi wokulahlekelwa ohlobene ohlose ukunciphisa noma ukuthuthukisa ukusebenza kwemodeli. Ngokwezibalo, ukulahlekelwa kungachazwa ngokuthi:


Kumane kuwumehluko phakathi kokuphumayo kwangempela nokubikezelwe, futhi ukunciphisa lo mehluko kusho ukuthi imodeli yethu isondela kumanani eqiniso ayisisekelo.


I-algorithm yokunciphisa isebenzisa ukwehla kwe-gradient ukunqamula umsebenzi wokulahlekelwa futhi ithole ubuncane bomhlaba wonke. Isinyathelo ngasinye esivundlayo sibandakanya ukubuyekeza izisindo ze-algorithm ukuze kuthuthukiswe okukhiphayo.


I-Plain Gradient Ukwehla

I-algorithm evamile yokwehla kwegradient isebenzisa isilinganiso sawo wonke ama-gradient abalwe kuyo yonke idathasethi. Umjikelezo wempilo wesibonelo esisodwa sokuqeqesha ubukeka kanjena:



Isibalo sokubuyekeza isisindo sibukeka kanjena:

Lapho W emele izisindo zemodeli futhi dJ/dW iphuma kokunye komsebenzi wokulahlekelwa ngokuphathelene nesisindo semodeli. Indlela evamile inezinga eliphezulu lokuhlangana kodwa iba eqolo uma usebenzisana nedathasethi enkulu ehlanganisa izigidi zamaphoyinti edatha.

I-Stochastic Gradient Descent (SGD)

Indlela ye-SGD ihlala ifana ne-plain GD, kodwa esikhundleni sokusebenzisa yonke idathasethi ukubala ama-gradient, isebenzisa inqwaba encane kokufakwayo. Indlela isebenza kahle kakhulu kodwa ingase igxume kakhulu izungeze ubuncane bomhlaba njengoba ukuphindaphinda ngakunye kusebenzisa ingxenye yedatha yokufunda kuphela.

I-Stochastic Average Gradient

Indlela ye-Stochastic Average Gradient (SAG) yethulwa njengendawo emaphakathi phakathi kwe-GD ne-SGD. Ikhetha iphoyinti ledatha elingahleliwe futhi ibuyekeze inani layo ngokusekelwe kugradient kulelo phuzu kanye nesilinganiso esisindiwe samagrediyenti adlule agcinwe kulelo phuzu ledatha elithile.


Ngokufana ne-SGD, i-SAG imodela zonke izinkinga njengesamba esilinganiselwe se-convex, imisebenzi ehlukanisayo. Kunoma ikuphi ukuphindwaphindwa, kusebenzisa ama-gradient amanje kanye nesilinganiso samagrediyenti adlule ukuze kuthuthukiswe isisindo. I-equation ithatha ifomu elilandelayo:



Isilinganiso Sokuhlangana

Phakathi kwama-algorithms amabili adumile, i-gradient egcwele (FG) kanye nokwehla kwe-stochastic gradient (SGD), i-algorithm ye-FG inenani elingcono lokuhlangana njengoba isebenzisa yonke idatha esethiwe ngesikhathi sokuphindaphinda ngakunye ukuze kubalwe.

Nakuba i-SAG inesakhiwo esifana ne-SGD, izinga lokuhlangana kwayo liyaqhathaniswa futhi ngezinye izikhathi lingcono kunendlela egcwele ye-gradient. Ithebula 1 ngezansi lifingqa imiphumela evela ekuhlolweni kwe Schmidt et. al .

Umthombo: https://arxiv.org/pdf/1309.2388

Okunye Ukuguqulwa

Naphezu kokusebenza kwayo okumangalisayo, izinguquko ezimbalwa ziye zahlongozwa ku-algorithm yoqobo ye-SGD ukusiza ukuthuthukisa ukusebenza.


  • Ukukala Kabusha Ekuphindaphindeni Kwasekuqaleni: Ukuhlangana kwe-SAG kuhlala kuhamba kancane phakathi neziphindaphinda ezimbalwa zokuqala njengoba i-algorithm yenza isiqondisindlela sibe sijwayelekile ngo-n (inani eliphelele lamaphoyinti edatha). Lokhu kunikeza isilinganiso esingalungile njengoba i-algorithm ingakawaboni amaphuzu amaningi edatha. Ukuguqulwa kuphakamisa ukujwayela ngo-m esikhundleni sika-n, lapho u-m eyinombolo yamaphoyinti edatha abonwa okungenani kanye kuze kube lokho kuphindaphindwa okuthile.
  • Amaqoqo amancane: Indlela ye-Stochastic Gradient isebenzisa ama-mini-batches ukucubungula amaphuzu amaningi edatha ngesikhathi esisodwa. Indlela efanayo ingasetshenziswa ku-SAG. Lokhu kuvumela i-vectorization kanye nokufana ukuze kuthuthukiswe ukusebenza kahle kwekhompyutha. Iphinde yehlise umthwalo wenkumbulo, inselelo evelele ye-algorithm ye-SAG.
  • Ukuhlolwa kosayizi wesinyathelo: Usayizi wesinyathelo okukhulunywe ngawo ekuqaleni (116L) unikeza imiphumela emangalisayo, kodwa ababhali baphinde bahlola ngokusebenzisa usayizi wesinyathelo ongu-1L. Lesi sakamuva sinikeze ukuhlangana okungcono nakakhulu. Nokho, ababhali abakwazanga ukwethula ukuhlaziya okusemthethweni kwemiphumela ethuthukisiwe. Baphetha ngokuthi usayizi wesinyathelo kufanele uhlolwe ukuze kutholwe okulungele inkinga ethile.


Imicabango yokugcina

Ukwehla kwe-Gradient wukulungiselelwa okudumile okusetshenziselwa ukuthola ubuncane bomhlaba bemisebenzi enikeziwe yenjongo. I-algorithm isebenzisa i-gradient yomsebenzi wenhloso ukunqamula umthambeka wokusebenza ize ifike endaweni ephansi kakhulu.

I-Full Gradient Descent (FG) kanye ne-Stochastic Gradient Descent (SGD) izinhlobonhlobo ezimbili ezidumile ze-algorithm. I-FG isebenzisa yonke idathasethi ngesikhathi sokuphindaphinda ngakunye futhi inikeza izinga eliphezulu lokuhlangana ngezindleko eziphezulu zokubala. Ekuphindaphindweni ngakunye, i-SGD isebenzisa isethi engaphansi yedatha ukuze iqalise i-algorithm. Isebenza ngempumelelo kakhulu kepha inokuhlangana okungaqinisekile.


I-Stochastic Average Gradient (SAG) ingenye inguquko ehlinzeka ngezinzuzo zawo womabili ama-algorithms adlule. Isebenzisa isilinganiso samagrediyenti adlule kanye nesethi engaphansi yedathasethi ukuze inikeze izinga eliphezulu lokuhlangana nokubala okuphansi. I-algorithm ingabuye ishintshwe ukuze kuthuthukiswe ukusebenza kahle kwayo kusetshenziswa i-vectorization nama-mini-batches.