paint-brush
Ku twisisa Stochastic Average Gradienthi@kustarev
31,726 ku hlayiwa
31,726 ku hlayiwa

Ku twisisa Stochastic Average Gradient

hi Andrey Kustarev4m2024/06/06
Read on Terminal Reader
Read this story w/o Javascript

Ku leha ngopfu; Ku hlaya

Gradient descent i optimization leyi dumeke leyi tirhisiwaka ku kuma ti global minima ta mintirho ya xikongomelo leyi nyikiweke. Algorithm yi tirhisa gradient ya ntirho wa xikongomelo ku tsemakanya xirhendzevutani xa ntirho ku kondza yi fika eka ndhawu ya le hansi swinene. Full Gradient Descent (FG) na Stochastic Gradient Descent (SGD) i ku hambana kambirhi loku dumeke ka algorithm. FG yi tirhisa dataset hinkwayo hi nkarhi wa iteration yin’wana na yin’wana naswona yi nyika mpimo wa le henhla wa ku hlangana hi ntsengo wa le henhla wa xibalo. Eka ku vuyeleriwa kun’wana na kun’wana, SGD yi tirhisa ntlawa lowutsongo wa datha ku fambisa algorithm. Yi tirha kahle swinene kambe yi ri na ku hlangana loku nga tiyisekiki. Stochastic Average Gradient (SAG) i ku hambana kun’wana loku nyikaka mimpfuno ya tialgorithm leti hatimbirhi ta khale. Yi tirhisa avhareji ya ti gradients ta nkarhi lowu hundzeke na ntlawa lowuntsongo wa dataset ku nyika mpimo wa le henhla wa ku hlangana na xibalo xa le hansi. Algorithm yinga cinciwa kuya emahlweni ku antswisa vukorhokeri bya yona hiku tirhisa vectorization na mini-batches.

People Mentioned

Mention Thumbnail

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Ku twisisa Stochastic Average Gradient
Andrey Kustarev HackerNoon profile picture
0-item


Gradient descent i thekiniki ya optimization leyi dumeke swinene eka machine learning (ML) modeling. Algorithm yi hunguta xihoxo exikarhi ka mimpimo leyi vhumbhiweke na ntiyiso wa le hansi. Tanihi leswi thekiniki yi tekelaka enhlokweni yinhla yin’wana na yin’wana ya datha ku twisisa na ku hunguta xihoxo, matirhelo ya yona ya titshege hi mpimo wa datha ya ndzetelo. Tithekiniki to fana na Stochastic Gradient Descent (SGD) ti endleriwe ku antswisa matirhelo ya xibalo kambe hi ntsengo wa ku pakanisa ka ku hlangana.


Stochastic Average Gradient yi ringanisela endlelo ra xikhale, leri tivekaka tanihi Full Gradient Descent na SGD, naswona yi nyika mimpfuno leyimbirhi. Kambe loko hi nga si tirhisa algorithm, hi fanele ku rhanga hi twisisa nkoka wa yona eka model optimization.

Ku Antswisa Swikongomelo swa Dyondzo ya Muchini hi Gradient Descent

Algorithm yin’wana na yin’wana ya ML yi na ntirho wa ku lahlekeriwa lowu fambelanaka lowu kongomisaka eka ku hunguta kumbe ku antswisa matirhelo ya modele. Hi tinhlayo, ku lahlekeriwa ku nga hlamuseriwa tanihi:


I ntsena ku hambana exikarhi ka vuhumelerisi bya xiviri na lebyi vhumbhiweke, naswona ku hunguta ku hambana loku swi vula leswaku modele wa hina wu tshinela ekusuhi na mimpimo ya ntiyiso wa le hansi.


Algorithm ya ku hunguta yi tirhisa ku rhelela ka gradient ku tsemakanya ntirho wa ku lahlekeriwa no kuma minimum ya misava hinkwayo. Goza rin’wana na rin’wana ro tsemakanya ri katsa ku pfuxeta swipimelo swa algorithm ku antswisa vuhumelerisi.


Ku Xika ka Gradient ya Plain

Algorithm ya ntolovelo ya ku rhelela ka gradient yi tirhisa avhareji ya ti gradient hinkwato leti hlayiweke eka dataset hinkwayo. Xirhendzevutani xa vutomi bya xikombiso xin’we xa ndzetelo xi languteka hi ndlela leyi landzelaka:



Xiringaniso xa ku pfuxetiwa ka ntiko xi languteka hi ndlela leyi landzelaka:

Laha W yi yimelaka swipimelo swa modele naswona dJ/dW i derivative ya ntirho wa ku lahlekeriwa hi ku xixima ntiko wa modele. Ndlela ya ntolovelo yi na mpimo wa le henhla wa ku hlangana kambe yi va leyi durhaka hi tlhelo ra xibalo loko ku tirhana na tidathaseti letikulu leti katsaka timiliyoni ta tinhla ta datha.

Ku rhelela ka Stochastic Gradient (SGD) .

Maendlelo ya SGD ya tshama ya fana na GD yo olova, kambe ematshan’wini yo tirhisa dataset hinkwayo ku hlayela ti gradients, yi tirhisa ntlawa wutsongo ku suka eka swingheniso. Ndlela leyi yi tirha kahle swinene kambe yinga hop ngopfu eka ti global minima tani hileswi iteration yin’wana na yin’wana yi tirhisaka ntsena xiphemu xa data ku dyondza.

Xiringaniso xa Stochastic Gradient

Endlelo ra Stochastic Average Gradient (SAG) ri nghenisiwile tanihi ndhawu ya le xikarhi exikarhi ka GD na SGD. Yi hlawula ndhawu ya datha leyi nga hlelekangiki naswona yi pfuxeta nkoka wa yona hi ku ya hi xirhendzevutani eka ndhawu yoleyo na avhareji leyi pimiweke ya swirhendzevutani swa nkarhi lowu hundzeke leswi hlayisiweke eka ndhawu yoleyo yo karhi ya datha.


Ku fana na SGD, SAG yi modela xiphiqo xin’wana na xin’wana tanihi nhlayo leyi heleleke ya mintirho ya convex, leyi hambanisiwaka. Eka iteration yin’wana na yin’wana leyi nyikiweke, yi tirhisa ti gradients ta sweswi na avhareji ya ti gradients ta khale eka ku ndlandlamuxa ntiko. Xiringaniso xi teka xivumbeko lexi landzelaka:



Mpimo wa ku Hlangana

Exikarhi ka ti algorithms timbirhi leti dumeke, full gradient (FG) na stochastic gradient descent (SGD), algorithm ya FG yina convergence rate yo antswa tani hileswi yi tirhisaka data hinkwayo ya data hi nkarhi wa iteration yin’wana na yin’wana ku hlayela.

Hambi leswi SAG yi nga na xivumbeko lexi fanaka na SGD, mpimo wa yona wa ku hlangana wu ringanisiwa na naswona minkarhi yin’wana wu antswa ku tlula endlelo ra full gradient. Tafula ra 1 laha hansi ri katsakanya mbuyelo ku suka eka swikambelo swa Schmidt na yena. al .

Xihlovo: https://arxiv.org/pdf/1309.2388

Ku Cinciwa Kun’wana

Hambi leswi matirhelo ya yona yo hlamarisa, ku cinciwa ko hlayanyana ku ringanyetiwe eka algorithm yo sungula ya SGD ku pfuneta ku antswisa matirhelo.


  • Ku pima nakambe eka Ti Iterations to Sungula: Ku hlangana ka SAG ku tshama ku ri ku nonoka hi nkarhi wa ku vuyeleriwa ko sungula tanihileswi algorithm yi normalisaka tlhelo hi n (nhlayo hinkwayo ya tinhla ta data). Leswi swi nyika xiringanyeto lexi nga kongomangiki tanihileswi algorithm yi nga si vona tinhla to tala ta data. Ku cinciwa ku ringanyeta ku normalisa hi m ematshan’wini ya n, laha m ku nga nhlayo ya tinhla ta data leti voniweke kan’we ku fikela eka ku vuyeleriwa koloko ko karhi.
  • Ti-mini-batch: Endlelo ra Stochastic Gradient ri tirhisa ti-mini-batch ku tirhisa tinhla to tala ta datha hi nkarhi wun’we. Endlelo leri fanaka ri nga tirhisiwa eka SAG. Leswi swi pfumelela ku vectorization na parallelization ku antswisiwa ka vukorhokeri bya khompyuta. Swi tlhela swi hunguta ndzhwalo wa memori, ntlhontlho lowukulu eka algorithm ya SAG.
  • Ku ringeta ka Step-Size: Sayizi ya magoza leyi boxiweke eku sunguleni (116L) yi nyika mbuyelo wo hlamarisa, kambe vatsari va tlhele va ringeta hi ku tirhisa sayizi ya magoza ya 1L. Leswo hetelela swi nyikele ku hlangana lokunene swinene. Hambiswiritano, vatsari a va swi kotanga ku humesa nxopaxopo wa ximfumo wa mimbuyelo leyi antswisiweke. Va gimeta hi leswaku mpimo wa goza wu fanele ku ringetiwa ku kuma lowunene eka xiphiqo xo karhi.


Miehleketo yo Hetelela

Gradient descent i optimization leyi dumeke leyi tirhisiwaka ku kuma ti global minima ta mintirho ya xikongomelo leyi nyikiweke. Algorithm yi tirhisa gradient ya ntirho wa xikongomelo ku tsemakanya xirhendzevutani xa ntirho ku kondza yi fika eka ndhawu ya le hansi swinene.

Full Gradient Descent (FG) na Stochastic Gradient Descent (SGD) i ku hambana kambirhi loku dumeke ka algorithm. FG yi tirhisa dataset hinkwayo hi nkarhi wa iteration yin’wana na yin’wana naswona yi nyika mpimo wa le henhla wa ku hlangana hi ntsengo wa le henhla wa xibalo. Eka ku vuyeleriwa kun’wana na kun’wana, SGD yi tirhisa ntlawa lowutsongo wa datha ku fambisa algorithm. Yi tirha kahle swinene kambe yi ri na ku hlangana loku nga tiyisekiki.


Stochastic Average Gradient (SAG) i ku hambana kun’wana loku nyikaka mimpfuno ya tialgorithm leti hatimbirhi ta khale. Yi tirhisa avhareji ya ti gradients ta nkarhi lowu hundzeke na ntlawa lowuntsongo wa dataset ku nyika mpimo wa le henhla wa ku hlangana na xibalo xa le hansi. Algorithm yinga cinciwa kuya emahlweni ku antswisa vukorhokeri bya yona hiku tirhisa vectorization na mini-batches.