Sonke sisombulula ingxaki " yeyona ndlela imfutshane " amaxesha amaninzi ngemini. Ngokungeyomfuneko kunjalo. Siyicombulula xa sisemsebenzini okanye xa sijonga i-Intanethi okanye xa sicwangcisa izinto zethu edesikeni.
Ivakala kancinci… kakhulu? Makhe sifumanise.
Khawufane ucinge, uthathe isigqibo sokudibana nabahlobo, kulungile ... masithi kwi-cafe. Okokuqala, kufuneka ufumane indlela (okanye indlela) eya kwi-cafe, ngoko uqala ukukhangela izithuthi zikawonke-wonke ezikhoyo (ukuba uhamba ngeenyawo) okanye iindlela kunye nezitrato (ukuba uqhuba). Ukhangela indlela esuka kwindawo okuyo ukuya kwikhefi kodwa hayi "nayiphi na" indlela - eyona imfutshane okanye ikhawulezayo.
Nanku omnye umzekelo, ongacacanga njengowangaphambili. Ngexesha lomsebenzi uthatha isigqibo sokuthatha "indlela emfutshane" kwisitalato esisecaleni kuba ... "yindlela emfutshane" kwaye "ikhawuleza" ukuhamba ngale ndlela. Kodwa wazi njani ukuba le "short cut" iyakhawuleza? Ngokusekelwe kumava akho, usombulule ingxaki "yeyona ndlela imfutshane" kwaye ukhethe indlela, ehamba ngesitrato esisecaleni.
Kuyo yomibini imizekelo, indlela “emfutshane” imiselwa nokuba ngumgama okanye ixesha elifunekayo ukusuka kwenye indawo ukuya kwenye. Imizekelo ehambayo lusetyenziso lwendalo lwethiyori yegrafu kunye nengxaki "yeyona ndlela imfutshane" ngakumbi. Noko ke, kuthekani ngomzekelo wokulungelelanisa izinto edesikeni? Kule meko "emfutshane" inokumela inani okanye ukuntsonkotha kwezenzo ekufuneka uzenze ukufumana, umzekelo, iphepha: idesika evulekileyo, ifolda evulekileyo, thatha iphepha vs ukuthatha iphepha ekunene kwidesika. .
Yonke le mizekelo ingasentla imele ingxaki yokufumana eyona ndlela imfutshane phakathi kwee-vertex ezimbini kwigrafu (indlela okanye indlela phakathi kweendawo ezimbini, inani lezenzo okanye ubunzima bokufumana iphepha ukusuka kwindawo enye okanye kwenye). Olu didi lweengxaki zendlela ezimfutshane lubizwa ngokuba yi-SSSP (Umthombo omnye oMfutshane weNdlela) kunye nesiseko se-algorithm yokusombulula ezi ngxaki yi -algorithm ye-Dijkstra , ene- O(n^2)
yobunzima bokubala.
Kodwa, ngamanye amaxesha kufuneka sifumane zonke iindlela ezimfutshane phakathi kwazo zonke ii-vertex. Qwalasela lo mzekelo ulandelayo: wenzela imephu yakho iintshukumo rhoqo phakathi kwekhaya , umsebenzi kunye nethiyetha . Kulo mzekelo uya kuphelela ngeendlela ezi-6: work ⭢ home
, home ⭢ work
, work ⭢ theatre
, theatre ⭢ work
, home ⭢ theatre
kunye theatre ⭢ home
(iindlela ezibuyela umva zinokwahluka ngenxa yeendlela zendlela enye umzekelo) .
Ukongeza indawo eyongezelelekileyo kwimephu kuya kukhokelela ekukhuleni okubalulekileyo kokudityaniswa - ngokungqinelana nefomula ye-n ye -combinatorics ingabalwa ngolu hlobo:
A(k, n) = n! / (n - m)! // where n - is a number of elements, // k - is a number of elements in permutation (in our case k = 2)
Okusinika iindlela ezili-12 zeendawo ezi-4 kunye neendlela ezingama-90 zeendawo ezili-10 (ezichukumisayo). Qaphela… oku ngaphandle kokuthathela ingqalelo iindawo eziphakathi phakathi kweendawo okt, ukusuka ekhaya ukuya emsebenzini kufuneka uwele izitalato ezi-4, uhambe ngomlambo kwaye uwele ibhulorho. Ukuba siyaqikelela, ezinye iindlela zinokuba neendawo eziqhelekileyo eziphakathi… kuhle ... ngenxa yoko siya kuba negrafu enkulu kakhulu, eneentloko ezininzi, apho i-vertex nganye iyakumela nokuba yindawo okanye indawo ephakathi ebalulekileyo. Udidi lweengxaki, apho kufuneka sifumane zonke iindlela ezimfutshane phakathi kwazo zonke izibini ze-vertexes kwigrafu, ibizwa ngokuba yi -APSP (Zonke Izibini Ezifutshane) kunye nesiseko se-algorithm yokusombulula ezi ngxaki yi -algorithm ye-Floyd-Warshall , ene- O(n^3)
ukuntsokotha kokubala.
Kwaye le yi-algorithm esiza kuyiphumeza namhlanje 🙂
I-algorithm ye-Floyd-Warshall ifumana zonke iindlela ezimfutshane phakathi kweperi nganye yeevertex kwigrafu. I-algorithms yapapashwa nguRobert Floyd kwi- [1] (jonga icandelo elithi "References" ukuze ufumane iinkcukacha ezingaphezulu). Kwangalo nyaka, uPeter Ingerman kwi- [2] uchaze ukuphunyezwa kwe-algorithm yangoku ngohlobo lwezintlu ezintathu ezibekwe for
:
algorithm FloydWarshall(W) do for k = 0 to N - 1 do for i = 0 to N - 1 do for j = 0 to N - 1 do W[i, j] = min(W[i, j], W[i, k] + W[k, j]) end for end for end for end algorithm // where W - is a weight matrix of N x N size, // min() - is a function which returns lesser of it's arguments
Ukuba awuzange ube notshintsho ekusebenzeni ngegrafu emelwe ngohlobo lwe-matrix ngoko kunokuba nzima ukuqonda ukuba i-algorithm ingentla yenza ntoni. Ke, ukuqinisekisa ukuba sikwiphepha elinye, makhe sijonge ukuba igrafu inokumelwa njani ngohlobo lwe-matrix kwaye kutheni umelo olunjalo luluncedo ukusombulula eyona ngxaki imfutshane yendlela.
Umfanekiso ongezantsi ubonisa igrafu eqondisiweyo, enobunzima bee-vertexes ezi-5. Ngasekhohlo, igrafu iboniswa ngendlela ebonwayo, eyenziwe ngezangqa kunye neentolo, apho isangqa ngasinye simele i-vertex kwaye utolo lumele isiphelo esinesalathiso. Inani elingaphakathi kwesangqa lihambelana nenombolo yevertex kunye nenani elingentla komphetho liyahambelana nobunzima bomphetho. Ngasekunene, igrafu efanayo inikezelwa ngendlela ye-matrix yobunzima. I-matrix yobunzima luhlobo lwe -matrix ekufutshane apho iseli nganye ye-matrix iqulethe "ubunzima" - umgama phakathi kwe-vertex i
(umqolo) kunye ne-vertex j
(ikholamu). I-matrix yobunzima ayibandakanyi ulwazi malunga “nendlela” phakathi kwee-vertex (uluhlu lwee-vertex ofumana ngalo ukusuka ku i
ukuya ku j
) – nje ubunzima (umgama) phakathi kwezi vesi.
Kwi-matrix yobunzima sinokubona ukuba ixabiso leseli lilingana nobunzima phakathi kwee-vertex ezikufutshane . Yiyo loo nto, ukuba sihlola iindlela ukusuka kwi-vertex 0
(umqolo we 0
), siya kubona ukuba ... kukho indlela enye kuphela - ukusuka ku 0
ukuya ku 1
. Kodwa, kumboniso obonakalayo sinokubona ngokucacileyo iindlela ukusuka kwi-vertex 0
ukuya kwi-vertexes yesi 2
kunye ne 3
(nge-vertex 1
). Isizathu salokhu silula - kwimeko yokuqala, i-matrix yobunzima iqulethe umgama phakathi kwee-vertex ezikufutshane kuphela. Nangona kunjalo, olu lwazi lulodwa lwanele ukufumana ukuphumla.
Makhe sibone ukuba isebenza njani. Nika ingqalelo kwiseli W[0, 1]
. Ixabiso libonisa, kukho indlela esuka kwivertex 0
ukuya kwivertex 1
enobunzima obulingana no 1
(ngokufutshane singabhala njenge: 0 ⭢ 1: 1
). Ngolu lwazi, ngoku sinokuskena zonke iiseli zomqolo woku 1
(oqulethe zonke izisindo zazo zonke iindlela ukusuka kwi-vertex 1
) kunye ne-port yangasemva le ngcaciso kumqolo 0
, ukwandisa ubunzima ngo 1
(ixabiso le- W[0, 1]
).
Ngokusebenzisa amanyathelo afanayo sinokufumana iindlela ukusuka kwi-vertex 0
ukuya kwezinye ii-vertex. Ngexesha lokukhangela, kunokwenzeka ukuba kukho iindlela ezingaphezu kwenye ezikhokelela kwi-vertex enye kwaye yintoni ebaluleke kakhulu ubunzima bezi ndlela zinokwahluka. Umzekelo wemeko enjalo ubonisiwe kumfanekiso ongezantsi, apho ukukhangela ukusuka kwi-vertex 0
ukuya kwi-vertex 2
kutyhile enye indlela eya kwi-vertex yesi 3
yobunzima obuncinane.
Sineendlela ezimbini: indlela yoqobo – 0 ⭢ 3: 4
kunye nendlela entsha esisandula ukuyifumanisa – 0 ⭢ 2 ⭢ 3: 3
(khumbula, ubunzima bematrix ayiqulathanga iindlela, ngoko asazi ukuba yeyiphi. iivertex zibandakanyiwe kwindlela yokuqala). Eli lixesha xa sithatha isigqibo sokugcina eyona imfutshane kwaye sibhale 3
kwiseli W[0, 3]
.
Kubonakala ngathi sifumene eyona ndlela yethu yokuqala imfutshane!
Amanyathelo esisanda kuwabona angoyena ndoqo we-algorithm ye-Floyd-Warshall. Jonga i-algorithm's pseudocode ixesha elingakumbi:
algorithm FloydWarshall(W) do for k = 0 to N - 1 do for i = 0 to N - 1 do for j = 0 to N - 1 do W[i, j] = min(W[i, j], W[i, k] + W[k, j]) end for end for end for end algorithm
Umjikelo ongaphandle for
- k
iiterates phezu kwazo zonke iivertex kwigrafu kwaye kuphindaphindo ngalunye, ulwahlulo u k
umele ivertex esizingela iindlela kuyo . Umjikelo wangaphakathi for
i
iphinda iphindaphinde ngaphezu kwazo zonke ii-vertex kwigrafu kwaye kwi-iteration nganye, i
i-vertex esikhangela iindlela ukusuka kuyo . Kwaye okokugqibela umjikelo ongaphakathi for
j
iterates phezu kwazo zonke iivertex kwigrafu nakuphinda-phindo ngalunye, j
imele ivertex esikhangela indlela eya kuyo. Xa kudityanisiwe isinika oku kulandelayo: kwi-iteration nganye k
sikhangela umendo kuzo zonke iivertex i
ukuya kuzo zonke iivertex j
ukugqitha kwivertex k
. Ngaphakathi kumjikelo sithelekisa indlela i ⭢ j
(emelwe ngu W[i, j]
) kunye nendlela i ⭢ k ⭢ j
(emelwe sisimbuku se- W[I, k]
kunye ne W[k, j]
) kunye nokubhala eyona imfutshane enye ibuyele kwi W[i, j]
.
Ngoku, xa siqonda i-mechanics lixesha lokuphumeza i-algorithm.
Ikhowudi yomthombo kunye neegrafu zokulinga ziyafumaneka kwindawo yokugcina kwi-GitHub. Iigrafu zovavanyo zinokufunyanwa kuvimba Data/Sparse-Graphs.zip
. Zonke iibenchmarks kwesi sithuba ziphunyezwe kwifayile ye -APSP01.cs .
Ngaphambi kokuba singene ekuphunyezweni kufuneka sicacise amaxesha ambalwa obugcisa:
NO_EDGE = (int.MaxValue / 2) - 1
.
Ngoku, makhe sibone ukuba kutheni.
Malunga ne-#1. Xa sithetha ngee-matrixes sichaza iiseli ngokwe “miqolo” kunye “nezikholamu”. Ngenxa yoko kubonakala kungokwemvelo ukucinga i-matrix ngendlela "yesikwere" okanye "uxande" (Umfanekiso 4a).
Nangona kunjalo, oku akuyomfuneko kuthetha ukuba kufuneka simele imatrix ngohlobo loluhlu lwee-arrays (Umfanekiso 4b) ukunamathela kwintelekelelo yethu. Endaweni yoku, sinokumela imatrix ngendlela yoluhlu lomgca (Umfanekiso 4c) apho isalathiso seseli nganye sibalwa kusetyenziswa le fomula ilandelayo:
i = row * row_size + col; // where row - cell row index, // col - cell column index, // row_size - number of cells in a row.
Uluhlu lomgca wobunzima be-matrix ngumqathango wangaphambili wokuphunyezwa okusebenzayo kwe-algorithm ye-Floyd-Warshall. Isizathu soko asilula kwaye ingcaciso eneenkcukacha ifanelwe yiposti eyahlukileyo… okanye izithuba ezimbalwa. Nangona kunjalo, okwangoku, kubalulekile ukukhankanya ukuba ukumelwa okunjalo kuphucula kakhulu indawo yedatha , eneneni enempembelelo enkulu ekusebenzeni kwe-algorithm.
Apha, ndicela ukuba undikholelwe kwaye nje olu lwazi engqondweni njengemeko yangaphambili, nangona kunjalo, kwangaxeshanye ndincoma ukuba uchithe ixesha kwaye ufunde lo mbuzo, kwaye ngendlela - ungakholelwa abantu kwi-Intanethi. .
- Inqaku lombhali
Ngokuphathelele #2. Ukuba ujonga ngokusondeleyo kwi-algorithm yepseudocode, awuzukufumana naziphi iitshekhi eziyelelene nobukho bendlela phakathi kweeverteksi ezimbini. Endaweni yoko, pseudocode sebenzisa nje min()
umsebenzi. Isizathu silula - ekuqaleni, ukuba akukho mendo phakathi kweeverteksi, ixabiso leseli limiselwe kwi infinity
kwaye kuzo zonke iilwimi, ngaphandle kokuba iJavaScript, onke amaxabiso angaphantsi kune infinity
. Kwimeko yeenombolo ezipheleleyo kunokuhenda ukusebenzisa int.MaxValue
njengexabiso elithi "akukho ndlela". Nangona kunjalo oku kuyakukhokelela ekuphuphumeni okupheleleyo kwiimeko xa amaxabiso azo zombini i ⭢ k
no k ⭢ j
iindlela ziyakulingana ne int.MaxValue
. Yiyo loo nto sisebenzisa ixabiso elingaphantsi kwesiqingatha se- int.MaxValue
.
Hayi! Kodwa kutheni singenakujonga nje ukuba indlela ikhona na phambi kokuba senze naziphi na izibalo. Umzekelo ngokuthelekisa iindlela zombini ukuya ku-0 (ukuba sithatha u-zero njengexabiso elithi "akukho ndlela").
- Umfundi ocingayo
Inokwenzeka ngokwenene kodwa ngelishwa iya kukhokelela kwisohlwayo esibalulekileyo sokusebenza. Ngokufutshane, i-CPU igcina iinkcukacha-manani zeziphumo zovavanyo lwesebe. xa if
zengxelo zivavanya ukuba true
okanye false
. Isebenzisa olu balo-manani ukwenza ikhowudi "yesebe eliqikelelweyo ngokweenkcukacha-manani" ngaphambili phambi kokuba if
ivavanywe (oku kubizwa ngokuba yintelekelelo ekwenziweni ) kwaye ke ngoko iphumeze ikhowudi ngokufanelekileyo ngakumbi. Nangona kunjalo, xa ukubikezelwa kwe-CPU kungachanekanga, kubangela ilahleko enkulu yokusebenza xa kuthelekiswa nokuqikelelwa okuchanekileyo kunye nokuphunyezwa okungenamiqathango kuba i-CPU kufuneka ime kwaye ibale isebe elichanekileyo.
Kuba kwi-iteration k
nganye sihlaziya inxalenye ebalulekileyo ye-matrix yobunzima, izibalo ze-CPU zesebe ziba lilize kuba akukho pateni yekhowudi, onke amasebe asekelwe ngokukodwa kwidatha. Ngoko ke oko kuhlolwa kuya kukhokelela kwisixa esibalulekileyo sokungacingelwa kakuhle kwesebe .
Apha kwakhona ndicela ukuba undikholelwe (okwangoku) kwaye emva koko uchithe ixesha lokufunda isihloko.
Uhh, kujongeka ngathi sigqibile ngethiyori inxalenye - masiphumeze i-algorithm (sichaza oku kuphunyezwa Baseline
):
public void Baseline(int[] matrix, int sz) { for (var k = 0; k < sz; ++k) { for (var i = 0; i < sz; ++i) { for (var j = 0; j < sz; ++j) { var distance = matrix[i * sz + k] + matrix[k * sz + j]; if (matrix[i * sz + j] > distance) { matrix[i * sz + j] = distance; } } } } }
Le khowudi ingentla iphantse ifane ikopi yepseudocode ekhankanywe ngaphambili ngaphandle kokunye - endaweni Math.Min()
sisebenzisa if
sihlaziya iseli kuphela xa kuyimfuneko.
Hayi! Yima umzuzu, yayingenguwe owabhala amagama amaninzi malunga nokuba kutheni ukuba akulunganga apha kunye nemigca embalwa kamva thina ngokwethu sazisa ukuba?
- Umfundi ocingayo
Isizathu silula. Ngomzuzu wokubhala i-JIT ikhupha ikhowudi ephantse ilingane kuzo zombini if
kunye nokuphunyezwa Math.Min
. Ungayijonga kwiinkcukacha kwi-shartlab.io kodwa nantsi i-snippets yemizimba ephambili ye-loop:
// // == Assembly code of implementation of innermost loop for of j using if. // 53: movsxd r14, r14d // // compare matrix[i * sz + j] and distance (Condition) // 56: cmp [rdx+r14*4+0x10], ebx 5b: jle short 64 // // if matrix[i * sz + j] greater than distance write distance to matrix // 5d: movsxd rbp, ebp 60: mov [rdx+rbp*4+0x10], ebx 64: // end of loop // // == Assembly code of implementation of innermost loop for of j using Math.Min. // 4f: movsxd rbp, ebp 52: mov r14d, [rdx+rbp*4+0x10] // // compare matrix[i * sz + j] and distance (Condition) // 57: cmp r14d, ebx. // // if matrix[i * sz + j] less than distance write value to matrix // 5a: jle short 5e // // otherwise write distance to matrix // 5c: jmp short 61 5e: mov ebx, r14d 61: mov [rdx+rbp*4+0x10], ebx 65: // end of loop
Ungabona, nokuba sisebenzisa if
okanye Math.Min
kusekho ujongo olunemiqathango. Nangona kunjalo, kwimeko if
akukho bhala ngokungeyomfuneko.
Ngoku xa sigqibile ngokuphunyezwa, lixesha lokuqhuba ikhowudi kwaye sibone ukuba ikhawuleza kangakanani?!
Ungaqinisekisa ukuchaneka kwekhowudi ngokwakho ngokuqhuba iimvavanyo kwindawo yokugcina .
Ndisebenzisa iBenchmark.NET (uguqulelo 0.12.1) kwikhowudi yebenchmark. Zonke iigrafu ezisetyenziswe kwiibenchmarks ziqondiswe , iigrafu ze-acyclic ze-300, 600, 1200, 2400 kunye ne-4800 vertexes. Inani lemiphetho kwiigrafu lijikeleze i-80% yobuninzi obunokwenzeka apho ulwalathiso, iigrafu ze-acyclic zingabalwa ngolu hlobo:
var max = v * (v - 1)) / 2; // where v - is a number of vertexes in a graph.
Masishukume!
Nazi iziphumo zebenchmarks eziqhutywa kwiPC yam (Windows 10.0.19042, Intel Core i7-7700 CPU 3.60Ghz (Kaby Lake) / 8 iiprosesa ezinengqiqo / 4 cores):
Indlela | Ubungakanani | Ithetha | Impazamo | StdDev |
---|---|---|---|---|
Isiseko | 300 | 27.525 ms | 0.1937 ms | 0.1617 ms |
Isiseko | 600 | 217.897 ms | 1.6415 ms | 1.5355 ms |
Isiseko | 1200 | 1,763.335 ms | 7.4561 ms | 6.2262 ms |
Isiseko | 2400 | 14,533.335 ms | 63.3518 ms | 52.9016 ms |
Isiseko | 4800 | 119,768.219 ms | 181.5514 ms | 160.9406 ms |
Kwiziphumo esizibonayo, ixesha lokubala likhula ngokumangalisayo xa lithelekiswa nobukhulu begrafu - kwigrafu ye-300 vertexes ithathe i-27 milliseconds, igrafu ye-2400 vertex - imizuzwana ye-14.5 kunye negrafu ye-4800 - 119 imizuzwana phantse 2 imizuzu !
Ukujonga ikhowudi ye-algorithm kunokuba nzima ukuyicinga, kukho into esinokuyenza ukukhawulezisa izibalo… kuba kuhle… kukho iilophu AMATHATHU, iilophu AMATHATHU kuphela.
Nangona kunjalo, njengoko kuqhele ukwenzeka - amathuba afihliweyo kwiinkcukacha 🙂
I-algorithm ye-Floyd-Warshall yi-algorithm yesiseko sokusombulula zonke-izibini eyona ngxaki imfutshane yendlela, ngakumbi xa kufikwa kwiigrafu ezixineneyo okanye ezipheleleyo (kuba i-algorithm yokukhangela iindlela phakathi kwazo zonke izibini ze-vertexes).
Nangona kunjalo, kwiimvavanyo zethu sisebenzisa i-directed, iigrafu ze-acyclic , ezinepropati emangalisayo - ukuba kukho indlela esuka kwi-vertex 1
ukuya kwi- 2
, ngoko akukho ndlela esuka kwi-vertex 2
ukuya kwi- 1
. Kithina, kuthetha ukuba, zininzi ii-vertex ezingadibananga esinokutsiba ukuba akukho ndlela esuka ku i
ukuya ku k
(sichaza oku kuphunyezwa njenge SpartialOptimisation
).
public void SpartialOptimisation(int[] matrix, int sz) { for (var k = 0; k < sz; ++k) { for (var i = 0; i < sz; ++i) { if (matrix[i * sz + k] == NO_EDGE) { continue; } for (var j = 0; j < sz; ++j) { var distance = matrix[i * sz + k] + matrix[k * sz + j]; if (matrix[i * sz + j] > distance) { matrix[i * sz + j] = distance; } } } } }
Nazi iziphumo zangaphambili ( Baseline
) kunye nangoku ( SpartialOptimisation
) yangoku kwiseti yeegrafu (iziphumo ezikhawulezayo ziphawulwe ngokungqindilili ):
Indlela | Ubungakanani | Ithetha | Impazamo | StdDev | Umlinganiselo |
---|---|---|---|---|---|
Isiseko | 300 | 27.525 ms | 0.1937 ms | 0.1617 ms | 1.00 |
SpartialOptimisation | 300 | 12.399 ms | 0.0943 ms | 0.0882 ms | 0.45 |
Isiseko | 600 | 217.897 ms | 1.6415 ms | 1.5355 ms | 1.00 |
SpartialOptimisation | 600 | 99.122 ms | 0.8230 ms | 0.7698 ms | 0.45 |
Isiseko | 1200 | 1,763.335 ms | 7.4561 ms | 6.2262 ms | 1.00 |
SpartialOptimisation | 1200 | 766.675 ms | 6.1147 ms | 5.7197 ms | 0.43 |
Isiseko | 2400 | 14,533.335 ms | 63.3518 ms | 52.9016 ms | 1.00 |
SpartialOptimisation | 2400 | 6,507.878 ms | 28.2317 ms | 26.4079 ms | 0.45 |
Isiseko | 4800 | 119,768.219 ms | 181.5514 ms | 160.9406 ms | 1.00 |
SpartialOptimisation | 4800 | 55,590.374 ms | 414.6051 ms | 387.8218 ms | 0.46 |
Iyancomeka kakhulu!
Sinciphise ixesha lokubala ngesiqingatha! Ngokuqinisekileyo, i-denser igrafu, isantya esincinci siya kuba. Nangona kunjalo, olu lolunye lolungiselelo oluza luncedo ukuba uyazi kwangaphambili ukuba loluphi udidi lwedatha ojonge ukusebenza nalo.
Ngaba sinokwenza okungakumbi? 🙂
I-PC yam ixhotyiswe nge Inter Core i7-7700 CPU 3.60GHz (Kaby Lake)
iprosesa ene-8 logical processors ( HW ) okanye ii-cores ze-4 ezinobuchwepheshe be-Hyper-Threading . Ukuba nondoqo ongaphezulu kwesinye kufana nokuba “nezandla ezizezinye” esinokuzisebenzisa. Ke, masibone ukuba leliphi icandelo lomsebenzi elinokwahlulwa ngokufanelekileyo phakathi kwabasebenzi abaninzi kwaye emva koko, lingqamanise.
Iiluphu zisoloko zingoyena mviwa ucacileyo wokunxulunyaniswa kuba kwiimeko ezininzi uphindaphindo luzimele kwaye lunokwenziwa ngaxeshanye. Kwi-algorithm, sinee-loops ezintathu ekufuneka sizihlalutye kwaye sibone ukuba kukho naziphi na izinto ezixhomekeke ekusithinteleni ukuba zihambelane nazo.
Masiqale for of k
loop. Kuluphu ngalunye lokuphindaphinda ubala iindlela ukusuka kwivertex nganye ukuya kwivertex nganye kwivertex k
. Iindlela ezintsha kunye nezihlaziyiweyo zigcinwa kwi-matrix yobunzima. Ukujonga uphindaphindo onokuthi uqaphele - banokubulawa ngalo naluphi na ulandelelwano: 0, 1, 2, 3 okanye 3, 1, 2, 0 ngaphandle kokuchaphazela isiphumo. Nangona kunjalo, kusafuneka ziphunyezwe ngokulandelelana, kungenjalo ezinye iiterations aziyi kukwazi ukusebenzisa iindlela ezintsha okanye ezihlaziyiweyo kuba aziyi kubhalwa kwi-matrix yobunzima, okwangoku. Ukungahambelani okunjalo ngokuqinisekileyo kuya kutyumza iziphumo. Ngoko kufuneka siqhubeke sikhangela.
Umgqatswa olandelayo for of i
loop. Kuluphu ngalunye lokuphinda lufundeka indlela esuka kwivertex i
ukuya kwivertex k
(iseli: W[i, k]
), indlela esuka kwivertex k
ukuya kwivertex j
(iseli: W[k, j
]) kwaye emva koko ijonga ukuba ngaba indlela eyaziwayo evela i
ukuya ku j
(iseli: W[i, j]
) mfutshane kuno i ⭢ k ⭢ j
indlela (isimbuku se: W[i, k]
+ W[k, j]
). Ukuba ujonga ngokusondeleyo kwipateni yokufikelela, unokuqaphela - kwi-iteration nganye i
loop ifunda idatha ukusuka k
kumqolo kwaye ihlaziywe i
rowu ethetha ngokusisiseko - uphindaphindo luzimele kwaye lunokuphunyezwa kungekuphela kulo naluphi na ulandelelwano kodwa nangokuhambelana!
Oku kubonakala kuthembisa, ngoko ke masikuphumeze (sichaza olu phumezo njenge SpartialParallelOptimisations
).
for of j
loop nayo inokudityaniswa. Nangona kunjalo, ukuhambelana komjikelo ongaphakathi kulo mzekelo akuphumelelanga kakhulu. Unokuziqinisekisa ngokwenza utshintsho olulula kwikhowudi yemvelaphi .
- Inqaku lombhali
public void SpartialParallelOptimisations(int[] matrix, int sz) { for (var k = 0; k < sz; ++k) { Parallel.For(0, sz, i => { if (matrix[i * sz + k] == NO_EDGE) { return; } for (var j = 0; j < sz; ++j) { var distance = matrix[i * sz + k] + matrix[k * sz + j]; if (matrix[i * sz + j] > distance) { matrix[i * sz + j] = distance; } } }); } }
Nazi iziphumo ze Baseline
, SpartialOptimisation
kunye SpartialParallelOptimisations
uzalisekiso kwiseti enye yeegrafu (ukulinganisa kwenziwa kusetyenziswa i-Parallel class):
Indlela | Ubungakanani | Ithetha | Impazamo | StdDev | Umlinganiselo |
---|---|---|---|---|---|
Isiseko | 300 | 27.525 ms | 0.1937 ms | 0.1617 ms | 1.00 |
SpartialOptimisation | 300 | 12.399 ms | 0.0943 ms | 0.0882 ms | 0.45 |
SpartialParallelOptimisations | 300 | 6.252 ms | 0.0211 ms | 0.0187 ms | 0.23 |
Isiseko | 600 | 217.897 ms | 1.6415 ms | 1.5355 ms | 1.00 |
SpartialOptimisation | 600 | 99.122 ms | 0.8230 ms | 0.7698 ms | 0.45 |
SpartialParallelOptimisations | 600 | 35.825 ms | 0.1003 ms | 0.0837 ms | 0.16 |
Isiseko | 1200 | 1,763.335 ms | 7.4561 ms | 6.2262 ms | 1.00 |
SpartialOptimisation | 1200 | 766.675 ms | 6.1147 ms | 5.7197 ms | 0.43 |
SpartialParallelOptimisations | 1200 | 248.801 ms | 0.6040 ms | 0.5043 ms | 0.14 |
Isiseko | 2400 | 14,533.335 ms | 63.3518 ms | 52.9016 ms | 1.00 |
SpartialOptimisation | 2400 | 6,507.878 ms | 28.2317 ms | 26.4079 ms | 0.45 |
SpartialParallelOptimisations | 2400 | 2,076.403 ms | 20.8320 ms | 19.4863 ms | 0.14 |
Isiseko | 4800 | 119,768.219 ms | 181.5514 ms | 160.9406 ms | 1.00 |
SpartialOptimisation | 4800 | 55,590.374 ms | 414.6051 ms | 387.8218 ms | 0.46 |
SpartialParallelOptimisations | 4800 | 15,614.506 ms | 115.6996 ms | 102.5647 ms | 0.13 |
Kwiziphumo sinokubona ukuba ukunxulunyaniswa kwe for of i
loop kunciphise ixesha lokubala ngama-2-4 amaxesha xa kuthelekiswa nokuphunyezwa kwangaphambili ( SpartialOptimisation
)! Oku kuchukumisa kakhulu, nangona kunjalo, kubalulekile ukukhumbula, ukuhambelana kobalo olusulungekileyo kudla zonke izixhobo zekhompyutha ezikhoyo ezinokukhokelela ekulambeni kobutyebi kwezinye izicelo kwinkqubo.
Njengoko unokuthekelela - esi ayisosiphelo 🙂
IVectorisation lutshintsho lwekhowudi esebenza kwinto enye ibe yikhowudi esebenza kwizinto ezininzi ngaxeshanye.
Oku kunokuvakala njengomcimbi onzima, ngoko ke makhe sibone ukuba isebenza njani kumzekelo olula:
var a = new [] { 5, 7, 8, 1 }; var b = new [] { 4, 2, 2, 6 }; var c = new [] { 0, 0, 0, 0 }; for (var i = 0; i < 4; ++i) c[i] = a[i] + b[i];
Ngendlela eyenziwe lula kakhulu , ukuphunyezwa kokuphinda-phinda oku- 0
koku kungasentla for
kwinqanaba le-CPU kunokucaciswa ngolu hlobo lulandelayo:
Nantsi into eyenzekayo. I-CPU ilayisha amaxabiso oluhlu a
kunye no b
ukusuka kwimemori ukuya kwiirejista ze-CPU (irejista enye inokubamba ngqo ixabiso elinye ). Emva koko i-CPU yenza umsebenzi wokongeza kwi-scalar kwezi rejista kwaye ibhale isiphumo sibuyele kwimemori ephambili - ngqo kuluhlu c
. Le nkqubo iphinda iphindwe kane ngaphambi kokuba i-loop iphele.
UVectorisation kuthetha ukusetyenziswa kweerejista ezikhethekileyo ze-CPU - iVector okanye iirejista ze-SIMD (umyalelo omnye wedatha ezininzi), kunye nemiyalelo ye-CPU ehambelanayo yokwenza imisebenzi kumaxabiso amaninzi ngaxeshanye:
Nantsi into eyenzekayo. I-CPU ilayisha amaxabiso oluhlu a
kunye b
ukusuka kwimemori ukuya kwiirejista ze-CPU (nangona ngeli xesha, irejista yevektha enye inokubamba ixabiso loluhlu olubini ). Emva koko i-CPU yenza umsebenzi wokongeza i-vector kwezi rejista kwaye ibhale isiphumo sibuyele kwimemori ephambili - ngqo kuluhlu c
. Ngenxa yokuba sisebenza kumaxabiso amabini kanye, le nkqubo iphindwa kabini endaweni yesine.
Ukuphumeza i-vectorisation kwi-.NET sinokusebenzisa - iVector kunye neVector<T> iintlobo (nathi sinokusebenzisa iindidi ezivela kwi -System.Runtime.Intrinsics namespace, nangona kunjalo zihamba phambili kuphunyezo lwangoku, ngoko ke andiyi kuzisebenzisa, kodwa ndizive simahla ukuba uzizame ngokwakho):
public void SpartialVectorOptimisations(int[] matrix, int sz) { for (var k = 0; k < sz; ++k) { for (var i = 0; i < sz; ++i) { if (matrix[i * sz + k] == NO_EDGE) { continue; } var ik_vec = new Vector<int>(matrix[i * sz + k]); var j = 0; for (; j < sz - Vector<int>.Count; j += Vector<int>.Count) { var ij_vec = new Vector<int>(matrix, i * sz + j); var ikj_vec = new Vector<int>(matrix, k * sz + j) + ik_vec; var lt_vec = Vector.LessThan(ij_vec, ikj_vec); if (lt_vec == new Vector<int>(-1)) { continue; } var r_vec = Vector.ConditionalSelect(lt_vec, ij_vec, ikj_vec); r_vec.CopyTo(matrix, i * sz + j); } for (; j < sz; ++j) { var distance = matrix[i * sz + k] + matrix[k * sz + j]; if (matrix[i * sz + j] > distance) { matrix[i * sz + j] = distance; } } } } }
Ikhowudi yeVectorised inokujongeka ingaqhelekanga, ke masihambe ngayo inyathelo ngenyathelo.
Senza i-vector ye- i ⭢ k
indlela: var ik_vec = new Vector<int>(matrix[i * sz + k])
. Njengomphumo, ukuba i-vector inokubamba amaxabiso amane ohlobo int
kunye nobunzima be i ⭢ k
indlela ilingana no-5, siya kudala i-vector yesine 5's - [5, 5, 5, 5]. Okulandelayo, kuphindaphindo ngalunye, ngaxeshanye sibala iindlela ukusuka kwi-vertex i
ukuya kwi-vertexes j, j + 1, ..., j + Vector<int>.Count
.
Vector<int>.Count
lubuyisela inani leezakhi zodidi int
olungena kwiirejista ze-vector. Ubungakanani beerejista zeVektha buxhomekeke kwimodeli ye-CPU, ke le propati inokubuyisela amaxabiso awohlukeneyo kwii-CPU's.
- Inqaku lombhali
Yonke inkqubo yokubala inokohlulwa ibe ngamanyathelo amathathu:
ij_vec
kunye ikj_vec
.ij_vec
kunye ikj_vec
iivektha kwaye ukhethe amaxabiso amancinci kwivektha entsha r_vec
.r_vec
umva ubunzima matrix.
Ngelixa i-#1 kunye ne #3 zithe ngqo, #2 ifuna ingcaciso. Njengoko bekutshiwo ngaphambili, ngee-vectors sisebenzisa amaxabiso amaninzi ngexesha elinye. Ngoko ke, isiphumo semisebenzi ethile ayinakuba sisinye, oko kukuthi, umsebenzi wothelekiso Vector.LessThan(ij_vec, ikj_vec)
ayinakubuya true
okanye false
kuba ithelekisa amaxabiso amaninzi. Ngoko endaweni yoko ibuyisela i-vector entsha equlathe iziphumo zothelekiso phakathi kwamaxabiso ahambelanayo asuka kwivektha ij_vec
kunye ne ikj_vec
( -1
, ukuba ixabiso elisuka kwi ij_vec
lingaphantsi kwexabiso kwi ikj_vec
kunye no 0
ukuba kungenjalo). IVektha ebuyiselweyo (ngokwayo) ayiloncedo ngenene, nangona kunjalo, sinokusebenzisa Vector.ConditionalSelect(lt_vec, ij_vec, ikj_vec)
ukusebenza kwevektha ukukhupha amaxabiso afunekayo kwi ij_vec
kunye ne ikj_vec
vector kwi-vector entsha – r_vec
. Lo msebenzi ubuyisela ivektha entsha apho amaxabiso alingana namancinane amabini ahambelanayo amaxabiso eevektha zongeniso okt, ukuba ixabiso le vector lt_vec
lilingana no -1
, ngoko umsebenzi ukhetha ixabiso kwi ij_vec
, kungenjalo ikhetha ixabiso kwi ikj_vec
.
Ngaphandle kwe- #2 , kukho enye inxalenye, ifuna inkcazo-yesibini, i-loop ye-non-vectorised. Kufuneka xa ubungakanani bobunzima be-matrix bungeyomveliso Vector<int>.Count
ixabiso. Kwimeko apho i-loop engundoqo ayikwazi ukucubungula onke amaxabiso (kuba awukwazi ukulayisha inxalenye ye-vector) kwaye okwesibini, i-non-vectored, i-loop iya kubala ukuphindaphinda okuseleyo.
Nazi iziphumo zoku kunye nakho konke ukuphunyezwa kwangaphambili:
Indlela | sz | Ithetha | Impazamo | StdDev | Umlinganiselo |
---|---|---|---|---|---|
Isiseko | 300 | 27.525 ms | 0.1937 ms | 0.1617 ms | 1.00 |
SpartialOptimisation | 300 | 12.399 ms | 0.0943 ms | 0.0882 ms | 0.45 |
SpartialParallelOptimisations | 300 | 6.252 ms | 0.0211 ms | 0.0187 ms | 0.23 |
SpartialVectorOptimisations | 300 | 3.056 ms | 0.0301 ms | 0.0281 ms | 0.11 |
Isiseko | 600 | 217.897 ms | 1.6415 ms | 1.5355 ms | 1.00 |
SpartialOptimisation | 600 | 99.122 ms | 0.8230 ms | 0.7698 ms | 0.45 |
SpartialParallelOptimisations | 600 | 35.825 ms | 0.1003 ms | 0.0837 ms | 0.16 |
SpartialVectorOptimisations | 600 | 24.378 ms | 0.4320 ms | 0.4041 ms | 0.11 |
Isiseko | 1200 | 1,763.335 ms | 7.4561 ms | 6.2262 ms | 1.00 |
SpartialOptimisation | 1200 | 766.675 ms | 6.1147 ms | 5.7197 ms | 0.43 |
SpartialParallelOptimisations | 1200 | 248.801 ms | 0.6040 ms | 0.5043 ms | 0.14 |
SpartialVectorOptimisations | 1200 | 185.628 ms | 2.1240 ms | 1.9868 ms | 0.11 |
Isiseko | 2400 | 14,533.335 ms | 63.3518 ms | 52.9016 ms | 1.00 |
SpartialOptimisation | 2400 | 6,507.878 ms | 28.2317 ms | 26.4079 ms | 0.45 |
SpartialParallelOptimisations | 2400 | 2,076.403 ms | 20.8320 ms | 19.4863 ms | 0.14 |
SpartialVectorOptimisations | 2400 | 2,568.676 ms | 31.7359 ms | 29.6858 ms | 0.18 |
Isiseko | 4800 | 119,768.219 ms | 181.5514 ms | 160.9406 ms | 1.00 |
SpartialOptimisation | 4800 | 55,590.374 ms | 414.6051 ms | 387.8218 ms | 0.46 |
SpartialParallelOptimisations | 4800 | 15,614.506 ms | 115.6996 ms | 102.5647 ms | 0.13 |
SpartialVectorOptimisations | 4800 | 18,257.991 ms | 84.5978 ms | 79.1329 ms | 0.15 |
Ukususela kwisiphumo kuyacaca, i-vectorisation iye yanciphisa kakhulu ixesha lokubala - ukusuka kwi-3 ukuya kwi-4 amaxesha xa kuthelekiswa ne-non-parallelised version ( SpartialOptimisation
). Umzuzu onomdla apha, kukuba i-vectorised version iphinda idlulele kwi-parallel version ( SpartialParallelOptimisations
) kwiigrafu ezincinci (ngaphantsi kweeverteksi ze-2400).
Kwaye okokugqibela kodwa okungakuncinananga - masidibanise i-vectorisation kunye ne-parallelism!
Ukuba unomdla kwisicelo esisebenzayo se-vectorisation, ndincoma ukuba ufunde uchungechunge lwezithuba zikaDan Shechter apho wafaka khona Array.Sort
(ezi ziphumo kamva zifumene uhlaziyo lwe-Garbage Collector kwi-.NET 5 ).
Uzalisekiso lokugqibela ludibanisa iinzame zongqamaniso kunye novetho kunye ... ukunyaniseka alunamntu 🙂 Kuba… ke, sisanda kutshintsha umzimba Parallel.For
kwi SpartialParallelOptimisations
kunye ne-vectorised loop evela kwi SpartialVectorOptimisations
:
public void SpartialParallelVectorOptimisations(int[] matrix, int sz) { for (var k = 0; k < sz; ++k) { Parallel.For(0, sz, i => { if (matrix[i * sz + k] == NO_EDGE) { return; } var ik_vec = new Vector<int>(matrix[i * sz + k]); var j = 0; for (; j < sz - Vector<int>.Count; j += Vector<int>.Count) { var ij_vec = new Vector<int>(matrix, i * sz + j); var ikj_vec = new Vector<int>(matrix, k * sz + j) + ik_vec; var lt_vec = Vector.LessThan(ij_vec, ikj_vec); if (lt_vec == new Vector<int>(-1)) { continue; } var r_vec = Vector.ConditionalSelect(lt_vec, ij_vec, ikj_vec); r_vec.CopyTo(matrix, i * sz + j); } for (; j < sz; ++j) { var distance = matrix[i * sz + k] + matrix[k * sz + j]; if (matrix[i * sz + j] > distance) { matrix[i * sz + j] = distance; } } }); } }
Zonke iziphumo zesi sithuba zithiwe thaca apha ngezantsi. Njengoko kulindelekile, ukusetyenziswa ngaxeshanye ukuhambelana kunye ne-vectorisation kubonise iziphumo ezilungileyo, ukunciphisa ixesha lokubala ukuya kumaxesha angama-25 (kwiigrafu ze-1200 vertexes) xa kuthelekiswa nokuphunyezwa kokuqala.
Indlela | sz | Ithetha | Impazamo | StdDev | Umlinganiselo |
---|---|---|---|---|---|
Isiseko | 300 | 27.525 ms | 0.1937 ms | 0.1617 ms | 1.00 |
SpartialOptimisation | 300 | 12.399 ms | 0.0943 ms | 0.0882 ms | 0.45 |
SpartialParallelOptimisations | 300 | 6.252 ms | 0.0211 ms | 0.0187 ms | 0.23 |
SpartialVectorOptimisations | 300 | 3.056 ms | 0.0301 ms | 0.0281 ms | 0.11 |
SpartialParallelVectorOptimisations | 300 | 3.008 ms | 0.0075 ms | 0.0066 ms | 0.11 |
Isiseko | 600 | 217.897 ms | 1.6415 ms | 1.5355 ms | 1.00 |
SpartialOptimisation | 600 | 99.122 ms | 0.8230 ms | 0.7698 ms | 0.45 |
SpartialParallelOptimisations | 600 | 35.825 ms | 0.1003 ms | 0.0837 ms | 0.16 |
SpartialVectorOptimisations | 600 | 24.378 ms | 0.4320 ms | 0.4041 ms | 0.11 |
SpartialParallelVectorOptimisations | 600 | 13.425 ms | 0.0319 ms | 0.0283 ms | 0.06 |
Isiseko | 1200 | 1,763.335 ms | 7.4561 ms | 6.2262 ms | 1.00 |
SpartialOptimisation | 1200 | 766.675 ms | 6.1147 ms | 5.7197 ms | 0.43 |
SpartialParallelOptimisations | 1200 | 248.801 ms | 0.6040 ms | 0.5043 ms | 0.14 |
SpartialVectorOptimisations | 1200 | 185.628 ms | 2.1240 ms | 1.9868 ms | 0.11 |
SpartialParallelVectorOptimisations | 1200 | 76.770 ms | 0.3021 ms | 0.2522 ms | 0.04 |
Isiseko | 2400 | 14,533.335 ms | 63.3518 ms | 52.9016 ms | 1.00 |
SpartialOptimisation | 2400 | 6,507.878 ms | 28.2317 ms | 26.4079 ms | 0.45 |
SpartialParallelOptimisations | 2400 | 2,076.403 ms | 20.8320 ms | 19.4863 ms | 0.14 |
SpartialVectorOptimisations | 2400 | 2,568.676 ms | 31.7359 ms | 29.6858 ms | 0.18 |
SpartialParallelVectorOptimisations | 2400 | 1,281.877 ms | 25.1127 ms | 64.8239 ms | 0.09 |
Isiseko | 4800 | 119,768.219 ms | 181.5514 ms | 160.9406 ms | 1.00 |
SpartialOptimisation | 4800 | 55,590.374 ms | 414.6051 ms | 387.8218 ms | 0.46 |
SpartialParallelOptimisations | 4800 | 15,614.506 ms | 115.6996 ms | 102.5647 ms | 0.13 |
SpartialVectorOptimisations | 4800 | 18,257.991 ms | 84.5978 ms | 79.1329 ms | 0.15 |
SpartialParallelVectorOptimisations | 4800 | 12,785.824 ms | 98.6947 ms | 87.4903 ms | 0.11 |
Kule post singene kancinci kweyona ngxaki imfutshane yezibini kwaye siphumeze i-algorithm ye-Floyd-Warshall kwi-C # ukuyisombulula. Siphinde sahlaziya ukuphunyezwa kwethu ukuhlonipha idatha kunye nokusebenzisa umgangatho ophantsi we-.NET kunye ne-hardware.
Kule post sidlale "zonke ngaphakathi". Nangona kunjalo, kwizicelo zobomi benene kubalulekile ukukhumbula - asodwa. I-Hardcore parallelization inokuthoba kakhulu ukusebenza kwenkqubo kwaye ibangele ingozi enkulu kunokulunga. I-Vectorisation kwelinye icala ikhuselekile ngakumbi ukuba yenziwa ngendlela ezimeleyo yeqonga. Khumbula ukuba eminye imiyalelo yeVector inokufumaneka kuphela kwii-CPU ezithile.
Ndiyathemba ukuba ukonwabele ukufunda kwaye wonwabe “ekucudiseni” amasuntswana okusebenza kwi-algorithm enemijikelo EMITHATHU nje 🙂