Сургалтын шинжлэх ухааны GPU хөдөлгүүрийн үйлдвэрлэдэг 1,500x хурдан ухааны түлхүүр симулирует

Зохиогчийн эрх : Хуучин Zhang Gan He Эдүүлбэр Хуучин Liu Ж. Ж. Йоханс Hjorth Александр Козлов Yutao He Shenjian Zhang Jeanette Hellgren Kotaleski Yonghong Tian Шилэн Grillner Kai Du Хуан Хуан Зохиогчийн эрх : Хуучин Zhang Өнгөрсөн Эдүүлбэр Хуучин Liu Ж. Ж. Йоханс Hjorth Александр Козлов Хуучин Шэньчжэн Zhang Жанетт Hellgren Kotaleski Зөвлөгөө Шилэн Grillner Хэрэв та Хуан Хуан Эдүүлбэр Биофизикийн дэлгэрэнгүй олон хэсгүүдний загварууд нь биеийн тоног төхөөрөмжийн эх үүсвэрийг олж авахын тулд хүчтэй арга хэрэгсэл юм. Гэсэн хэдий ч, үнэтэй тоног төхөөрөмжийн зардал нь нейрохимийн шинжлэх ухааны болон АИ-ийн салбарт хэрэглээг хамардаг. дэлгэрэнгүй хэсгүүдний загварууд симуляцийг симуляцийг туршиж буй гол хязгаар нь симулятор нь гигант системийг шийдэх чадвар юм. Энд бид шинэ хэлбэрийг санал болгож байна Эдүүлбэр Иерархи Тагийн үйл явц хурдан болгохын тулд cheduling (DHS) арга хэрэгсэл. Бид теоретич DHS имплементацийг компьютерийн тохиромжтой, нарийвчлалтай гэж үзэж байна. Энэ GPU-д суурилсан арга хэрэгсэл нь стандарт CPU платформ дахь класик цуврал Hines арга хэрэгсэлээс илүү хурдан 2-3 орд өндөр хурдтай ажилладаг. Бид DHS арга хэрэгсэл, NEURON симулятор дахь GPU компьютерийн хөдөлгүүрийн нэгтгэсэн DeepDendrite framework бий болгосон бөгөөд нейрофизийн үйл явцад DeepDendrite-ийн хэрэглээг үзүүлээрэй. Бид 25,000-ийн спинтэй нарийвчлалтай хүний пирамидын нейроны загвар дахь нейрон нь мэдрэгчдийн шинж чанарыг хэрхэн нөлөөлж болохыг судлах. D H S Үйлчилгээ Нейронын кодирууд, тоноглогдсон эх үүсвэрүүд нь нейрохимийн шинжлэх ухааны хувьд чухал юм. Эмэгтэйчүүдийн мозг өөр өөр төрлийн нейрон, жинхэнэ морфологи, биофизикийн шинж чанарыг хамардаг. Энэ нь одоо ч байтугай мэдлэгтэй биш боловч "тоон-нейрон" суралцал , нейрон нь хялбар суулгах нэгж гэж үзсэн бөгөөд энэ нь одоо ч бас шилдэг нь нейрон компьютерийг ашиглаж байна, ялангуяа нейрон сүлжээний анализ. Хамгийн сүүлийн үеийн жилийн турш, орчин үеийн бодисын мэдлэг (AI) энэ принципейг ашиглаж, эрчим хүчтэй хэрэгсэл, гэх мэт мэдлэгтэй нейрон сүлжээ (ANN) боловсруулсан Гэсэн хэдий ч, нэг нейроны түвшинд өргөн хүрээтэй тооцоололтоор, нейрон dendrites гэх мэт субцелерийн өрөөн нь өөрсдийн тооцоололтой нэгж болгон nonlinear үйл ажиллагаа явуулж болно. , , , , Үүнээс гадна, dendritic spines, dendrites нь dendrites хатуу ширхэг ширхэг ширхэг нь dendrites, synaptic сигналийг хуваалцаж болно, тэднийг ex vivo болон in vivo тэдний хатуу dendrites-ээс өөрчилж болно. , , , . 1 2 3 4 5 6 7 8 9 10 11 Биологийн тодорхойлолттай нейрон ашиглан симуляци нь биологийн тодорхойлолт нь компьютерийн эх үүсвэрийг холбох теорийн бүтэц олгодог. Биофизикийн тодорхойлолттай олон хэсгүүдтэй загвар бүтэцний үндсэн , Бид мэдрэгчтэй dendritic morphologies, байгалийн ион conductance, болон extrinsic synaptic inputs нь нейроны загвар хийх боломжийг олгодог. дэлгэрэнгүй multi-compartment загвар, түүнчлэн dendrites, хэт авианы хавтан дээр суурилсан байна , Dendrites-ийн биофизикийн мембран шинж чанарыг пассив кабелийн хэлбэрээр загварладаг, цахим сигналуудыг энгийн нейрон үйл явцыг хооронд хязгаарлагддаг, хуваалцах талаархи математик тодорхойлолт олгодог. Кабелийн шинж чанарыг идэвхтэй биофизикийн механизмыг зэрэг ион суваг, гайхалтай, хязгаарлагдмал синаптик цагираг гэх мэт холбогдсон, дэлгэрэнгүй олон хэсгүүдийн загвар нь туршилтын хязгаарлалыг дагаж болно. , . 12 13 12 4 7 Нейрон шинжлэх ухааны талаархи гүнзгий үр дүнд гадна, биологийн дэлгэрэнгүй нейрон загварууд нь эмийн бүтэц, биофизикийн тодорхойлолт, ИА-ийн хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоорондын хоолой юм. Гэсэн хэдий ч "backpropagation-of-error" (backprop) алгоритмтай ANN-ийг мэргэшсэн хэрэглээнд гайхамшигтай гүйцэтгэлийг олж, Go болон шахмат тоглолтонд хамгийн сайн хүний мэргэжлийн , , хүний мозг ANN-ийг илүү динамик, шулуун байгаль орчинд адилдаг. , Хамгийн сүүлийн үеийн теорийн судалгаа нь dendritic интеграци нь үр ашигтай суралцах алгоритмыг үүсгэхэд чухал юм. , , Үүнээс гадна, нэг нарийвчлалтай олон хэсэг загвар нь зөвхөн синаптик хүчдэл тохиргоор тоног төхөөрөмжийн нейронын сүлжээний түвшин не-линейны тооцоог суралцаж болно. , , дэлгэрэнгүй загварууд нь илүү хүчтэй ариун AI системийг бий болгохын тулд бүрэн потенциалыг харуулсан. Тиймээс нэг дэлгэрэнгүй нейрон загварууд нь их хэмжээний биологийн дэлгэрэнгүй сүлжээнд ариун AI-ийн парадигмуудыг өргөжүүлэх хамгийн их приоритет юм. 14 15 16 17 18 19 20 21 22 Ангилалтай симуляцийн аргаар нэг давтамж нь ихэвчлэн өндөр тооцооллын зардал юм, энэ нь нейрохимийн шинжлэх ухааны болон AI-д хэрэглээг ихэвчлэн хязгаарлагдмал байна. Симуляцины гол хязгаарлалт нь хязгаарлагдмал загвар дээр суурилсан линеар хэлбэрийн шийдэх юм. , , үр ашигтай сайжруулахын тулд класик Hines арга нь O(n3)-ээс O(n) -ийг шийдэхийн тулд цаг хугацааны хязгаарлалыг багасгах, энэ нь NEURON гэх мэт алдартай симулятор дахь үндсэн алгоритм болгон өргөн хэрэглэдэг. and GENESIS Гэсэн хэдий ч, энэ арга хэрэгсэл тусгай хуудсуудтай боловсруулах цуврал аргаар ашигладаг. Биофизикийн тодорхойлолттай dendrites, dendritic spines нь хэд хэдэн суулгах үед, Linear Equation Matrix ("Hines Matrix") нь dendrites эсвэл spines нь ихэвчлэн нэмэгдэж байна. ), Hines арга нь одоо практик биш болгон, учир нь энэ нь бүрэн симуляци дээр маш их ачаалал байна. 12 23 24 25 26 1E Нөхцөл 5-ийн пирамидын нейрон загвар, дэлгэрэнгүй нейрон загвартай хэрэглэгдсэн математик формул. Нэмэлт нейрон загварууд нумерик симуляцийг ажиллуулах үед Workflow. Equation-solving phase is the bottleneck in the simulation. Simulation-д линеар хэлбэрийн жишээ. Hines-ийн методын өгөгдлийн харилцаа холбоо нь линеар хэлбэрийн шийдэл хийхэд Үйлчилгээ Hines-ийн матрицыйн хэмжээ нь загварын хязгаарлалттай байдаг. Хязгаарлалттай системийн тоо нь загварууд илүү нарийвчлалтай байх үед ихэвчлэн нэмэгдэж байна. Компьютерийн зардал (хэвийн шийдэл фазын хоорондын хоорондын хоорондын янз бүрийн төрлийн Hines-ийн арга хэрэгсэл). өөр өөр шийдэл арга хэрэгсэл иллюстраци. Нейрон нь янз бүрийн хэсэг нь олон боловсруулах нэгж нь паралель арга хэрэгсэл (зүүн, дунд), янз бүрийн өнгөтэй харуулж байна. цуврал арга хэрэгсэл (зүүн), бүх хуудсууд нэг нэгж нь тооцогддог. Компьютерийн үнэ нь гурван арга хэрэгсэл пирамидалын загвар нь хавтан шийдэх үед. гагнуурын нь 500 пирамидалын загварыг шийдэх талаар янз бүрийн арга замыг ажиллуулах хугацаа. гагнуурын хугацаа нь 1s симуляцийн хугацааг харуулдаг (жишээ нь 0.025 мс-ийн цаг хугацаатай 40,000 удаа шийдэх). p-Hines нь CoreNEURON (GPU дээр), Branch дээр суурилсан гагнуурын арга замыг (GPU дээр), DHS Dendritic хиэрхэрийн төлөвлөгөөний арга замыг (GPU дээр). a b c d c e f g h g i Өнгөрсөн долоо хоногийн турш, Hines-ийн арга хэрэгсэл нь шилэн түвшинд паралель арга хэрэгсэл ашиглан хурдасгахын тулд маш их дэвшилтэт авах боломжтой бөгөөд энэ нь өөр нэг шилэн дахь янз бүрийн хэсгүүдийн тооцоо нь паралель хийж чадна. , , , , , Гэсэн хэдий ч, одоогийн шилэн түвшний паралель арга хэрэгсэл нь ихэвчлэн үр ашигтай паралелизацийн стратегийг байхгүй бол, эсвэл анхны Hines арга хэрэгсэлтай харьцуулахад хялбар тооцтай. 27 28 29 30 31 32 Энд бид бүрэн автомат, нумерик нарийвчлалтай, optimized симуляцийн хэрэгсэл боловсруулсан бөгөөд энэ нь компьютерийн үр ашигтай хурдасгах, компьютерийн зардал багасгахад боломжийг олгодог. Үүнээс гадна, энэ симуляцийн хэрэгсэл нь машин суралцах болон AI хэрэглээнд биологийн мэдээлэлтай нейрон сүлжээг үүсгэх, туршиж болно. Критически, бид Hines арга хэрэгсэл нь паралель тооцоолох математик төлөвлөгөөний асуудал гэж хэлж, комбинатор оптимизацид суурилсан Dendritic Hierarchical Scheduling (DHS) арга хэрэгсэл үүсгэх Параллел компьютерийн теори . Бид бидний алгоритм нь нарийвчлалгүйгээр хамгийн тохиромжтой төлөвлөгөөг хангахыг харуулж байна. Үүнээс гадна, бид GPU-ийн тоног төхөөрөмжийн түншлэл, тоног төхөөрөмжийн хандах механизмыг ашиглан DHS-ийг одоогийн хамгийн дэвшилтэт GPU чипын хувьд optimized байна. ) compared to the classic simulator NEURON while maintaining identical accuracy. 33 34 1 25 AI-д хэрэглэхэд тодорхой dendritic симуляцийг боломжийг олгохын тулд бид DHS-д суурилуулсан CoreNEURON (NEURON-д optimized calculation engine) платформыг нэгтгэх замаар DeepDendrite framework-ийг үүсгэх болно. Simulation Engine болон хоёр тусламжтай модулиуд (I/O модуль, суралцах модуль) нь симуляцид хооронд dendritic суралцалтын алгоритмыг дэмждэг. DeepDendrite GPU тоног төхөөрөмжийн платформ дээр ажиллуулж, нейрохимийн шинжлэх ухааны стандарт симуляцийг болон AI-д суралцалтын үйл ажиллагааг дэмждэг. 35 Хамгийн сүүлийн үеийн, бид DeepDendrite ашиглан хэд хэдэн хэрэглээг санал болгож байна, нейрохимийн шинжлэх ухааны болон AI-д зарим чухал асуултуудтай: (1) Бид dendritic ширээний импульсийн орон нутгийн загварууд нь dendritic ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний нейрон дээр нейрон үйл ажиллагаа хэрхэн нөлөөлж болохыг харуулж байна. DeepDendrite нь ~25,000 dendritic ширээний нь симуляцид хүний пирамидын нейрон загваруудтай нейрончлолыг суралцах боломжийг олгодог. (2) Сэтгэгдэлд DeepDendrite-ийг AI-ийн контекстд, тодорхойлолт нь morphologically detailed human pyramid DeepDendrite-ийн бүх эх үүсвэр код, бүрэн ширээний загвар, дэлгэрэнгүй dendritic сүлжээний загвар онлайн (Code Availability) үзнэ үү. Бидний нээлттэй эх үүсвэрийг суралцаж болно бусад dendritic суралцалтын нөхцөл, жишээ нь nonlinear (full-active) dendrites-ийн суралцалтын нөхцөл , Burst-зависимый синаптик пластикууд , and learning with spike prediction Бүтээгдэхүүний талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаархи талаар 21 20 36 Бүтээгдэхүүн Dendritic Hierarchical Scheduling (DHS) арга Ион хөнгөн цагааны компьютерийг тооцох, линеар хөнгөн цагааны шийдэл нь биофизичтэй тодорхойлолттай нейрон, энэ нь цаг хугацааны хэрэгцээтэй бөгөөд маш их тооцооллын ачаалалтай байдаг нь хоёр чухал фаз юм. Үнэндээ, тусгай хавтгайны ион хөнгөн цагааны компьютерийг тооцох нь бүрэн тусгай үйл явц бөгөөд энэ нь GPU шиг массив параллель компьютерийн нэгжтай төхөөрөмжүүд дээр байгалийн харьцуулахад болно Эцэст нь, линеар хэлбэрийн шийдэл нь паралелизацийн үйл явцыг буцаан буцаан буцаан байх болно (Fig. Нөхцөл 37 1a-г To tackle this bottleneck, cellular-level parallel methods have been developed, which accelerate single-cell computation by “splitting” a single cell into several compartments that can be computed in parallel , , Гэсэн хэдий ч, ийм арга хэрэгсэл нь нэг нейрон нь хуваалцах талаар практик стратегийг үүсгэхийн тулд өмнөх мэдлэг дээр ихээхэн хамаарна (Фиг. ; Supplementary Fig. ). Энэ нь асимметрийн morphologies нь нейрон, жишээ нь, пирамидал нейрон болон Purkinje нейрон нь бага үр ашигтай байх болно. 27 28 38 1g -и 1 Бид биологийн дэлгэрэнгүй нейрон сүлжээний симуляцийг илүү үр дүнтэй, нарийвчлалтай харьцангуй арга замыг хөгжүүлэх зорилготой байна. Эхлээд, бид шилэн түвшинд харьцангуй арга замыг нарийвчлалтай талаархи критерийг үүсгэдэг. харьцангуй компьютерийн теорийг суурилсан , бид харьцуулахад харьцуулахад харьцуулахад гурван нөхцөл санал болгож байна харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад харьцуулахад 34 Симуляцийн нарийвчлал, тооцооллын зардал дээр суурилсан, бид паралелизацийн асуудал нь математик төлөвлөгөөний асуудал гэж хэлж байна (Methods). Хэвийн хэлбэл, бид нэг нейрон нь олон түдгэлзүүлсэн (загварын) бүлэг гэж үздэг. паралель утас, бид хамгийн их нь тооцоолж болно Үнэндээ чансаанд нодололтууд байдаг, гэхдээ бид бүрийн нодололтууд боловсруулсан бол нэг нодололтууд нь тооцогдсон байх ёстой; Бидний зорилго нь бүх үйл явцыг зориулсан шатанд хамгийн бага тоотай стратегийг олж авах юм. k k хамгийн тохиромжтой хуваалтыг үүсгэхын тулд бид Dendritic Hierarchical Scheduling (DHS) гэж нэрлэдэг арга хэрэгсэл санал болгож байна. DHS-ийн гол санаа нь гүнзгий түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлэх юм. DHS арга нь хоёр арга хэмжээтэй байдаг: dendritic топологийг анализ, хамгийн сайн хуваалцлыг олж авах: (1) Дараа нь дэлгэрэнгүй загваруудтай, бид эхлээд өөрсдийн хатуу түлхүүр олж авах, хавтгай дөрвөлжин (хавтгай нь хавтгай нь тэдний эртний түлхүүрүүд нь тоо юм) хавтгайгыг тооцох (Хувтгай. ). (2) Топологийн анализ дараа бид кандидатаас хайж, хамгийн их сонгох хамгийн уян хатан candidate түдгэлзүүлсэн (хөгжлийн түдгэлзүүлсэн бол нэг түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн түдгэлзүүлсэн). ). 2а 2Б, C k 2d DHS үйл явц. DHS үйл явц Хамгийн уян хатан candidate node нь бүр iteration. Illustration of calculating node depth of a compartmental model. The model is first converted to a tree structure then the depth of each node is computed. Colors indicate different depth values. янз бүрийн нейрон загварууд дээр топологийн анализ. тусгай morphologies нь гурван нейроны энд харуулсан. Бүх загварууд нь, soma нь зуурын зүрх гэж сонгож байна, Тиймээс зүрхний гүн сома (0) нь дисталь dendrites нэмэгдэж байна. DHS-ийн загвар дээр гүйцэтгэх иллюстраци дөрвөн утастай. Кандидатууд: боловсруулсан нодууд. сонгосон кандидаттууд: DHS-ийн сонгосон нодууд, т.н. Хамгийн гүнзгий кандидат. Processed nodes: өмнө боловсруулсан nodes. Паралелизацийн стратегийн DHS-ийн үйл явц хойш олж авсан . ДХС нь хоолой нь хэд хэдэн хоолой руу хуваалцахын тулд цуврал хоолой боловсруулах шатанд 14-ээс 5-тэй бууруулж байна. Хавтгай зардал, т.е. DHS-ийн тооцооллын зардал, DHS-ийг янз бүрийн төрөл загвар дээр янз бүрийн тоо нь ашиглах үед цуврал Hines-ийн зардалтай харьцуулалт. a k b c d b k e d f Жишээлбэл нь 15 өрөөнтэй хязгаарлагдмал загварыг авч үзнэ үү. Хязгаарлагдмал тооцоолох Hines-ийн арга хэрэгсэл ашиглан бүх түлхүүрүүдийг боловсруулахын тулд 14 арга хэмжээтэй байх болно. ДХС-ийг 4 паралель нэгжтай ашиглан түүний түлхүүрүүд нь 5 түлхүүрээр хуваалцах боломжтой. ): {{9,10,12,14}, {1,7,11,13}, {2,3,4,8}, {6}, {5}}. Эдгээр дор цуглуулаг нь паралель боловсруулах боломжтой бөгөөд энэ нь DHS-ийн ашиглан бүх цуглуулаг боловсруулахын тулд зөвхөн 5 шатанд хэрэгтэй. Нөхцөл 2d 2E Дараа нь бид ModelDB-аас сонгосон 6 санал болгож буй дэлгэрэнгүй нейрон загвар дээр DHS-ийн арга хэрэглэдэг. ) янз бүрийн нунтаг (Fig. ): Кортикулын болон хиппокампалын пирамидалны нейроны зэрэг , , , cerebellar Purkinje нейрон Striatal проекцийн нейрон (SPN) ), болон нэхмэл боолт митрал жимс , мэдрэгч, кортикаль, субкортикаль талбайд гол гол нейроны хамардаг. Дараа нь тооцооллын үнэ цэнэтэй. Энд харьцуулалтын үнэ цэнэ DHS-ийн цуврал Hines-ийн аргаар тооцооллын үнэ цэнэтэй харьцуулагдсан. тооцооллын үнэ цэнэтэй, т.е. хэлбэрийн шийдэл хийх үйл явцүүдийн тоо, харьцуулалтын үнэ цэнэтэй нэмэгдэж байна. Жишээ нь, 16 шугам нь DHS-ийн тооцооллын үнэ цэнэтэй Hines-ийн аргаар харьцуулахад 7%-10% юм. Интригуудтай нь, DHS-ийн арга нь 16 эсвэл 8 харьцуулалтын шугам (Дэлгэр. ), дэлгэц илүү нэмж чанарын гүйцэтгэлийг илүү сайжруулсан биш юм гэж үздэг хуудсууд хооронд холболт. 39 2f 40 41 42 43 44 45 2f Бид нэгтгэсэн DHS арга замыг үүсгэхийн тулд dendritic топологийн автомат анализ, паралель компьютерийн хувьд хамгийн тохиромжтой хуваалцах боломжийг олгодог. DHS нь симуляцийн эхлэх өмнө хамгийн тохиромжтой хуваалцахыг олж чадна. GPU-ийн памэрийн нэмэгдүүлэх нь DHS-ийг хурдасгах DHS нь нейрон бүр нь хэд хэдэн утастай тооцоолдог. Энэ нь нейрон сүлжээний симуляцийг ажиллуулах үед утастай утас нь маш их хэмжээгээр хэрэглэдэг. График боловсруулах нэгж (GPUs) нь ихэвчлэн боловсруулах нэгж (түүнчлэн streaming процессорууд, SPs, Фиг. ) Параллел компьютерийн . In theory, many SPs on the GPU should support efficient simulation for large-scale neural networks (Fig. Гэсэн хэдий ч, бид DHS-ийн үр ашиг нь сүлжээний хэмжээ нэмэгдэж байгаа үед ихэвчлэн багассан гэж үзсэн бөгөөд энэ нь интермедиум үр дүнг татаж, бичлэхийн тулд үүсгэсэн өгөгдлийн хадгалалт, нэмэлт хадгаламжийн хандалтын үр дүнд үр дүнтэй байж болох юм. Эдүүлбэр » 3а, б 46 3C 3d GPU архитектурын болон түүний хэв маяг бүтэц. Бүх GPU нь ихэвчлэн боловсруулах нэгжүүдтэй байдаг (шүршүүрийн процессорууд). Хэмжээний янз бүрийн төрлийн янз бүрийн дамжуулалттай байдаг. Architecture of Streaming Multiprocessors (SMs). Each SM contains multiple streaming processors, registers, and L1 cache. DHS-ийг хоёр нейрон дээр хэрэглэдэг. Симуляцийн үед бүр нь нэг stream процессорууд дээр гүйцэтгэдэг. GPU-д меморийн оптимизацийн стратеги. DHS-ийн шилдэг панел, тэсвэр, өгөгдлийн хадгалах өмнө (хүртээ) болон дараах (хүртээ) меморийн нэмэгдүүлэх. Дараа нь, хоёр нейроны симуляцийг триангуларизацид нэг шагны жишээ . Процессор нь дэлхий даяар бүр нэхмэл нь өгөгдлийг татаж авахын тулд өгөгдлийг илгээдэг. Хэмжээ буулгахгүйгээр (сүүлд), бүх хүсэлтийн өгөгдлийг татаж авахын тулд 7 үйл явц, хооронд үр дүнд зарим нэмэлт үйл явц хэрэгтэй. Хэмжээ буулгахын тулд (хүртээ), бүх хүсэлтийн өгөгдлийг татаж авахын тулд зөвхөн 2 үйл явц хэрэгтэй. Run time of DHS (32 threads each cell) with and without memory boosting on multiple layer 5 pyramidal models with spines. Хэмжээ нэмэгдүүлэх хурдасгаж олон давхаргатай 5 пирамидын загваруудтай. Хэмжээ нэмэгдүүлэх 1.6-2 удаа хурдасгаж өгдөг. a b c d d e f Бид GPU-ийн меморийн ширээний механизм дээр суурилсан GPU-ийн меморийн ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ши , . Өндөр дамжуулалтыг олж авахын тулд бид эхлээд нодуулын компьютерийн ордыг харьцуулахад, тэдгээрийн нодуулын тоотай харьцуулахад дамжуулалтыг өөрчилж байна. Дараа нь бид компьютерийн ордтай харьцуулахад глобал хадгаламжийн өгөгдлийн хадгаламжийн дамжуулалтыг дамжуулав, т.е. нэг шатанд боловсруулсан нодууд нь глобал хадгаламжийн дараах хэлбэрээр хадгаламж байна. Үүнээс гадна, бид интермедиум үр дүнд хадгалахын тулд GPU бүртгэлийг ашигладаг. Үзэлт нь памэлийн нэмэгдүүлэх нь 8 хүсэлтийн өгөгдлийг татаж авахын тулд зөвхөн хоёр памэлийн транзакцийг ашигладаг (Фиг. Дараа нь, спин болон типичны нейрон загваруудтай пирамидалын нейроны олон тооны туршилт (Fig. ; нэмэлт Fig. ) дэлгэрэнгүй дэлгэрэнгүй дэлгэрэнгүй дэлгэрэнгүй дэлгэрэнгүй дэлгэрэнгүй 46 47 3d 3а, F 2 GPU-ийн хадгаламжийн нэмэгдүүлэх нь DHS-ийн гүйцэтгэлийн өргөн туршилтаар туршиж авахын тулд бид 6 типичный нейрон загварыг сонгож, хавтгай дөрвөн загварууд дээр кабелийн босоо шийдэх ажиллуулах хугацааг үнэлдэг. ). Бид CoreNEURON-д GPU аргаар харьцуулахад, DHS-4 болон DHS-16 тус бүр 5 болон 15 удаа хурдан авч болно (Хувцаслалт. Үүнээс гадна, NEURON-ийн стандарт цуврал Hines арга замтай харьцуулахад CPU-ийн нэг утастай ажиллуулж, DHS нь 2-3 орд хэмжээгээр симуляцийг хурдан болгодог. ), нэг ижил нумерик нарийвчлалтай байх үед хатуу ширхэг (Дэлгэрэнгүй ширхэг. Нөхцөл ), active dendrites (Supplementary Fig. ) болон янз бүрийн сегментацийн стратегии (Дополнительный Фиг. Нөхцөл 4 4а 3 4 8 7 7 GPU дээр 1s симуляцийг шийдэх ажиллуулах хугацаа (dt = 0,025 мс, нийт 40,000 итераци). CoreNEURON: CoreNEURON-д ашиглагддаг паралель арга; DHS-4: DHS нь бүр нейрон нь 4 утас; DHS-16: DHS нь 16 утас нь бүр нейрон. Нөхцөл Visualization of the partition by DHS-4 and DHS-16, each color indicates a single thread. During computation, each thread switches among different branches. a b c DHS нь хавтгайны төрөл тусгай хамгийн тохиромжтой хуваалцах үүсгэдэг To gain insights into the working mechanism of the DHS method, we visualized the partitioning process by mapping compartments to each thread (every color presents a single thread in Fig. ). Визуализатор нь нэг утас нь янз бүрийн гадаргуудын хооронд ихэвчлэн холбогдсон гэж үздэг (Фиг. Үзүүлэлт нь, DHS нь morphologically symmetric нейрон, гэх мэт striatal проекцийн нейрон (SPN) болон Mitral шил (Fig. ). Үнэндээ, энэ нь пирамидалын нейрон, Purkinje-ийн эсийн зэрэг морфологийн асимметрийн нейроны фрагментийн хуваалцах үүсгэдэг (Фиг. ), энэ нь DHS нь нэхмэлийн арьс нь тусгай хавтгайны түлхүүр дээр (түүнчлэн, арьс түлхүүр) гэх мэт түлхүүр түлхүүрээр хуваалцаж байна. Энэ нь шилэн төрөл тусгай хуваалцсан түлхүүр нь DHS-ийг бүх боломжийг бүрэн ашиглах боломжийг олгодог. 4Б, C 4Б, C 4Б, C 4Б, C Нэгдүгээрт, DHS болон мэдрэмжийг нэмэгдүүлэх нь харьцуулахад харьцуулагдсан үр ашигтай линеар хэлбэрийн шийдлийг шийдэхийн тулд хамгийн тохиромжтой шийдлийг үүсгэдэг. Эдгээр принципейг ашиглан бид мэдрэмжтай GPU програмуудтай ямар ч тодорхой мэдлэггүй загварыг импровизац хийхэд нейрохимийн шинжлэх ухааны шинжлэх ухааны платформыг ашиглаж болно. Дараа нь бид DeepDendrite-ийг шинжлэх ухааны ажиллагаанд ашиглах талаархи мэдрэмжийг харуулж байна. Үзээлбэл хэсэгт AI-тэй холбоотой ажиллагаанд DeepDendrite-ийг дэмжих боломжийг харуулж байна. DHS нь спингийн түвшин загвар хийх боломжтой Dendritic ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний ширээний шир , , , , Гэсэн хэдий ч, ширхэг нь маш бага байдаг ( ~ 1 μm урт) tension-dependent үйл явцыг харьцуулахад шууд эмчлэхэд. Иймээс, теоретик ажил ширхэг тооцоо бүрэн мэдэгдэхэд чухал юм. 10 48 49 50 51 Бид хоёр өрөөтэй нэг ширээний загвар хийх боломжтой: ширээний ширээний ширээ, ширээний ширээний ширээ, ширээний ширээний ширээний ширээний ширээ. The theory predicts that the very thin spinal neck (0.1-0.5 умм диаметр) electronically isolates the spinal head from its parent dendritis, Иймээс spinal head дээр үүсгэсэн сигналуудыг compartmentalizing . However, the detailed model with fully distributed spines on dendrites (“full-spine model”) is computationally very expensive. A common compromising solution is to modify the capacitance and resistance of the membrane by a Өнгөрсөн , бүх ширээний хэлбэрээр хэлбэрээр хэлбэрээр. Энд, спин фактор нь биеийн мембраны биофизикийн шинж чанарыг хамарна. . 52 53 F 54 F 54 Inspired by the previous work of Eyal et al. , бид dendritic spines дээр үүсгэсэн excitatory input-ийн янз бүрийн орон нутгийн загвар нь ялангуяа загвартай spines нь хүний пирамидалын нейрон загвар дахь нейрон үйл ажиллагааг хэлбэрээр хэрхэн судлах (Fig. ). Noticeably, Eyal et al. employed the spine factor to incorporate spines into dendrites while only a few activated spines were explicitly attached to dendrites (“few-spine model” in Fig. ). The value of spine in their model was computed from the dendritic area and spine area in the reconstructed data. Accordingly, we calculated the spine density from their reconstructed data to make our full-spine model more consistent with Eyal’s few-spine model. With the spine density set to 1.3 μm-1, the pyramidal neuron model contained about 25,000 spines without altering the model’s original morphological and biophysical properties. Further, we repeated the previous experiment protocols with both full-spine and few-spine models. We use the same synaptic input as in Eyal’s work but attach extra background noise to each sample. By comparing the somatic traces (Fig. ) and spike probability (Fig. ) in full-spine and few-spine models, we found that the full-spine model is much leakier than the few-spine model. In addition, the spike probability triggered by the activation of clustered spines appeared to be more nonlinear in the full-spine model (the solid blue line in Fig. ) харин хэд хэдэн ширээний загвар (Fig. ). These results indicate that the conventional F-factor method may underestimate the impact of dense spine on the computations of dendritic excitability and nonlinearity. 51 5a F 5a F 5b, c 5d 5d 5d Experiment setup. We examine two major types of models: few-spine models and full-spine models. Few-spine models (two on the left) are the models that incorporated spine area globally into dendrites and only attach individual spines together with activated synapses. In full-spine models (two on the right), all spines are explicitly attached over whole dendrites. We explore the effects of clustered and randomly distributed synaptic inputs on the few-spine models and the full-spine models, respectively. Somatic voltages recorded for cases in . Colors of the voltage curves correspond to , Scale бар: 20 мс, 20 мВ. Color-coded voltages during the simulation in at specific times. Colors indicate the magnitude of voltage. Somatic spike probability as a function of the number of simultaneously activated synapses (as in Eyal et al.’s work) for four cases in . Баруун шугам холбогдсон байна. Run time of experiments in with different simulation methods. NEURON: conventional NEURON simulator running on a single CPU core. CoreNEURON: CoreNEURON simulator on a single GPU. DeepDendrite: DeepDendrite on a single GPU. a b a a c b d a e d In the DeepDendrite platform, both full-spine and few-spine models achieved 8 times speedup compared to CoreNEURON on the GPU platform and 100 times speedup compared to serial NEURON on the CPU platform (Fig. ; Supplementary Table ) while keeping the identical simulation results (Supplementary Figs. and ). Therefore, the DHS method enables explorations of dendritic excitability under more realistic anatomic conditions. 5e 1 4 8 Сэтгэгдэл In this work, we propose the DHS method to parallelize the computation of Hines method and we mathematically demonstrate that the DHS provides an optimal solution without any loss of precision. Next, we implement DHS on the GPU hardware platform and use GPU memory boosting techniques to refine the DHS (Fig. ). When simulating a large number of neurons with complex morphologies, DHS with memory boosting achieves a 15-fold speedup (Supplementary Table ) as compared to the GPU method used in CoreNEURON and up to 1,500-fold speedup compared to serial Hines method in the CPU platform (Fig. ; Supplementary Fig. and Supplementary Table ). Furthermore, we develop the GPU-based DeepDendrite framework by integrating DHS into CoreNEURON. Finally, as a demonstration of the capacity of DeepDendrite, we present a representative application: examine spine computations in a detailed pyramidal neuron model with 25,000 spines. Further in this section, we elaborate on how we have expanded the DeepDendrite framework to enable efficient training of biophysically detailed neural networks. To explore the hypothesis that dendrites improve robustness against adversarial attacks , we train our network on typical image classification tasks. We show that DeepDendrite can support both neuroscience simulations and AI-related detailed neural network tasks with unprecedented speed, therefore significantly promoting detailed neuroscience simulations and potentially for future AI explorations. 55 3 1 4 3 1 56 Decades of efforts have been invested in speeding up the Hines method with parallel methods. Early work mainly focuses on network-level parallelization. In network simulations, each cell independently solves its corresponding linear equations with the Hines method. Network-level parallel methods distribute a network on multiple threads and parallelize the computation of each cell group with each thread , . With network-level methods, we can simulate detailed networks on clusters or supercomputers . In recent years, GPU has been used for detailed network simulation. Because the GPU contains massive computing units, one thread is usually assigned one cell rather than a cell group , , . With further optimization, GPU-based methods achieve much higher efficiency in network simulation. However, the computation inside the cells is still serial in network-level methods, so they still cannot deal with the problem when the “Hines matrix” of each cell scales large. 57 58 59 35 60 61 Cellular-level parallel methods further parallelize the computation inside each cell. The main idea of cellular-level parallel methods is to split each cell into several sub-blocks and parallelize the computation of those sub-blocks , . However, typical cellular-level methods (e.g., the “multi-split” method ) pay less attention to the parallelization strategy. The lack of a fine parallelization strategy results in unsatisfactory performance. To achieve higher efficiency, some studies try to obtain finer-grained parallelization by introducing extra computation operations , , or making approximations on some crucial compartments, while solving linear equations , . These finer-grained parallelization strategies can get higher efficiency but lack sufficient numerical accuracy as in the original Hines method. 27 28 28 29 38 62 63 64 Unlike previous methods, DHS adopts the finest-grained parallelization strategy, i.e., compartment-level parallelization. By modeling the problem of “how to parallelize” as a combinatorial optimization problem, DHS provides an optimal compartment-level parallelization strategy. Moreover, DHS does not introduce any extra operation or value approximation, so it achieves the lowest computational cost and retains sufficient numerical accuracy as in the original Hines method at the same time. Dendritic spines are the most abundant microstructures in the brain for projection neurons in the cortex, hippocampus, cerebellum, and basal ganglia. As spines receive most of the excitatory inputs in the central nervous system, electrical signals generated by spines are the main driving force for large-scale neuronal activities in the forebrain and cerebellum , . The structure of the spine, with an enlarged spine head and a very thin spine neck—leads to surprisingly high input impedance at the spine head, which could be up to 500 MΩ, combining experimental data and the detailed compartment modeling approach , . Due to such high input impedance, a single synaptic input can evoke a “gigantic” EPSP ( ~ 20 mV) at the spine-head level , , thereby boosting NMDA currents and ion channel currents in the spine . However, in the classic single detailed compartment models, all spines are replaced by the coefficient modifying the dendritic cable geometries . Энэ арга хэрэгсэл нь шүршүүрийн хөнгөн цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан 10 11 48 65 48 66 11 F 54 On the other hand, the spine’s electrical compartmentalization is always accompanied by the biochemical compartmentalization , , , resulting in a drastic increase of internal [Ca2+], within the spine and a cascade of molecular processes involving synaptic plasticity of importance for learning and memory. Intriguingly, the biochemical process triggered by learning, in turn, remodels the spine’s morphology, enlarging (or shrinking) the spine head, or elongating (or shortening) the spine neck, which significantly alters the spine’s electrical capacity , , , . Such experience-dependent changes in spine morphology also referred to as “structural plasticity”, have been widely observed in the visual cortex , , somatosensory cortex , , motor cortex , hippocampus , and the basal ganglia in vivo. They play a critical role in motor and spatial learning as well as memory formation. However, due to the computational costs, nearly all detailed network models exploit the “F-factor” approach to replace actual spines, and are thus unable to explore the spine functions at the system level. By taking advantage of our framework and the GPU platform, we can run a few thousand detailed neurons models, each with tens of thousands of spines on a single GPU, while maintaining ~100 times faster than the traditional serial method on a single CPU (Fig. ). Therefore, it enables us to explore of structural plasticity in large-scale circuit models across diverse brain regions. 8 52 67 67 68 69 70 71 72 73 74 75 9 76 5e Another critical issue is how to link dendrites to brain functions at the systems/network level. It has been well established that dendrites can perform comprehensive computations on synaptic inputs due to enriched ion channels and local biophysical membrane properties , , . For example, cortical pyramidal neurons can carry out sublinear synaptic integration at the proximal dendrite but progressively shift to supralinear integration at the distal dendrite . Moreover, distal dendrites can produce regenerative events such as dendritic sodium spikes, calcium spikes, and NMDA spikes/plateau potentials , . Such dendritic events are widely observed in mice or even human cortical neurons in vitro, which may offer various logical operations , or gating functions , . Recently, in vivo recordings in awake or behaving mice provide strong evidence that dendritic spikes/plateau potentials are crucial for orientation selectivity in the visual cortex , sensory-motor integration in the whisker system , , and spatial navigation in the hippocampal CA1 region . 5 6 7 77 6 78 6 79 6 79 80 81 82 83 84 85 To establish the causal link between dendrites and animal (including human) patterns of behavior, large-scale biophysically detailed neural circuit models are a powerful computational tool to realize this mission. However, running a large-scale detailed circuit model of 10,000-100,000 neurons generally requires the computing power of supercomputers. It is even more challenging to optimize such models for in vivo data, as it needs iterative simulations of the models. The DeepDendrite framework can directly support many state-of-the-art large-scale circuit models , , , which were initially developed based on NEURON. Moreover, using our framework, a single GPU card such as Tesla A100 could easily support the operation of detailed circuit models of up to 10,000 neurons, thereby providing carbon-efficient and affordable plans for ordinary labs to develop and optimize their own large-scale detailed models. 86 87 88 Тавтай морилно уу, Тавтай морилно уу, Тавтай морилно уу, Тавтай морилно уу! , and exploring full learning potentials on more realistic neuron , Гэсэн хэдий ч, загварын хэмжээ, биологийн тодорхойлолт хооронд компромисс байна, түүнчлэн сүлжээний түвшин нэмэгдүүлэх нь олон удаа нейрон түвшин цуглуулгыг зарцуулдаг. , , . Moreover, more detailed neuron models are less mathematically tractable and computationally expensive . 20 21 22 19 20 89 21 There has also been progress in the role of active dendrites in ANNs for computer vision tasks. Iyer et al. . proposed a novel ANN architecture with active dendrites, demonstrating competitive results in multi-task and continual learning. Jones and Kording used a binary tree to approximate dendrite branching and provided valuable insights into the influence of tree structure on single neurons’ computational capacity. Bird et al. . proposed a dendritic normalization rule based on biophysical behavior, offering an interesting perspective on the contribution of dendritic arbor structure to computation. While these studies offer valuable insights, they primarily rely on abstractions derived from spatially extended neurons, and do not fully exploit the detailed biological properties and spatial information of dendrites. Further investigation is needed to unveil the potential of leveraging more realistic neuron models for understanding the shared mechanisms underlying brain computation and deep learning. 90 91 92 Эдгээр асуултуудыг хангахын тулд бид DeepDendrite-ийг боловсруулсан бөгөөд энэ хэрэгсэл нь Dendritic Hierarchical Scheduling (DHS) арга хэрэгсэл ашиглан тооцоололтой зардал багасгахын тулд I/O модуль, том өгөгдлийн багцыг боловсруулдаг. DeepDendrite-ийг ашиглан бид амжилттай гурван давхаргатай гибрид нейрон сүлжээ, Human Pyramidal Cell Network (HPC-Net) (Хөгжлийн пирамидын шилэн сүлжээ) (Харга. ). This network demonstrated efficient training capabilities in image classification tasks, achieving approximately 25 times speedup compared to training on a traditional CPU-based platform (Fig. ; Supplementary Table Нөхцөл 6a, b 6f 1 The illustration of the Human Pyramidal Cell Network (HPC-Net) for image classification. Images are transformed to spike trains and fed into the network model. Learning is triggered by error signals propagated from soma to dendrites. Training with mini-batch. Multiple networks are simulated simultaneously with different images as inputs. The total weight updates ΔW are computed as the average of ΔWi from each network. Comparison of the HPC-Net before and after training. Left, the visualization of hidden neuron responses to a specific input before (top) and after (bottom) training. Right, hidden layer weights (from input to hidden layer) distribution before (top) and after (bottom) training. Трансфер adversarial атак туршилтын ажлын үйл явц. Бид эхлээд 20-ийн хавтгай ResNet дээр цуглуулсан туршилтын adversarial дээж үүсгэдэг. Дараа нь энэ adversarial дээж (шуурын зураг) ашиглаж, цэвэр зурагтай боловсруулсан загваруудын ангилалт нарийвчлал шалгана. Prediction accuracy of each model on adversarial samples after training 30 epochs on MNIST (left) and Fashion-MNIST (right) datasets. Run time of training and testing for the HPC-Net. The batch size is set to 16. Left, run time of training one epoch. Right, run time of testing. Parallel NEURON + Python: training and testing on a single CPU with multiple cores, using 40-process-parallel NEURON to simulate the HPC-Net and extra Python code to support mini-batch training. DeepDendrite: training and testing the HPC-Net on a single GPU with DeepDendrite. a b c d e f Additionally, it is widely recognized that the performance of Artificial Neural Networks (ANNs) can be undermined by adversarial attacks —intentionally engineered perturbations devised to mislead ANNs. Intriguingly, an existing hypothesis suggests that dendrites and synapses may innately defend against such attacks . Our experimental results utilizing HPC-Net lend support to this hypothesis, as we observed that networks endowed with detailed dendritic structures demonstrated some increased resilience to transfer adversarial attacks compared to standard ANNs, as evident in MNIST and Fashion-MNIST datasets (Fig. ). This evidence implies that the inherent biophysical properties of dendrites could be pivotal in augmenting the robustness of ANNs against adversarial interference. Nonetheless, it is essential to conduct further studies to validate these findings using more challenging datasets such as ImageNet . 93 56 94 95 96 6d, e 97 In conclusion, DeepDendrite has shown remarkable potential in image classification tasks, opening up a world of exciting future directions and possibilities. To further advance DeepDendrite and the application of biologically detailed dendritic models in AI tasks, we may focus on developing multi-GPU systems and exploring applications in other domains, such as Natural Language Processing (NLP), where dendritic filtering properties align well with the inherently noisy and ambiguous nature of human language. Challenges include testing scalability in larger-scale problems, understanding performance across various tasks and domains, and addressing the computational complexity introduced by novel biological principles, such as active dendrites. By overcoming these limitations, we can further advance the understanding and capabilities of biophysically detailed dendritic neural networks, potentially uncovering new advantages, enhancing their robustness against adversarial attacks and noisy inputs, and ultimately bridging the gap between neuroscience and modern AI. Methods Simulation with DHS CoreNEURON Эдүүлбэр ( ) Neuron ашиглах Архитектур, хэв маяг ашиглах, тооцоолох хурд нь optimized байна. Бид CoreNEURON байлгах нь эх үүсвэр кодыг өөрчилж CoreNEURON байлгахын тулд Dendritic Hierarchical Scheduling (DHS) арга замыг хэрэглэдэг. CoreNEURON нь GPU дээр симулируулсан бүх загварууд нь дараах орлогдсон DHS-тэй симулируулсан болно: 35 https://github.com/BlueBrain/CoreNeuron 25 coreneuron_exec -d /path/to/models -e time --cell-permute 3 --cell-nthread 16 --gpu The usage options are as in Table . 1 Accuracy of the simulation using cellular-level parallel computation To ensure the accuracy of the simulation, we first need to define the correctness of a cellular-level parallel algorithm to judge whether it will generate identical solutions compared with the proven correct serial methods, like the Hines method used in the NEURON simulation platform. Based on the theories in parallel computing , a parallel algorithm will yield an identical result as its corresponding serial algorithm, if and only if the data process order in the parallel algorithm is consistent with data dependency in the serial method. The Hines method has two symmetrical phases: triangularization and back-substitution. By analyzing the serial computing Hines method , we find that its data dependency can be formulated as a tree structure, where the nodes on the tree represent the compartments of the detailed neuron model. In the triangularization process, the value of each node depends on its children nodes. In contrast, during the back-substitution process, the value of each node is dependent on its parent node (Fig. ). Thus, we can compute nodes on different branches in parallel as their values are not dependent. 34 55 1d Based on the data dependency of the serial computing Hines method, we propose three conditions to make sure a parallel method will yield identical solutions as the serial computing Hines method: (1) The tree morphology and initial values of all nodes are identical to those in the serial computing Hines method; (2) In the triangularization phase, a node can be processed if and only if all its children nodes are already processed; (3) In the back-substitution phase, a node can be processed only if its parent node is already processed. Once a parallel computing method satisfies these three conditions, it will produce identical solutions as the serial computing method. Computational cost of cellular-level parallel computing method To theoretically evaluate the run time, i.e., efficiency, of the serial and parallel computing methods, we introduce and formulate the concept of computational cost as follows: given a tree Нөхцөл threads (basic computational units) to perform triangularization, parallel triangularization equals to divide the node set of into subsets, i.e., Үнэгүй , , … } where the size of each subset | | ≤ , i.e., at most нодоос бүр үйл ажиллагаа явуулж болно, учир нь зөвхөн threads. The process of the triangularization phase follows the order: → → … → , and nodes in the same subset can be processed in parallel. So, we define | | (the size of set , i.e., here) as the computational cost of the parallel computing method. In short, we define the computational cost of a parallel method as the number of steps it takes in the triangularization phase. Because the back-substitution is symmetrical with triangularization, the total cost of the entire solving equation phase is twice that of the triangularization phase. T k V T n V V1 V2 Нөхцөл Vi k k k V1 V2 Vn Vi V V n Mathematical scheduling problem Based on the simulation accuracy and computational cost, we formulate the parallelization problem as a mathematical scheduling problem: Given a tree = { , } and a positive integer , where is the node-set and is the edge set. Define partition ( ) = { , , … }, | | ≤ , 1 ≤ ≤ n, where | | indicates the cardinal number of subset , i.e., the number of nodes in , and for each node ∈ , all its children nodes { | ∈children( )} байх ёстой Өмнөх дэд цомогт , where 1 ≤ < Бидний зорилго нь хамгийн тохиромжтой хуваалцах ( ) whose computational cost | ( )| is minimal. T V E k V E P V V1 V2 Vn Vi k i Үнэгүй Vi Vi v Vi c c v Vj j i P * V P * V Here subset Энэ нь бүх нодуудыг тооцох болно -th step (Fig. ), so | | ≤ indicates that we can compute nodes each step at most because the number of available threads is . The restriction “for each node ∈ , all its children nodes { | Эмэгтэйчүүд ( )} must in a previous subset , where 1 ≤ « “Энэ нь Node can be processed only if all its child nodes are processed. Үнэгүй i 2E Үнэгүй k k k v Vi c c v Нөхцөл j i v DHS гүйцэтгэлийн We aim to find an optimal way to parallelize the computation of solving linear equations for each neuron model by solving the mathematical scheduling problem above. To get the optimal partition, DHS first analyzes the topology and calculates the depth ( ) Бүх нунтаг ∈ . Then, the following two steps will be executed iteratively until every node ∈ is assigned to a subset: (1) find all candidate nodes and put these nodes into candidate set Зөвхөн энэ нь тэдний бүх хүүхдүүдийн зүрхүүд боловсруулсан бол, эсвэл энэ нь ямар ч хүүхдүүдийн зүрхүүдтэй биш юм. (2) | ≤ , i.e., the number of candidate nodes is smaller or equivalent to the number of available threads, remove all nodes in and put them into Дараа нь, татаж авах deepest nodes from Өнгөрсөн хуваалцах . Label these nodes as processed nodes (Fig. ). After filling in subset , go to step (1) to fill in the next subset . d v v V v V Q Q k Q V*i k Q Үнэгүй 2d Үнэгүй Vi+1 Correctness proof for DHS After applying DHS to a neural tree = { , }, we get a partition ( ) = { Нөхцөл , … }, | Өнгөрсөн ≤ , 1 ≤ ≤ . Nodes in the same subset will be computed in parallel, taking steps to perform triangularization and back-substitution, respectively. We then demonstrate that the reordering of the computation in DHS will result in a result identical to the serial Hines method. T V E P V V1 V2 Vn Vi k i n Vi n The partition ( ) obtained from DHS decides the computation order of all nodes in a neural tree. Below we demonstrate that the computation order determined by ( ) satisfies the correctness conditions. ( ) is obtained from the given neural tree . Operations in DHS do not modify the tree topology and values of tree nodes (corresponding values in the linear equations), so the tree morphology and initial values of all nodes are not changed, which satisfies condition 1: the tree morphology and initial values of all nodes are identical to those in serial Hines method. In triangularization, nodes are processed from subset to . As shown in the implementation of DHS, all nodes in subset are selected from the candidate set , , болон нодоос байрлуулж болно Өнгөрсөн цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан цагаан are in { , , … }, meaning that a node is only computed after all its children have been processed, which satisfies condition 2: in triangularization, a node can be processed if and only if all its child nodes are already processed. In back-substitution, the computation order is the opposite of that in triangularization, i.e., from to . As shown before, the child nodes of all nodes in Энэ нь , , … }, so parent nodes of nodes in are in { , , … }, 3-р нөхцөл: back-substitution-д, нодоод нь зүгээр л эхний нодоор боловсруулсан бол боловсруулсан болно. P V P V P V T V1 Нөхцөл Үнэгүй Q Q Vi V1 V2 Vi-1 Vn V1 Vi V1 V2 Vi-1 Vi Vi+1 Vi+2 Vn Optimality proof for DHS The idea of the proof is that if there is another optimal solution, it can be transformed into our DHS solution without increasing the number of steps the algorithm requires, thus indicating that the DHS solution is optimal. For each subset in ( ), DHS moves (Thread тоо) Хамгийн гүнзгий нодоос нь харьцуулахад candidate багц to . Хэрэв Node тоо is smaller than , move all nodes from Өнгөрсөн . To simplify, we introduce , indicating the depth sum of deepest nodes in . All subsets in ( ) satisfy the max-depth criteria (Supplementary Fig. ): . We then prove that selecting the deepest nodes in each iteration makes an optimal partition. If there exists an optimal partition = { , , … } хамгийн их нарийвчлалтай критерийг хангахын тулд дор бүтэц агуулсан, бид дор бүтэц өөрчлөх боломжтой ( ) so that all subsets consist of the deepest nodes from Үйлчилгээний тоо: ( )|) remain the same after modification. Vi P V k Үнэгүй Vi Qi k Qi Vi Di k Qi P V 6а P(V) Эдүүлбэр V*1 V*2 V*s P* V Q P* V Without any loss of generalization, we start from the first subset not satisfying the criteria, i.e., . There are two possible cases that will make not satisfy the max-depth criteria: (1) | | < and there exist some valid nodes in that are not put to ; (2) | | = but nodes in are not the Хамгийн давуу тал нь . Эдүүлбэр V*i V*i k Qi V*i V*i k V*i k Qi Зарим тохиолдолд (1), учир нь зарим кандидатайн түдгэлзүүлсэн , these nodes must be in the subsequent subsets. As | | , we can move the corresponding nodes from the subsequent subsets to , which will not increase the number of subsets and make satisfy the criteria (Supplementary Fig. , top). For case (2), | | = , these deeper nodes that are not moved from the candidate set into must be added to subsequent subsets (Supplementary Fig. , bottom). These deeper nodes can be moved from subsequent subsets to through the following method. Assume that after filling , is picked and one of the Хамгийн гүнзгий нунтаг Үүнээс гадна , thus will be put into a subsequent subset ( > ). We first move from to + , then modify subset + as follows: if | + | ≤ and none of the nodes in + is the parent of node , stop modifying the latter subsets. Otherwise, modify + as follows (Supplementary Fig. ): if the parent node of is in + , move this parent node to + Өнгөрсөн ; else move the node with minimum depth from + to + . After adjusting , modify subsequent subsets + Нөхцөл + , … with the same strategy. Finally, move Нөхцөл to . V*i V*i < k V*i V*i 6b V*i k Qi V*i 6b V*i V*i v k v’ Qi v’ V*j j i v V*i V*i 1 V*i 1 V*i 1 k V*i 1 v V*i 1 6c v Эдүүлбэр 1 V*i 2 V*i 1 V*i 2 V*i V*i 1 V*i 2 V*j-1 v’ V*j V*i With the modification strategy described above, we can replace all shallower nodes in with the -th deepest node in and keep the number of subsets, i.e., | ( )| the same after modification. We can modify the nodes with the same strategy for all subsets in ( ) that do not contain the deepest nodes. Finally, all subsets ∈ ( ) can satisfy the max-depth criteria, and | ( )| does not change after modifying. V*i k Qi P* V P* V V*i P* V P* V In conclusion, DHS generates a partition ( ), and all subsets ∈ ( ) satisfy the max-depth condition: . For any other optimal partition ( ) we can modify its subsets to make its structure the same as ( ), i.e., each subset consists of the deepest nodes in the candidate set, and keep | ( ) the same after modification. So, the partition ( ) obtained from DHS is one of the optimal partitions. P V Vi P V P* V P V P* V | P V GPU implementation and memory boosting To achieve high memory throughput, GPU utilizes the memory hierarchy of (1) global memory, (2) cache, (3) register, where global memory has large capacity but low throughput, while registers have low capacity but high throughput. We aim to boost memory throughput by leveraging the memory hierarchy of GPU. GPU employs SIMT (Single-Instruction, Multiple-Thread) architecture. Warps are the basic scheduling units on GPU (a warp is a group of 32 parallel threads). A warp executes the same instruction with different data for different threads . Correctly ordering the nodes is essential for this batching of computation in warps, to make sure DHS obtains identical results as the serial Hines method. When implementing DHS on GPU, we first group all cells into multiple warps based on their morphologies. Cells with similar morphologies are grouped in the same warp. We then apply DHS on all neurons, assigning the compartments of each neuron to multiple threads. Because neurons are grouped into warps, the threads for the same neuron are in the same warp. Therefore, the intrinsic synchronization in warps keeps the computation order consistent with the data dependency of the serial Hines method. Finally, threads in each warp are aligned and rearranged according to the number of compartments. 46 Хэрвээ та энэ нь тавтай морилно уу, Хэрвээ та энэ нь тавтай морилно уу, Хэрвээ та энэ нь тавтай морилно уу, Хэрвээ та энэ нь тавтай морилно уу, Хэрвээ та энэ нь тавтай морилно уу, Хэрвээ та энэ нь тавтай морилно уу, Хэрвээ та энэ нь тавтай морилно уу, Хэрвээ та энэ нь тавтай морилно уу. Full-spine and few-spine biophysical models We used the published human pyramidal neuron . The membrane capacitance m = 0.44 μF cm-2, membrane resistance m = 48,300 Ω cm2, and axial resistivity a = 261.97 Ω cm. In this model, all dendrites were modeled as passive cables while somas were active. The leak reversal potential l = -83.1 mV. Ion channels such as Na+ and K+ were inserted on soma and initial axon, and their reversal potentials were Na = 67.6 mV, K = -102 mV respectively. All these specific parameters were set the same as in the model of Eyal, et al. , for more details please refer to the published model (ModelDB, access No. 238347). 51 c r r E E E 51 In the few-spine model, the membrane capacitance and maximum leak conductance of the dendritic cables 60 μm away from soma were multiplied by a spine factor to approximate dendritic spines. In this model, spine was set to 1.9. Only the spines that receive synaptic inputs were explicitly attached to dendrites. F F In the full-spine model, all spines were explicitly attached to dendrites. We calculated the spine density with the reconstructed neuron in Eyal, et al. . The spine density was set to 1.3 μm-1, and each cell contained 24994 spines on dendrites 60 μm away from the soma. 51 The morphologies and biophysical mechanisms of spines were the same in few-spine and full-spine models. The length of the spine neck neck = 1.35 μm and the diameter neck = 0.25 μm, whereas the length and diameter of the spine head were 0.944 μm, i.e., the spine head area was set to 2.8 μm2. Both spine neck and spine head were modeled as passive cables, with the reversal potential = -86 mV. The specific membrane capacitance, membrane resistance, and axial resistivity were the same as those for dendrites. L D El Synaptic inputs Бид дистрибуирован, clustered synaptic input-ийн хувьд нейрон уян хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан хатан ха AMPA-based and NMDA-based synaptic currents were simulated as in Eyal et al.’s work. AMPA conductance was modeled as a double-exponential function and NMDA conduction as a voltage-dependent double-exponential function. For the AMPA model, the specific rise and decay were set to 0.3 and 1.8 ms. For the NMDA model, Өнгөрсөн & decay were set to 8.019 and 34.9884 ms, respectively. The maximum conductance of AMPA and NMDA were 0.73 nS and 1.31 nS. τ τ τ τ Загварын шугам We attached background noise to each cell to simulate a more realistic environment. Noise patterns were implemented as Poisson spike trains with a constant rate of 1.0 Hz. Each pattern started at start = 10 ms and lasted until the end of the simulation. We generated 400 noise spike trains for each cell and attached them to randomly-selected synapses. The model and specific parameters of synaptic currents were the same as described in , except that the maximum conductance of NMDA was uniformly distributed from 1.57 to 3.275, resulting in a higher AMPA to NMDA ratio. t Synaptic Inputs Exploring neuronal excitability We investigated the spike probability when multiple synapses were activated simultaneously. For distributed inputs, we tested 14 cases, from 0 to 240 activated synapses. For clustered inputs, we tested 9 cases in total, activating from 0 to 12 clusters respectively. Each cluster consisted of 20 synapses. For each case in both distributed and clustered inputs, we calculated the spike probability with 50 random samples. Spike probability was defined as the ratio of the number of neurons fired to the total number of samples. All 1150 samples were simulated simultaneously on our DeepDendrite platform, reducing the simulation time from days to minutes. Performing AI tasks with the DeepDendrite platform Conventional detailed neuron simulators lack two functionalities important to modern AI tasks: (1) alternately performing simulations and weight updates without heavy reinitialization and (2) simultaneously processing multiple stimuli samples in a batch-like manner. Here we present the DeepDendrite platform, which supports both biophysical simulating and performing deep learning tasks with detailed dendritic models. DeepDendrite consists of three modules (Supplementary Fig. ): (1) an I/O module; (2) a DHS-based simulating module; (3) a learning module. When training a biophysically detailed model to perform learning tasks, users first define the learning rule, then feed all training samples to the detailed model for learning. In each step during training, the I/O module picks a specific stimulus and its corresponding teacher signal (if necessary) from all training samples and attaches the stimulus to the network model. Then, the DHS-based simulating module initializes the model and starts the simulation. After simulation, the learning module updates all synaptic weights according to the difference between model responses and teacher signals. After training, the learned model can achieve performance comparable to ANN. The testing phase is similar to training, except that all synaptic weights are fixed. 5 HPC-Net model Image classification is a typical task in the field of AI. In this task, a model should learn to recognize the content in a given image and output the corresponding label. Here we present the HPC-Net, a network consisting of detailed human pyramidal neuron models that can learn to perform image classification tasks by utilizing the DeepDendrite platform. HPC-Net has three layers, i.e., an input layer, a hidden layer, and an output layer. The neurons in the input layer receive spike trains converted from images as their input. Hidden layer neurons receive the output of input layer neurons and deliver responses to neurons in the output layer. The responses of the output layer neurons are taken as the final output of HPC-Net. Neurons between adjacent layers are fully connected. For each image stimulus, we first convert each normalized pixel to a homogeneous spike train. For pixel with coordinates ( ) in the image, the corresponding spike train has a constant interspike interval ISI( ) (in ms) which is determined by the pixel value ( ) as shown in Eq. ( ). X, Y τ x, y p x, y 1 In our experiment, the simulation for each stimulus lasted 50 ms. All spike trains started at 9 + ISI ms and lasted until the end of the simulation. Then we attached all spike trains to the input layer neurons in a one-to-one manner. The synaptic current triggered by the spike arriving at time is given by τ t0 where post-synaptic хүчдэл, өөрсдийн хүчдэл syn = 1 mV, the maximum synaptic conductance max = 0.05 μS, and the time constant = 0.5 ms. v E g τ Neurons in the input layer were modeled with a passive single-compartment model. The specific parameters were set as follows: membrane capacitance m = 1.0 μF cm-2, membrane resistance m = 104 Ω cm2, axial resistivity a = 100 Ω cm, reversal potential of passive compartment l = 0 mV. c r r E The hidden layer contains a group of human pyramidal neuron models, receiving the somatic voltages of input layer neurons. The morphology was from Eyal, et al. , and all neurons were modeled with passive cables. The specific membrane capacitance m = 1.5 μF cm-2, membrane resistance m = 48,300 Ω cm2, axial resistivity a = 261.97 Ω cm, and the reversal potential of all passive cables l = 0 mV. Input neurons could make multiple connections to randomly-selected locations on the dendrites of hidden neurons. The synaptic current activated by the -th synapse of the -th input neuron on neuron ’s dendrite is defined as in Eq. ( ), where is the synaptic conductance, is the synaptic weight, is the ReLU-like somatic activation function, and is the somatic voltage of the -th input neuron at time . 51 c r r E k i j 4 gijk Wijk i t Neurons in the output layer were also modeled with a passive single-compartment model, and each hidden neuron only made one synaptic connection to each output neuron. All specific parameters were set the same as those of the input neurons. Synaptic currents activated by hidden neurons are also in the form of Eq. ( ). 4 Image classification with HPC-Net For each input image stimulus, we first normalized all pixel values to 0.0-1.0. Then we converted normalized pixels to spike trains and attached them to input neurons. Somatic voltages of the output neurons are used to compute the predicted probability of each class, as shown in equation , where is the probability of -th class predicted by the HPC-Net, is the average somatic voltage from 20 ms to 50 ms of the -th output neuron, and indicates the number of classes, which equals the number of output neurons. The class with the maximum predicted probability is the final classification result. In this paper, we built the HPC-Net with 784 input neurons, 64 hidden neurons, and 10 output neurons. 6 pi i i C Synaptic plasticity rules for HPC-Net Inspired by previous work , we use a gradient-based learning rule to train our HPC-Net to perform the image classification task. The loss function we use here is cross-entropy, given in Eq. ( ), where is the predicted probability for class , Үнэлгээний загварууд нь stimulus image-ийг орно, = 1 if input image belongs to class , and = 0 if not. 36 7 Пи i yi yi i yi When training HPC-Net, we compute the update for weight (the synaptic weight of the -th synapse connecting neuron to neuron ) at each time step. After the simulation of each image stimulus, is updated as shown in Eq. ( ): Wijk k i j Wijk 8 Энд суралцах хурд, цаг хугацааны шинэчлэлт үнэ юм , , are somatic voltages of neuron and respectively, is the Synaptic Current нь нейрон-ийн идэвхжүүлэх on neuron , its synaptic conductance, is the transfer resistance between the -th connected compartment of neuron Энэ нь neuron ’s dendrite to neuron ’s soma, s = 30 ms, e = 50 ms are start time and end time for learning respectively. For output neurons, the error term can be computed as shown in Eq. ( ). For hidden neurons, the error term is calculated from the error terms in the output layer, given in Eq. ( ). t vj vi i j Iijk k i j gijk rijk k i j j t t 10 11 Since all output neurons are single-compartment, equals to the input resistance of the corresponding compartment, . Transfer and input resistances are computed by NEURON. Mini-batch training is a typical method in deep learning for achieving higher prediction accuracy and accelerating convergence. DeepDendrite also supports mini-batch training. When training HPC-Net with mini-batch size batch, we make batch copies of HPC-Net. During training, each copy is fed with a different training sample from the batch. DeepDendrite first computes the weight update for each copy separately. After all copies in the current training batch are done, the average weight update is calculated and weights in all copies are updated by this same amount. N N Robustness against adversarial attack with HPC-Net To demonstrate the robustness of HPC-Net, we tested its prediction accuracy on adversarial samples and compared it with an analogous ANN (one with the same 784-64-10 structure and ReLU activation, for fair comparison in our HPC-Net each input neuron only made one synaptic connection to each hidden neuron). We first trained HPC-Net and ANN with the original training set (original clean images). Then we added adversarial noise to the test set and measured their prediction accuracy on the noisy test set. We used the Foolbox , to generate adversarial noise with the FGSM method . ANN was trained with PyTorch , and HPC-Net was trained with our DeepDendrite. For fairness, we generated adversarial noise on a significantly different network model, a 20-layer ResNet . The noise level ranged from 0.02 to 0.2. We experimented on two typical datasets, MNIST Үнэндээ Fashion-MNIST . Results show that the prediction accuracy of HPC-Net is 19% and 16.72% higher than that of the analogous ANN, respectively. 98 99 93 100 101 95 96 Reporting summary Further information on research design is available in the linked to this article. Nature Portfolio Reporting Summary Data availability Энэ судалгааны үзүүлэлтүүдийг дэмжих өгөгдөл нь харьцуулахад боломжтой бөгөөд энэ харьцуулахад хандуулсан нэмэлт мэдээлэл, эхний өгөгдлийн файлууд. – Татаж авах боломжтой . The MNIST dataset is publicly available at . The Fashion-MNIST dataset is publicly available at . are provided with this paper. 3 6 https://github.com/pkuzyc/DeepDendrite http://yann.lecun.com/exdb/mnist https://github.com/zalandoresearch/fashion-mnist Source data Code availability The source code of DeepDendrite as well as the models and code used to reproduce Figs. – in this study are available at . 3 6 https://github.com/pkuzyc/DeepDendrite References McCulloch, W. S. & Pitts, W. A logical calculus of the ideas immanent in nervous activity. , 115–133 (1943). Bull. Math. Biophys. 5 LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. , 436–444 (2015). Nature 521 Poirazi, P., Brannon, T. & Mel, B. W. Arithmetic of subthreshold synaptic summation in a model CA1 pyramidal cell. , 977–987 (2003). Neuron 37 London, M. & Häusser, M. Dendritic computation. , 503–532 (2005). Annu. Rev. Neurosci. 28 Branco, T. & Häusser, M. The single dendritic branch as a fundamental functional unit in the nervous system. , 494–502 (2010). Curr. Opin. Neurobiol. 20 Stuart, G. J. & Spruston, N. Dendritic integration: 60 years of progress. , 1713–1721 (2015). Nat. Neurosci. 18 Poirazi, P. & Papoutsi, A. Illuminating dendritic function with computational models. , 303–321 (2020). Nat. Rev. Neurosci. 21 Yuste, R. & Denk, W. Dendritic spines as basic functional units of neuronal integration. , 682–684 (1995). Nature 375 Engert, F. & Bonhoeffer, T. Dendritic spine changes associated with hippocampal long-term synaptic plasticity. , 66–70 (1999). Nature 399 Yuste, R. Dendritic spines and distributed circuits. , 772–781 (2011). Neuron 71 Yuste, R. Electrical compartmentalization in dendritic spines. , 429–449 (2013). Annu. Rev. Neurosci. 36 Rall, W. Branching dendritic trees and motoneuron membrane resistivity. , 491–527 (1959). Exp. Neurol. 1 Segev, I. & Rall, W. Computational study of an excitable dendritic spine. , 499–523 (1988). J. Neurophysiol. 60 Silver, D. et al. Гайхамшигтай нейрон сүлжээг болон арьс хайлтын тоглолтонд гарын авлага. Nature 529, 484–489 (2016). Silver, D. et al. Шахах, шоги, автомашины тоглолтонд оюутнуудыг оюутнуудыг оюутнуудыг оюутнуудыг оюутнуудыг агуулсан алгоритм. Science 362, 1140-1144 (2018). McCloskey, M. & Cohen, N. J. Catastrophic interference in connectionist networks: the sequential learning problem. , 109–165 (1989). Psychol. Learn. Motiv. 24 French, R. M. Catastrophic forgetting in connectionist networks. , 128–135 (1999). Trends Cogn. Sci. 3 Naud, R. & Sprekeler, H. Sparse bursts optimize information transmission in a multiplexed neural code. , E6329–E6338 (2018). Proc. Natl Acad. Sci. USA 115 Sacramento, J., Costa, R. P., Bengio, Y. & Senn, W. Dendritic cortical microcircuits backpropagation алгоритмыг харуулсан. in Advances in Neural Information Processing Systems 31 (NeurIPS 2018) (NeurIPS*,* 2018). Payeur, A., Guerguiev, J., Zenke, F., Richards, B. A. & Naud, R. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. , 1010–1019 (2021). Nat. Neurosci. 24 Bicknell, B. A. & Häusser, M. A synaptic learning rule for exploiting nonlinear dendritic computation. , 4001–4017 (2021). Neuron 109 Moldwin, T., Kalmenson, M. & Segev, I. The gradient clusteron: a model neuron that learns to solve classification tasks via dendritic nonlinearities, structural plasticity, and gradient descent. , e1009015 (2021). PLoS Comput. Biol. 17 Hodgkin, A. L. & Huxley, A. F. A quantitative description of membrane current and Its application to conduction and excitation in nerve. , 500–544 (1952). J. Physiol. 117 Rall, W. Theory of physiological properties of dendrites. , 1071–1092 (1962). Ann. N. Y. Acad. Sci. 96 Hines, M. L. & Carnevale, N. T. The NEURON simulation environment. , 1179–1209 (1997). Neural Comput. 9 Bower, J. M. & Beeman, D. The Book of GENESIS: General Neural Simulation System ашиглан реалістик нейрон загварууд судлах (eds Bower, J. M. & Beeman, D.) 17-27 (Springer New York, 1998). Hines, M. L., Eichner, H. & Schürmann, F. Neuron splitting in compute-bound parallel network simulations enables runtime scaling with twice as many processors. , 203–210 (2008). J. Comput. Neurosci. 25 Hines, M. L., Markram, H. & Schürmann, F. Нэг нейрондын бүрэн дундаж харьцангуй симуляци. J. Comput. Neurosci. 25, 439-448 (2008). Ben-Shalom, R., Liberman, G. & Korngreen, A. Accelerating compartmental modeling on a graphical processing unit. , 4 (2013). Front. Neuroinform. 7 Tsuyuki, T., Yamamoto, Y. & Yamazaki, T. Efficient numerical simulation of neuron models with spatial structure on graphics processing units. In (eds Hirose894Akiraet al.) 279–285 (Springer International Publishing, 2016). Proc. 2016 International Conference on Neural Information Processing Vooturi, D. T., Kothapalli, K. & Bhalla, U. S. Parallelizing Hines Matrix Solver in Neuron Simulations on GPU. In 388–397 (IEEE, 2017). Proc. IEEE 24th International Conference on High Performance Computing (HiPC) Huber, F. GPU дээр hines матриц нь үр дүнтэй арьс шийдэл. Preprint at https://arxiv.org/abs/1810.12742 (2018). Korte, B. & Vygen, J. Combinatorial Оптимизацийн теори, алгоритм 6 edn (Springer, 2018). Gebali, F. (Wiley, 2011). Algorithms and Parallel Computing Kumbhar, P. et al. CoreNEURON: An optimized compute engine for the NEURON simulator. , 63 (2019). Front. Neuroinform. 13 Urbanczik, R. & Senn, W. Соматик Spiking-ийн dendritic prediction дээр суралцах. Neuron 81, 521-528 (2014). Ben-Shalom, R., Aviv, A., Razon, B. & Korngreen, A. График процессорууд дээр параллель генетик алгоритмыг ашиглан ион каналын загвар optimizing. J. Neurosci. Methods 206, 183-194 (2012). Mascagni, M. A parallelizing algorithm for computing solutions to arbitrarily branched cable neuron models. , 105–114 (1991). J. Neurosci. Methods 36 McDougal, R. A. et al. МодельDB-ийн 20 жил, түүнээс дээш: нейрон шинжлэх ухааны дараагийн хувьд чухал загвар хийх хэрэгсэл бий болгох. J. Comput. Neurosci. 42, 1-10 (2017). Migliore, M., Messineo, L. & Ferrante, M. Dendritic Ih selectively blocks temporal summation of unsynchronized distal inputs in CA1 pyramidal neurons. , 5–13 (2004). J. Comput. Neurosci. 16 Hemond, P. et al. Distinct classes of pyramidal cells exhibit mutually exclusive firing patterns in hippocampal area CA3b. , 411–424 (2008). Hippocampus 18 Hay, E., Hill, S., Schürmann, F., Markram, H. & Segev, I. Models of neocortical layer 5b pyramidal cells capturing a wide range of dendritic and perisomatic active Properties. , e1002107 (2011). PLoS Comput. Biol. 7 Masoli, S., Solinas, S. & D’Angelo, E. Action potential processing in a detailed purkinje cell model reveals a critical role for axonal compartmentalization. , 47 (2015). Front. Cell. Neurosci. 9 Lindroos, R. et al. Basal ganglia neuromodulation over multiple temporal and structural scales—simulations of direct pathway MSNs investigate the fast onset of dopaminergic effects and predict the role of Kv4.2. , 3 (2018). Front. Neural Circuits 12 Migliore, M. et al. Synaptic clusters function as odor operators in the olfactory bulb. , 8499–8504 (2015). Proc. Natl Acad. Sci. USa 112 NVIDIA. . (2021). CUDA C++ Programming Guide https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html NVIDIA. . (2021). CUDA C++ Best Practices Guide https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Harnett, M. T., Makara, J. K., Spruston, N., Kath, W. L. & Magee, J. C. Synaptic amplification by dendritic spines enhances input cooperativity. , 599–602 (2012). Nature 491 Chiu, C. Q. et al. Compartmentalization of GABAergic inhibition by dendritic spines. , 759–762 (2013). Science 340 Tønnesen, J., Katona, G., Rózsa, B. & Nägerl, U. V. Spine neck plasticity regulates compartmentalization of synapses. , 678–685 (2014). Nat. Neurosci. 17 Eyal, G. et al. Human cortical pyramidal neurons: from spines to spikes via models. , 181 (2018). Front. Cell. Neurosci. 12 Koch, C. & Zador, A. The function of dendritic spines: devices subserving biochemical rather than electrical compartmentalization. , 413–422 (1993). J. Neurosci. 13 Koch, C. Dendritic spines. In (Oxford University Press, 1999). Biophysics of Computation Rapp, M., Yarom, Y. & Segev, I. Параллел шилэн фоналын үйл ажиллагаа нь церебелларын purkinje-ийн шилэн шинж чанар дээр нөлөө. Нейрон компьютерийн 4, 518-533 (1992). Hines, M. Efficient computation of branched nerve equations. , 69–76 (1984). Int. J. Bio-Med. Comput. 15 Nayebi, A. & Ganguli, S. Biologically inspired protection of deep networks from adversarial attacks. Preprint at (2017). https://arxiv.org/abs/1703.09202 Goddard, N. H. & Hood, G. Large-Scale Simulation Using Parallel GENESIS. In (eds Bower James M. & Beeman David) 349-379 (Springer New York, 1998). The Book of GENESIS: Exploring Realistic Neural Models with the GEneral NEural SImulation System Migliore, M., Cannia, C., Lytton, W. W., Markram, H. & Hines, M. L. НЭУРОН-тэй харьцангуй сүлжээний симуляци. J. Компьютерийн Neurosci. 21, 119 (2006). Lytton, W. W. et al. Мониторинг нейротехнологийн ухааны судалгааг урьдчилан сэргийлэх: NEURON-д том сүлжээг харьцуулахад. Neural Comput. 28, 2063–2090 (2016). Valero-Lara, P. et al. cuHinesBatch: GPU-д олон Hines системийг шийдэл. In Proc. 2017 International Conference on Computational Science 566-575 (IEEE, 2017). Akar, N. A. et al. Arbor—Дэлгэрэнгүй morphologically нейрон сүлжээний симуляцийн библиотек орчин үеийн өндөр гүйцэтгэлийн компьютерийн архитектур. 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) 274-282 (IEEE, 2019). Ben-Shalom, R. et al. NeuroGPU: Accelerating multi-compartment, biophysically detailed neuron simulations on GPUs. , 109400 (2022). J. Neurosci. Methods 366 Rempe, M. J. & Chopp, D. L. A predictor-corrector algorithm for reaction-diffusion equations associated with neural activity on branched structures. , 2139–2161 (2006). SIAM J. Sci. Comput. 28 Kozloski, J. & Wagner, J. An ultrascalable solution to large-scale neural tissue simulation. , 15 (2011). Front. Neuroinform. 5 Jayant, K. et al. Targeted intracellular voltage recordings from dendritic spines using quantum-dot-coated nanopipettes. , 335–342 (2017). Nat. Nanotechnol. 12 Palmer, L. M. & Stuart, G. J. Membrane potential changes in dendritic spines during action potentials and synaptic input. , 6897–6903 (2009). J. Neurosci. 29 Nishiyama, J. & Yasuda, R. Biochemical computation for spine structural plasticity. , 63–75 (2015). Neuron 87 Yuste, R. & Bonhoeffer, T. Morphological changes in dendritic spines associated with long-term synaptic plasticity. , 1071–1089 (2001). Annu. Rev. Neurosci. 24 Holtmaat, A. & Svoboda, K. Experience-dependent structural synaptic plasticity in the mammalian brain. , 647–658 (2009). Nat. Rev. Neurosci. 10 Caroni, P., Donato, F. & Muller, D. Structural plasticity upon learning: regulation and functions. , 478–490 (2012). Nat. Rev. Neurosci. 13 Keck, T. et al. Насанд хүрэгчдэд үзэсгэлэнт хоолойтай реорганизацид нейрон хоолойны массив реструктуризаци. Nat. Neurosci. 11, 1162 (2008). Hofer, S. B., Mrsic-Flogel, T. D., Bonhoeffer, T. & Hübener, M. Experience leaves a lasting structural trace in cortical circuits. , 313–317 (2009). Nature 457 Trachtenberg, J. T. et al. Long-term in vivo imaging of experience-dependent synaptic plasticity in adult cortex. , 788–794 (2002). Nature 420 Marik, S. A., Yamahachi, H., McManus, J. N., Szabo, G. & Gilbert, C. D. Axonal dynamics of excitatory and inhibitory neurons in somatosensory cortex. , e1000395 (2010). PLoS Biol. 8 Xu, T. et al. Өнгөрсөн хөдөлгүүрийн санахын тулд синапсын хурдан үүсгэх, selective stabilization. Nature 462, 915-919 (2009). Albarran, E., Raissi, A., Jáidar, O., Shatz, C. J. & Ding, J. B. Моторын суралцах шинэ үүсгэсэн dendritic шүршүүрийн тогтвортой байдлыг сайжруулдаг. Neuron 109, 3298-3311 (2021). Бранко, Т. & Häusser, М. Синнаптикийн интеграцийг нэг саргийн пирамидалын эсийн dendrites. Neuron 69, 885-892 (2011). Major, G., Larkum, M. E. & Schiller, J. Active properties of neocortical pyramidal neuron dendrites. , 1–24 (2013). Annu. Rev. Neurosci. 36 Gidon, A. et al. Dendritic action potentials and computation in human layer 2/3 cortical neurons. , 83–87 (2020). Science 367 Doron, M., Chindemi, G., Muller, E., Markram, H. & Segev, I. Цагтай синаптик хязгаарлалт NMDA ширхэг хэлбэрээр, орон нутгийн dendritic боловсруулах, дэлхий даяар I / O шинж чанарыг нөлөөлж. Cell Rep. 21, 1550-1561 (2017). Du, K. et al. Cell-type-specific inhibition of the dendritic plateau potential in striatal spiny projection neurons. , E7612–E7621 (2017). Proc. Natl Acad. Sci. USA 114 Smith, S. L., Smith, I. T., Branco, T. & Häusser, M. Dendritic spikes enhance stimulus selectivity in cortical neurons in vivo. , 115–120 (2013). Nature 503 Xu, N.-l et al. Nonlinear dendritic integration of sensory and motor input during an active sensing task. , 247–251 (2012). Nature 492 Takahashi, N., Oertner, T. G., Hegemann, P. & Larkum, M. E. Active cortical dendrites modulate perception. , 1587–1590 (2016). Science 354 Sheffield, M. E. & Dombeck, D. A. Calcium transient prevalence across the dendritic arbour predicts place field properties. , 200–204 (2015). Nature 517 Markram, H. et al. Reconstruction and simulation of neocortical microcircuitry. , 456–492 (2015). Cell 163 Billeh, Y. N. et al. Структур, үйл ажиллагаатай өгөгдлийн олон тооны загварууд нь системийн интеграци. Neuron 106, 388–403 (2020). Hjorth, J. et al. The microcircuits of striatum in silico. , 202000671 (2020). Proc. Natl Acad. Sci. USA 117 Guerguiev, J., Lillicrap, T. P. & Richards, B. A. Towards deep learning with segregated dendrites. , e22901 (2017). elife 6 Iyer, A. et al. Avoiding catastrophe: active dendrites enable multi-task learning in dynamic environments. , 846219 (2022). Front. Neurorobot. 16 Jones, I. S. & Kording, K. P. Might a single neuron solve interesting machine learning problems through successive computations on its dendritic tree? , 1554–1571 (2021). Neural Comput. 33 Bird, A. D., Jedlicka, P. & Cuntz, H. Dendritic normalisation improves learning in sparsely connected artificial neural networks. , e1009202 (2021). PLoS Comput. Biol. 17 Goodfellow, I. J., Shlens, J. & Szegedy, C. Explaining and harnessing adversarial examples. In (ICLR, 2015). 3rd International Conference on Learning Representations (ICLR) Papernot, N., McDaniel, P. & Goodfellow, I. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. Preprint at (2016). https://arxiv.org/abs/1605.07277 Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. , 2278–2324 (1998). Proc. IEEE 86 Xiao, H., Rasul, K. & Vollgraf, R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. Preprint at (2017). http://arxiv.org/abs/1708.07747 Bartunov, S. et al. Assessing the scalability of biologically-motivated deep learning algorithms and architectures. In (NeurIPS, 2018). Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Rauber, J., Brendel, W. & Bethge, M. Foolbox: A Python toolbox to benchmark the robustness of machine learning models. In (2017). Reliable Machine Learning in the Wild Workshop, 34th International Conference on Machine Learning Rauber, J., Zimmermann, R., Bethge, M. & Brendel, W. Foolbox native: fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX. , 2607 (2020). J. Open Source Softw. 5 Paszke, A. et al. PyTorch: An imperative style, high-performance deep learning library. In (NeurIPS, 2019). Advances in Neural Information Processing Systems 32 (NeurIPS 2019) He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 770–778 (IEEE, 2016). Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Ажлын Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Зохиогчийн эрх © Энэ нийтлэл нь CC by 4.0 Deed (Attribution 4.0 International) лицензтай байдаг. Энэ текст нь CC by 4.0 Ажлын (Attribution 4.0 International) лицензийн дагуу. Байгалийн боломжтой