Wanasayansi walijenga injini ya GPU ambayo inachanganya seli za ubongo kwa kasi zaidi ya 1500

Waandishi wa: Yichen Zhang Gan He Lei Ma Xiaofei Liu J. J. Johannes Hjorth Alexander Kozlov Yutao He Shenjian Zhang Jeanette Hellgren Kotaleski Maisha ya Yonghong Sten Grillner Kai Du Tiejun Huang Waandishi wa: Maana ya Zhang Yeye mwenyewe Maana ya Mwakyembe Liu J. J. Johannes Hjorth kwa ajili ya Alexander Kozlov Yutao He Shenjian Zhang Jeanette Hellgren ya Kotaleski Maisha ya Yonghong Mji wa Grillner Kama wewe Tiejun Huang Abstract ya Biophysically detailed multi-compartment mifano ni zana yenye nguvu ya kuchunguza kanuni za kompyuta za ubongo na pia inatumika kama mfumo wa nadharia ya kuzalisha algorithms kwa mifumo ya akili ya kiufundi (AI). Hata hivyo, gharama ya gharama ya kompyuta ya gharama kubwa hupunguza maombi katika uwanja wa neuroscience na AI. kizuizi kikubwa wakati wa simulating mifano ya sehemu ya kina ni uwezo wa simulator kutatua mifumo kubwa ya usawa linear. Endriti ya Maelezo ya Hierarchical cheduling (DHS) mbinu ili kuharakisha mchakato huo kwa kiasi kikubwa. Sisi kimsingi kuthibitisha kwamba utekelezaji wa DHS ni kompyuta bora na sahihi. Mbinu hii ya GPU inafanya kazi kwa 2-3 utaratibu wa kiwango cha kasi zaidi kuliko ile ya mbinu classic Hines mfululizo katika jukwaa la CPU ya kawaida. Sisi kujenga mfumo wa DeepDendrite, ambao unajumuisha mbinu ya DHS na injini ya GPU ya simulator ya NEURON na kuonyesha maombi ya DeepDendrite katika kazi za neuroscience. Sisi kuchunguza jinsi mifano ya nafasi ya ufumbuzi wa spin kuathiri msisimko wa neurons katika mfano wa kimwili wa kimwili wa kina na 25,000 spin. Zaidi, sisi kutoa mjadala mfupi juu ya uwezo wa DeepDendrite kwa AI D H S Maelezo ya Kufafanua kanuni za coding na hesabu za neurons ni muhimu kwa neuroscience. ubongo wa wanyama unaoundwa na maelfu ya aina tofauti za neurons na mali ya kipekee ya morphological na biophysical. Ingawa haina tena kweli kwa dhana, mafundisho ya "point-neuron" Katika miaka ya hivi karibuni, akili ya kisasa (AI) imekuwa ikitumia kanuni hii na kuendeleza zana za nguvu, kama vile mitandao ya kisasa ya neural (ANN). Hata hivyo, pamoja na hesabu kamili katika ngazi ya neuron moja, sehemu za chini za seli, kama vile dendrites za neurons, pia zinaweza kutekeleza operesheni zisizo za linear kama sehemu za hesabu za kujitegemea. , , , , Zaidi ya hayo, vipande vya dendritic, vipande vidogo ambavyo vinashughulikia dendrites kwa kiasi kikubwa katika neurons za spiny, zinaweza kuunganisha ishara za synaptic, kuruhusu kuondolewa kutoka kwa dendrites zao wazazi ex vivo na in vivo. , , , . 1 2 3 4 5 6 7 8 9 10 11 Simulations using biologically detailed neurons provide a theoretical framework for linking biological details to computational principles. The core of the biophysically detailed multi-compartment model framework , inakuwezesha kuunda mifano ya neurons na morphologies halisi dendritic, kuendesha ionic ndani, na extrinsic synaptic inputs. kiini cha mifano ya kina multi-compartment, yaani, dendrites, ni kujengwa juu ya nadharia classic Cable , ambayo inaboresha mali ya membrane ya biophysical ya dendrites kama mitambo ya pasiv, inatoa maelezo ya kimantiki ya jinsi ishara za elektroniki zinaingia na kuenea katika taratibu za kisaikolojia ngumu. Kwa kuingiza nadharia ya cable na mbinu za biophysical za kazi kama vile njia za ion, mtiririko na kuzuia mtiririko wa synaptic, nk, mfano wa kina wa multi-compartment unaweza kufikia hesabu za kisaikolojia na subcellular nje ya mipaka ya majaribio , . 12 13 12 4 7 Mbali na athari yake ya kina juu ya neuroscience, mifano ya neuron ya kina ya kibiolojia ilitumiwa hivi karibuni kuunganisha upungufu kati ya maelezo ya muundo wa neural na biophysical na AI. Njia inayozidi katika uwanja wa kisasa wa AI ni ANNs zilizoundwa na neurons za pointi, analog ya mitandao ya kibiolojia ya neural. Ingawa ANNs na algorithm ya "backpropagation-of-error" (backprop) zilitambua utendaji wa kushangaza katika maombi maalum, hata kushinda wachezaji bora wa kibinadamu katika michezo ya Go na shaka. , , ubongo wa binadamu bado unafanana na ANNs katika maeneo ambayo yanahusisha mazingira ya nguvu zaidi na ya sauti , Utafiti wa hivi karibuni unaonyesha kwamba ushirikiano wa dendritic ni muhimu katika kuzalisha algorithms ya kujifunza yenye ufanisi ambayo inaweza kuzidi backprop katika usindikaji wa pamoja wa habari. , , Zaidi ya hayo, mfano mmoja wa kina wa multi-compartment anaweza kujifunza hesabu za nonlinear kwenye kiwango cha mtandao kwa ajili ya neurons za pointi kwa kurekebisha tu nguvu ya synaptic. , , demonstrating the full potential of the detailed models in building more powerful brain-like AI systems. Therefore, it is of high priority to expand paradigms in brain-like AI from single detailed neuron models to large-scale biologically detailed networks. 14 15 16 17 18 19 20 21 22 Moja ya changamoto ya muda mrefu ya mbinu ya simulation ya kina iko katika gharama yake ya juu sana ya kompyuta, ambayo imepunguza sana maombi yake kwa sayansi ya neva na AI. , , . To improve efficiency, the classic Hines method reduces the time complexity for solving equations from O(n3) to O(n), which has been widely applied as the core algorithm in popular simulators such as NEURON ya Genesis . However, this method uses a serial approach to process each compartment sequentially. When a simulation involves multiple biophysically detailed dendrites with dendritic spines, the linear equation matrix (“Hines Matrix”) scales accordingly with an increasing number of dendrites or spines (Fig. ), kufanya njia ya Hines haina kazi tena, kwa sababu inawakilisha mzigo mkubwa sana juu ya simulation nzima. 12 23 24 25 26 1 ya A kurejeshwa kiwango cha 5-pyramidal neuron mfano na fomu ya kisaikolojia kutumika na maelezo ya neuron mifano. Mchakato wa kazi wakati wa simulation ya idadi ya mifano ya neuron ya kina. hatua ya ufumbuzi wa usawa ni kizuizi katika simulation. Mfano wa usawa wa linear katika simulation. Data dependency of the Hines method when solving linear equations in ya The size of the Hines matrix scales with model complexity. The number of linear equations system to be solved undergoes a significant increase when models are growing more detailed. Computational cost (steps taken in the equation solving phase) of the serial Hines method on different types of neuron models. Ufafanuzi wa mbinu tofauti za ufumbuzi. Sehemu tofauti za neuron zinashirikishwa kwa vituo vingi vya usindikaji katika mbinu za parallel (katika katikati, kulia), kuonyeshwa kwa rangi tofauti. Kiwango cha gharama ya matoleo matatu katika wakati wa kutatua usawa wa mfano wa piramidi na spines. Kuendesha muda wa mbinu tofauti juu ya ufumbuzi wa usawa kwa ajili ya mifano 500 piramidi na spines. wakati wa kuendesha inaonyesha matumizi ya muda wa simulation ya 1 s (kutatua usawa mara 40,000 na hatua ya muda wa 0.025 ms). p-Hines mbinu parallel katika CoreNEURON (katika GPU), Branch msingi mbinu parallel (katika GPU), DHS mbinu Dendritic mipangilio mipangilio (katika GPU). a b c d c e f g h g i Katika miongo kadhaa iliyopita, maendeleo makubwa yamefikiwa katika kuharakisha mbinu ya Hines kwa kutumia mbinu za parallel kwenye ngazi ya seli, ambayo inaruhusu kulinganisha hesabu ya sehemu tofauti katika kila seli. , , , , , . However, current cellular-level parallel methods often lack an efficient parallelization strategy or lack sufficient numerical accuracy as compared to the original Hines method. 27 28 29 30 31 32 Hapa, tunajenga chombo cha simulation cha moja kwa moja, kimsingi cha usahihi, na cha ufanisi ambacho kinaweza kuharakisha ufanisi wa kompyuta na kupunguza gharama za kompyuta. Zaidi ya hayo, chombo hiki cha simulation kinaweza kutumika kwa urahisi kwa kuanzisha na kujaribu mitandao ya neural na maelezo ya kibiolojia kwa maombi ya kujifunza mashine na AI. Nadharia ya kompyuta ya parallel Tunaonyesha kwamba algorithm yetu hutoa mipangilio bora bila kupoteza usahihi wowote. Zaidi ya hayo, tumeongeza DHS kwa chip ya GPU inayoendelea zaidi kwa kutumia kiwango cha kumbukumbu ya GPU na mekanisms za upatikanaji wa kumbukumbu. Pamoja, DHS inaweza kuharakisha hesabu 60-1,500 mara (Tabu ya ziada) ) ikilinganishwa na simulator classic NEURON while maintaining identical accuracy. 33 34 1 25 To enable detailed dendritic simulations for use in AI, we next establish the DeepDendrite framework by integrating the DHS-embedded CoreNEURON (an optimized compute engine for NEURON) platform as the simulation engine and two auxiliary modules (I/O module and learning module) supporting dendritic learning algorithms during simulations. DeepDendrite runs on the GPU hardware platform, supporting both regular simulation tasks in neuroscience and learning tasks in AI. 35 Hatimaye lakini si chini, tunatoa pia maombi kadhaa kutumia DeepDendrite, kwa lengo la changamoto kadhaa muhimu katika neuroscience na AI: (1) Tunashuhudia jinsi mifano ya nafasi ya ufumbuzi wa dendritic spine kuathiri shughuli za neural na neurons ambazo zina spines katika miti ya dendritic (model full-spine). DeepDendrite inatuwezesha kuchunguza hesabu ya neural katika mfano wa piramidi ya binadamu wa simulated na ~ 25,000 dendritic spines. (2) Katika majadiliano tunashughulikia pia uwezo wa DeepDendrite katika mazingira ya AI, hasa, katika kujenga ANNs na neurons ya piramidi ya binadamu ya kina. matokeo yetu yanaonyesha kwamba DeepDendrite ina uwezo wa kupunguza kwa kiasi kikubwa muda wa mafunzo, na hivyo kufanya mifano ya mtandao ya kina zaidi Msimbo wote wa chanzo kwa DeepDendrite, mifano ya kikamilifu na mfano wa mtandao wa dendritic wa kina unaopatikana kwa umma mtandaoni (tazama Upatikanaji wa Msimbo). Mfumo wetu wa kujifunza wa chanzo cha wazi unaweza kuunganishwa kwa urahisi na sheria zingine za kujifunza za dendritic, kama vile sheria za kujifunza kwa dendrites zisizo linear (ya kikamilifu) Plastiki ya Synaptic inayohusiana na kuvunjika , and learning with spike prediction Kwa ujumla, utafiti wetu hutoa seti kamili ya zana ambazo zina uwezo wa kubadilisha mazingira ya sasa ya jamii ya sayansi ya kompyuta. Kwa kutumia nguvu ya GPU ya kompyuta, tunatarajia kwamba zana hizi zitasaidia utafiti wa ngazi ya mfumo wa kanuni za kompyuta za miundo nzuri ya ubongo, pamoja na kukuza mwingiliano kati ya sayansi ya akili na AI ya kisasa. 21 20 36 Matokeo ya Mfumo wa Dendritic Hierarchical Scheduling (DHS) Tathmini ya mtiririko wa ion na ufumbuzi wa usawa wa linear ni hatua mbili muhimu wakati wa simulation ya neurons za kina za biophysics, ambazo ni za muda mrefu na hutoa mzigo mkubwa wa kompyuta. Kwa bahati nzuri, kompyuta ya mtiririko wa ion ya kila chumba ni mchakato huru kabisa ili inaweza kusambazwa kwa asili kwenye vifaa na vifaa vya kompyuta kubwa kama GPUs Matokeo yake, ufumbuzi wa muongozo wa linear huwa kifungo cha pili cha mchakato wa parallelization (Fig. (Kwa ajili ya 37 1A kwa F To tackle this bottleneck, cellular-level parallel methods have been developed, which accelerate single-cell computation by “splitting” a single cell into several compartments that can be computed in parallel , , Hata hivyo, mbinu hizo zinategemea sana juu ya ujuzi wa awali ili kuzalisha mikakati ya vitendo juu ya jinsi ya kuvunja neuron moja katika vikundi (Kielelezo. ; Supplementary Fig. Hivyo, inakuwa chini ya ufanisi kwa neurons na morphologies asymmetric, kwa mfano, neurons pyramidal na neurons Purkinje. 27 28 38 1g ya 1 We aim to develop a more efficient and precise parallel method for the simulation of biologically detailed neural networks. First, we establish the criteria for the accuracy of a cellular-level parallel method. Based on the theories in parallel computing , tunapendekeza masharti matatu ili kuhakikisha kwamba mbinu ya parallel itatoa ufumbuzi sawa kama mbinu ya Hines ya kompyuta ya mfululizo kulingana na utegemezi wa data katika mbinu ya Hines (tazama mbinu). Kisha kwa nadharia kutathmini muda wa kuendesha, yaani, ufanisi, ya mbinu za kompyuta ya mfululizo na parallel, tunachukua na kuelezea dhana ya gharama ya kompyuta kama idadi ya hatua ambayo mbinu inachukua katika kutatua usawa (tazama mbinu). 34 Kulingana na usahihi wa simulation na gharama ya hesabu, tunajifunza tatizo la parallelization kama tatizo la mipangilio ya kimantiki (tazama mbinu). Kwa maneno rahisi, tunaona neuron moja kama mti na nodes nyingi (shirikisho). viungo vya parallel, tunaweza kuhesabu kwa kiwango cha juu nodes katika kila hatua, lakini tunahitaji kuhakikisha nodes ni kuhesabiwa tu kama nodes zake watoto wote wamechaguliwa; lengo letu ni kupata mkakati na idadi ndogo ya hatua kwa utaratibu mzima. k k Ili kuzalisha sehemu bora, tunapendekeza mbinu inayoitwa Dendritic Hierarchical Scheduling (DHS) (ukweli wa nadharia unawasilishwa katika mbinu). Njia ya DHS inajumuisha hatua mbili: uchambuzi wa topolojia ya dendritic na kupata sehemu bora: (1) Kulingana na mfano wa kina, kwanza tunapata mti wake wa kujitegemea na kuhesabu kina cha kila node (ukubwa wa node ni idadi ya node zake za awali) kwenye mti (Hifadhi. (2) Baada ya uchambuzi wa topology, tunatafuta wagombea na kuchagua kwa kiasi kikubwa viungo vya kina zaidi vya mgombea (node ni mgombea tu ikiwa viungo vyake vyote vya watoto vimechaguliwa). ). 2a ya 2B ya C k 2D ya Mchakato wa kazi wa DHS Kila mguu wa mguu wa mguu wa kila iteration. Maonyesho ya kuhesabu kina cha node ya mfano wa sehemu. Mfano unabadilishwa kwanza katika muundo wa mti, kisha kina cha kila node inategemea. Rangi zinaonyesha thamani tofauti za kina. Topology analysis on different neuron models. Six neurons with distinct morphologies are shown here. For each model, the soma is selected as the root of the tree so the depth of the node increases from the soma (0) to the distal dendrites. Maelezo ya utekelezaji wa DHS juu ya mfano katika na mitambo nne. Wagombea: viungo ambavyo vinaweza kusindika. Wagombea waliochaguliwa: viungo ambavyo vinachaguliwa na DHS, yaani, deepest candidates. Processed nodes: nodes that have been processed before. Parallelization strategy obtained by DHS after the process in . Each node is assigned to one of the four parallel threads. DHS reduces the steps of serial node processing from 14 to 5 by distributing nodes to multiple threads. Relative cost, i.e., the proportion of the computational cost of DHS to that of the serial Hines method, when applying DHS with different numbers of threads on different types of models. a k b c d b k e d f Take a simplified model with 15 compartments as an example, using the serial computing Hines method, it takes 14 steps to process all nodes, while using DHS with four parallel units can partition its nodes into five subsets (Fig. Kwa sababu nywele katika kundi moja zinaweza kusindika kwa pamoja, inachukua hatua tano tu ili kusindika nywele zote kwa kutumia DHS. (Kwa ajili ya 2d 2e Kisha, tunatumia mbinu ya DHS kwenye mifano sita ya neuron inayojulikana (ilichaguliwa kutoka kwa ModelDB) ) with different numbers of threads (Fig. ):, including cortical and hippocampal pyramidal neurons , , , cerebellar Purkinje neurons Neurons ya Projekta ya Striatal (SPNs) ) na seli ya mitrali ya olfactory bulb , covering the major principal neurons in sensory, cortical and subcortical areas. We then measured the computational cost. The relative computational cost here is defined by the proportion of the computational cost of DHS to that of the serial Hines method. The computational cost, i.e., the number of steps taken in solving equations, drops dramatically with increasing thread numbers. For example, with 16 threads, the computational cost of DHS is 7%-10% as compared to the serial Hines method. Intriguingly, the DHS method reaches the lower bounds of their computational cost for presented neurons when given 16 or even 8 parallel threads (Fig. ), kuonyesha kuongeza vichwa zaidi haina kuboresha utendaji zaidi kwa sababu ya utegemezi kati ya vyumba. 39 2F ya 40 41 42 43 44 45 2F ya Pamoja, tunaunda mbinu ya DHS ambayo inaruhusu uchambuzi wa automatiska wa topolojia ya dendritic na sehemu bora kwa ajili ya kompyuta ya parallel. Ni thamani ya kutambua kwamba DHS huona sehemu bora kabla ya simulation kuanza, na hakuna hesabu ya ziada inahitajika kutatua usawa. Speeding up DHS by GPU memory boosting DHS computes each neuron with multiple threads, which consumes a vast amount of threads when running neural network simulations. Graphics Processing Units (GPUs) consist of massive processing units (i.e., streaming processors, SPs, Fig. kwa ajili ya kompyuta ya parallel . In theory, many SPs on the GPU should support efficient simulation for large-scale neural networks (Fig. Hata hivyo, tumeona daima kuwa ufanisi wa DHS ulipungua kwa kiasi kikubwa wakati ukubwa wa mtandao uliongezeka, ambayo inaweza kuwa matokeo ya uhifadhi wa data ulioenea au upatikanaji wa kumbukumbu ya ziada unaosababishwa na kupakia na kuandika matokeo ya kati (Kielelezo. , left). ya 3a, b 46 3c 3d GPU usanifu na makundi yake ya kumbukumbu. Kila GPU ina makundi makubwa ya usindikaji (processors mtiririko). Aina tofauti ya kumbukumbu ina uwezo tofauti. Architecture ya Streaming Multiprocessors (SMs). Kila SM ina processors nyingi za mtiririko, makaratasi, na cache ya L1. Kutumia DHS kwenye neurons mbili, kila moja na mitambo nne. Wakati wa simulation, kila mitambo hufanya kwenye processor moja ya mtiririko. Memory optimization strategy on GPU. Top panel, thread assignment and data storage of DHS, before (left) and after (right) memory boosting. Bottom, an example of a single step in triangularization when simulating two neurons in Processors kutuma ombi la data ili kupakia data kwa kila thread kutoka kumbukumbu ya kimataifa. Bila kuongezeka kwa kumbukumbu (kulia), inachukua shughuli saba kupakia data yote ya ombi na baadhi ya shughuli za ziada kwa matokeo ya kati. Na kuongezeka kwa kumbukumbu (kulia), inachukua hatua mbili tu kupakia data yote ya ombi, registers hutumiwa kwa matokeo ya kati, ambayo huongeza kuongezeka kwa kumbukumbu. Kuendesha muda wa DHS (32 thread kila seli) na na bila kuongezeka kwa kumbukumbu kwenye mifano ya piramidi ya ngazi nyingi ya 5 na spines. Speed up of memory boosting on multiple layer 5 pyramidal models with spines. Memory boosting brings 1.6-2 times speedup. a b c d d e f Tunaweza kutatua tatizo hili kwa kuboresha kumbukumbu ya GPU, mbinu ya kuongeza upatikanaji wa kumbukumbu kwa kutumia uwiano wa kumbukumbu wa GPU na mchakato wa upatikanaji. Kulingana na mchakato wa kupakia kumbukumbu wa GPU, viungo vya mfululizo vya kupakia data zilizo sawa na zilizohifadhiwa kwa mfululizo husababisha upatikanaji mkubwa wa kumbukumbu ikilinganishwa na upatikanaji wa data zilizohifadhiwa na scatter, ambayo hupunguza upatikanaji wa kumbukumbu. , . To achieve high throughput, we first align the computing orders of nodes and rearrange threads according to the number of nodes on them. Then we permute data storage in global memory, consistent with computing orders, i.e., nodes that are processed at the same step are stored successively in global memory. Moreover, we use GPU registers to store intermediate results, further strengthening memory throughput. The example shows that memory boosting takes only two memory transactions to load eight request data (Fig. Zaidi ya hayo, majaribio juu ya idadi kadhaa ya neurons pyramidal na spines na mifano ya kawaida ya neurons (Kielelezo. ; Supplementary Fig. ) show that memory boosting achieves a 1.2-3.8 times speedup as compared to the naïve DHS. 46 47 3d 3e, f 2 Ili kuthibitisha utendaji wa DHS kwa kuboresha kumbukumbu ya GPU, tunachagua mifano sita ya kawaida ya neuron na kutathmini muda wa kukimbia wa kutatua usawa wa cable juu ya idadi kubwa ya kila mfano (Kielelezo. ). We examined DHS with four threads (DHS-4) and sixteen threads (DHS-16) for each neuron, respectively. Compared to the GPU method in CoreNEURON, DHS-4 and DHS-16 can speed up about 5 and 15 times, respectively (Fig. ). Moreover, compared to the conventional serial Hines method in NEURON running with a single-thread of CPU, DHS speeds up the simulation by 2-3 orders of magnitude (Supplementary Fig. ), while retaining the identical numerical accuracy in the presence of dense spines (Supplementary Figs. na ya ), active dendrites (Supplementary Fig. ) and different segmentation strategies (Supplementary Fig. ). 4 4a 3 4 8 7 7 Run time of solving equations for a 1 s simulation on GPU (dt = 0.025 ms, 40,000 iterations in total). CoreNEURON: the parallel method used in CoreNEURON; DHS-4: DHS with four threads for each neuron; DHS-16: DHS with 16 threads for each neuron. ya Visualization ya sehemu kwa DHS-4 na DHS-16, kila rangi inaonyesha thread moja. a b c DHS creates cell-type-specific optimal partitioning To gain insights into the working mechanism of the DHS method, we visualized the partitioning process by mapping compartments to each thread (every color presents a single thread in Fig. ). The visualization shows that a single thread frequently switches among different branches (Fig. ). Interestingly, DHS generates aligned partitions in morphologically symmetric neurons such as the striatal projection neuron (SPN) and the Mitral cell (Fig. Kwa kulinganisha, inazalisha vipande vichache vya neurons za morphologically za asymmetric kama vile neurons za pyramidal na seli ya Purkinje (Pig. ), kuonyesha kwamba DHS inashiriki mti wa neural kwa kiwango cha kila chumba (yaani, node ya mti) badala ya kiwango cha shamba. 4B ya C 4B ya C 4B ya C 4b, c Kwa ufupi, DHS na kuongezeka kwa kumbukumbu hutoa suluhisho bora kwa ufumbuzi wa kiwango cha linear pamoja na ufanisi usio na kipimo. Kutumia kanuni hii, tumeunda jukwaa la upatikanaji wa wazi la DeepDendrite, ambayo inaweza kutumika na wanasayansi wa neva kutekeleza mifano bila ujuzi wowote wa programu ya GPU. Chini, tunaonyesha jinsi tunaweza kutumia DeepDendrite katika kazi za neuroscience. Pia tunazungumzia uwezo wa mfumo wa DeepDendrite kwa kazi zinazohusiana na AI katika sehemu ya Mazungumzo. DHS enables spine-level modelling As dendritic spines receive most of the excitatory input to cortical and hippocampal pyramidal neurons, striatal projection neurons, etc., their morphologies and plasticity are crucial for regulating neuronal excitability , , , , . However, spines are too small ( ~ 1 μm length) to be directly measured experimentally with regard to voltage-dependent processes. Thus, theoretical work is critical for the full understanding of the spine computations. 10 48 49 50 51 Tunaweza kubuni mwongozo mmoja na vikundi viwili: kichwa cha mwongozo ambapo sinapses ziko na kichwa cha mwongozo ambacho kinahusisha kichwa cha mwongozo na dendrites. . The theory predicts that the very thin spine neck (0.1-0.5 um in diameter) electronically isolates the spine head from its parent dendrite, thus compartmentalizing the signals generated at the spine head . However, the detailed model with fully distributed spines on dendrites (“full-spine model”) is computationally very expensive. A common compromising solution is to modify the capacitance and resistance of the membrane by a spine factor , instead of modeling all spines explicitly. Here, the spine factor aims at approximating the spine effect on the biophysical properties of the cell membrane . 52 53 F 54 F 54 Inspired by the previous work of Eyal et al. , tulijaribu jinsi mifano tofauti ya nafasi ya ufumbuzi wa kusisimua zilizoundwa kwenye spines dendritic kuunda shughuli za neuronal katika mfano wa neuron ya piramidi ya binadamu na spines zilizoonyeshwa wazi (Picha. Kwa mfano, Eyal et al. walifanya kazi katika spine factor to incorporate spines into dendrites while only a few activated spines were explicitly attached to dendrites (“few-spine model” in Fig. ). The value of spine in their model was computed from the dendritic area and spine area in the reconstructed data. Accordingly, we calculated the spine density from their reconstructed data to make our full-spine model more consistent with Eyal’s few-spine model. With the spine density set to 1.3 μm-1, the pyramidal neuron model contained about 25,000 spines without altering the model’s original morphological and biophysical properties. Further, we repeated the previous experiment protocols with both full-spine and few-spine models. We use the same synaptic input as in Eyal’s work but attach extra background noise to each sample. By comparing the somatic traces (Fig. ) and spike probability (Fig. ) in full-spine and few-spine models, we found that the full-spine model is much leakier than the few-spine model. In addition, the spike probability triggered by the activation of clustered spines appeared to be more nonlinear in the full-spine model (the solid blue line in Fig. ) than in the few-spine model (the dashed blue line in Fig. ). These results indicate that the conventional F-factor method may underestimate the impact of dense spine on the computations of dendritic excitability and nonlinearity. 51 5a F 5a F 5b, c 5D ya 5D ya 5d Experiment setup. We examine two major types of models: few-spine models and full-spine models. Few-spine models (two on the left) are the models that incorporated spine area globally into dendrites and only attach individual spines together with activated synapses. In full-spine models (two on the right), all spines are explicitly attached over whole dendrites. We explore the effects of clustered and randomly distributed synaptic inputs on the few-spine models and the full-spine models, respectively. Somatic voltages recorded for cases in . Colors of the voltage curves correspond to , scale bar: 20 ms, 20 mV. Color-coded voltages during the simulation in at specific times. Colors indicate the magnitude of voltage. Uwezekano wa kupanda soma kama utendaji wa idadi ya synapses zilizosambazwa kwa wakati mmoja (kama katika kazi ya Eyal et al.) kwa kesi nne katika . Background noise is attached. Run time of experiments in na mbinu tofauti za simulation. NEURON: simulator ya kawaida ya NEURON inayoendesha kwenye core moja ya CPU. CoreNEURON: simulator ya CoreNEURON kwenye GPU moja. DeepDendrite: DeepDendrite kwenye GPU moja. a b a a c b d a e d Katika jukwaa la DeepDendrite, mifano yote ya nyuma na nyuma ya nyuma ilipata kasi ya mara 8 ikilinganishwa na CoreNEURON kwenye jukwaa la GPU na kasi ya mara 100 ikilinganishwa na NEURON ya mfululizo kwenye jukwaa la CPU. ; Supplementary Table ) while keeping the identical simulation results (Supplementary Figs. and ). Therefore, the DHS method enables explorations of dendritic excitability under more realistic anatomic conditions. 5e 1 4 8 Discussion In this work, we propose the DHS method to parallelize the computation of Hines method and we mathematically demonstrate that the DHS provides an optimal solution without any loss of precision. Next, we implement DHS on the GPU hardware platform and use GPU memory boosting techniques to refine the DHS (Fig. ). When simulating a large number of neurons with complex morphologies, DHS with memory boosting achieves a 15-fold speedup (Supplementary Table ) as compared to the GPU method used in CoreNEURON and up to 1,500-fold speedup compared to serial Hines method in the CPU platform (Fig. • Figo ya ziada. Meza ya ziada ). Furthermore, we develop the GPU-based DeepDendrite framework by integrating DHS into CoreNEURON. Finally, as a demonstration of the capacity of DeepDendrite, we present a representative application: examine spine computations in a detailed pyramidal neuron model with 25,000 spines. Further in this section, we elaborate on how we have expanded the DeepDendrite framework to enable efficient training of biophysically detailed neural networks. To explore the hypothesis that dendrites improve robustness against adversarial attacks Tunaonyesha kwamba DeepDendrite inaweza kusaidia simulations ya neuroscience na kazi za mitandao ya neural zinazohusiana na AI kwa kasi isiyo ya kawaida, hivyo kuchochea sana simulations ya neuroscience ya kina na uwezekano wa utafiti wa AI ya baadaye. 55 3 1 4 3 1 56 Miaka kadhaa ya jitihada zimewekwa katika kuharakisha mbinu ya Hines kwa njia za parallel. Kazi ya awali ilihusisha hasa kwenye parallelization ya kiwango cha mtandao. Katika simulations ya mtandao, kila seli hutoa ufumbuzi wa moja kwa moja wa usawa wake wa linear kwa njia ya Hines. mbinu za parallel ya kiwango cha mtandao hutoa mtandao kwenye viungo vingi na kuunganisha hesabu ya kila kikundi cha seli na kila kiungo , Kwa mbinu za kiwango cha mtandao, tunaweza kuiga mitandao ya kina kwenye clusters au supercomputers Katika miaka ya hivi karibuni, GPU imekuwa kutumika kwa simulation ya mtandao wa kina. Kwa sababu GPU ina vipengele vya kompyuta kubwa, kiini kimoja kinachotolewa kwa seli moja badala ya kikundi cha seli. , , . With further optimization, GPU-based methods achieve much higher efficiency in network simulation. However, the computation inside the cells is still serial in network-level methods, so they still cannot deal with the problem when the “Hines matrix” of each cell scales large. 57 58 59 35 60 61 Cellular-level parallel methods further parallelize the computation inside each cell. The main idea of cellular-level parallel methods is to split each cell into several sub-blocks and parallelize the computation of those sub-blocks , . However, typical cellular-level methods (e.g., the “multi-split” method ) pay less attention to the parallelization strategy. The lack of a fine parallelization strategy results in unsatisfactory performance. To achieve higher efficiency, some studies try to obtain finer-grained parallelization by introducing extra computation operations , , au kufanya karibu juu ya baadhi ya sehemu muhimu, wakati wa kutatua usawa wa linear , . These finer-grained parallelization strategies can get higher efficiency but lack sufficient numerical accuracy as in the original Hines method. 27 28 28 29 38 62 63 64 Unlike previous methods, DHS adopts the finest-grained parallelization strategy, i.e., compartment-level parallelization. By modeling the problem of “how to parallelize” as a combinatorial optimization problem, DHS provides an optimal compartment-level parallelization strategy. Moreover, DHS does not introduce any extra operation or value approximation, so it achieves the lowest computational cost and retains sufficient numerical accuracy as in the original Hines method at the same time. Dendritic spines are the most abundant microstructures in the brain for projection neurons in the cortex, hippocampus, cerebellum, and basal ganglia. As spines receive most of the excitatory inputs in the central nervous system, electrical signals generated by spines are the main driving force for large-scale neuronal activities in the forebrain and cerebellum , Muundo wa uongo, na kichwa cha uongo na kichwa cha uongo sana cha uongo - husababisha upinzani mkubwa wa kuingia kwenye kichwa cha uongo, ambayo inaweza kuwa hadi 500 MΩ, kuunganisha data ya majaribio na mbinu ya mfano wa sehemu ya kina. , Kwa sababu ya impedansi ya juu ya kuingia, kuingia moja ya synaptic inaweza kuwakumbusha EPSP ya "gigantic" ( ~ 20 mV) katika kiwango cha kichwa cha uongo. , , thereby boosting NMDA currents and ion channel currents in the spine . However, in the classic single detailed compartment models, all spines are replaced by the coefficient modifying the dendritic cable geometries . This approach may compensate for the leak currents and capacitance currents for spines. Still, it cannot reproduce the high input impedance at the spine head, which may weaken excitatory synaptic inputs, particularly NMDA currents, thereby reducing the nonlinearity in the neuron’s input-output curve. Our modeling results are in line with this interpretation. 10 11 48 65 48 66 11 F 54 On the other hand, the spine’s electrical compartmentalization is always accompanied by the biochemical compartmentalization , , , resulting in a drastic increase of internal [Ca2+], within the spine and a cascade of molecular processes involving synaptic plasticity of importance for learning and memory. Intriguingly, the biochemical process triggered by learning, in turn, remodels the spine’s morphology, enlarging (or shrinking) the spine head, or elongating (or shortening) the spine neck, which significantly alters the spine’s electrical capacity , , , . Such experience-dependent changes in spine morphology also referred to as “structural plasticity”, have been widely observed in the visual cortex , , somatosensory cortex , , motor cortex , hippocampus Kwa upande wa Basal Ganglia in vivo. They play a critical role in motor and spatial learning as well as memory formation. However, due to the computational costs, nearly all detailed network models exploit the “F-factor” approach to replace actual spines, and are thus unable to explore the spine functions at the system level. By taking advantage of our framework and the GPU platform, we can run a few thousand detailed neurons models, each with tens of thousands of spines on a single GPU, while maintaining ~100 times faster than the traditional serial method on a single CPU (Fig. ). Therefore, it enables us to explore of structural plasticity in large-scale circuit models across diverse brain regions. 8 52 67 67 68 69 70 71 72 73 74 75 9 76 5e Another critical issue is how to link dendrites to brain functions at the systems/network level. It has been well established that dendrites can perform comprehensive computations on synaptic inputs due to enriched ion channels and local biophysical membrane properties , , . For example, cortical pyramidal neurons can carry out sublinear synaptic integration at the proximal dendrite but progressively shift to supralinear integration at the distal dendrite . Moreover, distal dendrites can produce regenerative events such as dendritic sodium spikes, calcium spikes, and NMDA spikes/plateau potentials , . Such dendritic events are widely observed in mice au hata neurons ya cortical ya binadamu in vitro, which may offer various logical operations , or gating functions , . Recently, in vivo recordings in awake or behaving mice provide strong evidence that dendritic spikes/plateau potentials are crucial for orientation selectivity in the visual cortex Ushirikiano wa sensory-motor katika mfumo wa whisker , , and spatial navigation in the hippocampal CA1 region . 5 6 7 77 6 78 6 79 6 79 80 81 82 83 84 85 To establish the causal link between dendrites and animal (including human) patterns of behavior, large-scale biophysically detailed neural circuit models are a powerful computational tool to realize this mission. However, running a large-scale detailed circuit model of 10,000-100,000 neurons generally requires the computing power of supercomputers. It is even more challenging to optimize such models for in vivo data, as it needs iterative simulations of the models. The DeepDendrite framework can directly support many state-of-the-art large-scale circuit models , , , which were initially developed based on NEURON. Moreover, using our framework, a single GPU card such as Tesla A100 could easily support the operation of detailed circuit models of up to 10,000 neurons, thereby providing carbon-efficient and affordable plans for ordinary labs to develop and optimize their own large-scale detailed models. 86 87 88 Recent works on unraveling the dendritic roles in task-specific learning have achieved remarkable results in two directions, i.e., solving challenging tasks such as image classification dataset ImageNet with simplified dendritic networks , na kuchunguza uwezo kamili wa kujifunza juu ya neurons halisi zaidi , . However, there lies a trade-off between model size and biological detail, as the increase in network scale is often sacrificed for neuron-level complexity , , Zaidi ya hayo, mifano ya neurons ya kina zaidi ni chini ya kuchunguza kimantiki na gharama kubwa kwa kompyuta. . 20 21 22 19 20 89 21 Pia kuna maendeleo katika jukumu la dendrites kazi katika ANNs kwa kazi ya maono ya kompyuta. Iyer et al. . proposed a novel ANN architecture with active dendrites, demonstrating competitive results in multi-task and continual learning. Jones and Kording used a binary tree to approximate dendrite branching and provided valuable insights into the influence of tree structure on single neurons’ computational capacity. Bird et al. . proposed a dendritic normalization rule based on biophysical behavior, offering an interesting perspective on the contribution of dendritic arbor structure to computation. While these studies offer valuable insights, they primarily rely on abstractions derived from spatially extended neurons, and do not fully exploit the detailed biological properties and spatial information of dendrites. Further investigation is needed to unveil the potential of leveraging more realistic neuron models for understanding the shared mechanisms underlying brain computation and deep learning. 90 91 92 In response to these challenges, we developed DeepDendrite, a tool that uses the Dendritic Hierarchical Scheduling (DHS) method to significantly reduce computational costs and incorporates an I/O module and a learning module to handle large datasets. With DeepDendrite, we successfully implemented a three-layer hybrid neural network, the Human Pyramidal Cell Network (HPC-Net) (Fig. ). This network demonstrated efficient training capabilities in image classification tasks, achieving approximately 25 times speedup compared to training on a traditional CPU-based platform (Fig. ; Supplementary Table ). 6A na B 6f 1 The illustration of the Human Pyramidal Cell Network (HPC-Net) for image classification. Images are transformed to spike trains and fed into the network model. Learning is triggered by error signals propagated from soma to dendrites. Training with mini-batch. Multiple networks are simulated simultaneously with different images as inputs. The total weight updates ΔW are computed as the average of ΔWi from each network. Comparison of the HPC-Net before and after training. Left, the visualization of hidden neuron responses to a specific input before (top) and after (bottom) training. Right, hidden layer weights (from input to hidden layer) distribution before (top) and after (bottom) training. Workflow of the transfer adversarial attack experiment. We first generate adversarial samples of the test set on a 20-layer ResNet. Then use these adversarial samples (noisy images) to test the classification accuracy of models trained with clean images. Prediction accuracy of each model on adversarial samples after training 30 epochs on MNIST (left) and Fashion-MNIST (right) datasets. Run time of training and testing for the HPC-Net. The batch size is set to 16. Left, run time of training one epoch. Right, run time of testing. Parallel NEURON + Python: training and testing on a single CPU with multiple cores, using 40-process-parallel NEURON to simulate the HPC-Net and extra Python code to support mini-batch training. DeepDendrite: training and testing the HPC-Net on a single GPU with DeepDendrite. a b c d e f Zaidi ya hayo, ni kutambuliwa kuwa utendaji wa Mitandao ya Neural ya Kijamii (ANNs) inaweza kuharibiwa na mashambulizi ya upinzani. —intentionally engineered perturbations devised to mislead ANNs. Intriguingly, an existing hypothesis suggests that dendrites and synapses may innately defend against such attacks . Our experimental results utilizing HPC-Net lend support to this hypothesis, as we observed that networks endowed with detailed dendritic structures demonstrated some increased resilience to transfer adversarial attacks compared to standard ANNs, as evident in MNIST and Fashion-MNIST datasets (Fig. ). This evidence implies that the inherent biophysical properties of dendrites could be pivotal in augmenting the robustness of ANNs against adversarial interference. Nonetheless, it is essential to conduct further studies to validate these findings using more challenging datasets such as ImageNet . 93 56 94 95 96 6d, e 97 In conclusion, DeepDendrite has shown remarkable potential in image classification tasks, opening up a world of exciting future directions and possibilities. To further advance DeepDendrite and the application of biologically detailed dendritic models in AI tasks, we may focus on developing multi-GPU systems and exploring applications in other domains, such as Natural Language Processing (NLP), where dendritic filtering properties align well with the inherently noisy and ambiguous nature of human language. Challenges include testing scalability in larger-scale problems, understanding performance across various tasks and domains, and addressing the computational complexity introduced by novel biological principles, such as active dendrites. By overcoming these limitations, we can further advance the understanding and capabilities of biophysically detailed dendritic neural networks, potentially uncovering new advantages, enhancing their robustness against adversarial attacks and noisy inputs, and ultimately bridging the gap between neuroscience and modern AI. Methods Simulator kwa DHS CoreNEURON simulator ( ) uses the NEURON architecture and is optimized for both memory usage and computational speed. We implement our Dendritic Hierarchical Scheduling (DHS) method in the CoreNEURON environment by modifying its source code. All models that can be simulated on GPU with CoreNEURON can also be simulated with DHS by executing the following command: 35 https://github.com/BlueBrain/CoreNeuron 25 coreneuron_exec -d /path/to/models -e time --cell-permute 3 --cell-nthread 16 --gpu The usage options are as in Table . 1 Accuracy of the simulation using cellular-level parallel computation To ensure the accuracy of the simulation, we first need to define the correctness of a cellular-level parallel algorithm to judge whether it will generate identical solutions compared with the proven correct serial methods, like the Hines method used in the NEURON simulation platform. Based on the theories in parallel computing , a parallel algorithm will yield an identical result as its corresponding serial algorithm, if and only if the data process order in the parallel algorithm is consistent with data dependency in the serial method. The Hines method has two symmetrical phases: triangularization and back-substitution. By analyzing the serial computing Hines method , we find that its data dependency can be formulated as a tree structure, where the nodes on the tree represent the compartments of the detailed neuron model. In the triangularization process, the value of each node depends on its children nodes. In contrast, during the back-substitution process, the value of each node is dependent on its parent node (Fig. ). Thus, we can compute nodes on different branches in parallel as their values are not dependent. 34 55 1d Based on the data dependency of the serial computing Hines method, we propose three conditions to make sure a parallel method will yield identical solutions as the serial computing Hines method: (1) The tree morphology and initial values of all nodes are identical to those in the serial computing Hines method; (2) In the triangularization phase, a node can be processed if and only if all its children nodes are already processed; (3) In the back-substitution phase, a node can be processed only if its parent node is already processed. Once a parallel computing method satisfies these three conditions, it will produce identical solutions as the serial computing method. Computational cost of cellular-level parallel computing method To theoretically evaluate the run time, i.e., efficiency, of the serial and parallel computing methods, we introduce and formulate the concept of computational cost as follows: given a tree and threads (basic computational units) to perform triangularization, parallel triangularization equals to divide the node set of into subsets, i.e., = { ya , … } where the size of each subset | | ≤ , i.e., at most nodes can be processed each step since there are only threads. The process of the triangularization phase follows the order: → → … → , and nodes in the same subset can be processed in parallel. So, we define | | (the size of set , i.e., here) as the computational cost of the parallel computing method. In short, we define the computational cost of a parallel method as the number of steps it takes in the triangularization phase. Because the back-substitution is symmetrical with triangularization, the total cost of the entire solving equation phase is twice that of the triangularization phase. T k V T n V V1 V2 Vn Vi k k k V1 ya V2 Vn ya vi V V n Mathematical scheduling problem Kulingana na usahihi wa simulation na gharama ya hesabu, tunafafanua tatizo la parallelization kama tatizo la mipangilio ya kimantiki: Given a tree = { , } and a positive integer , where is the node-set and Mstari wa chini: Roho inawakilisha thamani ( ) = { , , … }, | | ≤ , 1 ≤ ≤ n, where | | indicates the cardinal number of subset , i.e., the number of nodes in , and for each node ∈ , all its children nodes { | ∈children( )} must in a previous subset ambapo 1 ≤ < . Our goal is to find an optimal partition ( (c) gharama ya matumizi ya kompyuta ( )| is minimal. T V E k V E P V V1 V2 Vn Vi k i Vi Vi Vi v Vi c c v Vj j i P* V P* V Here subset consists of all nodes that will be computed at -th step (Fig. ), so | | ≤ Hii inamaanisha kwamba tunaweza kuhesabu nodes each step at most because the number of available threads is . The restriction “for each node ∈ , all its children nodes { | ∈children( )} must in a previous subset , where 1 ≤ < ” indicates that node inaweza kutibiwa tu ikiwa viungo vyake vyote vya mtoto vimechaguliwa. Vi i 2e Vi k k k v Vi c c v Vj j i v DHS implementation We aim to find an optimal way to parallelize the computation of solving linear equations for each neuron model by solving the mathematical scheduling problem above. To get the optimal partition, DHS first analyzes the topology and calculates the depth ( ) for all nodes ∈ . Then, the following two steps will be executed iteratively until every node ∈ imewekwa kwa subset: (1) kutafuta viungo vyote vya mgombea na kuweka viungo hivi katika seti ya mgombea Node ni mgombea tu ikiwa nodes zake zote za watoto zimeharibika au haina nodes yoyote ya watoto. | ≤ , i.e., the number of candidate nodes is smaller or equivalent to the number of available threads, remove all nodes in and put them into , otherwise, remove deepest nodes from and add them to subset Tazama viungo hivi kama viungo vilivyopangwa (Fig. ). After filling in subset , go to step (1) to fill in the next subset . d v v V v V Q Q k Q V*i k Q Vi 2d Vi Vi+1 Correctness proof for DHS Baada ya kutumia DHS kwenye mti wa neva = { , }, we get a partition ( ) = { , , … }, | ya 1 ≤ ≤ . Nodes in the same subset will be computed in parallel, taking steps to perform triangularization and back-substitution, respectively. We then demonstrate that the reordering of the computation in DHS will result in a result identical to the serial Hines method. T V E P V V1 V2 wa Vi k i n Vi n The partition ( ) obtained from DHS decides the computation order of all nodes in a neural tree. Below we demonstrate that the computation order determined by ( ) satisfies the correctness conditions. ( ) is obtained from the given neural tree . Operations in DHS do not modify the tree topology and values of tree nodes (corresponding values in the linear equations), so the tree morphology and initial values of all nodes are not changed, which satisfies condition 1: the tree morphology and initial values of all nodes are identical to those in serial Hines method. In triangularization, nodes are processed from subset ya Kama ilivyoonyeshwa katika utekelezaji wa DHS, nodes zote katika subset are selected from the candidate set , and a node can be put into only if all its child nodes have been processed. Thus the child nodes of all nodes in are in { ya , … }, meaning that a node is only computed after all its children have been processed, which satisfies condition 2: in triangularization, a node can be processed if and only if all its child nodes are already processed. In back-substitution, the computation order is the opposite of that in triangularization, i.e., from to . As shown before, the child nodes of all nodes in Wao ni katika , , … }, so parent nodes of nodes in are in { ya , … }, which satisfies condition 3: in back-substitution, a node can be processed only if its parent node is already processed. P V P V P V T V1 wa Vi Q Q Vi V1 V2 Vi-1 Vn V1 Vi V1 V2 Vi-1 Vi Vi+1 ya +2 Vn Optimality proof for DHS The idea of the proof is that if there is another optimal solution, it can be transformed into our DHS solution without increasing the number of steps the algorithm requires, thus indicating that the DHS solution is optimal. For each subset in ( ), DHS moves (thread number) deepest nodes from the corresponding candidate set ya Ikiwa idadi ya wanyama katika is smaller than , move all nodes from to . To simplify, we introduce , indicating the depth sum of deepest nodes in . All subsets in ( ) satisfy the max-depth criteria (Supplementary Fig. ): . We then prove that selecting the deepest nodes in each iteration makes an optimal partition. If there exists an optimal partition = { , , … } ambayo ina subset ambayo haifai vigezo vya kiwango cha juu cha kina, tunaweza kubadilisha subset katika ( ) so that all subsets consist of the deepest nodes from and the number of subsets ( | ( )|) remain the same after modification. Vi P V k Qi Vi Qi k Qi Vi Di k Qi P V 6a ya P(V) P * (v) V*1 V*2 V*s P* V Q P* V Without any loss of generalization, we start from the first subset kutokubaliana na masharti hayo, na kwamba kuna matukio mawili yanayoweza not satisfy the max-depth criteria: (1) | | < na kuna baadhi ya nodes halali katika that are not put to ; (2) | | = but nodes in Je, si wa deepest nodes in . V*i ya i V*i k Qi V*i V*i k ya i k Qi For case (1), because some candidate nodes are not put to , these nodes must be in the subsequent subsets. As | | , we can move the corresponding nodes from the subsequent subsets to , which will not increase the number of subsets and make Kwa hiyo, unahitaji kubadilisha muundo wa nywele (Fig. Kwa ajili ya kesi (2) | = , nodes hizi za kina ambazo hazihamia kutoka kwenye seti ya mgombea into must be added to subsequent subsets (Supplementary Fig. , bottom). These deeper nodes can be moved from subsequent subsets to through the following method. Assume that after filling , is picked and one of the -th deepest nodes is still in , thus will be put into a subsequent subset ( > (Kwa mara ya kwanza kuhamia from to + , then modify subset + as follows: if | + | ≤ and none of the nodes in + is the parent of node , kuacha kubadilisha subset ya mwisho. Vinginevyo, kubadilisha + as follows (Supplementary Fig. ): if the parent node of is in + , move this parent node to + ; else move the node with minimum depth from + to + . After adjusting , modify subsequent subsets + , + , … with the same strategy. Finally, move from to . V*i V*i < k V*i V*i 6b V*i k Qi V*i 6b V*i V*i v k v’ Qi v’ V*j j i v V*i V*i 1 V*i 1 V*i 1 k V*i 1 v V*i 1 6c v V*i 1 ya i 2 V*i 1 V*i 2 V*i V*i 1 V*i 2 V*j-1 v’ V*j V*i Kwa mkakati wa mabadiliko ulioelezwa hapo juu, tunaweza kubadilisha viungo vyote vya chini katika with the Mzunguko wa kina zaidi katika and keep the number of subsets, i.e., | ( )| the same after modification. We can modify the nodes with the same strategy for all subsets in ( ) that do not contain the deepest nodes. Finally, all subsets ∈ ( ) can satisfy the max-depth criteria, and | ( Mabadiliko hayawezi kubadilika baada ya kubadilika. ya i k Qi P* V P* V V*i ya P* V P* V In conclusion, DHS generates a partition ( ), and all subsets ∈ ( ) satisfy the max-depth condition: . For any other optimal partition ( ) tunaweza kubadilisha subset yake ili kufanya muundo wake kuwa sawa na ( ), i.e., each subset consists of the deepest nodes in the candidate set, and keep | ( ) the same after modification. So, the partition ( ) kupatikana kutoka DHS ni moja ya partitions bora. P V Vi P V ya P* V P V P* V | P V GPU implementation and memory boosting To achieve high memory throughput, GPU utilizes the memory hierarchy of (1) global memory, (2) cache, (3) register, where global memory has large capacity but low throughput, while registers have low capacity but high throughput. We aim to boost memory throughput by leveraging the memory hierarchy of GPU. GPU inatumia SIMT (Single-Instruction, Multiple-Thread) usanifu. Warps ni vipengele vya msingi vya mipangilio kwenye GPU (warp ni kundi la vipengele 32 vya pamoja). warp hufanya maagizo sawa na data tofauti kwa vipengele tofauti . Correctly ordering the nodes is essential for this batching of computation in warps, to make sure DHS obtains identical results as the serial Hines method. When implementing DHS on GPU, we first group all cells into multiple warps based on their morphologies. Cells with similar morphologies are grouped in the same warp. We then apply DHS on all neurons, assigning the compartments of each neuron to multiple threads. Because neurons are grouped into warps, the threads for the same neuron are in the same warp. Therefore, the intrinsic synchronization in warps keeps the computation order consistent with the data dependency of the serial Hines method. Finally, threads in each warp are aligned and rearranged according to the number of compartments. 46 When a warp loads pre-aligned and successively-stored data from global memory, it can make full use of the cache, which leads to high memory throughput, while accessing scatter-stored data would reduce memory throughput. After compartments assignment and threads rearrangement, we permute data in global memory to make it consistent with computing orders so that warps can load successively-stored data when executing the program. Moreover, we put those necessary temporary variables into registers rather than global memory. Registers have the highest memory throughput, so the use of registers further accelerates DHS. Full-spine and few-spine biophysical models We used the published human pyramidal neuron . The membrane capacitance m = 0.44 μF cm-2, membrane resistance m = 48,300 Ω cm2, and axial resistivity a = 261.97 Ω cm. In this model, all dendrites were modeled as passive cables while somas were active. The leak reversal potential l = -83.1 mV. Ion channels such as Na+ and K+ were inserted on soma and initial axon, and their reversal potentials were Na = 67.6 mV, K = -102 mV respectively. All these specific parameters were set the same as in the model of Eyal, et al. , for more details please refer to the published model (ModelDB, access No. 238347). 51 c r r E E E 51 In the few-spine model, the membrane capacitance and maximum leak conductance of the dendritic cables 60 μm away from soma were multiplied by a spine factor to approximate dendritic spines. In this model, Mstari wa chini: Roho inawakilisha thamani nzuri. F F In the full-spine model, all spines were explicitly attached to dendrites. We calculated the spine density with the reconstructed neuron in Eyal, et al. . The spine density was set to 1.3 μm-1, and each cell contained 24994 spines on dendrites 60 μm away from the soma. 51 The morphologies and biophysical mechanisms of spines were the same in few-spine and full-spine models. The length of the spine neck neck = 1.35 μm and the diameter neck = 0.25 μm, whereas the length and diameter of the spine head were 0.944 μm, i.e., the spine head area was set to 2.8 μm2. Both spine neck and spine head were modeled as passive cables, with the reversal potential = -86 mV. The specific membrane capacitance, membrane resistance, and axial resistivity were the same as those for dendrites. L D El Synaptic inputs We investigated neuronal excitability for both distributed and clustered synaptic inputs. All activated synapses were attached to the terminal of the spine head. For distributed inputs, all activated synapses were randomly distributed on all dendrites. For clustered inputs, each cluster consisted of 20 activated synapses that were uniformly distributed on a single randomly-selected compartment. All synapses were activated simultaneously during the simulation. AMPA-based and NMDA-based synaptic currents were simulated as in Eyal et al.’s work. AMPA conductance was modeled as a double-exponential function and NMDA conduction as a voltage-dependent double-exponential function. For the AMPA model, the specific rise and decay were set to 0.3 and 1.8 ms. For the NMDA model, rise and decay were set to 8.019 and 34.9884 ms, respectively. The maximum conductance of AMPA and NMDA were 0.73 nS and 1.31 nS. τ τ τ τ Background noise We attached background noise to each cell to simulate a more realistic environment. Noise patterns were implemented as Poisson spike trains with a constant rate of 1.0 Hz. Each pattern started at start = 10 ms and lasted until the end of the simulation. We generated 400 noise spike trains for each cell and attached them to randomly-selected synapses. The model and specific parameters of synaptic currents were the same as described in , except that the maximum conductance of NMDA was uniformly distributed from 1.57 to 3.275, resulting in a higher AMPA to NMDA ratio. t Synaptic Inputs Exploring neuronal excitability We investigated the spike probability when multiple synapses were activated simultaneously. For distributed inputs, we tested 14 cases, from 0 to 240 activated synapses. For clustered inputs, we tested 9 cases in total, activating from 0 to 12 clusters respectively. Each cluster consisted of 20 synapses. For each case in both distributed and clustered inputs, we calculated the spike probability with 50 random samples. Spike probability was defined as the ratio of the number of neurons fired to the total number of samples. All 1150 samples were simulated simultaneously on our DeepDendrite platform, reducing the simulation time from days to minutes. Performing AI tasks with the DeepDendrite platform Conventional detailed neuron simulators lack two functionalities important to modern AI tasks: (1) alternately performing simulations and weight updates without heavy reinitialization and (2) simultaneously processing multiple stimuli samples in a batch-like manner. Here we present the DeepDendrite platform, which supports both biophysical simulating and performing deep learning tasks with detailed dendritic models. DeepDendrite consists of three modules (Supplementary Fig. ): (1) an I/O module; (2) a DHS-based simulating module; (3) a learning module. When training a biophysically detailed model to perform learning tasks, users first define the learning rule, then feed all training samples to the detailed model for learning. In each step during training, the I/O module picks a specific stimulus and its corresponding teacher signal (if necessary) from all training samples and attaches the stimulus to the network model. Then, the DHS-based simulating module initializes the model and starts the simulation. After simulation, the learning module updates all synaptic weights according to the difference between model responses and teacher signals. After training, the learned model can achieve performance comparable to ANN. The testing phase is similar to training, except that all synaptic weights are fixed. 5 HPC-Net model Image classification is a typical task in the field of AI. In this task, a model should learn to recognize the content in a given image and output the corresponding label. Here we present the HPC-Net, a network consisting of detailed human pyramidal neuron models that can learn to perform image classification tasks by utilizing the DeepDendrite platform. HPC-Net has three layers, i.e., an input layer, a hidden layer, and an output layer. The neurons in the input layer receive spike trains converted from images as their input. Hidden layer neurons receive the output of input layer neurons and deliver responses to neurons in the output layer. The responses of the output layer neurons are taken as the final output of HPC-Net. Neurons between adjacent layers are fully connected. For each image stimulus, we first convert each normalized pixel to a homogeneous spike train. For pixel with coordinates ( ) in the image, the corresponding spike train has a constant interspike interval ISI( ) (in ms) which is determined by the pixel value ( ) kama ilivyoonyeshwa katika Eq. ( ). x, y τ x, y p x, y 1 In our experiment, the simulation for each stimulus lasted 50 ms. All spike trains started at 9 + ISI ms and lasted until the end of the simulation. Then we attached all spike trains to the input layer neurons in a one-to-one manner. The synaptic current triggered by the spike arriving at time is given by τ T0 ya where is the post-synaptic voltage, the reversal potential syn = 1 mV, the maximum synaptic conductance max = 0.05 μS, na wakati wa kudumu = 0.5 ms. v E g τ Neurons in the input layer were modeled with a passive single-compartment model. The specific parameters were set as follows: membrane capacitance m = 1.0 μF cm-2, membrane resistance m = 104 Ω cm2, axial resistivity a = 100 Ω cm, reversal potential of passive compartment l = 0 mV. c r r E The hidden layer contains a group of human pyramidal neuron models, receiving the somatic voltages of input layer neurons. The morphology was from Eyal, et al. , and all neurons were modeled with passive cables. The specific membrane capacitance m = 1.5 μF cm-2, membrane resistance m = 48,300 Ω cm2, axial resistivity a = 261.97 Ω cm, and the reversal potential of all passive cables l = 0 mV. Input neurons could make multiple connections to randomly-selected locations on the dendrites of hidden neurons. The synaptic current activated by the -th synapse of the -th input neuron on neuron ’s dendrite is defined as in Eq. ( ), where is the synaptic conductance, is the synaptic weight, is the ReLU-like somatic activation function, and is the somatic voltage of the Neuroni ya kuingia kwa wakati . 51 c r r E k i j 4 gijk Wijk i t Neurons in the output layer were also modeled with a passive single-compartment model, and each hidden neuron only made one synaptic connection to each output neuron. All specific parameters were set the same as those of the input neurons. Synaptic currents activated by hidden neurons are also in the form of Eq. ( ). 4 Image classification with HPC-Net For each input image stimulus, we first normalized all pixel values to 0.0-1.0. Then we converted normalized pixels to spike trains and attached them to input neurons. Somatic voltages of the output neurons are used to compute the predicted probability of each class, as shown in equation , where is the probability of -th class predicted by the HPC-Net, is the average somatic voltage from 20 ms to 50 ms of the ya uingizaji wa neurons, na indicates the number of classes, which equals the number of output neurons. The class with the maximum predicted probability is the final classification result. In this paper, we built the HPC-Net with 784 input neurons, 64 hidden neurons, and 10 output neurons. 6 pi i i C Synaptic plasticity rules for HPC-Net Inspired by previous work , we use a gradient-based learning rule to train our HPC-Net to perform the image classification task. The loss function we use here is cross-entropy, given in Eq. ( (Kwa nini is the predicted probability for class , indicates the actual class the stimulus image belongs to, = 1 if input image belongs to class , and 0 kama si hivyo. 36 7 pi i ya yi i yi When training HPC-Net, we compute the update for weight (the synaptic weight of the -th synapse connecting neuron kwa ajili ya neurons ) at each time step. After the simulation of each image stimulus, is updated as shown in Eq. ( ): Wijk k i j Wijk 8 Here is the learning rate, is the update value at time , , are somatic voltages of neuron and respectively, is the -th synaptic current activated by neuron on neuron , its synaptic conductance, is the transfer resistance between the -th connected compartment of neuron on neuron ’s dendrite to neuron ’s soma, s = 30 ms, e = 50 ms are start time and end time for learning respectively. For output neurons, the error term can be computed as shown in Eq. ( ). For hidden neurons, the error term is calculated from the error terms in the output layer, given in Eq. ( ). t vj vi i j Iijk k i j gijk rijk k i j j t t 10 11 Since all output neurons are single-compartment, equals to the input resistance of the corresponding compartment, . Transfer and input resistances are computed by NEURON. Mini-batch training is a typical method in deep learning for achieving higher prediction accuracy and accelerating convergence. DeepDendrite also supports mini-batch training. When training HPC-Net with mini-batch size batch, we make batch copies of HPC-Net. During training, each copy is fed with a different training sample from the batch. DeepDendrite first computes the weight update for each copy separately. After all copies in the current training batch are done, the average weight update is calculated and weights in all copies are updated by this same amount. N N Robustness against adversarial attack with HPC-Net To demonstrate the robustness of HPC-Net, we tested its prediction accuracy on adversarial samples and compared it with an analogous ANN (one with the same 784-64-10 structure and ReLU activation, for fair comparison in our HPC-Net each input neuron only made one synaptic connection to each hidden neuron). We first trained HPC-Net and ANN with the original training set (original clean images). Then we added adversarial noise to the test set and measured their prediction accuracy on the noisy test set. We used the Foolbox , to generate adversarial noise with the FGSM method . ANN was trained with PyTorch , and HPC-Net was trained with our DeepDendrite. For fairness, we generated adversarial noise on a significantly different network model, a 20-layer ResNet . The noise level ranged from 0.02 to 0.2. We experimented on two typical datasets, MNIST and Fashion-MNIST . Results show that the prediction accuracy of HPC-Net is 19% and 16.72% higher than that of the analogous ANN, respectively. 98 99 93 100 101 95 96 Reporting summary Further information on research design is available in the linked to this article. Nature Portfolio Reporting Summary Upatikanaji wa data The data that support the findings of this study are available within the paper, Supplementary Information and Source Data files provided with this paper. The source code and data that used to reproduce the results in Figs. – Ni inapatikana katika . The MNIST dataset is publicly available at . The Fashion-MNIST dataset is publicly available at . imewasilishwa kwa karatasi hii. 3 6 https://github.com/pkuzyc/DeepDendrite http://yann.lecun.com/exdb/mnist https://github.com/zalandoresearch/fashion-mnist Chanzo cha data Code availability The source code of DeepDendrite as well as the models and code used to reproduce Figs. – in this study are available at . 3 6 https://github.com/pkuzyc/DeepDendrite References McCulloch, W. S. & Pitts, W. Tathmini ya mantiki ya mawazo yanayohusiana na shughuli za neva. LeCun, Y., Bengio, Y. & Hinton, G. Ujuzi wa kina. asili 521, 436–444 (2015). Poirazi, P., Brannon, T. & Mel, B. W. Arithmetic of subthreshold synaptic summation in a model CA1 pyramidal cell. , 977–987 (2003). Neuron 37 London, M. & Häusser, M. Dendritic hesabu. Mwaka. Rev. Neurosci. 28, 503-532 (2005). Branco, T. & Häusser, M. The single dendritic branch as a fundamental functional unit in the nervous system. , 494–502 (2010). Curr. Opin. Neurobiol. 20 Stuart, G. J. & Spruston, N. Dendritic integration: 60 years of progress. , 1713–1721 (2015). Nat. Neurosci. 18 Poirazi, P. & Papoutsi, A. Kuonyesha kazi ya dendritic na mifano ya kompyuta. Nat. Rev. Neurosci. 21, 303-321 (2020). Yuste, R. & Denk, W. Dendritic spines as basic functional units of neuronal integration. , 682–684 (1995). Nature 375 Engert, F. & Bonhoeffer, T. Dendritic spine changes associated with hippocampal long-term synaptic plasticity. , 66–70 (1999). Nature 399 Yuste, R. Dendritic spines na mzunguko wa kusambazwa. Neuron 71, 772–781 (2011). Yuste, R. Electrical compartmentalization in dendritic spines. , 429–449 (2013). Annu. Rev. Neurosci. 36 Rall, W. Mifugo ya miti ya dendritic na upinzani wa membrane ya motoneuron. Exp. Neurol. 1, 491-527 (1959). Segev, I. & Rall, W. Computational study of an excitable dendritic spine. , 499–523 (1988). J. Neurophysiol. 60 Silver, D. et al. Mastering the game of go with deep neural networks and tree search. , 484–489 (2016). Nature 529 Silver, D. et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. , 1140–1144 (2018). Science 362 McCloskey, M. & Cohen, N. J. Catastrophic interference in connectionist networks: the sequential learning problem. , 109–165 (1989). Psychol. Learn. Motiv. 24 French, R. M. Catastrophic forgetting in connectionist networks. , 128–135 (1999). Trends Cogn. Sci. 3 Naud, R. & Sprekeler, H. Sparse bursts optimize information transmission in a multiplexed neural code. , E6329–E6338 (2018). Proc. Natl Acad. Sci. USA 115 Sacramento, J., Costa, R. P., Bengio, Y. & Senn, W. Dendritic cortical microcircuits approximate the backpropagation algorithm. in (NeurIPS*,* 2018). Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Payeur, A., Guerguiev, J., Zenke, F., Richards, B. A. & Naud, R. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. , 1010–1019 (2021). Nat. Neurosci. 24 Bicknell, B. A. & Häusser, M. A synaptic learning rule for exploiting nonlinear dendritic computation. , 4001–4017 (2021). Neuron 109 Moldwin, T., Kalmenson, M. & Segev, I. The gradient clusteron: a model neuron that learns to solve classification tasks via dendritic nonlinearities, structural plasticity, and gradient descent. , e1009015 (2021). PLoS Comput. Biol. 17 Hodgkin, A. L. & Huxley, A. F. Ufafanuzi wa kiasi wa mzunguko wa membrane na maombi yake ya kuendesha na kusisimua katika neva. Rall, W. Theory of physiological properties of dendrites. , 1071–1092 (1962). Ann. N. Y. Acad. Sci. 96 Hines, M. L. & Carnevale, N. T. The NEURON simulation environment. , 1179–1209 (1997). Neural Comput. 9 Bower, J. M. & Beeman, D. in (eds Bower, J.M. & Beeman, D.) 17–27 (Springer New York, 1998). The Book of GENESIS: Exploring Realistic Neural Models with the GEneral NEural SImulation System Hines, M. L., Eichner, H. & Schürmann, F. Neuron splitting in compute-bound parallel network simulations enables runtime scaling with twice as many processors. , 203–210 (2008). J. Comput. Neurosci. 25 Hines, M. L., Markram, H. & Schürmann, F. simulation kamili ya pamoja ya neurons moja. J. Comput. Neurosci. 25, 439-448 (2008). Ben-Shalom, R., Liberman, G. & Korngreen, A. Kuharakisha mifumo ya sehemu kwenye kifaa cha usindikaji wa graphics. Tsuyuki, T., Yamamoto, Y. & Yamazaki, T. Ufanisi simulation idadi ya mifano ya neurons na muundo wa nafasi juu ya graphics usindikaji vifaa. In Proc. 2016 Mkutano wa Kimataifa juu ya Usindikaji wa Habari Neural (eds Hirose894Akiraet al.) 279-285 (Springer International Publishing, 2016). Vooturi, D. T., Kothapalli, K. & Bhalla, U.S. Parallelizing Hines Matrix Solver katika Simulations ya Neuron kwenye GPU. In Proc. IEEE 24th International Conference on High Performance Computing (HiPC) 388-397 (IEEE, 2017). Huber, F. Efficient tree solver for hines matrices on the GPU. Preprint at (2018). https://arxiv.org/abs/1810.12742 Korte, B. & Vygen, J. 6 edn (Springer, 2018). Combinatorial Optimization Theory and Algorithms Gebali, F. (Wiley, 2011). Algorithms and Parallel Computing Kumbhar, P. et al. CoreNEURON: An optimized compute engine for the NEURON simulator. , 63 (2019). Front. Neuroinform. 13 Urbanczik, R. & Senn, W. Learning by the dendritic prediction of somatic spiking. , 521–528 (2014). Neuron 81 Ben-Shalom, R., Aviv, A., Razon, B. & Korngreen, A. Optimizing ion channel models using a parallel genetic algorithm on graphical processors. , 183–194 (2012). J. Neurosci. Methods 206 Mascagni, M. A parallelizing algorithm for computing solutions to arbitrarily branched cable neuron models. , 105–114 (1991). J. Neurosci. Methods 36 McDougal, R. A. et al. Twenty years of modelDB and beyond: building essential modeling tools for the future of neuroscience. , 1–10 (2017). J. Comput. Neurosci. 42 Migliore, M., Messineo, L. & Ferrante, M. Dendritic Ih selectively blocks temporal summation of unsynchronized distal inputs in CA1 pyramidal neurons. , 5–13 (2004). J. Comput. Neurosci. 16 Hemond, P. et al. Distinct classes of pyramidal cells exhibit mutually exclusive firing patterns in hippocampal area CA3b. , 411–424 (2008). Hippocampus 18 Hay, E., Hill, S., Schürmann, F., Markram, H. & Segev, I. Models of neocortical layer 5b pyramidal cells capturing a wide range of dendritic and perisomatic active Properties. , e1002107 (2011). PLoS Comput. Biol. 7 Masoli, S., Solinas, S. & D’Angelo, E. Action potential processing in a detailed purkinje cell model reveals a critical role for axonal compartmentalization. , 47 (2015). Front. Cell. Neurosci. 9 Lindroos, R. et al. Basal ganglia neuromodulation over multiple temporal and structural scales—simulations of direct pathway MSNs investigate the fast onset of dopaminergic effects and predict the role of Kv4.2. , 3 (2018). Front. Neural Circuits 12 Migliore, M. et al. Synaptic clusters function as odor operators in the olfactory bulb. , 8499–8504 (2015). Proc. Natl Acad. Sci. USa 112 NVIDIA. . (2021). CUDA C++ Programming Guide https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html NVIDIA. . (2021). CUDA C++ Best Practices Guide https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Harnett, M. T., Makara, J. K., Spruston, N., Kath, W. L. & Magee, J. C. Synaptic amplification by dendritic spines enhances input cooperativity. , 599–602 (2012). Nature 491 Chiu, C. Q. et al. Compartmentalization of GABAergic inhibition by dendritic spines. , 759–762 (2013). Science 340 Tønnesen, J., Katona, G., Rózsa, B. & Nägerl, U. V. Spine neck plasticity regulates compartmentalization of synapses. , 678–685 (2014). Nat. Neurosci. 17 Eyal, G. et al. Human cortical pyramidal neurons: from spines to spikes via models. , 181 (2018). Front. Cell. Neurosci. 12 Koch, C. & Zador, A. The function of dendritic spines: devices subserving biochemical rather than electrical compartmentalization. , 413–422 (1993). J. Neurosci. 13 Koch, C. Dendritic spines. In (Oxford University Press, 1999). Biophysics of Computation Rapp, M., Yarom, Y. & Segev, I. The impact of parallel fiber background activity on the cable properties of cerebellar purkinje cells. , 518–533 (1992). Neural Comput. 4 Hines, M. Efficient computation of branched nerve equations. , 69–76 (1984). Int. J. Bio-Med. Comput. 15 Nayebi, A. & Ganguli, S. Biologically inspired protection of deep networks from adversarial attacks. Preprint at (2017). https://arxiv.org/abs/1703.09202 Goddard, N. H. & Hood, G. Simulation kubwa kutumia Genesis parallel. katika kitabu cha Genesis: kuchunguza mifano halisi ya neural na mfumo wa kawaida neural simulation (eds Bower James M. & Beeman David) 349-379 (Springer New York, 1998). Migliore, M., Cannia, C., Lytton, W. W., Markram, H. & Hines, M. L. Parallel network simulations with NEURON. , 119 (2006). J. Comput. Neurosci. 21 Lytton, W. W. et al. Simulation neurotechnologies for advancing brain research: parallelizing large networks in NEURON. , 2063–2090 (2016). Neural Comput. 28 Valero-Lara, P. et al. cuHinesBatch: Solving multiple Hines systems on GPUs human brain project. In 566–575 (IEEE, 2017). Proc. 2017 International Conference on Computational Science Akar, N. A. et al. Arbor—A morphologically-detailed neural network simulation library for contemporary high-performance computing architectures. In 274–282 (IEEE, 2019). Proc. 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) Ben-Shalom, R. et al. NeuroGPU: Accelerating multi-compartment, biophysically detailed neuron simulations on GPUs. , 109400 (2022). J. Neurosci. Methods 366 Rempe, M. J. & Chopp, D. L. A predictor-corrector algorithm for reaction-diffusion equations associated with neural activity on branched structures. , 2139–2161 (2006). SIAM J. Sci. Comput. 28 Kozloski, J. & Wagner, J. An ultrascalable solution to large-scale neural tissue simulation. , 15 (2011). Front. Neuroinform. 5 Jayant, K. et al. Targeted intracellular voltage recordings from dendritic spines using quantum-dot-coated nanopipettes. , 335–342 (2017). Nat. Nanotechnol. 12 Palmer, L. M. & Stuart, G. J. Membrane potential changes in dendritic spines during action potentials and synaptic input. , 6897–6903 (2009). J. Neurosci. 29 Nishiyama, J. & Yasuda, R. Biochemical computation for spine structural plasticity. , 63–75 (2015). Neuron 87 Yuste, R. & Bonhoeffer, T. Mabadiliko ya morphological katika spines dendritic kuhusiana na plasticity ya muda mrefu synaptic. Annu. Rev. Neurosci. 24, 1071-1089 (2001). Holtmaat, A. & Svoboda, K. Uzoefu-kuhusiana muundo synaptic plasticity katika ubongo wa wanyama. Nat. Rev. Neurosci. 10, 647–658 (2009). Caroni, P., Donato, F. & Muller, D. Structural plasticity upon learning: regulation and functions. , 478–490 (2012). Nat. Rev. Neurosci. 13 Keck, T. et al. Utaratibu mkubwa wa mzunguko wa neural wakati wa uanzishaji wa utendaji wa cortex ya mzima. Nat. Neurosci. 11, 1162 (2008). Hofer, S. B., Mrsic-Flogel, T. D., Bonhoeffer, T. & Hübener, M. Experience leaves a lasting structural trace in cortical circuits. , 313–317 (2009). Nature 457 Trachtenberg, J. T. et al. Long-term in vivo imaging of experience-dependent synaptic plasticity in adult cortex. , 788–794 (2002). Nature 420 Marik, S. A., Yamahachi, H., McManus, J. N., Szabo, G. & Gilbert, C. D. Dynamics Axonal ya neurons excitatory na inhibitory katika cortex somatosensory. PLoS Biol. 8, e1000395 (2010). Xu, T. et al. Rapid formation and selective stabilization of synapses for enduring motor memories. , 915–919 (2009). Nature 462 Albarran, E., Raissi, A., Jáidar, O., Shatz, C. J. & Ding, J. B. Enhancing motor learning by increasing the stability of newly formed dendritic spines in the motor cortex. , 3298–3311 (2021). Neuron 109 Branco, T. & Häusser, M. Synaptic integration gradients in single cortical pyramidal cell dendrites. , 885–892 (2011). Neuron 69 Major, G., Larkum, M. E. & Schiller, J. Active properties of neocortical pyramidal neuron dendrites. , 1–24 (2013). Annu. Rev. Neurosci. 36 Gidon, A. et al. Dendritic action potentials and computation in human layer 2/3 cortical neurons. , 83–87 (2020). Science 367 Doron, M., Chindemi, G., Muller, E., Markram, H. & Segev, I. Timed synaptic inhibition shapes NMDA spikes, influencing local dendritic processing and global I/O properties of cortical neurons. , 1550–1561 (2017). Cell Rep. 21 Du, K. et al. Cell-type-specific inhibition of the dendritic plateau potential in striatal spiny projection neurons. , E7612–E7621 (2017). Proc. Natl Acad. Sci. USA 114 Smith, S. L., Smith, I. T., Branco, T. & Häusser, M. Dendritic spikes enhance stimulus selectivity in cortical neurons in vivo. , 115–120 (2013). Nature 503 Xu, N.-l et al. Nonlinear dendritic integration of sensory and motor input during an active sensing task. , 247–251 (2012). Nature 492 Takahashi, N., Oertner, T. G., Hegemann, P. & Larkum, M. E. Active cortical dendrites modulate perception. , 1587–1590 (2016). Science 354 Sheffield, M. E. & Dombeck, D. A. Calcium transient prevalence across the dendritic arbour predicts place field properties. , 200–204 (2015). Nature 517 Markram, H. et al. Ujenzi na simulation ya microcircuitry neocortical. Simu 163, 456-492 (2015). Billeh, Y. N. et al. Ushirikiano wa utaratibu na utendaji wa data katika mifano mingi ya mouse primary visual cortex. Neuron 106, 388–403 (2020). Hjorth, J. et al. The microcircuits of striatum in silico. , 202000671 (2020). Proc. Natl Acad. Sci. USA 117 Guerguiev, J., Lillicrap, T. P. & Richards, B. A. Towards deep learning with segregated dendrites. , e22901 (2017). elife 6 Iyer, A. et al. Avoiding catastrophe: active dendrites enable multi-task learning in dynamic environments. , 846219 (2022). Front. Neurorobot. 16 Jones, I. S. & Kording, K. P. Might a single neuron solve interesting machine learning problems through successive computations on its dendritic tree? , 1554–1571 (2021). Neural Comput. 33 Bird, A. D., Jedlicka, P. & Cuntz, H. Dendritic normalisation improves learning in sparsely connected artificial neural networks. , e1009202 (2021). PLoS Comput. Biol. 17 Goodfellow, I. J., Shlens, J. & Szegedy, C. Explaining and harnessing adversarial examples. In (ICLR, 2015). 3rd International Conference on Learning Representations (ICLR) Papernot, N., McDaniel, P. & Goodfellow, I. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. Preprint at (2016). https://arxiv.org/abs/1605.07277 Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. , 2278–2324 (1998). Proc. IEEE 86 Xiao, H., Rasul, K. & Vollgraf, R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. Preprint at (2017). http://arxiv.org/abs/1708.07747 Bartunov, S. et al. Assessing the scalability of biologically-motivated deep learning algorithms and architectures. In (NeurIPS, 2018). Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Rauber, J., Brendel, W. & Bethge, M. Foolbox: A Python toolbox to benchmark the robustness of machine learning models. In (2017). Reliable Machine Learning in the Wild Workshop, 34th International Conference on Machine Learning Rauber, J., Zimmermann, R., Bethge, M. & Brendel, W. Foolbox native: fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX. , 2607 (2020). J. Open Source Softw. 5 Paszke, A. et al. PyTorch: An imperative style, high-performance deep learning library. In (NeurIPS, 2019). Advances in Neural Information Processing Systems 32 (NeurIPS 2019) He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 770–778 (IEEE, 2016). Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Ujumbe wa The authors sincerely thank Dr. Rita Zhang, Daochen Shi and members at NVIDIA for the valuable technical support of GPU computing. This work was supported by the National Key R&D Program of China (No. 2020AAA0130400) to K.D. and T.H., National Natural Science Foundation of China (No. 61088102) to T.H., National Key R&D Program of China (No. 2022ZD01163005) to L.M., Key Area R&D Program of Guangdong Province (No. 2018B030338001) to T.H., National Natural Science Foundation of China (No. 61825101) to Y.T., Swedish Research Council (VR-M-2020-01652), Swedish e-Science Research Centre (SeRC), EU/Horizon 2020 No. 945539 (HBP SGA3), and KTH Digital Futures to J.H.K., J.H., and A.K., Swedish Research Council (VR-M-2021-01995) and EU/Horizon 2020 no. 945539 (HBP SGA3) to S.G. and A.K. Part of the simulations were enabled by resources provided by the Swedish National Infrastructure for Computing (SNIC) at PDC KTH partially funded by the Swedish Research Council through grant agreement no. 2018-05973. This paper is under CC by 4.0 Deed (Attribution 4.0 International) license. available on nature Makala hii ni under CC by 4.0 Deed (Attribution 4.0 International) license. Inapatikana kwa asili