Wetenschappers bouwen GPU-engine die hersencellen 1500 keer sneller simuleert

De auteurs: Yichen Zhang Gan hij Lei Ma Xiaofei Liu J. J. Johannes Hjorth Alexander Kozlov Yutao He Shenjian Zhang Jeanette Hellgren Kotaleski Yonghong Tian Sten Grillner Kai Du Tiejun Huang De auteurs: Yichen Zhang Gan hij Leu Ma Xiaofei Liu door J.J. Johannes Hjorth Alexander Kozlov Yutao Hij Shenjian Zhang door Jeanette Hellgren Kotaleski Yonghong Tian Steen Griller Wanneer je Tiejun Huang Abstractie Biophysically gedetailleerde multi-compartment modellen zijn krachtige hulpmiddelen om computationele principes van de hersenen te verkennen en dienen ook als een theoretisch kader om algoritmen voor kunstmatige intelligentie (AI) systemen te genereren. Echter, de dure computationele kosten beperken de toepassingen in zowel de neurowetenschappen en AI velden ernstig. De belangrijkste bottleneck tijdens het simuleren van gedetailleerde compartimentmodellen is het vermogen van een simulator om grote systemen van lineaire vergelijkingen op te lossen. Endritiek Hierarchisch cheduling (DHS) methode om een dergelijk proces aanzienlijk te versnellen. We bewijzen theoretisch dat de DHS-implementatie computationeel optimaal en nauwkeurig is. Deze GPU-gebaseerde methode voert met 2-3 orders van grootte hogere snelheid uit dan die van de klassieke seriële Hines-methode in het conventionele CPU-platform. We bouwen een DeepDendrite-framework, dat de DHS-methode en de GPU-computing-engine van de NEURON-simulator integreert en de toepassingen van DeepDendrite in neurowetenschappen demonstreert. We onderzoeken hoe ruimtelijke patronen van spin-inputs neuronale prikkelbaarheid beïnvloeden in een gedetailleerd D H S Introductie Het ontcijferen van de coderings- en computationele principes van neuronen is essentieel voor de neurowetenschap. De hersenen van zoogdieren bestaan uit meer dan duizenden verschillende soorten neuronen met unieke morfologische en biofysische eigenschappen. , waarin neuronen werden beschouwd als eenvoudige samenvattende eenheden, wordt nog steeds op grote schaal toegepast in neurale berekening, vooral in neurale netwerkanalyse.In de afgelopen jaren heeft moderne kunstmatige intelligentie (AI) dit principe gebruikt en krachtige tools ontwikkeld, zoals kunstmatige neurale netwerken (ANN) In aanvulling op uitgebreide berekeningen op het niveau van de enkele neuron, kunnen subcellulaire compartimenten, zoals neuronale dendrieten, ook niet-lineaire operaties uitvoeren als onafhankelijke berekeningseenheden. , , , , Bovendien kunnen dendritische spinaten, kleine uitsteeksels die dicht de dendrieten in spinale neuronen bedekken, synaptische signalen compartmentaliseren, waardoor ze kunnen worden gescheiden van hun ouderlijke dendrieten ex vivo en in vivo. , , , . 1 2 3 4 5 6 7 8 9 10 11 Simulaties met behulp van biologisch gedetailleerde neuronen bieden een theoretisch kader voor het koppelen van biologische details aan computationele principes. , stelt ons in staat om neuronen te modelleren met realistische dendritische morfologieën, intrinsieke ionische geleidbaarheid en extrinsieke synaptische inputs.De ruggengraat van het gedetailleerde multi-compartment model, d.w.z. dendrites, is gebouwd op de klassieke kabel theorie , die de biophysische membraan-eigenschappen van dendrieten als passieve kabels modelleert, een wiskundige beschrijving van hoe elektronische signalen invaderen en zich verspreiden in complexe neuronale processen. door kabeltheorie te integreren met actieve biophysische mechanismen zoals ionenkanalen, excitatieve en remmende synaptische stromen, enz., kan een gedetailleerd multi-compartment model cellulaire en subcellulaire neuronale berekeningen bereiken buiten experimentele beperkingen , . 12 13 12 4 7 Naast zijn diepe impact op de neurowetenschappen, werden biologisch gedetailleerde neuronmodellen onlangs gebruikt om de kloof tussen neuronale structurele en biofysische details en AI te overbruggen.De overheersende techniek in het moderne AI-veld is ANN's bestaande uit puntneuronen, een analoog aan biologische neurale netwerken.Hoewel ANN's met "backpropagation-of-error" (backprop) algoritme opmerkelijke prestaties behaalde in gespecialiseerde toepassingen, zelfs het verslaan van top menselijke professionele spelers in games van Go en schaak , De menselijke hersenen presteren nog steeds beter dan ANN's in domeinen die meer dynamische en lawaaierige omgevingen omvatten. , Recente theoretische studies suggereren dat dendritische integratie cruciaal is bij het genereren van efficiënte leeralgoritmen die mogelijk de backprop in parallelle informatieverwerking overschrijden. , , Bovendien kan een enkel gedetailleerd multi-compartment model niet-lineaire berekeningen op netwerkniveau voor puntneuronen leren door alleen de synaptische sterkte aan te passen. , Het demonstreert het volledige potentieel van de gedetailleerde modellen bij het bouwen van krachtiger hersenenachtige AI-systemen.Daarom is het van hoge prioriteit paradigma's in hersenenachtige AI uit te breiden van enkele gedetailleerde neuronmodellen tot grootschalige biologisch gedetailleerde netwerken. 14 15 16 17 18 19 20 21 22 Een langdurige uitdaging van de gedetailleerde simulatie benadering ligt in de buitengewoon hoge computationele kosten, die zijn toepassing aan neurowetenschappen en AI ernstig heeft beperkt. , , Om de efficiëntie te verbeteren, vermindert de klassieke Hines-methode de tijdscomplexiteit voor het oplossen van vergelijkingen van O(n3) tot O(n), die op grote schaal is toegepast als het kernalgoritme in populaire simulatoren zoals NEURON. en Genesis Deze methode maakt echter gebruik van een seriële benadering om elk compartiment sequentieel te verwerken.Wanneer een simulatie meerdere biofysisch gedetailleerde dendrieten met dendrietspinnen omvat, schaalt de lineaire vergelijkingsmatrix (‘Hines Matrix’) dienovereenkomstig met een toenemend aantal dendrieten of spinnen (Fig. ), waardoor de Hines-methode niet langer praktisch is, omdat het een zeer zware last op de hele simulatie legt. 12 23 24 25 26 1e Een gereconstrueerd laag-5 piramidale neuronmodel en de wiskundige formule gebruikt met gedetailleerde neuronmodellen. Workflow bij numerieke simulatie van gedetailleerde neuronmodellen.De vergelijkingsoplossingsfase is de bottleneck in de simulatie. Een voorbeeld van lineaire vergelijkingen in de simulatie. Gegevensafhankelijkheid van de Hines-methode bij het oplossen van lineaire vergelijkingen in . . Het aantal lineaire vergelijkingssystemen dat moet worden opgelost, ondergaat een aanzienlijke toename wanneer modellen meer gedetailleerd worden. Computationele kosten (stappen genomen in de vergelijking oplossingsfase) van de seriële Hines-methode op verschillende soorten neuronmodellen. Illustratie van verschillende oplossingsmethoden. Verschillende delen van een neuron worden toegewezen aan meerdere verwerkingseenheden in parallelle methoden (midden, rechts), weergegeven met verschillende kleuren. Berekeningskosten van drie methoden Bij het oplossen van vergelijkingen van een piramidemodel met spinnen. De runtime geeft het tijdverbruik van 1 s simulatie (het oplossen van de vergelijking 40.000 keer met een tijdstap van 0,025 ms). p-Hines parallelle methode in CoreNEURON (op GPU), Branch-gebaseerde branch-gebaseerde parallelle methode (op GPU), DHS Dendritic hiërarchische planning methode (op GPU). a b c d c e f g h g i In de afgelopen decennia is enorme vooruitgang geboekt om de Hines-methode te versnellen door parallelle methoden op cellulair niveau te gebruiken, waardoor de berekening van verschillende delen in elke cel parallel kan worden gemaakt. , , , , , De huidige parallelle methoden op cellulair niveau hebben echter vaak geen efficiënte parallellisatiestrategie of hebben geen voldoende numerieke nauwkeurigheid in vergelijking met de oorspronkelijke Hines-methode. 27 28 29 30 31 32 Hier ontwikkelen we een volledig geautomatiseerde, numerieke nauwkeurige en geoptimaliseerde simulatietool die de computationele efficiëntie aanzienlijk kan versnellen en de computationele kosten kan verminderen.Bovendien kan dit simulatietool naadloos worden aangenomen voor het opzetten en testen van neurale netwerken met biologische details voor machine learning en AI-toepassingen. De parallelle computer theorie We tonen aan dat ons algoritme een optimale planning biedt zonder enig verlies aan nauwkeurigheid.Bovendien hebben we DHS geoptimaliseerd voor de meest geavanceerde GPU-chip op dit moment door gebruik te maken van de GPU-geheugenhiërarchie en geheugentoegangsmechanismen. Vergeleken met de klassieke simulator Neuron met een identieke nauwkeurigheid. 33 34 1 25 Om gedetailleerde dendritische simulaties voor gebruik in AI mogelijk te maken, creëren we het DeepDendrite-kader door het DHS-geïntegreerde CoreNEURON-platform (een geoptimaliseerde computationele motor voor NEURON) te integreren. als de simulatiemotor en twee auxiliaire modules (I/O-module en leermodule) die dendritische leeralgoritmen ondersteunen tijdens simulaties. DeepDendrite draait op het GPU-hardwareplatform en ondersteunt zowel reguliere simulatietaken in de neurowetenschappen als leertaken in AI. 35 Last but not least, we presenteren ook verschillende toepassingen met behulp van DeepDendrite, gericht op een aantal kritieke uitdagingen in de neurowetenschappen en AI: (1) We demonstreren hoe ruimtelijke patronen van dendritische spin-inputs neuronale activiteiten beïnvloeden met neuronen die spines bevatten in de dendritische bomen (full-spine-modellen). DeepDendrite stelt ons in staat om neuronale berekening te verkennen in een gesimuleerd menselijk piramidale neuronmodel met ~25.000 dendritische spines. (2) In de discussie overwegen we ook het potentieel van DeepDendrite in de context van AI, specifiek bij het maken van ANN's met morfologisch gedetailleerde menselijke piramidale neuronen. Alle broncode voor DeepDendrite, de full-spine modellen en het gedetailleerde dendritische netwerkmodel zijn publiekelijk online beschikbaar (zie Code Availability). Burst-afhankelijke synaptische plasticiteit , en leren met spike voorspelling Over het algemeen biedt onze studie een volledige reeks hulpmiddelen die het potentieel hebben om het huidige ecosysteem van de computationele neurowetenschappen te veranderen.Door de kracht van GPU-computing te benutten, verwachten we dat deze hulpmiddelen systeemniveauonderzoeken van computationele principes van de fijne structuren van de hersenen zullen vergemakkelijken, evenals de interactie tussen neurowetenschappen en moderne AI zullen bevorderen. 21 20 36 Resultaten Dendritische hiërarchische planning (DHS) Het berekenen van ionische stromen en het oplossen van lineaire vergelijkingen zijn twee kritieke fasen bij het simuleren van biofysisch gedetailleerde neuronen, die tijdrovend zijn en ernstige computationele lasten veroorzaken.Gelukkig is het berekenen van ionische stromen van elk compartiment een volledig onafhankelijk proces, zodat het op natuurlijke wijze kan worden paralleliseerd op apparaten met massale parallelle computing-eenheden zoals GPU's Als gevolg hiervan wordt het oplossen van lineaire vergelijkingen de resterende bottleneck voor het parallellisatieproces (Fig. ) van 37 1a - f Om deze bottleneck aan te pakken, zijn parallelle methoden op cellulair niveau ontwikkeld, die single-cell computation versnellen door "splitting" een enkele cel in verschillende compartimenten die parallel kunnen worden berekend. , , Dergelijke methoden vertrouwen echter sterk op voorafgaande kennis om praktische strategieën te genereren over hoe een enkele neuron in compartimenten te splitsen (Fig. Aanvullende Fig. Daarom wordt het minder efficiënt voor neuronen met asymmetrische morfologieën, bijvoorbeeld pyramidale neuronen en Purkinje neuronen. 27 28 38 1g i 1 We streven ernaar een efficiëntere en nauwkeurigere parallelle methode te ontwikkelen voor de simulatie van biologisch gedetailleerde neurale netwerken. Ten eerste stellen we de criteria vast voor de nauwkeurigheid van een parallelle methode op cellulair niveau. , stellen we drie voorwaarden voor om ervoor te zorgen dat een parallelle methode identieke oplossingen levert als de seriële berekeningsmethode Hines volgens de gegevensafhankelijkheid in de Hines-methode (zie Methoden). 34 Op basis van de nauwkeurigheid van de simulatie en de berekeningskosten formuleren we het parallelisatieprobleem als een mathematisch schemaprobleem (zie Methoden). parallelle draden, kunnen we maximaal berekenen We moeten ervoor zorgen dat een knooppunt alleen wordt berekend als al zijn kinderknopen zijn verwerkt; ons doel is om een strategie te vinden met het minimum aantal stappen voor de hele procedure. k k Om een optimale partitie te genereren, stellen we een methode voor genaamd Dendritic Hierarchical Scheduling (DHS) (theoretisch bewijs wordt gepresenteerd in de Methods). De DHS-methode omvat twee stappen: het analyseren van de dendritische topologie en het vinden van de beste partitie: (1) Gegeven een gedetailleerd model, krijgen we eerst de overeenkomstige afhankelijkheidsboom en berekenen we de diepte van elke knoop (de diepte van een knoop is het aantal van zijn voorouder knooppunten) op de boom (Figuur. (2) Na topologie-analyse zoeken we de kandidaten en selecteren we maximaal diepte kandidaat knooppunten (een knooppunt is een kandidaat alleen als al zijn kinderen knooppunten zijn verwerkt). ) van 2a 2b en c k 2d DHS werkstroom. DHS processen De diepste kandidaat knooppunten van elke iteratie. Het model wordt eerst omgezet in een boomstructuur en vervolgens wordt de diepte van elke knoop berekend. Topologie-analyse op verschillende neuromodellen. Zes neuronen met verschillende morfologieën worden hier getoond. Voor elk model wordt de soma geselecteerd als de wortel van de boom, zodat de diepte van de knoop toeneemt van de soma (0) tot de distale dendrieten. Illustratie van het uitvoeren van DHS op het model in kandidaten: knooppunten die kunnen worden verwerkt. geselecteerde kandidaten: knooppunten die door DHS worden geselecteerd, d.w.z. de Verwerkte knooppunten: knooppunten die eerder zijn verwerkt. Paralleliseringsstrategie verkregen door DHS na het proces in Elke knooppunt wordt toegewezen aan een van de vier parallelle draden. DHS vermindert de stappen van seriële knooppuntverwerking van 14 naar 5 door nodes te distribueren naar meerdere draden. Relatieve kosten, d.w.z. het aandeel van de berekeningskosten van DHS ten opzichte van die van de seriële Hines-methode, bij toepassing van DHS met verschillende aantallen draden op verschillende soorten modellen. a k b c d b k e d f Neem een vereenvoudigd model met 15 compartimenten als voorbeeld, met behulp van de seriële berekening Hines-methode, het duurt 14 stappen om alle knooppunten te verwerken, terwijl het gebruik van DHS met vier parallelle eenheden zijn knooppunten in vijf subsets kan delen (Fig. ): {{9,10,12,14}, {1,7,11,13}, {2,3,4,8}, {6}, {5}}. Aangezien knooppunten in dezelfde subset parallel kunnen worden verwerkt, duurt het slechts vijf stappen om alle knooppunten met DHS te verwerken (Fig. ) van 2d 2e Vervolgens passen we de DHS-methode toe op zes representatieve gedetailleerde neuronmodellen (selecteerd uit ModelDB). ) met verschillende aantallen draden (Fig. ): inclusief corticale en hippocampale piramidale neuronen , , Cerebellaire Purkinje neuronen De striatale projectineuronen (SPN’s) ), en olfactorische bollen mitrale cellen , die de belangrijkste belangrijke neuronen in de sensorische, corticale en subcorticale gebieden bestrijken. We hebben vervolgens de berekeningskosten gemeten. De relatieve berekeningskosten hier worden gedefinieerd door het aandeel van de berekeningskosten van DHS ten opzichte van die van de seriële Hines-methode. De berekeningskosten, d.w.z. het aantal stappen dat wordt genomen bij het oplossen van vergelijkingen, daalt dramatisch met toenemende draadgetallen. Bijvoorbeeld met 16 draden is de berekeningskosten van DHS 7%-10% in vergelijking met de seriële Hines-methode. Interessant genoeg bereikt de DHS-methode de lagere grenzen van hun berekeningskosten voor gepresenteerde neuronen ), wat suggereert dat het toevoegen van meer draden de prestaties niet verder verbetert vanwege de afhankelijkheden tussen de compartimenten. 39 2f 40 41 42 43 44 45 2f Samen genereren we een DHS-methode die geautomatiseerde analyse van de dendritische topologie en optimale partitie voor parallel computing mogelijk maakt. Het is vermeldenswaard dat DHS de optimale partitie vindt voordat de simulatie begint en geen extra berekening nodig is om vergelijkingen op te lossen. DHS versnellen door GPU geheugen boost DHS berekent elke neuron met meerdere draden, die een enorme hoeveelheid draden verbruikt bij het uitvoeren van neurale netwerksimulaties. Voor Parallel Computing In theorie zouden veel SP's op de GPU efficiënte simulatie voor grootschalige neurale netwerken moeten ondersteunen (Fig. We hebben echter consequent waargenomen dat de efficiëntie van DHS aanzienlijk afnam toen de netwerkgrootte groeide, wat kan zijn als gevolg van verspreide gegevensopslag of extra geheugentoegang veroorzaakt door het laden en schrijven van tussenliggende resultaten (Fig. van links) 3a en b 46 3c 3d GPU-architectuur en de geheugenhiërarchie. Elke GPU bevat enorme verwerkingseenheden (stroomprocessoren). Verschillende soorten geheugen hebben verschillende doorvoer. Architectuur van Streaming Multiprocessors (SM's). Elke SM bevat meerdere streamingprocessors, registers en L1-cache. Het toepassen van DHS op twee neuronen, elk met vier strengen. Memory optimalisatiestrategie op GPU. Top paneel, draad toewijzing en gegevensopslag van DHS, voor (links) en na (rechts) geheugen boost. Processors sturen een gegevensverzoek om gegevens voor elke draad uit het wereldwijde geheugen te laden.Zonder geheugenboosting (links), duurt het zeven transacties om alle verzoekgegevens te laden en enkele extra transacties voor tussenliggende resultaten.Met geheugenboosting (rechts), duurt het slechts twee transacties om alle verzoekgegevens te laden, registers worden gebruikt voor tussenliggende resultaten, wat de geheugendoorlaat verder verbetert. Runtime van DHS (32 strengen per cel) met en zonder geheugen boost op multi-layer 5 piramidale modellen met spin. Versnellen van geheugen boost op multi-layer 5 piramidale modellen met spin. geheugen boost brengt 1,6-2 keer boost. a b c d d e f We lossen dit probleem op door GPU-geheugenboosting, een methode om het geheugenpercentage te verhogen door gebruik te maken van de geheugenhiërarchie en toegangsmechanisme van de GPU. Gebaseerd op het geheugenladingsmechanisme van de GPU, leiden opeenvolgende draden die gelijkaardige en opeenvolgende opgeslagen gegevens laden tot een hoge geheugenpercentage in vergelijking met het openen van scatter-opgeslagen gegevens, wat de geheugenpercentage vermindert. , Om een hoge doorvoer te bereiken, stellen we eerst de computationele orders van knooppunten in lijn en regelen we de draden opnieuw op basis van het aantal knooppunten op hen. Vervolgens permuteren we de gegevensopslag in het wereldwijde geheugen, consistent met de computationele orders, d.w.z. knooppunten die in dezelfde stap worden verwerkt, worden opeenvolgend in het wereldwijde geheugen opgeslagen. Bovendien gebruiken we GPU-registers om tussenliggende resultaten op te slaan, waardoor de geheugendoorvoer verder wordt versterkt. Bovendien experimenten op meerdere nummers van piramidale neuronen met spin en de typische neuron modellen (Fig. Aanvullende Fig. ) tonen aan dat geheugenboosting een 1,2-3,8-voudige versnelling bereikt in vergelijking met de naïeve DHS. 46 47 3d 3e, f 2 Om de prestaties van DHS met GPU-geheugenboosting uitgebreid te testen, selecteren we zes typische neuronmodellen en evalueren we de runtime van het oplossen van kabelvergelijkingen op massale getallen van elk model (Fig. We onderzochten DHS met vier strengen (DHS-4) en zestien strengen (DHS-16) voor elke neuron, respectievelijk. Bovendien, in vergelijking met de conventionele seriële Hines-methode in NEURON met een enkele draad van de CPU, DHS versnelt de simulatie met 2-3 orders van grootte (Supplementaire Figuur. ), terwijl het behoud van de identieke numerieke nauwkeurigheid in de aanwezigheid van dichte spinnen (Supplementary Figs. en ), actieve dendrieten (Supplementary Fig. ) en verschillende segmentatiestrategieën (Supplementary Fig. ) van 4 4a 3 4 8 7 7 Runtime van het oplossen van vergelijkingen voor een 1 s simulatie op GPU (dt = 0,025 ms, 40.000 iteraties in totaal). CoreNEURON: de parallelle methode gebruikt in CoreNEURON; DHS-4: DHS met vier strengen voor elke neuron; DHS-16: DHS met 16 strengen voor elke neuron. - het Visualisatie van de partitie door DHS-4 en DHS-16, elke kleur geeft een enkele draad aan. a b c DHS creëert celletype-specifieke optimale partitioning Om inzicht te krijgen in het werkmechanisme van de DHS-methode, visualiseren we het partitioneringsproces door compartimenten naar elke draad te mappen (elke kleur presenteert een enkele draad in Figuur. De visualisatie laat zien dat een enkele draad vaak schakelt tussen verschillende takken (Fig. Interessant genoeg genereert DHS geallieerde partities in morfologisch symmetrische neuronen zoals de striatale projectineuron (SPN) en de Mitralcel (Fig. Daarentegen genereert het gefragmenteerde partities van morfologisch asymmetrische neuronen zoals de piramidale neuronen en Purkinje-cel (Figuur. ), wat aangeeft dat DHS de neurale boom splitsen op individuele compartiment schaal (d.w.z. boom knooppunt) in plaats van de tak schaal. 4b en c 4b en c 4b en c 4b en c Kortom, DHS en geheugenboosting genereren een theoretisch bewezen optimale oplossing voor het oplossen van lineaire vergelijkingen in parallel met ongekende efficiëntie. Met dit principe hebben we het open-access DeepDendrite-platform gebouwd, dat door neurowetenschappers kan worden gebruikt om modellen te implementeren zonder specifieke GPU-programmeerkennis. DHS maakt modellen op het niveau van de wervelkolom mogelijk Aangezien dendritische spinalen het grootste deel van de excitatie-invoer ontvangen naar corticale en hippocampale piramidale neuronen, striatale projectineuronen, enz., zijn hun morfologieën en plasticiteit cruciaal voor het reguleren van neuronale excitabiliteit. , , , , Echter, spinnen zijn te klein ( ~ 1 μm lengte) om rechtstreeks te worden gemeten experimenteel met betrekking tot spanning-afhankelijke processen. 10 48 49 50 51 We kunnen een enkele wervelkolom modellen met twee compartimenten: de wervelkop waar synapsen zich bevinden en de wervelkolom die de wervelkop met dendrieten verbindt. De theorie voorspelt dat de zeer dunne nek van de wervelkolom (0,1-0,5 um in diameter) de wervelkolom elektronisch isoleren van zijn ouder dendrite, waardoor de signalen gegenereerd op de wervelkolom Het gedetailleerde model met volledig verdeelde spinnen op dendrieten (“full-spine model”) is echter computationeel erg duur. Spin Factor , in plaats van alle spinnen expliciet te modelleren. hier, de spin factor is gericht op het benaderen van het effect van de wervelkolom op de biofysische eigenschappen van het celmembraan . 52 53 F 54 F 54 Geïnspireerd op het vorige werk van Eyal et al. , we investigated how different spatial patterns of excitatory inputs formed on dendritic spines shape neuronal activities in a human pyramidal neuron model with explicitly modeled spines (Fig. Opmerkelijk is dat Eyal et al. de spine factor to incorporate spines into dendrites while only a few activated spines were explicitly attached to dendrites (“few-spine model” in Fig. ). The value of spine in their model was computed from the dendritic area and spine area in the reconstructed data. Accordingly, we calculated the spine density from their reconstructed data to make our full-spine model more consistent with Eyal’s few-spine model. With the spine density set to 1.3 μm-1, the pyramidal neuron model contained about 25,000 spines without altering the model’s original morphological and biophysical properties. Further, we repeated the previous experiment protocols with both full-spine and few-spine models. We use the same synaptic input as in Eyal’s work but attach extra background noise to each sample. By comparing the somatic traces (Fig. ) and spike probability (Fig. ) in full-spine and few-spine models, we found that the full-spine model is much leakier than the few-spine model. In addition, the spike probability triggered by the activation of clustered spines appeared to be more nonlinear in the full-spine model (the solid blue line in Fig. ) than in the few-spine model (the dashed blue line in Fig. ). These results indicate that the conventional F-factor method may underestimate the impact of dense spine on the computations of dendritic excitability and nonlinearity. 51 5a F 5a F 5b, c 5d 5d 5d Experiment setup. We examine two major types of models: few-spine models and full-spine models. Few-spine models (two on the left) are the models that incorporated spine area globally into dendrites and only attach individual spines together with activated synapses. In full-spine models (two on the right), all spines are explicitly attached over whole dendrites. We explore the effects of clustered and randomly distributed synaptic inputs on the few-spine models and the full-spine models, respectively. Somatic voltages recorded for cases in . Colors of the voltage curves correspond to , scale bar: 20 ms, 20 mV. Color-coded voltages during the simulation in at specific times. Colors indicate the magnitude of voltage. Somatic spike probability as a function of the number of simultaneously activated synapses (as in Eyal et al.’s work) for four cases in . Background noise is attached. Run time of experiments in with different simulation methods. NEURON: conventional NEURON simulator running on a single CPU core. CoreNEURON: CoreNEURON simulator on a single GPU. DeepDendrite: DeepDendrite on a single GPU. a b a a c b d a e d In the DeepDendrite platform, both full-spine and few-spine models achieved 8 times speedup compared to CoreNEURON on the GPU platform and 100 times speedup compared to serial NEURON on the CPU platform (Fig. ; Supplementary Table ) while keeping the identical simulation results (Supplementary Figs. and ). Therefore, the DHS method enables explorations of dendritic excitability under more realistic anatomic conditions. 5e 1 4 8 Discussion In this work, we propose the DHS method to parallelize the computation of Hines method Vervolgens implementeren we DHS op het GPU-hardwareplatform en gebruiken we GPU-geheugenboostingtechnieken om de DHS te verfijnen (Fig. ). When simulating a large number of neurons with complex morphologies, DHS with memory boosting achieves a 15-fold speedup (Supplementary Table ) as compared to the GPU method used in CoreNEURON and up to 1,500-fold speedup compared to serial Hines method in the CPU platform (Fig. Aanvullende Fig. and Supplementary Table ). Furthermore, we develop the GPU-based DeepDendrite framework by integrating DHS into CoreNEURON. Finally, as a demonstration of the capacity of DeepDendrite, we present a representative application: examine spine computations in a detailed pyramidal neuron model with 25,000 spines. Further in this section, we elaborate on how we have expanded the DeepDendrite framework to enable efficient training of biophysically detailed neural networks. To explore the hypothesis that dendrites improve robustness against adversarial attacks , we train our network on typical image classification tasks. We show that DeepDendrite can support both neuroscience simulations and AI-related detailed neural network tasks with unprecedented speed, therefore significantly promoting detailed neuroscience simulations and potentially for future AI explorations. 55 3 1 4 3 1 56 Decades of efforts have been invested in speeding up the Hines method with parallel methods. Early work mainly focuses on network-level parallelization. In network simulations, each cell independently solves its corresponding linear equations with the Hines method. Network-level parallel methods distribute a network on multiple threads and parallelize the computation of each cell group with each thread , . With network-level methods, we can simulate detailed networks on clusters or supercomputers In de afgelopen jaren is de GPU gebruikt voor gedetailleerde netwerksimulatie.Omdat de GPU enorme berekeningseenheden bevat, wordt een draad meestal aan één cel toegewezen in plaats van aan een celgroep. , , . With further optimization, GPU-based methods achieve much higher efficiency in network simulation. However, the computation inside the cells is still serial in network-level methods, so they still cannot deal with the problem when the “Hines matrix” of each cell scales large. 57 58 59 35 60 61 Cellular-level parallel methods further parallelize the computation inside each cell. The main idea of cellular-level parallel methods is to split each cell into several sub-blocks and parallelize the computation of those sub-blocks , . However, typical cellular-level methods (e.g., the “multi-split” method ) pay less attention to the parallelization strategy. The lack of a fine parallelization strategy results in unsatisfactory performance. To achieve higher efficiency, some studies try to obtain finer-grained parallelization by introducing extra computation operations , , or making approximations on some crucial compartments, while solving linear equations , Deze parallellisatiestrategieën met fijnere granen kunnen een hogere efficiëntie verkrijgen, maar hebben geen voldoende numerieke nauwkeurigheid zoals in de oorspronkelijke Hines-methode. 27 28 28 29 38 62 63 64 Unlike previous methods, DHS adopts the finest-grained parallelization strategy, i.e., compartment-level parallelization. By modeling the problem of “how to parallelize” as a combinatorial optimization problem, DHS provides an optimal compartment-level parallelization strategy. Moreover, DHS does not introduce any extra operation or value approximation, so it achieves the lowest computational cost and retains sufficient numerical accuracy as in the original Hines method at the same time. Dendritic spines are the most abundant microstructures in the brain for projection neurons in the cortex, hippocampus, cerebellum, and basal ganglia. As spines receive most of the excitatory inputs in the central nervous system, electrical signals generated by spines are the main driving force for large-scale neuronal activities in the forebrain and cerebellum , . The structure of the spine, with an enlarged spine head and a very thin spine neck—leads to surprisingly high input impedance at the spine head, which could be up to 500 MΩ, combining experimental data and the detailed compartment modeling approach , . Due to such high input impedance, a single synaptic input can evoke a “gigantic” EPSP ( ~ 20 mV) at the spine-head level , , thereby boosting NMDA currents and ion channel currents in the spine . However, in the classic single detailed compartment models, all spines are replaced by the coefficient modifying the dendritic cable geometries . This approach may compensate for the leak currents and capacitance currents for spines. Still, it cannot reproduce the high input impedance at the spine head, which may weaken excitatory synaptic inputs, particularly NMDA currents, thereby reducing the nonlinearity in the neuron’s input-output curve. Our modeling results are in line with this interpretation. 10 11 48 65 48 66 11 F 54 On the other hand, the spine’s electrical compartmentalization is always accompanied by the biochemical compartmentalization , , , resulting in a drastic increase of internal [Ca2+], within the spine and a cascade of molecular processes involving synaptic plasticity of importance for learning and memory. Intriguingly, the biochemical process triggered by learning, in turn, remodels the spine’s morphology, enlarging (or shrinking) the spine head, or elongating (or shortening) the spine neck, which significantly alters the spine’s electrical capacity , , , . Such experience-dependent changes in spine morphology also referred to as “structural plasticity”, have been widely observed in the visual cortex , , somatosensory cortex , , motor cortex , hippocampus De basale ganglia in vivo. They play a critical role in motor and spatial learning as well as memory formation. However, due to the computational costs, nearly all detailed network models exploit the “F-factor” approach to replace actual spines, and are thus unable to explore the spine functions at the system level. By taking advantage of our framework and the GPU platform, we can run a few thousand detailed neurons models, each with tens of thousands of spines on a single GPU, while maintaining ~100 times faster than the traditional serial method on a single CPU (Fig. ). Therefore, it enables us to explore of structural plasticity in large-scale circuit models across diverse brain regions. 8 52 67 67 68 69 70 71 72 73 74 75 9 76 5e Another critical issue is how to link dendrites to brain functions at the systems/network level. It has been well established that dendrites can perform comprehensive computations on synaptic inputs due to enriched ion channels and local biophysical membrane properties , , . For example, cortical pyramidal neurons can carry out sublinear synaptic integration at the proximal dendrite but progressively shift to supralinear integration at the distal dendrite . Moreover, distal dendrites can produce regenerative events such as dendritic sodium spikes, calcium spikes, and NMDA spikes/plateau potentials , . Such dendritic events are widely observed in mice or even human cortical neurons in vitro, which may offer various logical operations , or gating functions , . Recently, in vivo recordings in awake or behaving mice provide strong evidence that dendritic spikes/plateau potentials are crucial for orientation selectivity in the visual cortex , sensory-motor integration in the whisker system , , and spatial navigation in the hippocampal CA1 region . 5 6 7 77 6 78 6 79 6 79 80 81 82 83 84 85 To establish the causal link between dendrites and animal (including human) patterns of behavior, large-scale biophysically detailed neural circuit models are a powerful computational tool to realize this mission. However, running a large-scale detailed circuit model of 10,000-100,000 neurons generally requires the computing power of supercomputers. It is even more challenging to optimize such models for in vivo data, as it needs iterative simulations of the models. The DeepDendrite framework can directly support many state-of-the-art large-scale circuit models , , , which were initially developed based on NEURON. Moreover, using our framework, a single GPU card such as Tesla A100 could easily support the operation of detailed circuit models of up to 10,000 neurons, thereby providing carbon-efficient and affordable plans for ordinary labs to develop and optimize their own large-scale detailed models. 86 87 88 Recent works on unraveling the dendritic roles in task-specific learning have achieved remarkable results in two directions, i.e., solving challenging tasks such as image classification dataset ImageNet with simplified dendritic networks , and exploring full learning potentials on more realistic neuron , Er is echter een compromis tussen modelgrootte en biologische details, omdat de toename van de netwerkomvang vaak wordt opgeofferd voor complexiteit op neuronniveau. , , . Moreover, more detailed neuron models are less mathematically tractable and computationally expensive . 20 21 22 19 20 89 21 There has also been progress in the role of active dendrites in ANNs for computer vision tasks. Iyer et al. . proposed a novel ANN architecture with active dendrites, demonstrating competitive results in multi-task and continual learning. Jones and Kording gebruikte een binaire boom om dendrite vertakking te benaderen en gaf waardevolle inzichten in de invloed van de boomstructuur op de berekeningscapaciteit van enkele neuronen. . proposed a dendritic normalization rule based on biophysical behavior, offering an interesting perspective on the contribution of dendritic arbor structure to computation. While these studies offer valuable insights, they primarily rely on abstractions derived from spatially extended neurons, and do not fully exploit the detailed biological properties and spatial information of dendrites. Further investigation is needed to unveil the potential of leveraging more realistic neuron models for understanding the shared mechanisms underlying brain computation and deep learning. 90 91 92 In response to these challenges, we developed DeepDendrite, a tool that uses the Dendritic Hierarchical Scheduling (DHS) method to significantly reduce computational costs and incorporates an I/O module and a learning module to handle large datasets. With DeepDendrite, we successfully implemented a three-layer hybrid neural network, the Human Pyramidal Cell Network (HPC-Net) (Fig. ). This network demonstrated efficient training capabilities in image classification tasks, achieving approximately 25 times speedup compared to training on a traditional CPU-based platform (Fig. ; Supplementary Table ). 6a, b 6f 1 The illustration of the Human Pyramidal Cell Network (HPC-Net) for image classification. Images are transformed to spike trains and fed into the network model. Learning is triggered by error signals propagated from soma to dendrites. Training with mini-batch. Multiple networks are simulated simultaneously with different images as inputs. The total weight updates ΔW are computed as the average of ΔWi from each network. Vergelijking van de HPC-Net voor en na de training. Links, de visualisatie van verborgen neuronresponsen op een specifieke input voor (top) en na (bottom) training. Recht, verborgen laag gewichten (van input naar verborgen laag) distributie voor (top) en na (bottom) training. We genereren eerst adversarial monsters van de test op een 20-laag ResNet. Gebruik dan deze adversarial monsters (lawaaierige afbeeldingen) om de classificatie nauwkeurigheid van modellen getraind met schone afbeeldingen te testen. Prediction accuracy of each model on adversarial samples after training 30 epochs on MNIST (left) and Fashion-MNIST (right) datasets. Runtime van training en testen voor de HPC-Net. De batchgrootte is ingesteld op 16. Links, runtime van training een tijdperk. Recht, runtime van testen. Parallel NEURON + Python: training en testen op een enkele CPU met meerdere kernen, met behulp van 40-proces parallel NEURON om de HPC-Net te simuleren en extra Python code om mini-batch training te ondersteunen. DeepDendrite: training en testen van de HPC-Net op een enkele GPU met DeepDendrite. a b c d e f Additionally, it is widely recognized that the performance of Artificial Neural Networks (ANNs) can be undermined by adversarial attacks —intentionally engineered perturbations devised to mislead ANNs. Intriguingly, an existing hypothesis suggests that dendrites and synapses may innately defend against such attacks . Our experimental results utilizing HPC-Net lend support to this hypothesis, as we observed that networks endowed with detailed dendritic structures demonstrated some increased resilience to transfer adversarial attacks vergeleken met standaard ANN's, zoals duidelijk is in MNIST and Fashion-MNIST datasets (Fig. ). This evidence implies that the inherent biophysical properties of dendrites could be pivotal in augmenting the robustness of ANNs against adversarial interference. Nonetheless, it is essential to conduct further studies to validate these findings using more challenging datasets such as ImageNet . 93 56 94 95 96 6d, e 97 In conclusion, DeepDendrite has shown remarkable potential in image classification tasks, opening up a world of exciting future directions and possibilities. To further advance DeepDendrite and the application of biologically detailed dendritic models in AI tasks, we may focus on developing multi-GPU systems and exploring applications in other domains, such as Natural Language Processing (NLP), where dendritic filtering properties align well with the inherently noisy and ambiguous nature of human language. Challenges include testing scalability in larger-scale problems, understanding performance across various tasks and domains, and addressing the computational complexity introduced by novel biological principles, such as active dendrites. By overcoming these limitations, we can further advance the understanding and capabilities of biophysically detailed dendritic neural networks, potentially uncovering new advantages, enhancing their robustness against adversarial attacks and noisy inputs, and ultimately bridging the gap between neuroscience and modern AI. Methods Simulation with DHS CoreNEURON simulator ( ) uses the NEURON architecture and is optimized for both memory usage and computational speed. We implement our Dendritic Hierarchical Scheduling (DHS) method in the CoreNEURON environment by modifying its source code. All models that can be simulated on GPU with CoreNEURON can also be simulated with DHS by executing the following command: 35 https://github.com/BlueBrain/CoreNeuron 25 coreneuron_exec -d /path/to/modellen -e tijd --cell-permute 3 --cell-nthread 16 --gpu The usage options are as in Table . 1 Accuracy of the simulation using cellular-level parallel computation To ensure the accuracy of the simulation, we first need to define the correctness of a cellular-level parallel algorithm to judge whether it will generate identical solutions compared with the proven correct serial methods, like the Hines method used in the NEURON simulation platform. Based on the theories in parallel computing , a parallel algorithm will yield an identical result as its corresponding serial algorithm, if and only if the data process order in the parallel algorithm is consistent with data dependency in the serial method. The Hines method has two symmetrical phases: triangularization and back-substitution. By analyzing the serial computing Hines method , vinden we dat de gegevensafhankelijkheid kan worden geformuleerd als een boomstructuur, waarbij de knooppunten op de boom de compartimenten van het gedetailleerde neuronmodel vertegenwoordigen. In het triangularisatieproces hangt de waarde van elke knoop af van de kinderknopen. ). Thus, we can compute nodes on different branches in parallel as their values are not dependent. 34 55 1d Based on the data dependency of the serial computing Hines method, we propose three conditions to make sure a parallel method will yield identical solutions as the serial computing Hines method: (1) The tree morphology and initial values of all nodes are identical to those in the serial computing Hines method; (2) In the triangularization phase, a node can be processed if and only if all its children nodes are already processed; (3) In the back-substitution phase, a node can be processed only if its parent node is already processed. Once a parallel computing method satisfies these three conditions, it will produce identical solutions as the serial computing method. Computational cost of cellular-level parallel computing method Om de looptijd, d.w.z. de efficiëntie, van de seriële en parallelle berekeningsmethoden theoretisch te evalueren, introduceren en formuleren we het concept van berekeningskosten als volgt: and draden (basiscomputatie-eenheden) om triangularisatie uit te voeren, parallelle triangularisatie is gelijk aan het delen van de knooppunt of into subsets, i.e., = { , , … } where the size of each subset | | ≤ , i.e., at most nodes can be processed each step since there are only threads. The process of the triangularization phase follows the order: → → … → , and nodes in the same subset kan worden verwerkt in parallel. dus, we definiëren | (the size of set , i.e., here) as the computational cost of the parallel computing method. In short, we define the computational cost of a parallel method as the number of steps it takes in the triangularization phase. Because the back-substitution is symmetrical with triangularization, the total cost of the entire solving equation phase is twice that of the triangularization phase. T k V T n V V1 V2 Vn Vi k k k V1 V2 Vn Vi V V n Mathematical scheduling problem Based on the simulation accuracy and computational cost, we formulate the parallelization problem as a mathematical scheduling problem: Given a tree = { , } and a positive integer , where is the node-set and is the edge set. Define partition ( ) = { , , … }, | | ≤ , 1 ≤ ≤ n, where | | indicates the cardinal number of subset , i.e., the number of nodes in , and for each node ∈ , all its children nodes { | ∈children( )} must in a previous subset , where 1 ≤ < Ons doel is om een optimale partitie te vinden ( ) whose computational cost | ( )| is minimal. T V E k V E P V V1 V2 Vn Vi k i Vi Vi Vi v Vi c c v Vj j i P* V P* V Here subset consists of all nodes that will be computed at -th step (Fig. ), so | Hoofdstuk ≤ indicates that we can compute nodes each step at most because the number of available threads is De beperking "voor elke knoop ∈ , all its children nodes { | ∈children( )} must in a previous subset , where 1 ≤ < ” indicates that node can be processed only if all its child nodes are processed. Vi i 2e Vi k k k v Vi c c v Vj j i v DHS implementation We aim to find an optimal way to parallelize the computation of solving linear equations for each neuron model by solving the mathematical scheduling problem above. To get the optimal partition, DHS first analyzes the topology and calculates the depth ( ) for all nodes ∈ . Then, the following two steps will be executed iteratively until every node ∈ is assigned to a subset: (1) find all candidate nodes and put these nodes into candidate set . A node is a candidate only if all its child nodes have been processed or it does not have any child nodes. (2) if | | ≤ , i.e., the number of candidate nodes is smaller or equivalent to the number of available threads, remove all nodes in and put them into , otherwise, remove deepest nodes from Voeg ze toe aan subset . Label these nodes as processed nodes (Fig. ). After filling in subset , go to step (1) to fill in the next subset . d v v V v V Q Q k Q V*i k Q Vi 2d Vi Vi+1 Correctness proof for DHS After applying DHS to a neural tree = { - het }, we get a partition ( ) = { , , … }, | | ≤ , 1 ≤ ≤ . Nodes in the same subset will be computed in parallel, taking steps to perform triangularization and back-substitution, respectively. We then demonstrate that the reordering of the computation in DHS will result in a result identical to the serial Hines method. T V E P V V1 V2 Vn Vi k i n Vi n The partition ( ) obtained from DHS decides the computation order of all nodes in a neural tree. Below we demonstrate that the computation order determined by ( voldoet aan de voorwaarden van correctheid. ( ) is obtained from the given neural tree . Operations in DHS do not modify the tree topology and values of tree nodes (corresponding values in the linear equations), so the tree morphology and initial values of all nodes are not changed, which satisfies condition 1: the tree morphology and initial values of all nodes are identical to those in serial Hines method. In triangularization, nodes are processed from subset to Zoals weergegeven in de implementatie van DHS, alle knooppunten in de subset are selected from the candidate set , and a node can be put into only if all its child nodes have been processed. Thus the child nodes of all nodes in are in { , , … }, meaning that a node is only computed after all its children have been processed, which satisfies condition 2: in triangularization, a node can be processed if and only if all its child nodes are already processed. In back-substitution, the computation order is the opposite of that in triangularization, i.e., from Twee . As shown before, the child nodes of all nodes in are in { , , … }, so parent nodes of nodes in are in { , , … }, which satisfies condition 3: in back-substitution, a node can be processed only if its parent node is already processed. P V P V P V T V1 Vn Vi Q Q Vi V1 V2 Vi-1 Vn V1 Vi V1 De V2 Vi-1 Vi Vi+1 Vi+2 Vn Optimality proof for DHS The idea of the proof is that if there is another optimal solution, it can be transformed into our DHS solution without increasing the number of steps the algorithm requires, thus indicating that the DHS solution is optimal. For each subset in ( ), DHS moves (draadnummer) diepste knooppunten uit de overeenkomstige kandidaatset to . If the number of nodes in is smaller than , move all nodes from Twee . To simplify, we introduce Om de diepte van het bedrag van deepest nodes in . All subsets in ( ) voldoen aan de max-diepte criteria (Supplementaire figuur. We bewijzen dan dat het selecteren van de diepste knooppunten in elke iteratie een optimale partitie. Als er een optimale partitie bestaat = { , , … } containing subsets that do not satisfy the max-depth criteria, we can modify the subsets in ( ) so that all subsets consist of the deepest nodes from and the number of subsets ( | ( )|) remain the same after modification. Vi P V k Qi Vi Qi k Qi wij van k Qi P V 6a P(V) P*(V) V*1 V*2 V*s P* V Q P* V Without any loss of generalization, we start from the first subset not satisfying the criteria, i.e., . There are two possible cases that will make not satisfy the max-depth criteria: (1) | | < and there exist some valid nodes in that are not put to ; (2) | Gijs = De nodes in are not the deepest nodes in . V * I V*i V*i k Qi V*i V*i k V*i k Qi For case (1), because some candidate nodes are not put to , these nodes must be in the subsequent subsets. As | | , we can move the corresponding nodes from the subsequent subsets to , which will not increase the number of subsets and make voldoen aan de criteria (aanvullende figuur. , top). For case (2), | | = , these deeper nodes that are not moved from the candidate set into must be added to subsequent subsets (Supplementary Fig. , bottom). These deeper nodes can be moved from subsequent subsets to through the following method. Assume that after filling , is picked and one of the -th deepest nodes is still in , thus will be put into a subsequent subset ( > We bewegen eerst from to + , then modify subset + as follows: if | + | ≤ and none of the nodes in + is the parent of node , stop modifying the latter subsets. Otherwise, modify + as follows (Supplementary Fig. Als de ouderlijk knooppunt van is in + , move this parent node to + ; else move the node with minimum depth from + to + van . After adjusting , modify subsequent subsets + , + , … met dezelfde strategie. ten slotte, bewegen van to . V*i V*i < k V*i V*i 6b V*i k Qi V*i 6b V*i V*i v k v’ Qi V’ V*j j i v V*i V*i 1 V * I 1 V*i 1 k V*i 1 v V*i 1 6c v V*i 1 V*i 2 V*i 1 V*i 2 V*i V*i 1 V*i 2 V*j-1 v’ V*j V*i With the modification strategy described above, we can replace all shallower nodes in with the De diepste knoop in and keep the number of subsets, i.e., | ( )| the same after modification. We can modify the nodes with the same strategy for all subsets in ( ) that do not contain the deepest nodes. Finally, all subsets ∈ ( ) can satisfy the max-depth criteria, and | ( )| does not change after modifying. V*i k Qi P* V P* V V*i P* V P* V In conclusion, DHS generates a partition ( ), and all subsets ∈ ( ) satisfy the max-depth condition: . For any other optimal partition ( ) we can modify its subsets to make its structure the same as ( ), i.e., each subset consists of the deepest nodes in the candidate set, and keep | ( ) the same after modification. So, the partition ( ) obtained from DHS is one of the optimal partitions. P V Vi P V P* V P V P * V | P V GPU implementation and memory boosting To achieve high memory throughput, GPU utilizes the memory hierarchy of (1) global memory, (2) cache, (3) register, where global memory has large capacity but low throughput, while registers have low capacity but high throughput. We aim to boost memory throughput by leveraging the memory hierarchy of GPU. GPU gebruikt SIMT (Single-Instruction, Multiple-Thread) architectuur. Warps zijn de basisschema-eenheden op GPU (een warp is een groep van 32 parallelle draden). Een warp voert dezelfde instructie uit met verschillende gegevens voor verschillende draden Correct ordenen van de knooppunten is essentieel voor deze batch van berekening in warps, om ervoor te zorgen dat DHS identieke resultaten krijgt als de seriële Hines-methode. Wanneer DHS op GPU wordt geïmplementeerd, groeperen we eerst alle cellen in meerdere warps op basis van hun morfologieën. Cellen met vergelijkbare morfologieën worden gegroepeerd in dezelfde warp. We passen DHS vervolgens toe op alle neuronen, waarbij de compartimenten van elke neuron worden toegewezen aan meerdere draden. Omdat de neuronen worden gegroepeerd in warps, zijn de draden voor dezelfde neuron in dezelfde warp. Daarom houdt de intrinsieke synchronisatie in warps de berekeningsorde consistent met de gegevensafhankelijkheid van de seriële Hines-methode. 46 Wanneer een warp vooraf afgestemde en opeenvolgende opgeslagen gegevens uit het wereldwijde geheugen laadt, kan het volledige gebruik maken van de cache, wat leidt tot een hoog geheugendoorlaten, terwijl toegang tot scatter-opgeslagen gegevens het geheugendoorlaten zou verminderen.Na de toewijzing van compartimenten en de rearrangement van draden, permuteert we gegevens in het wereldwijde geheugen om het consistent te maken met computergestuurde orders, zodat warps opeenvolgend opgeslagen gegevens kunnen laden bij het uitvoeren van het programma.Bovendien zetten we die noodzakelijke tijdelijke variabelen in registers in plaats van wereldwijd geheugen. Registers hebben de hoogste geheugendoorlaten, dus het gebruik van registers versnelt DHS verder. Full-spine en few-spine biofysische modellen We used the published human pyramidal neuron . The membrane capacitance m = 0.44 μF cm-2, membrane resistance m = 48,300 Ω cm2, and axial resistivity a = 261.97 Ω cm. In this model, all dendrites were modeled as passive cables while somas were active. The leak reversal potential l = -83.1 mV. Ion channels such as Na+ and K+ were inserted on soma and initial axon, and their reversal potentials were Na = 67.6 mV, K = -102 mV respectively. All these specific parameters were set the same as in the model of Eyal, et al. , for more details please refer to the published model (ModelDB, access No. 238347). 51 c r r E E E 51 In the few-spine model, the membrane capacitance and maximum leak conductance of the dendritic cables 60 μm away from soma were multiplied by a spine factor to approximate dendritic spines. In this model, spine was set to 1.9. Only the spines that receive synaptic inputs were explicitly attached to dendrites. F F In the full-spine model, all spines were explicitly attached to dendrites. We calculated the spine density with the reconstructed neuron in Eyal, et al. . The spine density was set to 1.3 μm-1, and each cell contained 24994 spines on dendrites 60 μm away from the soma. 51 The morphologies and biophysical mechanisms of spines were the same in few-spine and full-spine models. The length of the spine neck neck = 1.35 μm and the diameter nek = 0,25 μm, terwijl de lengte en diameter van het ruggenhoofd 0,944 μm waren, d.w.z. het gebied van het ruggenhoofd werd ingesteld op 2,8 μm2. = -86 mV. The specific membrane capacitance, membrane resistance, and axial resistivity were the same as those for dendrites. L D El Synaptic inputs We investigated neuronal excitability for both distributed and clustered synaptic inputs. All activated synapses were attached to the terminal of the spine head. For distributed inputs, all activated synapses were randomly distributed on all dendrites. For clustered inputs, each cluster consisted of 20 activated synapses that were uniformly distributed on a single randomly-selected compartment. All synapses were activated simultaneously during the simulation. AMPA-based and NMDA-based synaptic currents were simulated as in Eyal et al.’s work. AMPA conductance was modeled as a double-exponential function and NMDA conduction as a voltage-dependent double-exponential function. For the AMPA model, the specific rise and de decay werden ingesteld op 0,3 en 1,8 ms. Voor het NMDA-model, rise and decay were set to 8.019 and 34.9884 ms, respectively. The maximum conductance of AMPA and NMDA were 0.73 nS and 1.31 nS. τ τ τ τ Background noise We attached background noise to each cell to simulate a more realistic environment. Noise patterns were implemented as Poisson spike trains with a constant rate of 1.0 Hz. Each pattern started at start = 10 ms and lasted until the end of the simulation. We generated 400 noise spike trains for each cell and attached them to randomly-selected synapses. The model and specific parameters of synaptic currents were the same as described in , except that the maximum conductance of NMDA was uniformly distributed from 1.57 to 3.275, resulting in a higher AMPA to NMDA ratio. t Synaptic Inputs Exploring neuronal excitability We investigated the spike probability when multiple synapses were activated simultaneously. For distributed inputs, we tested 14 cases, from 0 to 240 activated synapses. For clustered inputs, we tested 9 cases in total, activating from 0 to 12 clusters respectively. Each cluster consisted of 20 synapses. For each case in both distributed and clustered inputs, we calculated the spike probability with 50 random samples. Spike probability was defined as the ratio of the number of neurons fired to the total number of samples. All 1150 samples were simulated simultaneously on our DeepDendrite platform, reducing the simulation time from days to minutes. Performing AI tasks with the DeepDendrite platform Conventional detailed neuron simulators lack two functionalities important to modern AI tasks: (1) alternately performing simulations and weight updates without heavy reinitialization and (2) simultaneously processing multiple stimuli samples in a batch-like manner. Here we present the DeepDendrite platform, which supports both biophysical simulating and performing deep learning tasks with detailed dendritic models. DeepDendrite consists of three modules (Supplementary Fig. ): (1) an I/O module; (2) a DHS-based simulating module; (3) a learning module. When training a biophysically detailed model to perform learning tasks, users first define the learning rule, then feed all training samples to the detailed model for learning. In each step during training, the I/O module picks a specific stimulus and its corresponding teacher signal (if necessary) from all training samples and attaches the stimulus to the network model. Then, the DHS-based simulating module initializes the model and starts the simulation. After simulation, the learning module updates all synaptic weights according to the difference between model responses and teacher signals. After training, the learned model can achieve performance comparable to ANN. The testing phase is similar to training, except that all synaptic weights are fixed. 5 HPC-Net model Image classification is a typical task in the field of AI. In this task, a model should learn to recognize the content in a given image and output the corresponding label. Here we present the HPC-Net, a network consisting of detailed human pyramidal neuron models that can learn to perform image classification tasks by utilizing the DeepDendrite platform. HPC-Net has three layers, i.e., an input layer, a hidden layer, and an output layer. The neurons in the input layer receive spike trains converted from images as their input. Hidden layer neurons receive the output of input layer neurons and deliver responses to neurons in the output layer. The responses of the output layer neurons are taken as the final output of HPC-Net. Neurons between adjacent layers are fully connected. For each image stimulus, we first convert each normalized pixel to a homogeneous spike train. For pixel with coordinates ( ) in the image, the corresponding spike train has a constant interspike interval ISI( (in ms) die wordt bepaald door de pixelwaarde ( ) as shown in Eq. ( ). x, y τ x, y p x, y 1 In our experiment, the simulation for each stimulus lasted 50 ms. All spike trains started at 9 + ISI ms and lasted until the end of the simulation. Then we attached all spike trains to the input layer neurons in a one-to-one manner. The synaptic current triggered by the spike arriving at time is given by τ t0 where is the post-synaptic voltage, the reversal potential syn = 1 mV, de maximale synaptische geleidbaarheid max = 0.05 μS, and the time constant = 0.5 ms. v E g τ Neurons in the input layer were modeled with a passive single-compartment model. The specific parameters were set as follows: membrane capacitance m = 1.0 μF cm-2, membrane resistance m = 104 Ω cm2, axiale weerstand a = 100 Ω cm, reversal potential of passive compartment l = 0 mV. c r r E De verborgen laag bevat een groep van menselijke pyramidale neuronmodellen, die de somatische spanning van de invoerlaag neuronen ontvangen. , and all neurons were modeled with passive cables. The specific membrane capacitance m = 1.5 μF cm-2, membrane resistance m = 48,300 Ω cm2, axial resistivity a = 261.97 Ω cm, and the reversal potential of all passive cables l = 0 mV. Input neurons could make multiple connections to randomly-selected locations on the dendrites of hidden neurons. The synaptic current activated by the -th synapse of the -th input neuron on neuron ‘s dendrite wordt gedefinieerd als in Eq. ( ), where is the synaptic conductance, is the synaptic weight, is the ReLU-like somatic activation function, and is the somatic voltage of the -th input neuron at time . 51 c r r E k i j 4 gijk Wijk i t Neurons in the output layer were also modeled with a passive single-compartment model, and each hidden neuron only made one synaptic connection to each output neuron. All specific parameters were set the same as those of the input neurons. Synaptic currents activated by hidden neurons are also in the form of Eq. ( ). 4 Image classification with HPC-Net For each input image stimulus, we first normalized all pixel values to 0.0-1.0. Then we converted normalized pixels to spike trains and attached them to input neurons. Somatic voltages of the output neurons are used to compute the predicted probability of each class, as shown in equation , where is the probability of -th class predicted by the HPC-Net, is the average somatic voltage from 20 ms to 50 ms of the -th output neuron, and De klasse met de maximale voorspelde waarschijnlijkheid is het uiteindelijke classificatieresultaat.In dit artikel hebben we het HPC-Net gebouwd met 784 input neurons, 64 verborgen neurons en 10 output neurons. 6 pi i i C Synaptic plasticity rules for HPC-Net Inspired by previous work , we use a gradient-based learning rule to train our HPC-Net to perform the image classification task. The loss function we use here is cross-entropy, given in Eq. ( ), where is de voorspelde waarschijnlijkheid voor de klasse , indicates the actual class the stimulus image belongs to, = 1 if input image belongs to class , and = 0 if not. 36 7 pi i yi yi i yi When training HPC-Net, we compute the update for weight (the synaptic weight of the -th synapse connecting neuron De neuronen ) at each time step. After the simulation of each image stimulus, is bijgewerkt zoals weergegeven in Eq. ( ): Wijk k i j Wijk 8 Hier is de leerpercentage, is de update waarde op tijd - het , Somatische spanning van neuronen and respectievelijk is the -th synaptic current activated by neuron Een neuron , its synaptic conductance, is the transfer resistance between the Het verbonden compartiment van een neuron on neuron ’s dendrite to neuron ’s soma, s = 30 ms, e = 50 ms are start time and end time for learning respectively. For output neurons, the error term can be computed as shown in Eq. ( Voor verborgen neuronen wordt de foutterm berekend uit de fouttermijnen in de outputlaag, gegeven in Eq. ( ) van t Vj vi i j Iijk k i j gijk rijk k i j j t t 10 11 Since all output neurons are single-compartment, equals to the input resistance of the corresponding compartment, . Transfer and input resistances are computed by NEURON. Mini-batch training is a typical method in deep learning for achieving higher prediction accuracy and accelerating convergence. DeepDendrite also supports mini-batch training. When training HPC-Net with mini-batch size batch, we make batch copies of HPC-Net. During training, each copy is fed with a different training sample from the batch. DeepDendrite first computes the weight update for each copy separately. After all copies in the current training batch are done, the average weight update is calculated and weights in all copies are updated by this same amount. N N Robustness against adversarial attack with HPC-Net Om de robuustheid van HPC-Net te demonstreren, testten we de voorspellingsnauwkeurigheid op adversaria-monsters en vergeleken we het met een analoge ANN (een met dezelfde 784-64-10-structuur en ReLU-activatie, voor een eerlijke vergelijking in ons HPC-Net maakte elke inputneuron slechts één synaptische verbinding met elk verborgen neuron). We trainen eerst HPC-Net en ANN met de oorspronkelijke trainingsset (oorspronkelijke schone afbeeldingen). , to generate adversarial noise with the FGSM method . ANN was trained with PyTorch , and HPC-Net was trained with our DeepDendrite. For fairness, we generated adversarial noise on a significantly different network model, a 20-layer ResNet . The noise level ranged from 0.02 to 0.2. We experimented on two typical datasets, MNIST and Fashion-MNIST . Results show that the prediction accuracy of HPC-Net is 19% and 16.72% higher than that of the analogous ANN, respectively. 98 99 93 100 101 95 96 Reporting summary Further information on research design is available in the linked to this article. Nature Portfolio Reporting Summary Gegevens beschikbaarheid The data that support the findings of this study are available within the paper, Supplementary Information and Source Data files provided with this paper. The source code and data that used to reproduce the results in Figs. – are available at De MNIST-dataset is publiekelijk beschikbaar op . The Fashion-MNIST dataset is publicly available at . are provided with this paper. 3 6 https://github.com/pkuzyc/DeepDendrite http://yann.lecun.com/exdb/mnist https://github.com/zalandoresearch/fashion-mnist Source data Code availability The source code of DeepDendrite as well as the models and code used to reproduce Figs. – in this study are available at . 3 6 https://github.com/pkuzyc/DeepDendrite References McCulloch, W. S. & Pitts, W. A logical calculus of the ideas immanent in nervous activity. , 115–133 (1943). Bull. Math. Biophys. 5 LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. , 436–444 (2015). Nature 521 Poirazi, P., Brannon, T. & Mel, B. W. Aritmetische van subdrempel synaptische som in een model CA1 piramidale cel. London, M. & Häusser, M. Dendritic computation. , 503–532 (2005). Annu. Rev. Neurosci. 28 Branco, T. & Häusser, M. De enkele dendritische tak als een fundamentele functionele eenheid in het zenuwstelsel. Curr. Opin. Neurobiol. 20, 494–502 (2010). Stuart, G. J. & Spruston, N. Dendritische integratie: 60 jaar vooruitgang. Nat. Neurosci. 18, 1713–1721 (2015). Poirazi, P. & Papoutsi, A. Verlichting van dendritische functie met computationele modellen. Nat. Rev. Neurosci. 21, 303-321 (2020). Yuste, R. & Denk, W. Dendritische spinalen als basisfunctionele eenheden van neuronale integratie. Engert, F. & Bonhoeffer, T. Dendritische veranderingen in de wervelkolom geassocieerd met hippocampale langdurige synaptische plasticiteit. Yuste, R. Dendritische spinnen en gedistribueerde circuits. Neuron 71, 772–781 (2011). Yuste, R. Elektrische compartimentalisatie in dendritische rimpels. Annu. Rev. Neurosci. 36, 429-449 (2013). Rall, W. Branching dendritische bomen en motoneuron membraan weerstand. Exp. Neurol. 1, 491-527 (1959). Segev, I. & Rall, W. Computational study of an excitable dendritic spine. , 499–523 (1988). J. Neurophysiol. 60 Silver, D. et al. Mastering the game of go with deep neural networks and tree search. , 484–489 (2016). Nature 529 Silver, D. et al. Een algorithme voor algemeen versterkt leren dat schaak, shogi en zelfspel beheerst. Science 362, 1140-1144 (2018). McCloskey, M. & Cohen, N. J. Catastrophic interference in connectionist networks: the sequential learning problem. , 109–165 (1989). Psychol. Learn. Motiv. 24 Frans, R. M. Catastrofaal vergeten in connectivistische netwerken. Trends Cogn. Sci. 3, 128-135 (1999). Naud, R. & Sprekeler, H. Sparse blaren optimaliseren informatieoverdracht in een multiplexed neurale code. Proc. Natl Acad. Sci. USA 115, E6329-E6338 (2018). Sacramento, J., Costa, R. P., Bengio, Y. & Senn, W. Dendritic cortical microcircuits approximate the backpropagation algorithm. in (NeurIPS*,* 2018). Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Payeur, A., Guerguiev, J., Zenke, F., Richards, B. A. & Naud, R. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. , 1010–1019 (2021). Nat. Neurosci. 24 Bicknell, B. A. & Häusser, M. A synaptic learning rule for exploiting nonlinear dendritic computation. , 4001–4017 (2021). Neuron 109 Moldwin, T., Kalmenson, M. & Segev, I. De gradiëntclusteron: een modelneuron dat leert classificatieopdrachten op te lossen via dendritische nonlineariteiten, structurele plasticiteit en gradient afdaling. Hodgkin, A. L. & Huxley, A. F. A quantitative description of membrane current and Its application to conduction and excitation in nerve. , 500–544 (1952). J. Physiol. 117 Rall, W. Theorie van de fysiologische eigenschappen van dendrieten. Ann. N. Y. Acad. Sci. 96, 1071-1092 (1962). Hines, M. L. & Carnevale, N. T. The NEURON simulation environment. , 1179–1209 (1997). Neural Comput. 9 Bower, J. M. & Beeman, D. in The Book of GENESIS: Exploring Realistic Neural Models with the General Neural Simulation System (eds Bower, J. M. & Beeman, D.) 17-27 (Springer New York, 1998). Hines, M. L., Eichner, H. & Schürmann, F. Neuron splitsing in computergebonden parallelle netwerksimulaties maakt runtime scaling mogelijk met twee keer zoveel processoren. Hines, M. L., Markram, H. & Schürmann, F. Volledig impliciete parallelle simulatie van enkele neuronen. Ben-Shalom, R., Liberman, G. & Korngreen, A. Versnelde compartimentale modellering op een grafische verwerkingseenheid. Tsuyuki, T., Yamamoto, Y. & Yamazaki, T. Efficient numerical simulation of neuron models with spatial structure on graphics processing units. In (eds Hirose894Akiraet al.) 279–285 (Springer International Publishing, 2016). Proc. 2016 International Conference on Neural Information Processing Vooturi, D. T., Kothapalli, K. & Bhalla, U. S. Parallelizing Hines Matrix Solver in Neuron Simulations on GPU. In 388–397 (IEEE, 2017). Proc. IEEE 24th International Conference on High Performance Computing (HiPC) Huber, F. Efficiënte boom solver voor hines matrices op de GPU. Preprint op https://arxiv.org/abs/1810.12742 (2018). Korte, B. & Vygen, J. Combinatoriale optimalisatietheorie en algoritmen 6 edn (Springer, 2018). Gebali, F. Algorithmen en Parallel Computing (Wiley, 2011) Kumbhar, P. et al. CoreNEURON: Een geoptimaliseerde berekeningsmotor voor de NEURON-simulator. Front. Neuroinform. 13, 63 (2019). Urbanczik, R. & Senn, W. Learning by the dendritic prediction of somatic spiking. , 521–528 (2014). Neuron 81 Ben-Shalom, R., Aviv, A., Razon, B. & Korngreen, A. Optimalisatie van ionenkanaalmodellen met behulp van een parallel genetisch algoritme op grafische processoren. Mascagni, M. Een parallellerend algoritme voor het berekenen van oplossingen voor willekeurig vertakte kabelneuronmodellen. McDougal, R. A. et al. Twenty years of modelDB and beyond: building essential modeling tools for the future of neuroscience. , 1–10 (2017). J. Comput. Neurosci. 42 Migliore, M., Messineo, L. & Ferrante, M. Dendritic Ih selectively blocks temporal summation of unsynchronized distal inputs in CA1 pyramidal neurons. , 5–13 (2004). J. Comput. Neurosci. 16 Hemond, P. et al. Distinct classes of pyramidal cells exhibit mutually exclusive firing patterns in hippocampal area CA3b. , 411–424 (2008). Hippocampus 18 Hay, E., Hill, S., Schürmann, F., Markram, H. & Segev, I. Models of neocortical layer 5b pyramidal cells capturing a wide range of dendritic and perisomatic active Properties. , e1002107 (2011). PLoS Comput. Biol. 7 Masoli, S., Solinas, S. & D’Angelo, E. Action potential processing in a detailed purkinje cell model reveals a critical role for axonal compartmentalization. , 47 (2015). Front. Cell. Neurosci. 9 Lindroos, R. et al. Basal ganglia neuromodulatie over meerdere tijdelijke en structurele schalen - simulaties van directe pad MSN's onderzoeken het snelle begin van dopaminerge effecten en voorspellen de rol van Kv4.2. Migliore, M. et al. Synaptische clusters functioneren als geuroperatoren in de olfactorische lamp. Proc. Natl Acad. Sci. USa 112, 8499–8504 (2015). NVIDIA. CUDA C++ Programmeergids. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html (2021). NVIDIA. . (2021). CUDA C++ Best Practices Guide https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Harnett, M. T., Makara, J. K., Spruston, N., Kath, W. L. & Magee, J. C. Synaptic amplification by dendritic spines enhances input cooperativity. , 599–602 (2012). Nature 491 Chiu, C. Q. et al. Compartmentalisatie van GABAergische remming door dendritische spinaten. Wetenschap 340, 759–762 (2013). Tønnesen, J., Katona, G., Rózsa, B. & Nägerl, U. V. Spine neck plasticity regulates compartmentalization of synapses. , 678–685 (2014). Nat. Neurosci. 17 Eyal, G. et al. Menselijke corticale pyramidale neuronen: van spinalen tot spikes via modellen. Front. Cell. Neurosci. 12, 181 (2018). Koch, C. & Zador, A. De functie van dendritische spinnen: apparaten die biochemische in plaats van elektrische compartimentalisatie subserveren. Koch, C. Dendritic spines. in Biophysics of Computation (Oxford University Press, 1999). Rapp, M., Yarom, Y. & Segev, I. The impact of parallel fiber background activity on the cable properties of cerebellar purkinje cells. , 518–533 (1992). Neural Comput. 4 Hines, M. Efficient computation of branched nerve equations. , 69–76 (1984). Int. J. Bio-Med. Comput. 15 Nayebi, A. & Ganguli, S. Biologically inspired protection of deep networks from adversarial attacks. Preprint at (2017). https://arxiv.org/abs/1703.09202 Goddard, N. H. & Hood, G. Large-Scale Simulation Using Parallel GENESIS. in The Book of GENESIS: Exploring Realistic Neural Models with the General Neural Simulation System (eds Bower James M. & Beeman David) 349-379 (Springer New York, 1998). Migliore, M., Cannia, C., Lytton, W. W., Markram, H. & Hines, M. L. Parallel netwerksimulaties met NEURON. Lytton, W. W. et al. Simulatie neurotechnologieën voor het bevorderen van hersenonderzoek: parallellering van grote netwerken in NEURON. Valero-Lara, P. et al. cuHinesBatch: Het oplossen van meerdere Hines-systemen op het menselijk breinproject van GPU's. In Proc. 2017 International Conference on Computational Science 566-575 (IEEE, 2017). Akar, N. A. et al. Arbor—A morphologically-detailed neural network simulation library for contemporary high-performance computing architectures. In 274–282 (IEEE, 2019). Proc. 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) Ben-Shalom, R. et al. NeuroGPU: Accelerating multi-compartment, biophysically detailed neuron simulations on GPUs. , 109400 (2022). J. Neurosci. Methods 366 Rempe, M. J. & Chopp, D. L. A predictor-corrector algorithm for reaction-diffusion equations associated with neural activity on branched structures. , 2139–2161 (2006). SIAM J. Sci. Comput. 28 Kozloski, J. & Wagner, J. An ultrascalable solution to large-scale neural tissue simulation. , 15 (2011). Front. Neuroinform. 5 Jayant, K. et al. Gericht intracellulaire spanning opnames van dendritische spinnen met behulp van quantum-dot-coated nanopipettes. Nat. Nanotechnol. 12, 335–342 (2017). Palmer, L. M. & Stuart, G. J. Membrane potential changes in dendritic spines during action potentials and synaptic input. , 6897–6903 (2009). J. Neurosci. 29 Nishiyama, J. & Yasuda, R. Biochemische berekening voor structurele plasticiteit van de wervelkolom. Neuron 87, 63–75 (2015). Yuste, R. & Bonhoeffer, T. Morphological changes in dendritic spines associated with long-term synaptic plasticity. , 1071–1089 (2001). Annu. Rev. Neurosci. 24 Holtmaat, A. & Svoboda, K. Experience-dependent structural synaptic plasticity in the mammalian brain. , 647–658 (2009). Nat. Rev. Neurosci. 10 Caroni, P., Donato, F. & Muller, D. Structural plasticity upon learning: regulation and functions. , 478–490 (2012). Nat. Rev. Neurosci. 13 Keck, T. et al. Massieve herstructurering van neuronale circuits tijdens functionele reorganisatie van de volwassen visuele cortex. Nat. Neurosci. 11, 1162 (2008). Hofer, S. B., Mrsic-Flogel, T. D., Bonhoeffer, T. & Hübener, M. Ervaring laat een blijvende structurele sporen achter in corticale circuits. Trachtenberg, J. T. et al. Long-term in vivo imaging of experience-dependent synaptic plasticity in adult cortex. , 788–794 (2002). Nature 420 Marik, S. A., Yamahachi, H., McManus, J. N., Szabo, G. & Gilbert, C. D. Axonale dynamiek van excitatieve en remmende neuronen in de somatosensoriale cortex. Xu, T. et al. Rapid formation and selective stabilization of synapses for enduring motor memories. , 915–919 (2009). Nature 462 Albarran, E., Raissi, A., Jáidar, O., Shatz, C. J. & Ding, J. B. Verbeteren van motorisch leren door de stabiliteit van de nieuw gevormde dendritische spin in de motorische cortex te verhogen. Branco, T. & Häusser, M. Synaptische integratie gradiënten in enkele corticale piramidale celdendrieten. Neuron 69, 885–892 (2011). Major, G., Larkum, M. E. & Schiller, J. Actieve eigenschappen van neocortical pyramidal neuron dendrites. Annu. Rev. Neurosci. 36, 1–24 (2013). Gidon, A. et al. Dendritische actie potentialen en berekening in de menselijke laag 2/3 corticale neuronen. Science 367, 83-87 (2020). Doron, M., Chindemi, G., Muller, E., Markram, H. & Segev, I. Timed synaptic inhibition shapes NMDA spikes, influencing local dendritic processing and global I/O properties of cortical neurons. , 1550–1561 (2017). Cell Rep. 21 Du, K. et al. Cell-type-specifieke remming van het dendritische plateau potentieel in striatale spinale projectie neuronen. Proc. Natl Acad. Sci. USA 114, E7612-E7621 (2017). Smith, S. L., Smith, I. T., Branco, T. & Häusser, M. Dendritic spikes enhance stimulus selectivity in cortical neurons in vivo. , 115–120 (2013). Nature 503 Xu, N.-l et al. Nonlinear dendritic integration of sensory and motor input during an active sensing task. , 247–251 (2012). Nature 492 Takahashi, N., Oertner, T. G., Hegemann, P. & Larkum, M. E. Active cortical dendrites modulate perception. , 1587–1590 (2016). Science 354 Sheffield, M. E. & Dombeck, D. A. Calcium transient prevalence across the dendritic arbour predicts place field properties. , 200–204 (2015). Nature 517 Markram, H. et al. Reconstruction and simulation of neocortical microcircuitry. , 456–492 (2015). Cell 163 Billeh, Y. N. et al. Systematic integration of structural and functional data into multi-scale models of mouse primary visual cortex. , 388–403 (2020). Neuron 106 Hjorth, J. et al. The microcircuits of striatum in silico. , 202000671 (2020). Proc. Natl Acad. Sci. USA 117 Guerguiev, J., Lillicrap, T. P. & Richards, B. A. Naar diep leren met gescheiden dendrieten. elife 6, e22901 (2017). Iyer, A. et al. Avoiding catastrophe: active dendrites enable multi-task learning in dynamic environments. , 846219 (2022). Front. Neurorobot. 16 Jones, I. S. & Kording, K. P. Might a single neuron solve interesting machine learning problems through successive computations on its dendritic tree? , 1554–1571 (2021). Neural Comput. 33 Bird, A. D., Jedlicka, P. & Cuntz, H. Dendritic normalisation improves learning in sparsely connected artificial neural networks. , e1009202 (2021). PLoS Comput. Biol. 17 Goodfellow, I. J., Shlens, J. & Szegedy, C. Explaining and harnessing adversarial examples. In (ICLR, 2015). 3rd International Conference on Learning Representations (ICLR) Papernot, N., McDaniel, P. & Goodfellow, I. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. Preprint at (2016). https://arxiv.org/abs/1605.07277 Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. , 2278–2324 (1998). Proc. IEEE 86 Xiao, H., Rasul, K. & Vollgraf, R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. Preprint at (2017). http://arxiv.org/abs/1708.07747 Bartunov, S. et al. Assessing the scalability of biologically-motivated deep learning algorithms and architectures. In (NeurIPS, 2018). Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Rauber, J., Brendel, W. & Bethge, M. Foolbox: A Python toolbox to benchmark the robustness of machine learning models. In (2017). Reliable Machine Learning in the Wild Workshop, 34th International Conference on Machine Learning Rauber, J., Zimmermann, R., Bethge, M. & Brendel, W. Foolbox native: fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX. , 2607 (2020). J. Open Source Softw. 5 Paszke, A. et al. PyTorch: An imperative style, high-performance deep learning library. In (NeurIPS, 2019). Advances in Neural Information Processing Systems 32 (NeurIPS 2019) He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 770–778 (IEEE, 2016). Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) erkenningen The authors sincerely thank Dr. Rita Zhang, Daochen Shi and members at NVIDIA for the valuable technical support of GPU computing. This work was supported by the National Key R&D Program of China (No. 2020AAA0130400) to K.D. and T.H., National Natural Science Foundation of China (No. 61088102) to T.H., National Key R&D Program of China (No. 2022ZD01163005) to L.M., Key Area R&D Program of Guangdong Province (No. 2018B030338001) to T.H., National Natural Science Foundation of China (No. 61825101) to Y.T., Swedish Research Council (VR-M-2020-01652), Swedish e-Science Research Centre (SeRC), EU/Horizon 2020 No. 945539 (HBP SGA3), and KTH Digital Futures to J.H.K., J.H., and A.K., Swedish Research Council (VR-M-2021-01995) and EU/Horizon 2020 no. 945539 (HBP SGA3) to S.G. and A.K. Part of the simulations were enabled by resources provided by the Swedish National Infrastructure for Computing (SNIC) at PDC KTH partially funded by the Swedish Research Council through grant agreement no. 2018-05973. Dit document is verkrijgbaar onder de CC by 4.0 Deed (Attribution 4.0 International) licentie. Dit papier is Onder de CC by 4.0 Deed (Attribution 4.0 International) licentie. Beschikbaar in de natuur