Επιστήμονες δημιούργησαν μηχανή GPU που προσομοιώνει εγκεφαλικά κύτταρα 1.500 φορές ταχύτερα

Οι συγγραφείς: Yichen Zhang Gan He Lei Ma Xiaofei Liu J. J. Γιοχάνες Χέρθ Alexander Kozlov Yutao He Σέντζιαν Ζανγκ Ζανέτ Χέλγκρεν Κοταλέσκι Γιουνγκ Χονγκ Τιαν Στέφανος Γκριλ Kai Du Tiejun Huang Οι συγγραφείς: Ζανγκ Ζανγκ Ο Γκάν ΝΕΑ ΜΑ Μιχάλης Λιου J. J. Γιοχάνες Χέρθ Αλέξανδρος Κοζλόφ Γιουβέντους Σέντζιαν Ζανγκ Ζανέτ Χέλγκρεν Κοταλέσκι Γιουνγκ Χονγκ Τιαν Στέφανος Γκριλ Όταν το Τσιτσιπάς Χουάν ΑΠΑΣΧΟΛΗΣΗ Biophysically detailed multi-compartment models are powerful tools to explore computational principles of the brain and also serve as a theoretical framework to generate algorithms for artificial intelligence (AI) systems. However, the expensive computational cost severely limits the applications in both the neuroscience and AI fields. The major bottleneck during simulating detailed compartment models is the ability of a simulator to solve large systems of linear equations. Here, we present a novel ΕΝΔΡΙΚΗ ΙΕΡΑΡΧΙΚΗ Η μέθοδος cheduling (DHS) για να επιταχύνει σημαντικά μια τέτοια διαδικασία. Δείχνουμε θεωρητικά ότι η εφαρμογή DHS είναι υπολογιστικά βέλτιστη και ακριβής. Αυτή η μέθοδος που βασίζεται σε GPU εκτελεί με 2-3 τάξεις μεγέθους υψηλότερη ταχύτητα από εκείνη της κλασικής μεθόδου σειριακής Hines στην συμβατική πλατφόρμα CPU. Χτίζουμε ένα πλαίσιο DeepDendrite, το οποίο ενσωματώνει τη μέθοδο DHS και την μηχανή υπολογισμού GPU του προσομοιωτή NEURON και αποδεικνύει τις εφαρμογές του DeepDendrite σε εργασίες νευροεπιστήμης. Εξετάζουμε πώς τα χωρικά πρότυπα των εισροών σπονδυλικής στήλης επηρεάζουν D H S Εισαγωγή Η αποκωδικοποίηση των κωδικοποιητικών και υπολογιστικών αρχών των νευρώνων είναι απαραίτητη για τη νευροεπιστήμη.Ο εγκέφαλος των θηλαστικών αποτελείται από περισσότερους από χιλιάδες διαφορετικούς τύπους νευρώνων με μοναδικές μορφολογικές και βιοφυσικές ιδιότητες. , στην οποία οι νευρώνες θεωρούνταν απλές μονάδες συγκέντρωσης, εξακολουθεί να εφαρμόζεται ευρέως στον νευρικό υπολογισμό, ειδικά στην ανάλυση νευρικών δικτύων.Τα τελευταία χρόνια, η σύγχρονη τεχνητή νοημοσύνη (AI) έχει χρησιμοποιήσει αυτή την αρχή και έχει αναπτύξει ισχυρά εργαλεία, όπως τα τεχνητά νευρικά δίκτυα (ANN). Ωστόσο, εκτός από ολοκληρωμένους υπολογισμούς στο επίπεδο ενός νευρώνου, υποκυτταρικά διαμερίσματα, όπως οι νευρωνικές δεντρίτες, μπορούν επίσης να εκτελέσουν μη γραμμικές λειτουργίες ως ανεξάρτητες υπολογιστικές μονάδες. , , , , Επιπλέον, οι δενδρικοί σπόνδυλοι, μικρές προεξοχές που καλύπτουν πυκνά τους δενδρίτες στους σπονδυλικούς νευρώνες, μπορούν να χωρίσουν τα συναπτικά σήματα, επιτρέποντάς τους να διαχωριστούν από τους γονείς τους δενδρίτες ex vivo και in vivo. , , , . 1 2 3 4 5 6 7 8 9 10 11 Οι προσομοιώσεις που χρησιμοποιούν βιολογικά λεπτομερείς νευρώνες παρέχουν ένα θεωρητικό πλαίσιο για τη σύνδεση των βιολογικών λεπτομερειών με τις υπολογιστικές αρχές. , μας επιτρέπει να μοντελοποιήσουμε νευρώνες με ρεαλιστικές μορφολογίες δενδρίτη, εγγενή ιονική αγωγιμότητα και εξωγενείς συναπτικές εισροές.Η σπονδυλική στήλη του λεπτομερούς μοντέλου πολλαπλών διαμερισμάτων, δηλαδή των δενδριτών, βασίζεται στην κλασική θεωρία των καλωδίων , η οποία μοντελοποιεί τις βιοφυσικές ιδιότητες της μεμβράνης των δενδριτών ως παθητικά καλώδια, παρέχοντας μια μαθηματική περιγραφή του τρόπου με τον οποίο τα ηλεκτρονικά σήματα εισβάλλουν και εξαπλώνονται σε σύνθετες νευρωνικές διεργασίες. Με την ενσωμάτωση της θεωρίας καλωδίων με ενεργούς βιοφυσικούς μηχανισμούς όπως ιόντα κανάλια, διεγερτικά και ανασταλτικά συναπτικά ρεύματα κ.λπ., ένα λεπτομερές μοντέλο πολλαπλών τμημάτων μπορεί να επιτύχει κυτταρικούς και υποκυτταρικούς νευρωνικούς υπολογισμούς πέρα από πειραματικούς περιορισμούς , . 12 13 12 4 7 Εκτός από τη βαθιά επίδρασή του στην νευροεπιστήμη, τα βιολογικά λεπτομερή μοντέλα νευρώνων χρησιμοποιήθηκαν πρόσφατα για να γεφυρώσουν το χάσμα μεταξύ των νευρωνικών δομικών και βιοφυσικών λεπτομερειών και της τεχνητής νοημοσύνης.Η επικρατούσα τεχνική στον σύγχρονο τομέα της τεχνητής νοημοσύνης είναι τα ANNs που αποτελούνται από νευρώνες σημείου, ένα ανάλογο με τα βιολογικά νευρικά δίκτυα. Παρόλο που τα ANNs με αλγόριθμο "backpropagation-of-error" (backprop) επιτυγχάνουν αξιοσημείωτες επιδόσεις σε εξειδικευμένες εφαρμογές, ακόμη και χτυπώντας κορυφαίους , Ο ανθρώπινος εγκέφαλος εξακολουθεί να ξεπερνά τα ANNs σε τομείς που περιλαμβάνουν πιο δυναμικά και θορυβώδη περιβάλλοντα. , Πρόσφατες θεωρητικές μελέτες δείχνουν ότι η δενδρική ολοκλήρωση είναι ζωτικής σημασίας για τη δημιουργία αποτελεσματικών αλγορίθμων μάθησης που ενδεχομένως να υπερβαίνουν το backprop στην παράλληλη επεξεργασία πληροφοριών. , , Επιπλέον, ένα ενιαίο λεπτομερές μοντέλο πολλαπλών διαμερισμάτων μπορεί να μάθει μη γραμμικούς υπολογισμούς σε επίπεδο δικτύου για τους νευρώνες σημείου, προσαρμόζοντας μόνο τη συναπτική δύναμη. , Ως εκ τούτου, είναι υψηλής προτεραιότητας να επεκταθούν τα πρότυπα στον τομέα της νοημοσύνης του εγκεφάλου από ενιαία λεπτομερή μοντέλα νευρώνων σε μεγάλης κλίμακας βιολογικά λεπτομερή δίκτυα. 14 15 16 17 18 19 20 21 22 Μια μακροχρόνια πρόκληση της προσέγγισης της λεπτομερούς προσομοίωσης έγκειται στο εξαιρετικά υψηλό κόστος υπολογισμού, το οποίο έχει περιορίσει σοβαρά την εφαρμογή του στη νευροεπιστήμη και την τεχνητή νοημοσύνη. , , Για τη βελτίωση της αποδοτικότητας, η κλασική μέθοδος Hines μειώνει την πολυπλοκότητα του χρόνου για την επίλυση εξισώσεων από O(n3) σε O(n), η οποία έχει εφαρμοστεί ευρέως ως ο βασικός αλγόριθμος σε δημοφιλείς προσομοιωτές όπως το NEURON. Η γενεά Ωστόσο, αυτή η μέθοδος χρησιμοποιεί μια σειριακή προσέγγιση για να επεξεργαστεί κάθε διαμέρισμα διαδοχικά.Όταν μια προσομοίωση περιλαμβάνει πολλαπλές βιοφυσικά λεπτομερείς δένδριτες με δενδριτικές σπονδύλους, η γραμμική μήτρα εξίσωσης («Hines Matrix») κλιμακώνεται ανάλογα με έναν αυξανόμενο αριθμό δενδριτών ή σπονδύλων (Εικόνα 4). ), καθιστώντας τη μέθοδο Hines πια μη πρακτική, καθώς δημιουργεί ένα πολύ βαρύ φορτίο σε ολόκληρη την προσομοίωση. 12 23 24 25 26 1Ε Ένα ανακατασκευασμένο μοντέλο πυραμιδικών νευρώνων στρώματος 5 και ο μαθηματικός τύπος που χρησιμοποιείται με λεπτομερή μοντέλα νευρώνων. Η ροή εργασίας κατά την αριθμητική προσομοίωση λεπτομερών μοντέλων νευρώνων Η φάση επίλυσης εξισώσεων είναι το φράγμα στη προσομοίωση. Ένα παράδειγμα γραμμικών εξισώσεων στην προσομοίωση. Η εξάρτηση των δεδομένων από τη μέθοδο Hines κατά την επίλυση γραμμικών εξισώσεων . . Ο αριθμός των γραμμικών εξισώσεων που πρέπει να λυθούν αυξάνεται σημαντικά όταν τα μοντέλα αναπτύσσονται πιο λεπτομερώς. Υπολογιστικό κόστος (τα βήματα που λαμβάνονται στη φάση επίλυσης των εξισώσεων) της σειριακής μεθόδου Hines σε διαφορετικούς τύπους μοντέλων νευρώνων. Διαφορετικά μέρη ενός νευρώνου αποδίδονται σε πολλαπλές μονάδες επεξεργασίας σε παράλληλες μεθόδους (κεντρική, δεξιά), που εμφανίζονται με διαφορετικά χρώματα. Το κόστος των τριών μεθόδων όταν λύνουμε εξισώσεις ενός πυραμιδικού μοντέλου με σπονδύλους. Χρόνος εκτέλεσης διαφόρων μεθόδων για την επίλυση εξισώσεων για 500 πυραμιδικά μοντέλα με σπινθήρες. Ο χρόνος εκτέλεσης υποδεικνύει την κατανάλωση χρόνου της προσομοίωσης 1 s (επίλυση της εξίσωσης 40.000 φορές με βήμα χρόνου 0,025 ms). p-Hines παράλληλη μέθοδος στο CoreNEURON (σε GPU), Branch-based παράλληλη μέθοδος με βάση το κλάδο (σε GPU), DHS ιεραρχική μέθοδος προγραμματισμού (σε GPU). a b c d c e f g h g i Τις τελευταίες δεκαετίες έχει επιτευχθεί τεράστια πρόοδος στην επιτάχυνση της μεθόδου Hines χρησιμοποιώντας παράλληλες μεθόδους σε κυτταρικό επίπεδο, οι οποίες επιτρέπουν την παράλληλη υπολογιστική των διαφόρων τμημάτων σε κάθε κύτταρο. , , , , , . However, current cellular-level parallel methods often lack an efficient parallelization strategy or lack sufficient numerical accuracy as compared to the original Hines method. 27 28 29 30 31 32 Εδώ, αναπτύσσουμε ένα πλήρως αυτοματοποιημένο, αριθμητικά ακριβές και βελτιστοποιημένο εργαλείο προσομοίωσης που μπορεί να επιταχύνει σημαντικά την αποδοτικότητα του υπολογισμού και να μειώσει το κόστος του υπολογισμού. Επιπλέον, αυτό το εργαλείο προσομοίωσης μπορεί να υιοθετηθεί ομαλά για τη δημιουργία και τη δοκιμή νευρικών δικτύων με βιολογικές λεπτομέρειες για εφαρμογές μηχανικής μάθησης και AI. Η θεωρία των παράλληλων υπολογιστών Αποδεικνύουμε ότι ο αλγόριθμος μας παρέχει τον βέλτιστο προγραμματισμό χωρίς απώλεια ακρίβειας.Επιπλέον, έχουμε βελτιστοποιήσει το DHS για το πιο προηγμένο τσιπ GPU, αξιοποιώντας την ιεραρχία μνήμης GPU και τους μηχανισμούς πρόσβασης μνήμης. ) σε σύγκριση με το κλασικό προσομοιωτή NEURON Διατηρώντας την ίδια ακρίβεια. 33 34 1 25 Για να επιτρέψουμε λεπτομερείς προσομοιώσεις δενδρικών για χρήση στην τεχνητή νοημοσύνη, στη συνέχεια δημιουργούμε το πλαίσιο DeepDendrite με την ενσωμάτωση της πλατφόρμας CoreNEURON (μια βελτιστοποιημένη μηχανή υπολογισμού για το NEURON) που ενσωματώνεται στο DHS. ως μηχανή προσομοίωσης και δύο βοηθητικές ενότητες (ενότητα I/O και ενότητα μάθησης) που υποστηρίζουν τους αλγόριθμους δενδρικής μάθησης κατά τη διάρκεια των προσομοιώσεων. 35 Τελευταίο αλλά όχι λιγότερο σημαντικό, παρουσιάζουμε επίσης αρκετές εφαρμογές που χρησιμοποιούν το DeepDendrite, στοχεύοντας σε μερικές κρίσιμες προκλήσεις στη νευροεπιστήμη και την τεχνητή νοημοσύνη: (1) Δείχνουμε πώς τα χωρικά πρότυπα των εισροών δενδρικής σπονδυλικής στήλης επηρεάζουν τις νευρωνικές δραστηριότητες με νευρώνες που περιέχουν σπινθήρες σε όλο το δένδρο (μοντέλα πλήρους σπονδυλικής στήλης). Το DeepDendrite μας επιτρέπει να διερευνήσουμε τον νευρωνικό υπολογισμό σε ένα προσομοιωμένο μοντέλο ανθρώπινων πυραμιδικών νευρώνων με ~25.000 δενδρικούς σπονδύλους. (2) Στη συζήτηση Όλος ο πηγαίος κώδικας για το DeepDendrite, τα μοντέλα πλήρους σπονδυλικής στήλης και το λεπτομερές μοντέλο δενδρικού δικτύου είναι διαθέσιμο στο κοινό στο διαδίκτυο (βλ. Διαθεσιμότητα κώδικα).Το πλαίσιο μάθησης ανοιχτού κώδικα μας μπορεί εύκολα να ενσωματωθεί με άλλους κανόνες δενδρικής μάθησης, όπως κανόνες μάθησης για μη γραμμικούς (πλήρως ενεργούς) δενδρίτες Εκρηξη-εξαρτώμενη συναπτική πλαστικότητα Μαθαίνοντας με την πρόβλεψη Spike Συνολικά, η μελέτη μας παρέχει ένα πλήρες σύνολο εργαλείων που έχουν τη δυνατότητα να αλλάξουν το τρέχον οικοσύστημα της κοινότητας των υπολογιστικών νευροεπιστημών.Με την αξιοποίηση της ισχύος των υπολογιστικών GPU, προβλέπουμε ότι αυτά τα εργαλεία θα διευκολύνουν την εξερεύνηση των υπολογιστικών αρχών των λεπτών δομών του εγκεφάλου σε επίπεδο συστήματος, καθώς και την προώθηση της αλληλεπίδρασης μεταξύ της νευροεπιστήμης και της σύγχρονης τεχνητής νοημοσύνης. 21 20 36 Αποτελέσματα Δανδρικός Ιεραρχικός Προγραμματισμός (Dendritic Hierarchical Scheduling - DHS) Η υπολογιστική των ιοντικών ρευμάτων και η επίλυση των γραμμικών εξισώσεων είναι δύο κρίσιμες φάσεις κατά την προσομοίωση βιοφυσικά λεπτομερών νευρώνων, οι οποίες είναι χρονοβόρες και συνεπάγονται σοβαρές υπολογιστικές επιβαρύνσεις. Ευτυχώς, η υπολογιστική των ιοντικών ρευμάτων κάθε διαμερίσματος είναι μια πλήρως ανεξάρτητη διαδικασία, έτσι ώστε να μπορεί να παραλληλιστεί φυσικά σε συσκευές με μαζικές μονάδες παράλληλης υπολογιστικής όπως οι GPUs Ως συνέπεια, η επίλυση γραμμικών εξισώσεων γίνεται το εναπομείναν φιαλίδιο για τη διαδικασία παράλληλης εξίσωσης (Σχήμα 3). ) της 37 1α-φ Για την αντιμετώπιση αυτού του φραγμού, έχουν αναπτυχθεί παράλληλες μέθοδοι σε κυτταρικό επίπεδο, οι οποίες επιταχύνουν τον υπολογισμό μεμονωμένων κυττάρων «διαιρώντας» ένα μόνο κύτταρο σε διάφορα διαμερίσματα που μπορούν να υπολογιστούν παράλληλα. , , Ωστόσο, τέτοιες μέθοδοι βασίζονται σε μεγάλο βαθμό στην προηγούμενη γνώση για να δημιουργήσουν πρακτικές στρατηγικές για το πώς να χωρίσουν έναν ενιαίο νευρώνα σε διαμερίσματα (Σχήμα 3. • Συμπληρωματικό ΦΙΓ. Ως εκ τούτου, γίνεται λιγότερο αποτελεσματική για τους νευρώνες με ασύμμετρες μορφολογίες, π.χ. πυραμιδικούς νευρώνες και νευρώνες Purkinje. 27 28 38 1 Γ 1 Στόχος μας είναι να αναπτύξουμε μια πιο αποτελεσματική και ακριβή παράλληλη μέθοδο για την προσομοίωση βιολογικά λεπτομερών νευρικών δικτύων. Πρώτον, καθορίζουμε τα κριτήρια για την ακρίβεια μιας παράλληλης μεθόδου κυτταρικού επιπέδου. , προτείνουμε τρεις προϋποθέσεις για να διασφαλίσουμε ότι μια παράλληλη μέθοδος θα παράγει πανομοιότυπες λύσεις όπως η μέθοδος σειριακού υπολογισμού Hines σύμφωνα με την εξάρτηση των δεδομένων στη μέθοδο Hines (βλέπε Μέθοδοι). 34 Με βάση την ακρίβεια της προσομοίωσης και το κόστος υπολογισμού, διατυπώνουμε το πρόβλημα της παράλληλης προγραμματισμού ως πρόβλημα μαθηματικού προγραμματισμού (βλέπε Μέθοδοι). παράλληλες γραμμές, μπορούμε να υπολογίσουμε στο μέγιστο Όμως, πρέπει να διασφαλίσουμε ότι ένας κόμβος υπολογίζεται μόνο αν έχουν επεξεργαστεί όλοι οι κόμβοι των παιδιών του. ο στόχος μας είναι να βρούμε μια στρατηγική με τον ελάχιστο αριθμό βημάτων για ολόκληρη τη διαδικασία. k k Για να δημιουργήσουμε μια βέλτιστη διαίρεση, προτείνουμε μια μέθοδο που ονομάζεται Dendritic Hierarchical Scheduling (DHS) (η θεωρητική απόδειξη παρουσιάζεται στις Μέθοδοι). Η μέθοδος DHS περιλαμβάνει δύο βήματα: ανάλυση της δενδρικής τοπολογίας και εύρεση του καλύτερου διαχωρισμού: (1) Δεδομένου ενός λεπτομερούς μοντέλου, λαμβάνουμε πρώτα το αντίστοιχο δέντρο εξάρτησης και υπολογίζουμε το βάθος κάθε κόμβου (το βάθος ενός κόμβου είναι ο αριθμός των προγόνων κόμβων του) στο δέντρο (Σχήμα 4). (2) Μετά την ανάλυση τοπολογίας, αναζητούμε τους υποψηφίους και επιλέγουμε το πολύ βαθύτερους υποψήφιους κόμβους (ένας κόμβος είναι υποψήφιος μόνο εάν έχουν υποβληθεί σε επεξεργασία όλοι οι κόμβοι των παιδιών του). ) της 2α 2 Β, Γ k 2D DHS - Διαδικασίες DHS Ο βαθύτερος υποψήφιος κόμβος κάθε επανάληψης. Εικονογράφηση του υπολογισμού του βάθους κόμβου ενός μοντέλου διαμερίσματος. Το μοντέλο μετατρέπεται πρώτα σε δομή δένδρου και στη συνέχεια υπολογίζεται το βάθος κάθε κόμβου. Ανάλυση τοπολογίας σε διαφορετικά μοντέλα νευρώνων. Εδώ παρουσιάζονται έξι νευρώνες με διαφορετικές μορφολογίες. Για κάθε μοντέλο, το σόμα επιλέγεται ως ρίζα του δέντρου έτσι ώστε το βάθος του κόμβου να αυξάνεται από το σόμα (0) στα απομακρυσμένα δεντρίτες. Εικόνα της εκτέλεσης του DHS στο μοντέλο με τέσσερα νήματα. υποψήφιοι: κόμβοι που μπορούν να υποβληθούν σε επεξεργασία. επιλεγμένοι υποψήφιοι: κόμβοι που επιλέγονται από το DHS, δηλαδή το Επεξεργασμένοι κόμβοι: κόμβοι που έχουν επεξεργαστεί προηγουμένως. Στρατηγική παράλληλης αντιστοίχισης που αποκτήθηκε από την DHS μετά τη διαδικασία Κάθε κόμβος αποδίδεται σε ένα από τα τέσσερα παράλληλα νήματα.Το DHS μειώνει τα βήματα της επεξεργασίας σειριακών κόμβων από 14 σε 5 με την κατανομή των κόμβων σε πολλαπλά νήματα. Σχετικό κόστος, δηλαδή το ποσοστό του υπολογιστικού κόστους του DHS σε σχέση με εκείνο της μεθόδου Hines, όταν εφαρμόζεται DHS με διαφορετικό αριθμό νήσεων σε διαφορετικούς τύπους μοντέλων. a k b c d b k e d f Πάρτε ένα απλοποιημένο μοντέλο με 15 διαμερίσματα ως παράδειγμα, χρησιμοποιώντας τη μέθοδο σειριακού υπολογισμού Hines, χρειάζεται 14 βήματα για να επεξεργαστείτε όλους τους κόμβους, ενώ χρησιμοποιώντας DHS με τέσσερις παράλληλες μονάδες μπορεί να διαχωρίσει τους κόμβους του σε πέντε υποσύνολα (Σχήμα 3. ): {{9,10,12,14}, {1,7,11,13}, {2,3,4,8}, {6}, {5}}. Επειδή οι κόμβοι στο ίδιο υποσύνολο μπορούν να υποβληθούν σε παράλληλη επεξεργασία, χρειάζονται μόνο πέντε βήματα για να υποβληθούν σε επεξεργασία όλοι οι κόμβοι χρησιμοποιώντας το DHS (Εικόνα 5). ) της 2D 2Ε Στη συνέχεια, εφαρμόζουμε τη μέθοδο DHS σε έξι αντιπροσωπευτικά λεπτομερή μοντέλα νευρώνων (που επιλέγονται από το ModelDB ) με διαφορετικό αριθμό ρυτίδων (Φίλ. ): συμπεριλαμβανομένων των πυραμιδικών νευρώνων του φλοιού και του ιππόκαμπου , , Οι νευρώνες του εγκεφάλου Στρατιώτες νευρώνες προβολής (SPN) ) και μυρωδικά κύτταρα μύθου βολβού , covering the major principal neurons in sensory, cortical and subcortical areas. We then measured the computational cost. The relative computational cost here is defined by the proportion of the computational cost of DHS to that of the serial Hines method. The computational cost, i.e., the number of steps taken in solving equations, drops dramatically with increasing thread numbers. For example, with 16 threads, the computational cost of DHS is 7%-10% as compared to the serial Hines method. Intriguingly, the DHS method reaches the lower bounds of their computational cost for presented neurons when given 16 or even 8 parallel threads (Fig. ), υποδεικνύοντας ότι η προσθήκη περισσότερων νήματα δεν βελτιώνει περαιτέρω την απόδοση λόγω των εξαρτήσεων μεταξύ των τμημάτων. 39 2 Φ 40 41 42 43 44 45 2 Φ Μαζί, δημιουργούμε μια μέθοδο DHS που επιτρέπει την αυτοματοποιημένη ανάλυση της δεδριτικής τοπολογίας και του βέλτιστου διαχωρισμού για παράλληλους υπολογισμούς.Αξίζει να σημειωθεί ότι το DHS βρίσκει τον βέλτιστο διαχωρισμό πριν ξεκινήσει η προσομοίωση και δεν απαιτείται πρόσθετος υπολογισμός για την επίλυση εξισώσεων. Επιτάχυνση του DHS με την ενίσχυση της μνήμης GPU Το DHS υπολογίζει κάθε νευρώνα με πολλαπλά νήματα, τα οποία καταναλώνουν ένα τεράστιο ποσό νήματα κατά την εκτέλεση προσομοιώσεων νευρικού δικτύου. Οι μονάδες επεξεργασίας γραφικών (GPUs) αποτελούνται από μαζικές μονάδες επεξεργασίας (δηλαδή, επεξεργαστές ροής, SPs, Fig. Για παράλληλους υπολογιστές Θεωρητικά, πολλά SPs στην GPU θα πρέπει να υποστηρίζουν αποτελεσματική προσομοίωση για μεγάλης κλίμακας νευρωνικά δίκτυα (Σχήμα 3. Ωστόσο, παρατηρήσαμε με συνέπεια ότι η αποτελεσματικότητα του DHS μειώθηκε σημαντικά όταν το μέγεθος του δικτύου αυξήθηκε, γεγονός που θα μπορούσε να προκύψει από τη διάσπαση της αποθήκευσης δεδομένων ή την πρόσβαση σε πρόσθετη μνήμη που προκαλείται από τη φόρτωση και τη γραφή ενδιάμεσων αποτελεσμάτων (Σχήμα 3). και αριστερά). 3α, β 46 3γ 3D Η αρχιτεκτονική της GPU και η ιεραρχία της μνήμης. Κάθε GPU περιέχει τεράστιες μονάδες επεξεργασίας (processors stream). Αρχιτεκτονική Streaming Multiprocessors (SMs) Κάθε SM περιέχει πολλαπλούς επεξεργαστές ροής, μητρώα και προσωρινή μνήμη L1. Εφαρμόζοντας DHS σε δύο νευρώνες, καθένας με τέσσερα νήματα. Κατά τη διάρκεια της προσομοίωσης, κάθε νήμα εκτελείται σε έναν επεξεργαστή ροής. Στρατηγική βελτιστοποίησης μνήμης σε GPU. Ανώτατο πάνελ, ανάθεση νήματος και αποθήκευση δεδομένων του DHS, πριν (αριστερά) και μετά (δεξιά) ενίσχυση μνήμης. Οι επεξεργαστές στέλνουν ένα αίτημα δεδομένων για να φορτώσουν δεδομένα για κάθε νήμα από την παγκόσμια μνήμη. Χωρίς ενίσχυση μνήμης (αριστερά), χρειάζονται επτά συναλλαγές για να φορτώσουν όλα τα δεδομένα αιτήσεων και μερικές πρόσθετες συναλλαγές για ενδιάμεσα αποτελέσματα. Χρόνος εκτέλεσης του DHS (32 νήματα ανά κελί) με και χωρίς ενίσχυση μνήμης σε πολυεπίπεδα 5 πυραμιδικά μοντέλα με σπινθήρες. Επιτάχυνση της ενίσχυσης της μνήμης σε πολλαπλά στρώματα 5 πυραμιδικά μοντέλα με σπινθήρες. a b c d d e f Επιλύουμε αυτό το πρόβλημα με την ενίσχυση της μνήμης GPU, μια μέθοδος για την αύξηση της ροής μνήμης αξιοποιώντας την ιεραρχία μνήμης και τον μηχανισμό πρόσβασης της GPU. Με βάση τον μηχανισμό φόρτωσης μνήμης της GPU, διαδοχικά νήματα που φορτώνουν ευθυγραμμισμένα και διαδοχικά αποθηκευμένα δεδομένα οδηγούν σε υψηλή ροή μνήμης σε σύγκριση με την πρόσβαση σε δεδομένα αποθηκευμένα με διάσπαση, γεγονός που μειώνει τη ροή μνήμης. , . To achieve high throughput, we first align the computing orders of nodes and rearrange threads according to the number of nodes on them. Then we permute data storage in global memory, consistent with computing orders, i.e., nodes that are processed at the same step are stored successively in global memory. Moreover, we use GPU registers to store intermediate results, further strengthening memory throughput. The example shows that memory boosting takes only two memory transactions to load eight request data (Fig. Επιπλέον, πειράματα σε πολλαπλούς αριθμούς πυραμιδικών νευρώνων με σπονδύλους και τα τυπικά μοντέλα νευρώνων (Σχήμα 3. ; Supplementary Fig. ) show that memory boosting achieves a 1.2-3.8 times speedup as compared to the naïve DHS. 46 47 3D 3Ε, Φ 2 Για να δοκιμάσουμε διεξοδικά την απόδοση του DHS με την ενίσχυση της μνήμης GPU, επιλέγουμε έξι τυπικά μοντέλα νευρώνων και αξιολογούμε τον χρόνο εκτέλεσης της επίλυσης εξισώσεων καλωδίων σε τεράστιους αριθμούς κάθε μοντέλου (Σχήμα 2. Εξετάσαμε το DHS με τέσσερα νήματα (DHS-4) και δεκαέξι νήματα (DHS-16) για κάθε νευρώνα, αντίστοιχα.Σε σύγκριση με τη μέθοδο GPU στο CoreNEURON, το DHS-4 και το DHS-16 μπορούν να επιταχύνουν περίπου 5 και 15 φορές, αντίστοιχα (Σχήμα 4). Επιπλέον, σε σύγκριση με τη συμβατική μέθοδο σειριακών Hines στο NEURON που τρέχει με ένα ενιαίο νήμα της CPU, το DHS επιταχύνει την προσομοίωση κατά 2-3 τάξεις μεγέθους (Επιπρόσθετο σχήμα. ), διατηρώντας ταυτόχρονα την ίδια αριθμητική ακρίβεια παρουσία πυκνών σπονδύλων (Συμπληρωματικά σχήματα. και ), ενεργά δενδρίτες (Επιπρόσθετη εικόνα. ) και διαφορετικές στρατηγικές τμηματοποίησης (Συμπληρωματική εικόνα. ) της 4 4α 3 4 8 7 7 Χρόνος εκτέλεσης της επίλυσης των εξισώσεων για μια προσομοίωση 1 s σε GPU (dt = 0,025 ms, 40.000 επαναλήψεις συνολικά). CoreNEURON: η παράλληλη μέθοδος που χρησιμοποιείται στο CoreNEURON; DHS-4: DHS με τέσσερα νήματα για κάθε νευρώνα; DHS-16: DHS με 16 νήματα για κάθε νευρώνα. , Visualization of the partition by DHS-4 and DHS-16, each color indicates a single thread. During computation, each thread switches among different branches. a b c DHS δημιουργεί βέλτιστη διαίρεση ειδικά για τον τύπο κυττάρου To gain insights into the working mechanism of the DHS method, we visualized the partitioning process by mapping compartments to each thread (every color presents a single thread in Fig. Η απεικόνιση δείχνει ότι ένα ενιαίο νήμα συχνά εναλλάσσεται μεταξύ διαφορετικών κλαδιών (Σχήμα 2. Ενδιαφέρον είναι ότι το DHS δημιουργεί ευθυγραμμισμένους διαχωρισμούς σε μορφολογικά συμμετρικούς νευρώνες όπως ο νευρώνης προβολής του στριάτου (SPN) και το κύτταρο του Μιθραίου (Σχήμα 4). Αντίθετα, παράγει κατακερματισμένους διαχωρισμούς μορφολογικά ασύμμετρων νευρώνων όπως οι πυραμιδικοί νευρώνες και το κύτταρο Purkinje (Σχήμα 4). ), υποδεικνύοντας ότι το DHS διαιρεί το νευρικό δέντρο σε κλίμακα μεμονωμένων τμημάτων (δηλαδή, κόμβος δέντρου) και όχι κλίμακα κλάδου. 4 Β, Γ 4 Β, Γ 4 Β, Γ 4 Β, Γ Συνοψίζοντας, το DHS και η ενίσχυση της μνήμης δημιουργούν μια θεωρητικά αποδεδειγμένη βέλτιστη λύση για την επίλυση γραμμικών εξισώσεων παράλληλα με πρωτοφανή απόδοση. Χρησιμοποιώντας αυτή την αρχή, χτίσαμε την πλατφόρμα ανοικτής πρόσβασης DeepDendrite, η οποία μπορεί να χρησιμοποιηθεί από τους νευροεπιστήμονες για την εφαρμογή μοντέλων χωρίς καμία συγκεκριμένη γνώση προγραμματισμού GPU. Παρακάτω, αποδεικνύουμε πώς μπορούμε να χρησιμοποιήσουμε το DeepDendrite σε εργασίες νευροεπιστήμης. Το DHS επιτρέπει τη μοντελοποίηση επιπέδου σπονδυλικής στήλης Καθώς οι δενδροειδείς σπονδύλοι λαμβάνουν το μεγαλύτερο μέρος της διεγερτικής εισόδου στους πυραμιδικούς νευρώνες του φλοιού και του ιππόκαμπου, τους νευρώνες προβολής του στριάτου κ.λπ., οι μορφολογίες και η πλαστικότητα τους είναι κρίσιμες για τη ρύθμιση της νευρωνικής διεγερσιμότητας. , , , , . However, spines are too small ( ~ 1 μm length) to be directly measured experimentally with regard to voltage-dependent processes. Thus, theoretical work is critical for the full understanding of the spine computations. 10 48 49 50 51 Μπορούμε να μοντελοποιήσουμε μια ενιαία σπονδυλική στήλη με δύο διαμερίσματα: το κεφάλι της σπονδυλικής στήλης όπου βρίσκονται οι συνάψεις και ο λαιμός της σπονδυλικής στήλης που συνδέει το κεφάλι της σπονδυλικής στήλης με δεντρίτες. Η θεωρία προβλέπει ότι ο πολύ λεπτός αυχένα της σπονδυλικής στήλης (0,1-0,5 μm σε διάμετρο) απομονώνει ηλεκτρονικά το κεφάλι της σπονδυλικής στήλης από τη γονική του δεντρίτιδα, χωρίζοντας έτσι τα σήματα που παράγονται στο κεφάλι της σπονδυλικής στήλης. Ωστόσο, το λεπτομερές μοντέλο με πλήρως κατανεμημένες σπείρες σε δεντρίτες («μοντέλο πλήρους σπονδυλικής στήλης») είναι υπολογιστικά πολύ ακριβό. Σπονδυλικός παράγοντας Αντί να ξεκαθαρίσουμε τα πάντα, εδώ, το spine factor aims at approximating the spine effect on the biophysical properties of the cell membrane . 52 53 F 54 F 54 Inspired by the previous work of Eyal et al. , we investigated how different spatial patterns of excitatory inputs formed on dendritic spines shape neuronal activities in a human pyramidal neuron model with explicitly modeled spines (Fig. ). Noticeably, Eyal et al. employed the spine factor to incorporate spines into dendrites while only a few activated spines were explicitly attached to dendrites (“few-spine model” in Fig. ). The value of spine in their model was computed from the dendritic area and spine area in the reconstructed data. Accordingly, we calculated the spine density from their reconstructed data to make our full-spine model more consistent with Eyal’s few-spine model. With the spine density set to 1.3 μm-1, the pyramidal neuron model contained about 25,000 spines without altering the model’s original morphological and biophysical properties. Further, we repeated the previous experiment protocols with both full-spine and few-spine models. We use the same synaptic input as in Eyal’s work but attach extra background noise to each sample. By comparing the somatic traces (Fig. ) and spike probability (Fig. ) στα μοντέλα πλήρους σπονδυλικής στήλης και λίγων σπονδυλικών σπονδύλων, διαπιστώσαμε ότι το μοντέλο πλήρους σπονδυλικής στήλης είναι πολύ πιο διαρροή από το μοντέλο με λίγες σπονδυλικές στήλες. Επιπλέον, η πιθανότητα αιχμής που προκαλείται από την ενεργοποίηση των ομαδοποιημένων σπονδύλων φάνηκε να είναι πιο μη γραμμική στο μοντέλο πλήρους σπονδυλικής στήλης (η στερεή μπλε γραμμή στο σχήμα. ) than in the few-spine model (the dashed blue line in Fig. ). These results indicate that the conventional F-factor method may underestimate the impact of dense spine on the computations of dendritic excitability and nonlinearity. 51 5a F 5a F 5 Β, Γ 5d 5d 5d Experiment setup. We examine two major types of models: few-spine models and full-spine models. Few-spine models (two on the left) are the models that incorporated spine area globally into dendrites and only attach individual spines together with activated synapses. In full-spine models (two on the right), all spines are explicitly attached over whole dendrites. We explore the effects of clustered and randomly distributed synaptic inputs on the few-spine models and the full-spine models, respectively. Somatic voltages recorded for cases in . Colors of the voltage curves correspond to , scale bar: 20 ms, 20 mV. Color-coded voltages during the simulation in at specific times. Colors indicate the magnitude of voltage. Somatic spike probability as a function of the number of simultaneously activated synapses (as in Eyal et al.’s work) for four cases in . Background noise is attached. Run time of experiments in με διαφορετικές μεθόδους προσομοίωσης. NEURON: συμβατικός προσομοιωτής NEURON που τρέχει σε έναν μόνο πυρήνα CPU. CoreNEURON: προσομοιωτής CoreNEURON σε μία μόνο GPU. DeepDendrite: DeepDendrite σε μία μόνο GPU. a b a a c b d a e d In the DeepDendrite platform, both full-spine and few-spine models achieved 8 times speedup compared to CoreNEURON on the GPU platform and 100 times speedup compared to serial NEURON on the CPU platform (Fig. ; Supplementary Table ) while keeping the identical simulation results (Supplementary Figs. and Ως εκ τούτου, η μέθοδος DHS επιτρέπει την εξερεύνηση της δενδριτικής διέγερσης υπό πιο ρεαλιστικές ανατομικές συνθήκες. 5e 1 4 8 Discussion In this work, we propose the DHS method to parallelize the computation of Hines method and we mathematically demonstrate that the DHS provides an optimal solution without any loss of precision. Next, we implement DHS on the GPU hardware platform and use GPU memory boosting techniques to refine the DHS (Fig. ). When simulating a large number of neurons with complex morphologies, DHS with memory boosting achieves a 15-fold speedup (Supplementary Table ) as compared to the GPU method used in CoreNEURON and up to 1,500-fold speedup compared to serial Hines method in the CPU platform (Fig. ; Supplementary Fig. and Supplementary Table ). Furthermore, we develop the GPU-based DeepDendrite framework by integrating DHS into CoreNEURON. Finally, as a demonstration of the capacity of DeepDendrite, we present a representative application: examine spine computations in a detailed pyramidal neuron model with 25,000 spines. Further in this section, we elaborate on how we have expanded the DeepDendrite framework to enable efficient training of biophysically detailed neural networks. To explore the hypothesis that dendrites improve robustness against adversarial attacks , we train our network on typical image classification tasks. We show that DeepDendrite can support both neuroscience simulations and AI-related detailed neural network tasks with unprecedented speed, therefore significantly promoting detailed neuroscience simulations and potentially for future AI explorations. 55 3 1 4 3 1 56 Decades of efforts have been invested in speeding up the Hines method with parallel methods. Early work mainly focuses on network-level parallelization. In network simulations, each cell independently solves its corresponding linear equations with the Hines method. Network-level parallel methods distribute a network on multiple threads and parallelize the computation of each cell group with each thread , . With network-level methods, we can simulate detailed networks on clusters or supercomputers . In recent years, GPU has been used for detailed network simulation. Because the GPU contains massive computing units, one thread is usually assigned one cell rather than a cell group , , . With further optimization, GPU-based methods achieve much higher efficiency in network simulation. However, the computation inside the cells is still serial in network-level methods, so they still cannot deal with the problem when the “Hines matrix” of each cell scales large. 57 58 59 35 60 61 Cellular-level parallel methods further parallelize the computation inside each cell. The main idea of cellular-level parallel methods is to split each cell into several sub-blocks and parallelize the computation of those sub-blocks , . However, typical cellular-level methods (e.g., the “multi-split” method ) pay less attention to the parallelization strategy. The lack of a fine parallelization strategy results in unsatisfactory performance. To achieve higher efficiency, some studies try to obtain finer-grained parallelization by introducing extra computation operations , , or making approximations on some crucial compartments, while solving linear equations , . These finer-grained parallelization strategies can get higher efficiency but lack sufficient numerical accuracy as in the original Hines method. 27 28 28 29 38 62 63 64 Unlike previous methods, DHS adopts the finest-grained parallelization strategy, i.e., compartment-level parallelization. By modeling the problem of “how to parallelize” as a combinatorial optimization problem, DHS provides an optimal compartment-level parallelization strategy. Moreover, DHS does not introduce any extra operation or value approximation, so it achieves the lowest computational cost and retains sufficient numerical accuracy as in the original Hines method at the same time. Dendritic spines are the most abundant microstructures in the brain for projection neurons in the cortex, hippocampus, cerebellum, and basal ganglia. As spines receive most of the excitatory inputs in the central nervous system, electrical signals generated by spines are the main driving force for large-scale neuronal activities in the forebrain and cerebellum , . The structure of the spine, with an enlarged spine head and a very thin spine neck—leads to surprisingly high input impedance at the spine head, which could be up to 500 MΩ, combining experimental data and the detailed compartment modeling approach , . Due to such high input impedance, a single synaptic input can evoke a “gigantic” EPSP ( ~ 20 mV) at the spine-head level , , thereby boosting NMDA currents and ion channel currents in the spine . However, in the classic single detailed compartment models, all spines are replaced by the coefficient modifying the dendritic cable geometries . This approach may compensate for the leak currents and capacitance currents for spines. Still, it cannot reproduce the high input impedance at the spine head, which may weaken excitatory synaptic inputs, particularly NMDA currents, thereby reducing the nonlinearity in the neuron’s input-output curve. Our modeling results are in line with this interpretation. 10 11 48 65 48 66 11 F 54 Από την άλλη πλευρά, η ηλεκτρική διαίρεση της σπονδυλικής στήλης συνοδεύεται πάντα από τη βιοχημική διαίρεση. , , , resulting in a drastic increase of internal [Ca2+], within the spine and a cascade of molecular processes involving synaptic plasticity of importance for learning and memory. Intriguingly, the biochemical process triggered by learning, in turn, remodels the spine’s morphology, enlarging (or shrinking) the spine head, or elongating (or shortening) the spine neck, which significantly alters the spine’s electrical capacity , , , . Such experience-dependent changes in spine morphology also referred to as “structural plasticity”, have been widely observed in the visual cortex , , somatosensory cortex , , motor cortex , hippocampus , and the basal ganglia in vivo. They play a critical role in motor and spatial learning as well as memory formation. However, due to the computational costs, nearly all detailed network models exploit the “F-factor” approach to replace actual spines, and are thus unable to explore the spine functions at the system level. By taking advantage of our framework and the GPU platform, we can run a few thousand detailed neurons models, each with tens of thousands of spines on a single GPU, while maintaining ~100 times faster than the traditional serial method on a single CPU (Fig. ). Therefore, it enables us to explore of structural plasticity in large-scale circuit models across diverse brain regions. 8 52 67 67 68 69 70 71 72 73 74 75 9 76 5e Another critical issue is how to link dendrites to brain functions at the systems/network level. It has been well established that dendrites can perform comprehensive computations on synaptic inputs due to enriched ion channels and local biophysical membrane properties , , Για παράδειγμα, οι πυραμιδικοί νευρώνες του φλοιού μπορούν να πραγματοποιήσουν υπογραμμική συναπτική ολοκλήρωση στην κοντινή δεντρίτιδα αλλά προοδευτικά να μετακινηθούν σε υπεργραμμική ολοκλήρωση στην απόσταση δεντρίτιδα. . Moreover, distal dendrites can produce regenerative events such as dendritic sodium spikes, calcium spikes, and NMDA spikes/plateau potentials , Τέτοια δενδρικά γεγονότα παρατηρούνται ευρέως σε ποντίκια. or even human cortical neurons in vitro, which may offer various logical operations , or gating functions , . Recently, in vivo recordings in awake or behaving mice provide strong evidence that dendritic spikes/plateau potentials are crucial for orientation selectivity in the visual cortex , ενσωμάτωση αισθητήρα-κινητήρα στο σύστημα whisker , , and spatial navigation in the hippocampal CA1 region . 5 6 7 77 6 78 6 79 6 79 80 81 82 83 84 85 To establish the causal link between dendrites and animal (including human) patterns of behavior, large-scale biophysically detailed neural circuit models are a powerful computational tool to realize this mission. However, running a large-scale detailed circuit model of 10,000-100,000 neurons generally requires the computing power of supercomputers. It is even more challenging to optimize such models for in vivo data, as it needs iterative simulations of the models. The DeepDendrite framework can directly support many state-of-the-art large-scale circuit models , , , which were initially developed based on NEURON. Moreover, using our framework, a single GPU card such as Tesla A100 could easily support the operation of detailed circuit models of up to 10,000 neurons, thereby providing carbon-efficient and affordable plans for ordinary labs to develop and optimize their own large-scale detailed models. 86 87 88 Recent works on unraveling the dendritic roles in task-specific learning have achieved remarkable results in two directions, i.e., solving challenging tasks such as image classification dataset ImageNet with simplified dendritic networks , and exploring full learning potentials on more realistic neuron , . However, there lies a trade-off between model size and biological detail, as the increase in network scale is often sacrificed for neuron-level complexity , , . Moreover, more detailed neuron models are less mathematically tractable and computationally expensive . 20 21 22 19 20 89 21 There has also been progress in the role of active dendrites in ANNs for computer vision tasks. Iyer et al. . proposed a novel ANN architecture with active dendrites, demonstrating competitive results in multi-task and continual learning. Jones and Kording Χρησιμοποίησε ένα δυαδικό δέντρο για να προσεγγίσει τη διαίρεση της δεντρίτης και παρείχε πολύτιμες ιδέες σχετικά με την επίδραση της δομής του δέντρου στην υπολογιστική ικανότητα των μεμονωμένων νευρώνων. . πρότεινε έναν κανόνα ομαλοποίησης των δενδρικών που βασίζεται στη βιοφυσική συμπεριφορά, προσφέροντας μια ενδιαφέρουσα προοπτική για τη συμβολή της δομής δενδρικών δέντρων στον υπολογισμό. Ενώ αυτές οι μελέτες προσφέρουν πολύτιμες γνώσεις, βασίζονται κυρίως σε αφηγήσεις που προέρχονται από χωρικά επεκταμένους νευρώνες και δεν εκμεταλλεύονται πλήρως τις λεπτομερείς βιολογικές ιδιότητες και χωρικές πληροφορίες των δενδρικών. 90 91 92 In response to these challenges, we developed DeepDendrite, a tool that uses the Dendritic Hierarchical Scheduling (DHS) method to significantly reduce computational costs and incorporates an I/O module and a learning module to handle large datasets. With DeepDendrite, we successfully implemented a three-layer hybrid neural network, the Human Pyramidal Cell Network (HPC-Net) (Fig. ). This network demonstrated efficient training capabilities in image classification tasks, achieving approximately 25 times speedup compared to training on a traditional CPU-based platform (Fig. ; Supplementary Table ). 6 Α, Β 6f 1 The illustration of the Human Pyramidal Cell Network (HPC-Net) for image classification. Images are transformed to spike trains and fed into the network model. Learning is triggered by error signals propagated from soma to dendrites. Training with mini-batch. Multiple networks are simulated simultaneously with different images as inputs. The total weight updates ΔW are computed as the average of ΔWi from each network. Σύγκριση του HPC-Net πριν και μετά την εκπαίδευση. Αριστερά, η απεικόνιση των κρυφών αντιδράσεων νευρώνων σε μια συγκεκριμένη είσοδο πριν (πάνω) και μετά (κάτω) εκπαίδευση. Δεξιά, κρυμμένα βάρη στρώματος (από την είσοδο στο κρυμμένο στρώμα) διανομή πριν (πάνω) και μετά (κάτω) εκπαίδευση. Workflow of the transfer adversarial attack experiment. We first generate adversarial samples of the test set on a 20-layer ResNet. Then use these adversarial samples (noisy images) to test the classification accuracy of models trained with clean images. Προβλέψτε την ακρίβεια κάθε μοντέλου σε αντίπαλα δείγματα μετά την εκπαίδευση 30 εποχών στα σύνολα δεδομένων MNIST (αριστερά) και Fashion-MNIST (δεξιά). Run time of training and testing for the HPC-Net. The batch size is set to 16. Left, run time of training one epoch. Right, run time of testing. Parallel NEURON + Python: training and testing on a single CPU with multiple cores, using 40-process-parallel NEURON to simulate the HPC-Net and extra Python code to support mini-batch training. DeepDendrite: training and testing the HPC-Net on a single GPU with DeepDendrite. a b c d e f Additionally, it is widely recognized that the performance of Artificial Neural Networks (ANNs) can be undermined by adversarial attacks —intentionally engineered perturbations devised to mislead ANNs. Intriguingly, an existing hypothesis suggests that dendrites and synapses may innately defend against such attacks . Our experimental results utilizing HPC-Net lend support to this hypothesis, as we observed that networks endowed with detailed dendritic structures demonstrated some increased resilience to transfer adversarial attacks compared to standard ANNs, as evident in MNIST Μοντέλο Μινίστ datasets (Fig. ). This evidence implies that the inherent biophysical properties of dendrites could be pivotal in augmenting the robustness of ANNs against adversarial interference. Nonetheless, it is essential to conduct further studies to validate these findings using more challenging datasets such as ImageNet . 93 56 94 95 96 6d, e 97 In conclusion, DeepDendrite has shown remarkable potential in image classification tasks, opening up a world of exciting future directions and possibilities. To further advance DeepDendrite and the application of biologically detailed dendritic models in AI tasks, we may focus on developing multi-GPU systems and exploring applications in other domains, such as Natural Language Processing (NLP), where dendritic filtering properties align well with the inherently noisy and ambiguous nature of human language. Challenges include testing scalability in larger-scale problems, understanding performance across various tasks and domains, and addressing the computational complexity introduced by novel biological principles, such as active dendrites. By overcoming these limitations, we can further advance the understanding and capabilities of biophysically detailed dendritic neural networks, potentially uncovering new advantages, enhancing their robustness against adversarial attacks and noisy inputs, and ultimately bridging the gap between neuroscience and modern AI. Methods Simulation with DHS CoreNEURON Ετικέτες ( ) uses the NEURON architecture and is optimized for both memory usage and computational speed. We implement our Dendritic Hierarchical Scheduling (DHS) method in the CoreNEURON environment by modifying its source code. All models that can be simulated on GPU with CoreNEURON can also be simulated with DHS by executing the following command: 35 https://github.com/BlueBrain/CoreNeuron 25 coreneuron_exec -d /path/to/models -e time --cell-permute 3 --cell-nthread 16 --gpu The usage options are as in Table . 1 Ακρίβεια της προσομοίωσης χρησιμοποιώντας παράλληλους υπολογισμούς σε κυτταρικό επίπεδο To ensure the accuracy of the simulation, we first need to define the correctness of a cellular-level parallel algorithm to judge whether it will generate identical solutions compared with the proven correct serial methods, like the Hines method used in the NEURON simulation platform. Based on the theories in parallel computing , a parallel algorithm will yield an identical result as its corresponding serial algorithm, if and only if the data process order in the parallel algorithm is consistent with data dependency in the serial method. The Hines method has two symmetrical phases: triangularization and back-substitution. By analyzing the serial computing Hines method , we find that its data dependency can be formulated as a tree structure, where the nodes on the tree represent the compartments of the detailed neuron model. In the triangularization process, the value of each node depends on its children nodes. In contrast, during the back-substitution process, the value of each node is dependent on its parent node (Fig. ). Thus, we can compute nodes on different branches in parallel as their values are not dependent. 34 55 1d Με βάση την εξάρτηση των δεδομένων από τη μέθοδο Hines, προτείνουμε τρεις προϋποθέσεις για να διασφαλίσουμε ότι μια παράλληλη μέθοδος θα παράγει πανομοιότυπες λύσεις με τη μέθοδο Hines: (1) Η μορφολογία του δέντρου και οι αρχικές τιμές όλων των κόμβων είναι πανομοιότυπες με εκείνες της μεθόδου Hines, (2) Στη φάση τριγωνικοποίησης, ένας κόμβος μπορεί να υποβληθεί σε επεξεργασία εάν και μόνο εάν έχουν ήδη υποβληθεί σε επεξεργασία όλοι οι κόμβοι των παιδιών του, (3) Στη φάση υποκατάστασης, ένας κόμβος μπορεί να υποβληθεί σε επεξεργασία μόνο εάν ο γονικός του κόμβος έχει ήδη υποβληθεί σε επεξεργασία. Υπολογιστικό κόστος της μεθόδου παράλληλης υπολογιστικής σε κυτταρικό επίπεδο To theoretically evaluate the run time, i.e., efficiency, of the serial and parallel computing methods, we introduce and formulate the concept of computational cost as follows: given a tree and τρύπες (βασικές υπολογιστικές μονάδες) για να εκτελέσετε τριγώνισμα, παράλληλη τριγώνισμα είναι ίση για να διαιρέσει το σύνολο κόμβων of into Υπάρχουν, δηλαδή = { , , ... } where the size of each subset | | ≤ , i.e., at most nodes can be processed each step since there are only threads. The process of the triangularization phase follows the order: → → … → , and nodes in the same subset can be processed in parallel. So, we define | Το μέγεθος (το μέγεθος του , i.e., here) as the computational cost of the parallel computing method. In short, we define the computational cost of a parallel method as the number of steps it takes in the triangularization phase. Because the back-substitution is symmetrical with triangularization, the total cost of the entire solving equation phase is twice that of the triangularization phase. T k V T n V V1 V2 Vn ΒΙ k k k V1 V2 Vn Vi V V n Mathematical scheduling problem Based on the simulation accuracy and computational cost, we formulate the parallelization problem as a mathematical scheduling problem: Given a tree = το , } and a positive integer , where is the node-set and is the edge set. Define partition ( ) = { , , … γιος | ≤ , 1 ≤ ≤ n, where | | indicates the cardinal number of subset , i.e., the number of nodes in , and for each node ∈ , all its children nodes { | ∈children( πρέπει σε προηγούμενο υποσύνολο , where 1 ≤ < . Our goal is to find an optimal partition ( ) whose computational cost | ( )| is minimal. T V E k V E P V Β1 V2 Vn Vi k i Vi Vi Vi v Vi c c v Vj j i P* V P* V Here subset αποτελείται από όλους τους κόμβους που θα υπολογιστούν στο Το επόμενο βήμα (ΦΙΓ. ), so | | ≤ indicates that we can compute nodes each step at most because the number of available threads is . The restriction “for each node ∈ «Όλα τα παιδιά της κόβουν» | Τα παιδιά ( )} must in a previous subset όπου 1 ≤ < ” indicates that node can be processed only if all its child nodes are processed. ΒΙ i 2e Vi k k k v ΒΙ c c v ΒΓ j i v DHS implementation We aim to find an optimal way to parallelize the computation of solving linear equations for each neuron model by solving the mathematical scheduling problem above. To get the optimal partition, DHS first analyzes the topology and calculates the depth ( ) for all nodes ∈ . Then, the following two steps will be executed iteratively until every node ∈ is assigned to a subset: (1) find all candidate nodes and put these nodes into candidate set Ένας κόμβος είναι υποψήφιος μόνο εάν έχουν υποβληθεί σε επεξεργασία όλοι οι παιδικοί κόμβοι του ή δεν έχει κανένα παιδικό κόμβο. | ≤ , δηλαδή, ο αριθμός των υποψήφιων κόμβων είναι μικρότερος ή ισοδύναμος με τον αριθμό των διαθέσιμων άκρων, αφαιρέστε όλους τους κόμβους στο and put them into , otherwise, remove deepest nodes from and add them to subset . Label these nodes as processed nodes (Fig. ). After filling in subset , go to step (1) to fill in the next subset . d v v V v V Q Q k Q V*i k Q ΒΙ 2d ΒΙ Vi+1 Correctness proof for DHS Μετά την εφαρμογή του DHS σε ένα νευρικό δέντρο = { , }, we get a partition ( ) = { , , … }, | | ≤ , 1 ≤ ≤ . Nodes in the same subset will be computed in parallel, taking steps to perform triangularization and back-substitution, respectively. We then demonstrate that the reordering of the computation in DHS will result in a result identical to the serial Hines method. T V E P V V1 V2 Vn Vi k i n Vi n The partition ( ) obtained from DHS decides the computation order of all nodes in a neural tree. Below we demonstrate that the computation order determined by ( ) satisfies the correctness conditions. ( ) is obtained from the given neural tree . Operations in DHS do not modify the tree topology and values of tree nodes (corresponding values in the linear equations), so the tree morphology and initial values of all nodes are not changed, which satisfies condition 1: the tree morphology and initial values of all nodes are identical to those in serial Hines method. In triangularization, nodes are processed from subset to . As shown in the implementation of DHS, all nodes in subset are selected from the candidate set , and a node can be put into only if all its child nodes have been processed. Thus the child nodes of all nodes in are in { , , … }, meaning that a node is only computed after all its children have been processed, which satisfies condition 2: in triangularization, a node can be processed if and only if all its child nodes are already processed. In back-substitution, the computation order is the opposite of that in triangularization, i.e., from to . As shown before, the child nodes of all nodes in are in { , , … }, so parent nodes of nodes in are in { , , … }, which satisfies condition 3: in back-substitution, a node can be processed only if its parent node is already processed. P V P V P V T V1 Vn Vi Q Q Vi V1 V2 Β1 Vn V1 Vi Β1 V2 Vi-1 Vi Vi+1 Vi+2 Vn Optimality proof for DHS The idea of the proof is that if there is another optimal solution, it can be transformed into our DHS solution without increasing the number of steps the algorithm requires, thus indicating that the DHS solution is optimal. For each subset in ( ), DHS moves (thread number) deepest nodes from the corresponding candidate set to . If the number of nodes in is smaller than , move all nodes from to . To simplify, we introduce , indicating the depth sum of deepest nodes in . All subsets in ( ) satisfy the max-depth criteria (Supplementary Fig. ): . We then prove that selecting the deepest nodes in each iteration makes βέλτιστο διαμέρισμα. αν υπάρχει βέλτιστο διαμέρισμα = { , , … } containing subsets that do not satisfy the max-depth criteria, we can modify the subsets in ( ) so that all subsets consist of the deepest nodes from Και ο αριθμός των υποσετών () ( )|) remain the same after modification. Vi P V k Qi Vi Qi k Qi Vi Di k Qi P V 6a P(V) P*(V) V*1 V*2 V*s P* V Q P* V Without any loss of generalization, we start from the first subset not satisfying the criteria, i.e., . There are two possible cases that will make not satisfy the max-depth criteria: (1) | | < and there exist some valid nodes in that are not put to ; (2) | | = but nodes in are not the deepest nodes in . V*i V*i V*i k Qi V*i V*i k V*i k ΤΣΙ For case (1), because some candidate nodes are not put to , these nodes must be in the subsequent subsets. As | | , we can move the corresponding nodes from the subsequent subsets to , which will not increase the number of subsets and make Να ικανοποιήσουν τα κριτήρια (Συμπληρωματικό σχήμα. , top). For case (2), | | = , these deeper nodes that are not moved from the candidate set into must be added to subsequent subsets (Supplementary Fig. , bottom). These deeper nodes can be moved from subsequent subsets to through the following method. Assume that after filling , is picked and one of the -th deepest nodes is still in , thus will be put into a subsequent subset ( > Πρώτα κινούμαστε from to + , then modify subset + as follows: if | + | ≤ and none of the nodes in + is the parent of node , stop modifying the latter subsets. Otherwise, modify + as follows (Supplementary Fig. ): if the parent node of is in + , move this parent node to + ; else move the node with minimum depth from + to + το . After adjusting , modify subsequent subsets + , + , … with the same strategy. Finally, move from to . V*i V*i < k V*i V*i 6b V*i k Qi V*i 6b V*i V*i v k v’ Qi v’ V*j j i v V*i V*i 1 V*i 1 V*i 1 k Β ́Ι 1 v V*i 1 6c v V*i 1 V*i 2 V*i 1 V*i 2 V*i V*i 1 V*i 2 V*j-1 v’ V*j V*i With the modification strategy described above, we can replace all shallower nodes in with the -th deepest node in and keep the number of subsets, i.e., | ( )| the same after modification. We can modify the nodes with the same strategy for all subsets in ( ) that do not contain the deepest nodes. Finally, all subsets ∈ ( ) μπορεί να ικανοποιήσει τα κριτήρια μέγιστου βάθους, και ( )| does not change after modifying. V*i k Qi P* V P* V V*i P* V P* V In conclusion, DHS generates a partition ( ), and all subsets ∈ ( ) satisfy the max-depth condition: . For any other optimal partition ( ) we can modify its subsets to make its structure the same as ( ), i.e., each subset consists of the deepest nodes in the candidate set, and keep | ( ) Μετά την τροποποίηση, η διαίρεση ( ) που λαμβάνεται από το DHS είναι ένα από τα βέλτιστα διαμερίσματα. P V ΒΙ P V Π * V P V Π * V | P V GPU implementation and memory boosting To achieve high memory throughput, GPU utilizes the memory hierarchy of (1) global memory, (2) cache, (3) register, where global memory has large capacity but low throughput, while registers have low capacity but high throughput. We aim to boost memory throughput by leveraging the memory hierarchy of GPU. Η GPU χρησιμοποιεί την αρχιτεκτονική SIMT (Single-Instruction, Multiple-Thread). Οι Warps είναι οι βασικές μονάδες προγραμματισμού στην GPU (μια warp είναι μια ομάδα 32 παράλληλων νήσεων). Η σωστή σειρά των κόμβων είναι απαραίτητη για αυτή την παρτίδα υπολογισμού σε warps, για να βεβαιωθείτε ότι το DHS λαμβάνει τα ίδια αποτελέσματα με τη μέθοδο της σειράς Hines. Όταν εφαρμόζουμε το DHS σε GPU, πρώτα ομαδοποιούμε όλα τα κύτταρα σε πολλαπλά warps με βάση τις μορφολογίες τους. Τα κύτταρα με παρόμοιες μορφολογίες ομαδοποιούνται στην ίδια warp. Στη συνέχεια εφαρμόζουμε το DHS σε όλους τους νευρώνες, αναθέτοντας τα τμήματα κάθε νευρώνου σε πολλαπλά νήματα. Επειδή τα νευρώνες ομαδοποιούνται σε warps, τα νήματα του ίδιου νευρώνου βρίσκονται στην ίδια warp. Ως εκ τούτου, 46 Όταν ένα warp φορτώνει εκ των προτέρων ευθυγραμμισμένα και διαδοχικά αποθηκευμένα δεδομένα από την παγκόσμια μνήμη, μπορεί να κάνει πλήρη χρήση της προσωρινής αποθήκευσης, η οποία οδηγεί σε υψηλή ροή μνήμης, ενώ η πρόσβαση σε διασκορπισμένα δεδομένα θα μειώσει τη ροή μνήμης. Μετά την ανάθεση τμημάτων και την αναδιοργάνωση των νήμων, μετατοπίζουμε δεδομένα στην παγκόσμια μνήμη για να το κάνουμε συνεπή με τις εντολές υπολογισμού, έτσι ώστε οι warps να μπορούν να φορτώσουν διαδοχικά αποθηκευμένα δεδομένα κατά την εκτέλεση του προγράμματος. Επιπλέον, τοποθετούμε αυτές τις απαραίτητες προσωρινές μεταβλητές στα μητρώα αντί για την παγκόσμια μνήμη. Full-spine and few-spine biophysical models We used the published human pyramidal neuron . The membrane capacitance m = 0.44 μF cm-2, membrane resistance m = 48,300 Ω cm2, and axial resistivity a = 261.97 Ω cm. In this model, all dendrites were modeled as passive cables while somas were active. The leak reversal potential l = -83.1 mV. Ion channels such as Na+ and K+ were inserted on soma and initial axon, and their reversal potentials were Na = 67.6 mV, K = -102 mV respectively. All these specific parameters were set the same as in the model of Eyal, et al. , for more details please refer to the published model (ModelDB, access No. 238347). 51 c r r E E E 51 In the few-spine model, the membrane capacitance and maximum leak conductance of the dendritic cables 60 μm away from soma were multiplied by a spine factor to approximate dendritic spines. In this model, spine was set to 1.9. Only the spines that receive synaptic inputs were explicitly attached to dendrites. F F In the full-spine model, all spines were explicitly attached to dendrites. We calculated the spine density with the reconstructed neuron in Eyal, et al. . The spine density was set to 1.3 μm-1, and each cell contained 24994 spines on dendrites 60 μm away from the soma. 51 The morphologies and biophysical mechanisms of spines were the same in few-spine and full-spine models. The length of the spine neck neck = 1.35 μm and the diameter ο λαιμός = 0,25 μm, ενώ το μήκος και η διάμετρος της κεφαλής της σπονδυλικής στήλης ήταν 0,944 μm, δηλαδή η περιοχή της κεφαλής της σπονδυλικής στήλης ορίστηκε σε 2,8 μm2. = -86 mV. Η ειδική χωρητικότητα της μεμβράνης, η αντίσταση της μεμβράνης και η αξονική αντίσταση ήταν οι ίδιες με εκείνες των δενδριτών. L D El Synaptic inputs We investigated neuronal excitability for both distributed and clustered synaptic inputs. All activated synapses were attached to the terminal of the spine head. For distributed inputs, all activated synapses were randomly distributed on all dendrites. For clustered inputs, each cluster consisted of 20 activated synapses that were uniformly distributed on a single randomly-selected compartment. All synapses were activated simultaneously during the simulation. AMPA-based and NMDA-based synaptic currents were simulated as in Eyal et al.’s work. AMPA conductance was modeled as a double-exponential function and NMDA conduction as a voltage-dependent double-exponential function. For the AMPA model, the specific rise and decay were set to 0.3 and 1.8 ms. For the NMDA model, rise and decay were set to 8.019 and 34.9884 ms, respectively. The maximum conductance of AMPA and NMDA were 0.73 nS and 1.31 nS. τ τ τ τ Background noise We attached background noise to each cell to simulate a more realistic environment. Noise patterns were implemented as Poisson spike trains with a constant rate of 1.0 Hz. Each pattern started at start = 10 ms and lasted until the end of the simulation. We generated 400 noise spike trains for each cell and attached them to randomly-selected synapses. The model and specific parameters of synaptic currents were the same as described in , except that the maximum conductance of NMDA was uniformly distributed from 1.57 to 3.275, resulting in a higher AMPA to NMDA ratio. t Synaptic Inputs Exploring neuronal excitability We investigated the spike probability when multiple synapses were activated simultaneously. For distributed inputs, we tested 14 cases, from 0 to 240 activated synapses. For clustered inputs, we tested 9 cases in total, activating from 0 to 12 clusters respectively. Each cluster consisted of 20 synapses. For each case in both distributed and clustered inputs, we calculated the spike probability with 50 random samples. Spike probability was defined as the ratio of the number of neurons fired to the total number of samples. All 1150 samples were simulated simultaneously on our DeepDendrite platform, reducing the simulation time from days to minutes. Performing AI tasks with the DeepDendrite platform Conventional detailed neuron simulators lack two functionalities important to modern AI tasks: (1) alternately performing simulations and weight updates without heavy reinitialization and (2) simultaneously processing multiple stimuli samples in a batch-like manner. Here we present the DeepDendrite platform, which supports both biophysical simulating and performing deep learning tasks with detailed dendritic models. DeepDendrite consists of three modules (Supplementary Fig. ): (1) an I/O module; (2) a DHS-based simulating module; (3) a learning module. When training a biophysically detailed model to perform learning tasks, users first define the learning rule, then feed all training samples to the detailed model for learning. In each step during training, the I/O module picks a specific stimulus and its corresponding teacher signal (if necessary) from all training samples and attaches the stimulus to the network model. Then, the DHS-based simulating module initializes the model and starts the simulation. After simulation, the learning module updates all synaptic weights according to the difference between model responses and teacher signals. After training, the learned model can achieve performance comparable to ANN. The testing phase is similar to training, except that all synaptic weights are fixed. 5 HPC-Net model Image classification is a typical task in the field of AI. In this task, a model should learn to recognize the content in a given image and output the corresponding label. Here we present the HPC-Net, a network consisting of detailed human pyramidal neuron models that can learn to perform image classification tasks by utilizing the DeepDendrite platform. HPC-Net has three layers, i.e., an input layer, a hidden layer, and an output layer. The neurons in the input layer receive spike trains converted from images as their input. Hidden layer neurons receive the output of input layer neurons and deliver responses to neurons in the output layer. The responses of the output layer neurons are taken as the final output of HPC-Net. Neurons between adjacent layers are fully connected. For each image stimulus, we first convert each normalized pixel to a homogeneous spike train. For pixel with coordinates ( ) in the image, the corresponding spike train has a constant interspike interval ISI( (σε ms) που καθορίζεται από την τιμή pixel ( ) as shown in Eq. ( ). x, y τ x, y p x, και 1 In our experiment, the simulation for each stimulus lasted 50 ms. All spike trains started at 9 + ISI ms and lasted until the end of the simulation. Then we attached all spike trains to the input layer neurons in a one-to-one manner. The synaptic current triggered by the spike arriving at time is given by τ t0 where is the post-synaptic voltage, the reversal potential syn = 1 mV, the maximum synaptic conductance max = 0.05 μS, and the time constant = 0.5 ms. v E g τ Neurons in the input layer were modeled with a passive single-compartment model. The specific parameters were set as follows: membrane capacitance m = 1.0 μF cm-2, membrane resistance m = 104 Ω cm2, axial resistivity α = 100 Ω cm, αναστροφή του δυναμικού του παθητικού διαμερίσματος l = 0 mV. c r r E The hidden layer contains a group of human pyramidal neuron models, receiving the somatic voltages of input layer neurons. The morphology was from Eyal, et al. , και όλοι οι νευρώνες μοντελοποιήθηκαν με παθητικά καλώδια. m = 1,5 μF cm-2, αντίσταση μεμβράνης m = 48,300 Ω cm2, axial resistivity a = 261.97 Ω cm, and the reversal potential of all passive cables l = 0 mV. Input neurons could make multiple connections to randomly-selected locations on the dendrites of hidden neurons. The synaptic current activated by the -th synapse of the -th input neuron on neuron ’s dendrite is defined as in Eq. ( ), where is the synaptic conductance, is the synaptic weight, is the ReLU-like somatic activation function, and is the somatic voltage of the -th input neuron at time . 51 c r r E k i j 4 gijk Wijk i t Neurons in the output layer were also modeled with a passive single-compartment model, and each hidden neuron only made one synaptic connection to each output neuron. All specific parameters were set the same as those of the input neurons. Synaptic currents activated by hidden neurons are also in the form of Eq. ( ). 4 Image classification with HPC-Net Για κάθε ερέθισμα εισόδου εικόνας, κανονικοποιήσαμε πρώτα όλες τις τιμές των εικονοστοιχείων σε 0.0-1.0. Στη συνέχεια μετατρέψαμε τα κανονικοποιημένα εικονοστοιχεία σε τρένα αιχμής και τα συνδέσαμε με τους νευρώνες εισόδου. , where Είναι η πιθανότητα του -th class predicted by the HPC-Net, is the average somatic voltage from 20 ms to 50 ms of the -th output neuron, and Η τάξη με τη μέγιστη προβλεπόμενη πιθανότητα είναι το τελικό αποτέλεσμα ταξινόμησης.Σε αυτό το έγγραφο, χτίσαμε το HPC-Net με 784 νευρώνες εισόδου, 64 κρυμμένους νευρώνες και 10 νευρώνες εξόδου. 6 pi i i C Synaptic plasticity rules for HPC-Net Inspired by previous work , we use a gradient-based learning rule to train our HPC-Net to perform the image classification task. The loss function we use here is cross-entropy, given in Eq. ( ), where is the predicted probability for class , indicates the actual class the stimulus image belongs to, = 1 if input image belongs to class , and = 0 if not. 36 7 pi i yi yi i yi When training HPC-Net, we compute the update for weight (the synaptic weight of the -th synapse connecting neuron to neuron ) at each time step. After the simulation of each image stimulus, is updated as shown in Eq. ( ): Wijk k i j Wijk 8 Εδώ είναι το ποσοστό μάθησης, είναι η τιμή ενημέρωσης στο χρόνο , , Οι σωματικές τάσεις των νευρώνων and αντίστοιχα , Είναι η -th synaptic current activated by neuron on neuron , its synaptic conductance, is the transfer resistance between the -th connected compartment of neuron on neuron ’s dendrite to neuron ’s soma, s = 30 ms e = 50 ms are start time and end time for learning respectively. For output neurons, the error term can be computed as shown in Eq. ( ). For hidden neurons, the error term is calculated from the error terms in the output layer, given in Eq. ( ). t ΒΓ ΒΙ i j Iijk k i j gijk Ρέικ k i j j t t 10 11 Δεδομένου ότι όλοι οι νευρώνες εξόδου είναι μονομερείς, ίσοι με την αντίσταση εισόδου του αντίστοιχου διαμερίσματος, οι αντίσταση μεταφοράς και εισόδου υπολογίζονται από το NEURON. Mini-batch training is a typical method in deep learning for achieving higher prediction accuracy and accelerating convergence. DeepDendrite also supports mini-batch training. When training HPC-Net with mini-batch size batch, we make batch copies of HPC-Net. During training, each copy is fed with a different training sample from the batch. DeepDendrite first computes the weight update for each copy separately. After all copies in the current training batch are done, the average weight update is calculated and weights in all copies are updated by this same amount. N N Αντιμετώπιση αντιτιθέμενων επιθέσεων με HPC-Net To demonstrate the robustness of HPC-Net, we tested its prediction accuracy on adversarial samples and compared it with an analogous ANN (one with the same 784-64-10 structure and ReLU activation, for fair comparison in our HPC-Net each input neuron only made one synaptic connection to each hidden neuron). We first trained HPC-Net and ANN with the original training set (original clean images). Then we added adversarial noise to the test set and measured their prediction accuracy on the noisy test set. We used the Foolbox , για να δημιουργήσετε αντίπαλο θόρυβο με τη μέθοδο FGSM Η ANN εκπαιδεύτηκε με τον PyTorch , και το HPC-Net εκπαιδεύτηκε με το DeepDendrite μας. Για να είμαστε δίκαιοι, δημιουργήσαμε αντίπαλο θόρυβο σε ένα σημαντικά διαφορετικό μοντέλο δικτύου, ένα ResNet 20 στρωμάτων . The noise level ranged from 0.02 to 0.2. We experimented on two typical datasets, MNIST Μοντέλο Μινίστ Τα αποτελέσματα δείχνουν ότι η ακρίβεια πρόβλεψης του HPC-Net είναι 19% και 16,72% υψηλότερη από εκείνη του ανάλογου ANN, αντίστοιχα. 98 99 93 100 101 95 96 Συνοπτική αναφορά Further information on research design is available in the συνδέεται με αυτό το άρθρο. Περιγραφή του χαρτοφυλακίου της φύσης Data availability The data that support the findings of this study are available within the paper, Supplementary Information and Source Data files provided with this paper. The source code and data that used to reproduce the results in Figs. – are available at . The MNIST dataset is publicly available at Το σύνολο δεδομένων Fashion-MNIST είναι διαθέσιμο στο κοινό στο . are provided with this paper. 3 6 https://github.com/pkuzyc/DeepDendrite http://yann.lecun.com/exdb/mnist https://github.com/zalandoresearch/fashion-mnist Source data Code availability The source code of DeepDendrite as well as the models and code used to reproduce Figs. – in this study are available at . 3 6 https://github.com/pkuzyc/DeepDendrite References McCulloch, W.S. & Pitts, W. Ένας λογικός υπολογισμός των ιδεών που είναι εγγενείς στη νευρική δραστηριότητα. μπουλ. LeCun, Y., Bengio, Y. & Hinton, G. Βαθιά μάθηση. Φύση 521, 436–444 (2015). Poirazi, P., Brannon, T. & Mel, B. W. Arithmetic of subthreshold synaptic summation in a model CA1 pyramidal cell. , 977–987 (2003). Neuron 37 Λονδίνο, Μ. & Χάουσερ, Μ. Δενδρικός υπολογισμός. Έτος. Rev. Neurosci. 28, 503–532 (2005). Branco, T. & Häusser, M. Ο ενιαίος δενδρικός κλάδος ως θεμελιώδης λειτουργική μονάδα στο νευρικό σύστημα. Curr. Opin. Neurobiol. 20, 494–502 (2010). Stuart, G. J. & Spruston, Ν. Δενδρική ολοκλήρωση: 60 χρόνια προόδου. Nat. Neurosci. 18, 1713–1721 (2015). Poirazi, P. & Papoutsi, A. Illuminating dendritic function with computational models. , 303–321 (2020). Nat. Rev. Neurosci. 21 Yuste, R. & Denk, W. Dendritic spines as basic functional units of neuronal integration. , 682–684 (1995). Nature 375 Engert, F. & Bonhoeffer, T. Dendritic spine changes associated with hippocampal long-term synaptic plasticity. , 66–70 (1999). Nature 399 Yuste, R. Dendritic spines and distributed circuits. , 772–781 (2011). Neuron 71 Yuste, R. Electrical compartmentalization in dendritic spines. , 429–449 (2013). Annu. Rev. Neurosci. 36 Rall, W. Branching dendritic trees and motoneuron membrane resistivity. , 491–527 (1959). Exp. Neurol. 1 Segev, I. & Rall, W. Υπολογιστική μελέτη μιας διεγερτικής δενδρικής σπονδυλικής στήλης. J. Neurophysiol. 60, 499-523 (1988). Silver, D. et al. Mastering the game of go with deep neural networks and tree search. Nature 529, 484–489 (2016). Silver, D. et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. , 1140–1144 (2018). Science 362 McCloskey, M. & Cohen, N. J. Καταστροφική παρεμβολή στα συνδετικά δίκτυα: το πρόβλημα της διαδοχικής μάθησης. French, R. M. Catastrophic forgetting in connectionist networks. , 128–135 (1999). Trends Cogn. Sci. 3 Naud, R. & Sprekeler, H. Sparse bursts optimize information transmission in a multiplexed neural code. , E6329–E6338 (2018). Proc. Natl Acad. Sci. USA 115 Sacramento, J., Costa, R. P., Bengio, Y. & Senn, W. Dendritic cortical microcircuits approximate the backpropagation algorithm. in (NeurIPS*,* 2018). Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Payeur, A., Guerguiev, J., Zenke, F., Richards, B. A. & Naud, R. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. , 1010–1019 (2021). Nat. Neurosci. 24 Bicknell, B. A. & Häusser, M. Ένας κανόνας συναπτικής μάθησης για την εκμετάλλευση μη γραμμικών δενδρικών υπολογισμών. Neuron 109, 4001–4017 (2021). Moldwin, T., Kalmenson, M. & Segev, I. The gradient clusteron: a model neuron that learns to solve classification tasks via dendritic nonlinearities, structural plasticity, and gradient descent. , e1009015 (2021). PLoS Comput. Biol. 17 Hodgkin, A. L. & Huxley, A. F. A quantitative description of membrane current and Its application to conduction and excitation in nerve. , 500–544 (1952). J. Physiol. 117 Rall, W. Θεωρία των φυσιολογικών ιδιοτήτων των δενδριτών. Ann. N. Y. Acad. Sci. 96, 1071-1092 (1962). Hines, M. L. & Carnevale, N. T. The NEURON simulation environment. , 1179–1209 (1997). Neural Comput. 9 Bower, J. M. & Beeman, D. in (eds Bower, J.M. & Beeman, D.) 17–27 (Springer New York, 1998). The Book of GENESIS: Exploring Realistic Neural Models with the GEneral NEural SImulation System Hines, M. L., Eichner, H. & Schürmann, F. Neuron splitting in compute-bound parallel network simulations enables runtime scaling with twice as many processors. , 203–210 (2008). J. Comput. Neurosci. 25 Hines, M. L., Markram, H. & Schürmann, F. Fully implicit parallel simulation of single neurons. , 439–448 (2008). J. Comput. Neurosci. 25 Ben-Shalom, R., Liberman, G. & Korngreen, A. Accelerating compartmental modeling on a graphical processing unit. , 4 (2013). Front. Neuroinform. 7 Tsuyuki, T., Yamamoto, Y. & Yamazaki, T. Efficient numerical simulation of neuron models with spatial structure on graphics processing units. In (eds Hirose894Akiraet al.) 279–285 (Springer International Publishing, 2016). Proc. 2016 International Conference on Neural Information Processing Vooturi, D. T., Kothapalli, K. & Bhalla, U. S. Parallelizing Hines Matrix Solver in Neuron Simulations on GPU. In 388–397 (IEEE, 2017). Proc. IEEE 24th International Conference on High Performance Computing (HiPC) Huber, F. Efficient tree solver for hines matrices on the GPU. Preprint at (2018). https://arxiv.org/abs/1810.12742 Korte, B. & Vygen, J. Θεωρία συνδυαστικής βελτιστοποίησης και αλγόριθμοι 6 edn (Springer, 2018). Gebali, F. (Wiley, 2011). Algorithms and Parallel Computing Kumbhar, P. et al. CoreNEURON: Μια βελτιστοποιημένη μηχανή υπολογισμού για τον προσομοιωτή NEURON. Urbanczik, R. & Senn, W. Learning by the dendritic prediction of somatic spiking. , 521–528 (2014). Neuron 81 Ben-Shalom, R., Aviv, A., Razon, B. & Korngreen, A. Optimizing ion channel models using a parallel genetic algorithm on graphical processors. , 183–194 (2012). J. Neurosci. Methods 206 Mascagni, M. A parallelizing algorithm for computing solutions to arbitrarily branched cable neuron models. , 105–114 (1991). J. Neurosci. Methods 36 McDougal, R. A. et al. Twenty years of modelDB and beyond: building essential modeling tools for the future of neuroscience. , 1–10 (2017). J. Comput. Neurosci. 42 Migliore, M., Messineo, L. & Ferrante, M. Dendritic Ih selectively blocks temporal summation of unsynchronized distal inputs in CA1 pyramidal neurons. , 5–13 (2004). J. Comput. Neurosci. 16 Hemond, P. et al. Distinct classes of pyramidal cells exhibit mutually exclusive firing patterns in hippocampal area CA3b. , 411–424 (2008). Hippocampus 18 Hay, E., Hill, S., Schürmann, F., Markram, H. & Segev, I. Models of neocortical layer 5b pyramidal cells capturing a wide range of dendritic and perisomatic active Properties. , e1002107 (2011). PLoS Comput. Biol. 7 Masoli, S., Solinas, S. & D’Angelo, E. Action potential processing in a detailed purkinje cell model reveals a critical role for axonal compartmentalization. , 47 (2015). Front. Cell. Neurosci. 9 Lindroos, R. et al. Basal ganglia neuromodulation over multiple temporal and structural scales—simulations of direct pathway MSNs investigate the fast onset of dopaminergic effects and predict the role of Kv4.2. , 3 (2018). Front. Neural Circuits 12 Migliore, M. et al. Synaptic clusters function as odor operators in the olfactory bulb. , 8499–8504 (2015). Proc. Natl Acad. Sci. USa 112 NVIDIA. . (2021). CUDA C++ Programming Guide https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html NVIDIA. CUDA C++ Οδηγός Βέλτιστων Πρακτικών. https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html (2021). Harnett, M. T., Makara, J. K., Spruston, N., Kath, W. L. & Magee, J. C. Synaptic amplification by dendritic spines enhances input cooperativity. , 599–602 (2012). Nature 491 Chiu, C. Q. et al. Compartmentalization of GABAergic inhibition by dendritic spines. , 759–762 (2013). Science 340 Tønnesen, J., Katona, G., Rózsa, B. & Nägerl, U. V. Spine neck plasticity regulates compartmentalization of synapses. , 678–685 (2014). Nat. Neurosci. 17 Eyal, G. et al. Ανθρώπινα πυραμιδικά νευρώνες του φλοιού: από σπονδύλους έως αιχμές μέσω μοντέλων. Front. Cell. Neurosci. 12, 181 (2018). Koch, C. & Zador, A. The function of dendritic spines: devices subserving biochemical rather than electrical compartmentalization. , 413–422 (1993). J. Neurosci. 13 Koch, C. Dendritic σπινές. σε Βιοφυσική της Υπολογιστικής (Oxford University Press, 1999). Rapp, M., Yarom, Y. & Segev, I. Η επίδραση της παράλληλης δραστηριότητας υπόβαθρου ινών στις ιδιότητες καλωδίων των εγκεφαλικών κυττάρων purkinje. Hines, M. Efficient computation of branched nerve equations. , 69–76 (1984). Int. J. Bio-Med. Comput. 15 Nayebi, A. & Ganguli, S. Biologically inspired protection of deep networks from adversarial attacks. Preprint at (2017). https://arxiv.org/abs/1703.09202 Goddard, N. H. & Hood, G. Large-Scale Simulation Using Parallel GENESIS. In (eds Bower James M. & Beeman David) 349-379 (Springer New York, 1998). The Book of GENESIS: Exploring Realistic Neural Models with the GEneral NEural SImulation System Migliore, M., Cannia, C., Lytton, W. W., Markram, H. & Hines, M. L. Parallel network simulations with NEURON. , 119 (2006). J. Comput. Neurosci. 21 Lytton, W. W. et al. Simulation neurotechnologies for advancing brain research: parallelizing large networks in NEURON. , 2063–2090 (2016). Neural Comput. 28 Valero-Lara, P. et al. cuHinesBatch: Solving multiple Hines systems on GPUs human brain project. In 566–575 (IEEE, 2017). Proc. 2017 International Conference on Computational Science Akar, N. A. et al. Arbor—A morphologically-detailed neural network simulation library for contemporary high-performance computing architectures. In 274–282 (IEEE, 2019). Proc. 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) Ben-Shalom, R. et al. NeuroGPU: Accelerating multi-compartment, biophysically detailed neuron simulations on GPUs. , 109400 (2022). J. Neurosci. Methods 366 Rempe, M. J. & Chopp, D. L. A predictor-corrector algorithm for reaction-diffusion equations associated with neural activity on branched structures. , 2139–2161 (2006). SIAM J. Sci. Comput. 28 Kozloski, J. & Wagner, J. An ultrascalable solution to large-scale neural tissue simulation. , 15 (2011). Front. Neuroinform. 5 Jayant, K. et al. Targeted intracellular voltage recordings from dendritic spines using quantum-dot-coated nanopipettes. , 335–342 (2017). Nat. Nanotechnol. 12 Palmer, L. M. & Stuart, G. J. Membrane potential changes in dendritic spines during action potentials and synaptic input. , 6897–6903 (2009). J. Neurosci. 29 Nishiyama, J. & Yasuda, R. Biochemical computation for spine structural plasticity. , 63–75 (2015). Neuron 87 Yuste, R. & Bonhoeffer, T. Μορφολογικές αλλαγές στις δενδριτικές σπονδύλες που σχετίζονται με τη μακροχρόνια συναπτική πλαστικότητα. Holtmaat, A. & Svoboda, K. Εμπειρία-εξαρτώμενη δομική συναπτική πλαστικότητα στον εγκέφαλο των θηλαστικών. Caroni, P., Donato, F. & Muller, D. Structural plasticity upon learning: regulation and functions. , 478–490 (2012). Nat. Rev. Neurosci. 13 Keck, T. et al. Massive restructuring of neuronal circuits during functional reorganization of adult visual cortex. , 1162 (2008). Nat. Neurosci. 11 Hofer, S. B., Mrsic-Flogel, T. D., Bonhoeffer, T. & Hübener, M. Experience leaves a lasting structural trace in cortical circuits. , 313–317 (2009). Nature 457 Trachtenberg, J. T. et al. Μακροπρόθεσμη in vivo απεικόνιση της εμπειρικά εξαρτώμενης συναπτικής πλαστικότητας στον ενήλικα φλοιό. φύση 420, 788-794 (2002). Marik, S. A., Yamahachi, H., McManus, J. N., Szabo, G. & Gilbert, C. D. Axonal dynamics of excitatory and inhibitory neurons in somatosensory cortex. , e1000395 (2010). PLoS Biol. 8 Xu, T. et al. Rapid formation and selective stabilization of synapses for enduring motor memories. , 915–919 (2009). Nature 462 Albarran, E., Raissi, A., Jáidar, O., Shatz, C. J. & Ding, J. B. Enhancing motor learning by increasing the stability of newly formed dendritic spines in the motor cortex. , 3298–3311 (2021). Neuron 109 Branco, T. & Häusser, M. Synaptic integration gradients in single cortical pyramidal cell dendrites. , 885–892 (2011). Neuron 69 Major, G., Larkum, M. E. & Schiller, J. Active properties of neocortical pyramidal neuron dendrites. , 1–24 (2013). Annu. Rev. Neurosci. 36 Gidon, A. et al. Δανδρικές δυνατότητες δράσης και υπολογισμός στο ανθρώπινο στρώμα 2/3 νευρώνων του φλοιού. Science 367, 83-87 (2020). Doron, M., Chindemi, G., Muller, E., Markram, H. & Segev, I. Timed synaptic inhibition shapes NMDA spikes, influencing local dendritic processing and global I/O properties of cortical neurons. , 1550–1561 (2017). Cell Rep. 21 Du, K. et al. Κυτταρικός τύπος-ειδική αναστολή του δυναμικού δενδρικού οροπέδου σε νευρώνες προβολής στριτιασικής σπονδυλικής στήλης. Proc. Natl Acad. Sci. USA 114, E7612-E7621 (2017). Smith, S. L., Smith, I. T., Branco, T. & Häusser, M. Dendritic αιχμές ενισχύουν την επιλεκτικότητα των ερεθισμάτων σε νευρώνες του φλοιού in vivo. Φύση 503, 115-120 (2013). Xu, N.-l et al. Nonlinear dendritic integration of sensory and motor input during an active sensing task. , 247–251 (2012). Nature 492 Takahashi, N., Oertner, T. G., Hegemann, P. & Larkum, M. E. Active cortical dendrites modulate perception. , 1587–1590 (2016). Science 354 Sheffield, M. E. & Dombeck, D. A. Η παροδική επικράτηση του ασβεστίου σε όλο το δενδριτικό δέντρο προβλέπει ιδιότητες πεδίου θέσης. φύση 517, 200–204 (2015). Markram, H. et al. Reconstruction and simulation of neocortical microcircuitry. , 456–492 (2015). Cell 163 Billeh, Y. N. et al. Systematic integration of structural and functional data into multi-scale models of mouse primary visual cortex. , 388–403 (2020). Neuron 106 Hjorth, J. et al. The microcircuits of striatum in silico. , 202000671 (2020). Proc. Natl Acad. Sci. USA 117 Guerguiev, J., Lillicrap, T. P. & Richards, B. A. Προς βαθιά μάθηση με διαχωρισμένους δενδρίτες. elife 6, e22901 (2017). Iyer, A. et al. Avoiding catastrophe: active dendrites enable multi-task learning in dynamic environments. , 846219 (2022). Front. Neurorobot. 16 Jones, I. S. & Kording, K. P. Might a single neuron solve interesting machine learning problems through successive computations on its dendritic tree? , 1554–1571 (2021). Neural Comput. 33 Bird, A. D., Jedlicka, P. & Cuntz, H. Dendritic normalisation improves learning in sparsely connected artificial neural networks. , e1009202 (2021). PLoS Comput. Biol. 17 Goodfellow, I. J., Shlens, J. & Szegedy, C. Εξηγώντας και αξιοποιώντας τα αντίθετα παραδείγματα. στο 3ο Διεθνές Συνέδριο για τις Εκπροσώπηση της Μάθησης (ICLR) (ICLR, 2015). Papernot, N., McDaniel, P. & Goodfellow, I. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. Preprint at (2016). https://arxiv.org/abs/1605.07277 Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Η μάθηση με βάση το βαθμό εφαρμόστηκε στην αναγνώριση εγγράφων. Xiao, H., Rasul, K. & Vollgraf, R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. Preprint at (2017). http://arxiv.org/abs/1708.07747 Bartunov, S. et al. Assessing the scalability of biologically-motivated deep learning algorithms and architectures. In (NeurIPS, 2018). Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Rauber, J., Brendel, W. & Bethge, M. Foolbox: A Python toolbox to benchmark the robustness of machine learning models. In (2017). Reliable Machine Learning in the Wild Workshop, 34th International Conference on Machine Learning Rauber, J., Zimmermann, R., Bethge, M. & Brendel, W. Foolbox native: fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX. , 2607 (2020). J. Open Source Softw. 5 Paszke, A. et al. PyTorch: Ένα επιτακτικό στυλ, βιβλιοθήκη βαθιάς μάθησης υψηλής απόδοσης. στις προόδους στα νευρικά συστήματα επεξεργασίας πληροφοριών 32 (NeurIPS 2019) (NeurIPS, 2019). He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 770–778 (IEEE, 2016). Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Acknowledgements The authors sincerely thank Dr. Rita Zhang, Daochen Shi and members at NVIDIA for the valuable technical support of GPU computing. This work was supported by the National Key R&D Program of China (No. 2020AAA0130400) to K.D. and T.H., National Natural Science Foundation of China (No. 61088102) to T.H., National Key R&D Program of China (No. 2022ZD01163005) to L.M., Key Area R&D Program of Guangdong Province (No. 2018B030338001) to T.H., National Natural Science Foundation of China (No. 61825101) to Y.T., Swedish Research Council (VR-M-2020-01652), Swedish e-Science Research Centre (SeRC), EU/Horizon 2020 No. 945539 (HBP SGA3), and KTH Digital Futures to J.H.K., J.H., and A.K., Swedish Research Council (VR-M-2021-01995) and EU/Horizon 2020 no. 945539 (HBP SGA3) to S.G. and A.K. Part of the simulations were enabled by resources provided by the Swedish National Infrastructure for Computing (SNIC) at PDC KTH partially funded by the Swedish Research Council through grant agreement no. 2018-05973. This paper is under CC by 4.0 Deed (Attribution 4.0 International) license. available on nature Αυτό το έγγραφο είναι Υπό την άδεια CC by 4.0 Deed (Attribution 4.0 International). Διαθέσιμο στη φύση