저자 : LEI 마 샤오페이 리우 J. J. 요한스 히어스 알렉산더 Kozlov 유타오 그 Shenjian Zhang에 관해 제네트 헬그렌 Kotaleski Yonghong 티안 스텐 그릴러 만약 당신이 타이완 Huang 저자 : LEI 마 샤오페이 리우 J. J. 요한스 히어스 알렉산더 Kozlov 유타오 그 Shenjian Zhang에 관해 제네트 헬그렌 Kotaleski Yonghong 티안 스텐 그릴러 만약 당신이 타이완 Huang abstract에 대하여 생물학적으로 상세한 다부서 모델은 뇌의 계산 원리를 탐구하는 강력한 도구이며 인공 지능 (AI) 시스템을위한 알고리즘을 생성하기위한 이론적 프레임 워크로도 역할을합니다.그러나 비싼 계산 비용은 신경 과학 및 AI 분야의 응용 프로그램을 심각하게 제한합니다. 상세한 부서 모델을 시뮬레이션하는 동안 주요 병점은 시뮬레이터가 선형 방정식의 대형 시스템을 해결할 수있는 능력입니다. 안드로이드 이라크 우리는 이론적으로 DHS 구현이 계산적으로 최적화되고 정확하다는 것을 증명합니다.이 GPU 기반 방법은 기존 CPU 플랫폼의 고전적인 시리즈 Hines 방법보다 2-3 개의 규모보다 빠른 속도로 작동합니다. 우리는 DHS 방법과 NEURON 시뮬레이터의 GPU 컴퓨팅 엔진을 통합하는 DeepDendrite 프레임 워크를 구축하고 신경 과학 작업에서 DeepDendrite의 응용 프로그램을 보여줍니다. 우리는 25,000 개의 스핀을 가진 세부적인 인간 피라미드 신경 모형에서 신경 자극성에 어떻게 영향을 미치는지 연구합니다. D H S 소개 신경계의 코딩 및 계산 원칙을 해독하는 것은 신경과학에 필수적입니다.미생물의 두뇌는 독특한 형태학적 및 생물 물리적 특성을 가진 수천 개 이상의 다른 유형의 신경계로 구성되어 있습니다.그러나 더 이상 개념적으로 사실이 아닙니다. 신경계가 단순한 합계 단위로 간주되었으며, 신경계 계산, 특히 신경계 네트워크 분석에서 여전히 널리 적용되고 있습니다.최근 몇 년 동안 현대 인공지능(AI)은 이 원칙을 활용하여 인공 신경계 네트워크(ANN)와 같은 강력한 도구를 개발했습니다. 그러나, 단일 신경계 수준에서 포괄적 인 계산 외에도, 신경 덴드리트와 같은 하위세포 구간은 독립적 인 계산 단위로서 비선형 작업을 수행할 수 있다. , , , , 또한, dendritic 스핀 (dendritic spines)은 척추 신경계에서 dendrites를 밀접하게 덮는 작은 발진으로, synaptic 신호를 구부러 낼 수 있으며, 부모 dendrites로부터 ex vivo 및 in vivo 분리 될 수 있습니다. , , , . 1 2 3 4 5 6 7 8 9 10 11 생물학적으로 상세한 신경계를 사용하는 시뮬레이션은 생물학적 세부 사항을 계산 원칙과 연결하는 이론적 틀을 제공합니다.The core of the biophysically detailed multi-compartment model framework , 현실적인 dendritic morphologies, intrinsic ionic conductance, and extrinsic synaptic inputs를 가진 신경계를 모델링할 수 있습니다.The backbone of the detailed multi-compartment model, i.e., dendrites, is built on the classical Cable theory. Dendrites의 생물 물리적 섬유 속성을 수동 케이블로 모델링하여 전자 신호가 복잡한 신경 과정에 어떻게 침입하고 전파하는지에 대한 수학적 설명을 제공합니다.Active biophysical mechanisms such as ion channels, excitatory and inhibitory synaptic currents, etc.와 함께 케이블 이론을 통합함으로써 세포 및 subcellular neuronal calculations beyond experimental limitations, a detailed multi-compartment model can. , . 12 13 12 4 7 신경과학에 대한 깊은 영향 외에도, 생물학적으로 상세한 신경 모델은 최근 신경 구조 및 생물 물리적 세부 사항과 AI 사이의 격차를 교차하기 위해 사용되었습니다.현대 AI 분야에서 보편적 인 기술은 지점 신경계로 구성된 ANN, 생물학적 신경 네트워크와 유사합니다. "backpropagation-of-error" (backprop) 알고리즘을 가진 ANN는 전문 응용 분야에서 놀라운 성능을 달성했지만, Go 및 체스 게임에서 최고의 인간 전문 플레이어를 이겼습니다. , 인간의 뇌는 아직도 더 역동적이고 시끄러운 환경을 포함하는 도메인에서 ANN을 뛰어넘는다. , 최근의 이론적 연구에 따르면 dendritic 통합은 병렬 정보 처리에서 백프로프를 초과할 수있는 효율적인 학습 알고리즘을 생성하는 데 중요하다. , , 또한, 하나의 상세한 멀티 파트먼트 모델은 단지 시나프틱 강도를 조정함으로써 포인트 신경에 대한 네트워크 수준의 비선형 계산을 배울 수 있다. , 더 강력한 뇌와 같은 AI 시스템을 구축하는 데 필요한 상세한 모델의 전체 잠재력을 보여주기 때문에 뇌와 같은 AI의 패러다임을 단일 상세한 신경 모델에서 대규모 생물학적으로 상세한 네트워크로 확장하는 것이 우선 순위입니다. 14 15 16 17 18 19 20 21 22 자세한 시뮬레이션 접근법의 오랜 도전 중 하나는 매우 높은 계산 비용으로, 신경과학 및 AI에 대한 적용을 심각하게 제한했습니다. , , 효율성을 향상시키기 위해, 고전적인 히인스 방법은 O(n3)에서 O(n)로 균형을 해결하기위한 시간 복잡성을 줄이고, 이는 NEURON과 같은 대중 시뮬레이터에서 핵심 알고리즘으로 널리 적용되었습니다. 그리고 Genesis 그러나, 이 방법은 각 구간을 연속적으로 처리하기 위해 일련 접근법을 사용합니다.시뮬레이션이 dendritic 스핀과 여러 생물학적으로 상세한 dendrites를 포함하는 경우, 선형 방정식 매트릭스(“Hines Matrix”)는 dendrites 또는 spines의 증가하는 숫자(그림. Hines 방법을 더 이상 실용적이지 않게 만들기 때문에 전체 시뮬레이션에 매우 무거운 부담을 초래합니다. 12 23 24 25 26 1e 재구성된 레이어-5 피라미드 신경계 모델과 상세한 신경계 모델에 사용된 수학 수식. 숫자적으로 세부적인 신경계 모델을 시뮬레이션할 때의 워크플로우.The equation-solving phase is the bottleneck in the simulation. 시뮬레이션의 선형 균형의 예입니다.An example of linear equations in the simulation. Hines Method의 데이터 의존성 (Data dependency of the Hines Method when solving linear equations) Hines 매트릭스의 크기는 모델의 복잡성과 함께 규모를 차지합니다.Linear equation system의 수는 모델이 더 상세하게 성장할 때 상당히 증가합니다. 다양한 유형의 신경계 모델에서 Hines 시리즈 방법의 계산 비용 (equation solving phase steps) 다른 해결 방법의 묘사.신경의 다른 부분은 여러 처리 단위에 병렬 방식으로 (중간, 오른쪽), 다른 색상으로 표시됩니다. 세 가지 방법의 계산 비용 스핀으로 피라미드 모델의 방정식을 해결할 때. 실행 시간은 1s 시뮬레이션의 시간 소비를 나타냅니다 (시간 단계의 0.025 ms로 방정식을 40,000 번 해결). p-Hines 병렬 방법 CoreNEURON (GPU에), Branch 기반 부문 기반 병렬 방법 (GPU에), DHS Dendritic 계층적 계획 방법 (GPU에). a b c d c e f g h g i 지난 수십 년 동안, 세포 수준에서 병렬 방법을 사용하여 Hines 방법을 가속화하는 데 엄청난 진전이 이루어졌으며, 이는 각 세포의 다른 부분의 계산을 병렬화 할 수있게합니다. , , , , , 그러나 현재 세포 수준의 병렬 방법은 종종 효율적인 병렬 전략이 부족하거나 원래의 Hines 방법에 비해 충분한 숫자적 정확성이 부족합니다. 27 28 29 30 31 32 여기서 우리는 계산 효율성을 크게 가속화하고 계산 비용을 줄일 수 있는 완전히 자동화된, 숫자적으로 정확하고 최적화된 시뮬레이션 도구를 개발합니다.또한, 이 시뮬레이션 도구는 기계 학습 및 AI 응용 프로그램을 위한 생물학적 세부 사항을 포함한 신경 네트워크를 설립하고 테스트하기 위해 순조롭게 채택될 수 있습니다. Parallel Computing 이론 우리는 우리의 알고리즘이 정밀도 손실없이 최적의 스케줄링을 제공한다는 것을 보여주고 있습니다.또한, 우리는 GPU 메모리 계층과 메모리 액세스 메커니즘을 활용하여 현재 가장 진보된 GPU 칩을 위해 DHS를 최적화했습니다. ) 클래식 시뮬레이터 NEURON에 비해 동일한 정확성을 유지할 수 있습니다. 33 34 1 25 AI에서 사용하기 위한 상세한 dendritic 시뮬레이션을 가능하게 하기 위해, 우리는 다음으로 DHS-embedded CoreNEURON(NEURON에 대한 최적화된 컴퓨팅 엔진) 플랫폼을 통합하여 DeepDendrite 프레임워크를 구축합니다. 시뮬레이션 엔진과 두 개의 보조 모듈(I/O 모듈 및 학습 모듈)으로 시뮬레이션 중에 dendritic 학습 알고리즘을 지원합니다.DeepDendrite는 GPU 하드웨어 플랫폼에서 실행되며 신경 과학의 정규 시뮬레이션 작업과 AI의 학습 작업을 모두 지원합니다. 35 마지막으로 DeepDendrite를 사용하여 몇 가지 핵심적인 신경과학 및 AI 과제를 목표로하는 몇 가지 응용 프로그램을 소개합니다: (1) 우리는 dendritic 척추 입력의 공간 패턴이 dendritic 나무에 걸쳐 스핀을 포함하는 신경 활동과 함께 신경 활동에 영향을 미치는 방법을 보여줍니다 (full-spine 모델). DeepDendrite는 ~25,000 dendritic 스핀을 가진 시뮬레이션 된 인간 피라미드 신경 모델에서 신경 계산을 탐구 할 수 있습니다. (2) 토론에서 우리는 또한 AI의 맥락에서 DeepDendrite의 잠재력을 고려합니다. DeepDendrite의 모든 소스 코드, 전체 척추 모델 및 상세한 dendritic 네트워크 모델은 온라인에서 공개적으로 사용할 수 있습니다 (Code Availability 참조).Our open-source learning framework can be easily integrated with other dendritic learning rules, such as learning rules for nonlinear (full-active) dendrites Burst-dependent synaptic plasticity에 관해 질문하기 , 그리고 스파이크 예측으로 배우기 전체적으로, 우리의 연구는 현재의 컴퓨팅 신경 과학 커뮤니티 생태계를 바꿀 수있는 잠재력을 갖춘 완전한 도구 세트를 제공합니다.GPU 컴퓨팅의 힘을 활용함으로써, 우리는 이러한 도구가 뇌의 미세 구조의 컴퓨팅 원칙에 대한 시스템 수준의 탐구를 촉진하고 신경 과학과 현대 AI 사이의 상호 작용을 촉진 할 것이라고 예상합니다. 21 20 36 Results Dendritic Hierarchical Scheduling (DHS) 방법 이온류를 계산하고 선형 방정식을 해결하는 것은 생물학적으로 상세한 신경계를 시뮬레이션할 때 중요한 두 단계이며, 이는 시간이 많이 소요되고 심각한 계산 부담을 초래합니다. 다행히 각 구역의 이온류를 계산하는 것은 완전히 독립적 인 과정이므로 GPU와 같은 대규모 평행 컴퓨팅 장치가있는 장치에서 자연스럽게 병렬화 될 수 있습니다. 결과적으로, 선형 균형을 해결하는 것은 병렬화 프로세스에 대한 나머지 잔류점이 된다(Fig. ) 37 1a - F 이 병점에 대처하기 위해 세포 수준의 병렬 방법이 개발되어 단일 세포 계산을 가속화하여 단일 세포를 여러 구역으로 "분열"하여 병렬로 계산할 수 있습니다. , , 그러나 이러한 방법은 이전 지식에 크게 의존하여 단일 신경계를 구역으로 분할하는 방법에 대한 실용적인 전략을 생성합니다. · 첨부된 Fig. 결과적으로, 그것은 비대칭적인 형태를 가진 신경계, 예를 들어, 피라미드 신경계와 Purkinje 신경계에 대해 덜 효율적이 된다. 27 28 38 1g 1 우리는 생물학적으로 상세한 신경 네트워크의 시뮬레이션을위한보다 효율적이고 정확한 병렬 방법을 개발하는 것을 목표로합니다.첫째, 우리는 세포 수준의 병렬 방법의 정확성에 대한 기준을 설정합니다. , 우리는 Hines 방법의 데이터 의존성에 따라 시리얼 컴퓨팅 Hines 방법과 동일한 솔루션을 제공할 수 있도록 세 가지 조건을 제안합니다 (Methods 참조).그리고 시리얼 컴퓨팅 방법의 실행 시간, 즉 효율성을 이론적으로 평가하기 위해, 우리는 계산 비용의 개념을 방법이 균형을 해결하는 단계의 수로 소개하고 정의합니다 (Methods 참조). 34 시뮬레이션 정확성과 계산 비용을 바탕으로, 우리는 병렬화 문제를 수학적 스케줄링 문제로 정의합니다 (Methods 참조).Simply terms, we view a single neuron as a tree with many nodes (partments). 동일한 트레드, 우리는 최대한 계산할 수 있습니다 각 단계에 노드가 있지만, 모든 아동 노드가 처리된 경우에만 노드가 계산되도록 해야 합니다; 우리의 목표는 전체 절차에 대한 최소 단계 수를 가진 전략을 찾는 것입니다. k k 최적의 파티션을 생성하려면 Dendritic Hierarchical Scheduling (DHS)라는 방법을 제안합니다 (이론적 증거는 Methods에서 제시됩니다). DHS 방법에는 두 단계가 포함되어 있습니다 : 덴드리틱 토폴로그를 분석하고 최선의 파티션을 찾습니다 : (1) 상세한 모델을 제공하면 먼저 해당 의존성 나무를 얻고 각 노드의 깊이를 계산합니다 (노드의 깊이는 그 조상 노드의 수입니다) 나무 (그림. 2) 토폴로그 분석 후, 우리는 후보자를 검색하고 최대한 선택합니다. 가장 깊은 후보 노드(노드가 모든 자녀 노드가 처리된 경우에만 후보) 이 절차는 모든 노드가 처리될 때까지 반복된다. ) 2a 2B, C k 2D DHS Workflow, DHS 프로세스 각 iteration에서 가장 깊은 후보 노드. 구간 모델의 노드 깊이를 계산하는 묘사.모델은 먼저 나무 구조로 변환하여 각 노드의 깊이를 계산합니다.Colors indicate different depth values. 다양한 신경 모델에 대한 토포학 분석 여기에 구별 된 모르포질을 가진 6 개의 신경계가 표시됩니다. 각 모델의 경우, soma는 나무의 뿌리로 선택되어 노드의 깊이가 soma (0)에서 distal dendrites로 증가합니다. 모델에 DHS를 수행하는 묘사 with four threads. Candidates: nodes that can be processed. Selected candidates: nodes that are picked by DHS, i.e., the 가장 깊은 후보자.처리된 노드 : 이전에 처리된 노드 DHS가 획득한 파라렐리케이션 전략(Parallelization strategy) 각 노드는 네 개의 평행 스레드 중 하나에 할당됩니다.DHS는 여러 스레드에 노드를 배포함으로써 시리얼 노드 처리 단계를 14에서 5로 줄입니다. 상대 비용, 즉 DHS의 계산 비용의 비율은 DHS를 다른 유형의 모델에 다른 숫자로 적용할 때 시리즈 Hines 방법의 비율입니다.Relative cost, i.e. the proportion of the computational cost of DHS to that of the serial Hines method, when applying DHS with different numbers of threads on different types of models. a k b c d b k e d f 예를 들어, 시리얼 컴퓨팅 Hines 방법을 사용하여 모든 노드를 처리하는 데 14 단계가 필요하며, 4개의 평행 단위가 있는 DHS를 사용하여 노드를 5개의 하위 세트로 분할할 수 있다. ): {{9,10,12,14}, {1,7,11,13}, {2,3,4,8}, {6}, {5}}. Because nodes in the same subset can be processed in parallel, it takes only five steps to process all nodes using DHS (Fig. ). 2d 2e Next, we apply the DHS method on six representative detailed neuron models (selected from ModelDB 2) 다른 숫자와 다른 숫자 (Fig. 2) :, cortical 및 hippocampal 피라미드 신경을 포함하여 , , Cerebellar Purkinje Neurons에 대한 리뷰 보기 스트라이아탈 프로젝션 뉴런(Striatic Projection Neurons, SPN) , 그리고 olfactory bulb mitral cells , 감각, 코르티아 및 하위 코르티아 영역의 주요 신경계를 다루고 있습니다. 우리는 계산 비용을 측정했습니다. 여기서 상대적인 계산 비용은 DHS의 계산 비용의 비율에 의해 정의됩니다. 계산 비용, 즉, 균형을 해결하는 단계의 수는 줄어들면서 극적으로 감소합니다. 예를 들어, 16 개의 스레드와 함께, DHS의 계산 비용은 시리즈 Hines 방법에 비해 7 %-10 %입니다. 흥미롭게도, DHS 방법은 16 개 또는 심지어 8 개의 병렬 스레드 (그림. ), 더 많은 스레드를 추가하는 것은 부서 간의 의존성 때문에 성능을 더 향상시키지 않는다고 제안한다. 39 2F 40 41 42 43 44 45 2f 함께 DHS 방법을 생성하여 덴드리트 토폴로그의 자동 분석과 평행 계산을 위한 최적의 파티션을 가능하게 합니다. DHS는 시뮬레이션이 시작되기 전에 최적의 파티션을 찾을 수 있으며, 균형을 해결하기 위해 추가 계산이 필요하지 않습니다. GPU 메모리 부팅을 통해 DHS를 가속화 DHS는 신경 네트워크 시뮬레이션을 실행할 때 엄청난 양의 스레드를 소비하는 여러 스레드로 각 신경기를 계산합니다.Graphics Processing Units (GPUs)는 대규모 처리 단위(즉, 스트리밍 프로세서, SPs, Fig. Parallel Computing에 대하여 이론적으로 GPU의 많은 SP는 대규모 신경 네트워크를 위한 효율적인 시뮬레이션을 지원해야 한다(그림 1). ). However, we consistently observed that the efficiency of DHS significantly decreased when the network size grew, which might result from scattered data storage or extra memory access caused by loading and writing intermediate results (Fig. 왼쪽 ) 3a, B 46 3C 3D GPU 아키텍처와 메모리 계층 각 GPU에는 대규모 프로세싱 유닛(스트림 프로세서)이 포함되어 있습니다. 스트리밍 멀티프로세서(SM) 아키텍처 각 SM에는 여러 스트리밍 프로세서, 레지스트리 및 L1 캐시가 포함됩니다. Applying DHS on two neurons, each with four threads. During simulation, each thread executes on one stream processor. GPU에 대한 메모리 최적화 전략.Top panel, thread assignment and data storage of DHS, before (left) and after (right) memory boosting.Bottom, a single step in triangularization when simulating two neurons in 프로세서는 글로벌 메모리에서 각 스레드에 대한 데이터를 로드하기 위해 데이터 요청을 보냅니다.메모리 부스트 (왼쪽)없이 모든 요청 데이터를 로드하는 데 7개의 트랜잭션이 필요하며 중간 결과를 위해 몇 가지 추가 트랜잭션이 필요합니다.메모리 부스트 (오른쪽)를 사용하면 모든 요청 데이터를 로드하는 데 2개의 트랜잭션만 필요합니다. 실행 시간 DHS (32 개의 스레드 각 셀)와 메모리 부스터와 함께 다중 레이어 5 피라미드 모델에 스핀. Speed up of memory boosting on multiple layer 5 pyramidal models with spines. Memory boosting brings 1.6-2 times speedup. a b c d d e f 우리는 GPU 메모리 전송량을 증가시키는 방법인 GPU 메모리 전송량을 강화하여 GPU 메모리 계층 및 액세스 메커니즘을 활용합니다.GPU의 메모리 로딩 메커니즘을 기반으로, 배열된 데이터와 연속적으로 저장된 데이터를 로딩하는 연속 스레드가 스펙터로 저장된 데이터에 액세스하는 것에 비해 메모리 전송량이 높아 메모리 전송량을 줄입니다. , . To achieve high throughput, we first align the computing orders of nodes and rearrange threads according to the number of nodes on them. Then we permute data storage in global memory, consistent with computing orders, i.e., nodes that are processed at the same step are stored successively in global memory. Moreover, we use GPU registers to store intermediate results, further strengthening memory throughput. The example shows that memory boosting takes only two memory transactions to load eight request data (Fig. , right). Furthermore, experiments on multiple numbers of pyramidal neurons with spines and the typical neuron models (Fig. · 첨부된 Fig. ) 메모리 증진은 순진한 DHS에 비해 1.2-3.8 배의 속도를 달성한다는 것을 보여줍니다. 46 47 3d 3e, F 2 To comprehensively test the performance of DHS with GPU memory boosting, we select six typical neuron models and evaluate the run time of solving cable equations on massive numbers of each model (Fig. ). We examined DHS with four threads (DHS-4) and sixteen threads (DHS-16) for each neuron, respectively. Compared to the GPU method in CoreNEURON, DHS-4 and DHS-16 can speed up about 5 and 15 times, respectively (Fig. 또한, NEURON의 전통적인 시리얼 Hines 방법과 비교하여 CPU의 단일 스레드에서 실행하는 경우, DHS는 시뮬레이션을 2-3번의 규모로 가속화합니다(Supplementary Fig. ), 동일한 숫자 정밀도를 유지하는 동안 밀접한 스핀의 존재 (Supplementary Figs. 그리고 , 활성 dendrites (Supplementary Fig. 2) 그리고 다른 분할 전략 (Supplementary Fig. ) 4 4a 3 4 8 7 7 Run time of solving equations for a 1 s simulation on GPU (dt = 0.025 ms, 40,000 iterations in total). CoreNEURON: the parallel method used in CoreNEURON; DHS-4: DHS with four threads for each neuron; DHS-16: DHS with 16 threads for each neuron. , DHS-4 및 DHS-16에 의한 파티션 시각화, 각 색상은 단일 스레드를 나타냅니다. a b c DHS는 셀 유형별 최적 파티션을 생성합니다.DHS creates cell-type-specific optimal partitioning DHS 방법의 작동 메커니즘에 대한 통찰력을 얻으려면 각 스레드에 구간을 매핑하여 파티션 프로세스를 시각화했습니다 (각 색은 그림에서 단일 스레드를 나타냅니다. 시각화는 단일 스레드가 서로 다른 지부들 사이에서 자주 전환한다는 것을 보여줍니다(Fig. 흥미롭게도, DHS는 스트리아탈 프로젝션 신경 (SPN) 및 미트랄 세포 (Fig. 대조적으로, 그것은 피라미드 뉴런과 Purkinje 세포 (그림. , DHS가 세포 유형에 특정한 미묘한 파티션은 DHS가 모든 유효한 스레드를 완전히 활용할 수 있도록 합니다.This cell-type-specific fine-grained partition enables DHS to fully exploit all available threads. 4 B, C 4 B, C 4 B, C 4 B, C 결론적으로, DHS와 메모리 증진은 전례없는 효율성과 동시에 선형 방정식을 해결하기 위한 이론적으로 입증된 최적의 솔루션을 생성합니다.이 원칙을 사용하여 신경 과학자들이 특정 GPU 프로그래밍 지식이없는 모델을 구현하기 위해 사용할 수 있는 오픈 액세스 DeepDendrite 플랫폼을 구축했습니다. DHS는 척추 수준 모델링을 가능하게 합니다. dendritic 척추가 cortical 및 hippocampal 피라미드 신경계, striatal 투영 신경계 등에 자극적 인 입력의 대부분을받는 것처럼, 그들의 형태와 유연성은 신경 자극을 조절하는 데 중요합니다. , , , , 그러나, 척추는 너무 작다 (~ 1 μm 길이) 척추 계산에 대한 완전한 이해에 대 한 이론적인 작업이 중요 하다. 10 48 49 50 51 우리는 두 개의 구간을 가진 단일 척추를 모델링 할 수 있습니다 : 시나프스가 위치한 척추 머리와 척추 머리를 dendrites에 연결하는 척추 목. . The theory predicts that the very thin spine neck (0.1-0.5 um in diameter) electronically isolates the spine head from its parent dendrite, thus compartmentalizing the signals generated at the spine head 그러나, dendrites에 완전히 분포된 스핀을 가진 상세한 모델 (“full-spine 모델”)은 계산적으로 매우 비싸다. 스페인 요소 그럼에도 불구하고, 모든 스핀을 명확하게 모델링하는 대신, 여기에, spine factor aims at approximating the spine effect on the biophysical properties of the cell membrane . 52 53 F 54 F 54 Inspired by the previous work of Eyal et al. , we investigated how different spatial patterns of excitatory inputs formed on dendritic spines shape neuronal activities in a human pyramidal neuron model with explicitly modeled spines (Fig. ). Noticeably, Eyal et al. employed the spine factor to incorporate spines into dendrites while only a few activated spines were explicitly attached to dendrites (“few-spine model” in Fig. ). The value of spine in their model was computed from the dendritic area and spine area in the reconstructed data. Accordingly, we calculated the spine density from their reconstructed data to make our full-spine model more consistent with Eyal’s few-spine model. With the spine density set to 1.3 μm-1, the pyramidal neuron model contained about 25,000 spines without altering the model’s original morphological and biophysical properties. Further, we repeated the previous experiment protocols with both full-spine and few-spine models. We use the same synaptic input as in Eyal’s work but attach extra background noise to each sample. By comparing the somatic traces (Fig. ) and spike probability (Fig. ) in full-spine and few-spine models, we found that the full-spine model is much leakier than the few-spine model. In addition, the spike probability triggered by the activation of clustered spines appeared to be more nonlinear in the full-spine model (the solid blue line in Fig. ) than in the few-spine model (the dashed blue line in Fig. ). These results indicate that the conventional F-factor method may underestimate the impact of dense spine on the computations of dendritic excitability and nonlinearity. 51 5a F 5a F 5b, c 5d 5d 5d Experiment setup. We examine two major types of models: few-spine models and full-spine models. Few-spine models (two on the left) are the models that incorporated spine area globally into dendrites and only attach individual spines together with activated synapses. In full-spine models (two on the right), all spines are explicitly attached over whole dendrites. We explore the effects of clustered and randomly distributed synaptic inputs on the few-spine models and the full-spine models, respectively. Somatic voltages recorded for cases in . Colors of the voltage curves correspond to , scale bar: 20 ms, 20 mV. 시뮬레이션 중에 색상 코딩된 전압 at specific times. Colors indicate the magnitude of voltage. Somatic spike probability as a function of the number of simultaneously activated synapses (as in Eyal et al.’s work) for four cases in . Background noise is attached. Run time of experiments in with different simulation methods. NEURON: conventional NEURON simulator running on a single CPU core. CoreNEURON: CoreNEURON simulator on a single GPU. DeepDendrite: DeepDendrite on a single GPU. a b a a c b d a e d In the DeepDendrite platform, both full-spine and few-spine models achieved 8 times speedup compared to CoreNEURON on the GPU platform and 100 times speedup compared to serial NEURON on the CPU platform (Fig. ; Supplementary Table ) while keeping the identical simulation results (Supplementary Figs. and 따라서 DHS 방법은 더 현실적인 해부학 조건에서 dendritic 자극성의 탐구를 가능하게 한다. 5e 1 4 8 Discussion In this work, we propose the DHS method to parallelize the computation of Hines method and we mathematically demonstrate that the DHS provides an optimal solution without any loss of precision. Next, we implement DHS on the GPU hardware platform and use GPU memory boosting techniques to refine the DHS (Fig. ). When simulating a large number of neurons with complex morphologies, DHS with memory boosting achieves a 15-fold speedup (Supplementary Table ) as compared to the GPU method used in CoreNEURON and up to 1,500-fold speedup compared to serial Hines method in the CPU platform (Fig. ; Supplementary Fig. and Supplementary Table ). Furthermore, we develop the GPU-based DeepDendrite framework by integrating DHS into CoreNEURON. Finally, as a demonstration of the capacity of DeepDendrite, we present a representative application: examine spine computations in a detailed pyramidal neuron model with 25,000 spines. Further in this section, we elaborate on how we have expanded the DeepDendrite framework to enable efficient training of biophysically detailed neural networks. To explore the hypothesis that dendrites improve robustness against adversarial attacks , we train our network on typical image classification tasks. We show that DeepDendrite can support both neuroscience simulations and AI-related detailed neural network tasks with unprecedented speed, therefore significantly promoting detailed neuroscience simulations and potentially for future AI explorations. 55 3 1 4 3 1 56 Decades of efforts have been invested in speeding up the Hines method with parallel methods. Early work mainly focuses on network-level parallelization. In network simulations, each cell independently solves its corresponding linear equations with the Hines method. Network-level parallel methods distribute a network on multiple threads and parallelize the computation of each cell group with each thread , . With network-level methods, we can simulate detailed networks on clusters or supercomputers . In recent years, GPU has been used for detailed network simulation. Because the GPU contains massive computing units, one thread is usually assigned one cell rather than a cell group , , . With further optimization, GPU-based methods achieve much higher efficiency in network simulation. However, the computation inside the cells is still serial in network-level methods, so they still cannot deal with the problem when the “Hines matrix” of each cell scales large. 57 58 59 35 60 61 Cellular-level parallel methods further parallelize the computation inside each cell. The main idea of cellular-level parallel methods is to split each cell into several sub-blocks and parallelize the computation of those sub-blocks , . However, typical cellular-level methods (e.g., the “multi-split” method ) pay less attention to the parallelization strategy. The lack of a fine parallelization strategy results in unsatisfactory performance. To achieve higher efficiency, some studies try to obtain finer-grained parallelization by introducing extra computation operations , , or making approximations on some crucial compartments, while solving linear equations , 이 미세 곡물의 병렬화 전략은 더 높은 효율성을 얻을 수 있지만, 원래의 Hines 방법과 같은 충분한 숫자적 정확성이 부족합니다. 27 28 28 29 38 62 63 64 Unlike previous methods, DHS adopts the finest-grained parallelization strategy, i.e., compartment-level parallelization. By modeling the problem of “how to parallelize” as a combinatorial optimization problem, DHS provides an optimal compartment-level parallelization strategy. Moreover, DHS does not introduce any extra operation or value approximation, so it achieves the lowest computational cost and retains sufficient numerical accuracy as in the original Hines method at the same time. Dendritic spines are the most abundant microstructures in the brain for projection neurons in the cortex, hippocampus, cerebellum, and basal ganglia. As spines receive most of the excitatory inputs in the central nervous system, electrical signals generated by spines are the main driving force for large-scale neuronal activities in the forebrain and cerebellum , . The structure of the spine, with an enlarged spine head and a very thin spine neck—leads to surprisingly high input impedance at the spine head, which could be up to 500 MΩ, combining experimental data and the detailed compartment modeling approach , . Due to such high input impedance, a single synaptic input can evoke a “gigantic” EPSP ( ~ 20 mV) at the spine-head level , , thereby boosting NMDA currents and ion channel currents in the spine . However, in the classic single detailed compartment models, all spines are replaced by the coefficient modifying the dendritic cable geometries . This approach may compensate for the leak currents and capacitance currents for spines. Still, it cannot reproduce the high input impedance at the spine head, which may weaken excitatory synaptic inputs, particularly NMDA currents, thereby reducing the nonlinearity in the neuron’s input-output curve. Our modeling results are in line with this interpretation. 10 11 48 65 48 66 11 F 54 On the other hand, the spine’s electrical compartmentalization is always accompanied by the biochemical compartmentalization , , , resulting in a drastic increase of internal [Ca2+], within the spine and a cascade of molecular processes involving synaptic plasticity of importance for learning and memory. Intriguingly, the biochemical process triggered by learning, in turn, remodels the spine’s morphology, enlarging (or shrinking) the spine head, or elongating (or shortening) the spine neck, which significantly alters the spine’s electrical capacity , , , . Such experience-dependent changes in spine morphology also referred to as “structural plasticity”, have been widely observed in the visual cortex , , somatosensory cortex , , motor cortex , hippocampus , and the basal ganglia in vivo. They play a critical role in motor and spatial learning as well as memory formation. However, due to the computational costs, nearly all detailed network models exploit the “F-factor” approach to replace actual spines, and are thus unable to explore the spine functions at the system level. By taking advantage of our framework and the GPU platform, we can run a few thousand detailed neurons models, each with tens of thousands of spines on a single GPU, while maintaining ~100 times faster than the traditional serial method on a single CPU (Fig. ). Therefore, it enables us to explore of structural plasticity in large-scale circuit models across diverse brain regions. 8 52 67 67 68 69 70 71 72 73 74 75 9 76 5e Another critical issue is how to link dendrites to brain functions at the systems/network level. It has been well established that dendrites can perform comprehensive computations on synaptic inputs due to enriched ion channels and local biophysical membrane properties , , . For example, cortical pyramidal neurons can carry out sublinear synaptic integration at the proximal dendrite but progressively shift to supralinear integration at the distal dendrite . Moreover, distal dendrites can produce regenerative events such as dendritic sodium spikes, calcium spikes, and NMDA spikes/plateau potentials , . Such dendritic events are widely observed in mice or even human cortical neurons in vitro, which may offer various logical operations , or gating functions , . Recently, in vivo recordings in awake or behaving mice provide strong evidence that dendritic spikes/plateau potentials are crucial for orientation selectivity in the visual cortex , whisker 시스템에 센서 모터 통합 , , and spatial navigation in the hippocampal CA1 region . 5 6 7 77 6 78 6 79 6 79 80 81 82 83 84 85 To establish the causal link between dendrites and animal (including human) patterns of behavior, large-scale biophysically detailed neural circuit models are a powerful computational tool to realize this mission. However, running a large-scale detailed circuit model of 10,000-100,000 neurons generally requires the computing power of supercomputers. It is even more challenging to optimize such models for in vivo data, as it needs iterative simulations of the models. The DeepDendrite framework can directly support many state-of-the-art large-scale circuit models , , , which were initially developed based on NEURON. Moreover, using our framework, a single GPU card such as Tesla A100 could easily support the operation of detailed circuit models of up to 10,000 neurons, thereby providing carbon-efficient and affordable plans for ordinary labs to develop and optimize their own large-scale detailed models. 86 87 88 Recent works on unraveling the dendritic roles in task-specific learning have achieved remarkable results in two directions, i.e., solving challenging tasks such as image classification dataset ImageNet with simplified dendritic networks , and exploring full learning potentials on more realistic neuron , . However, there lies a trade-off between model size and biological detail, as the increase in network scale is often sacrificed for neuron-level complexity , , . Moreover, more detailed neuron models are less mathematically tractable and computationally expensive . 20 21 22 19 20 89 21 또한 컴퓨터 시력 작업을위한 ANN에서 활성 dendrites의 역할에 대한 진전이있다. . proposed a novel ANN architecture with active dendrites, demonstrating competitive results in multi-task and continual learning. Jones and Kording used a binary tree to approximate dendrite branching and provided valuable insights into the influence of tree structure on single neurons’ computational capacity. Bird et al. . proposed a dendritic normalization rule based on biophysical behavior, offering an interesting perspective on the contribution of dendritic arbor structure to computation. While these studies offer valuable insights, they primarily rely on abstractions derived from spatially extended neurons, and do not fully exploit the detailed biological properties and spatial information of dendrites. Further investigation is needed to unveil the potential of leveraging more realistic neuron models for understanding the shared mechanisms underlying brain computation and deep learning. 90 91 92 In response to these challenges, we developed DeepDendrite, a tool that uses the Dendritic Hierarchical Scheduling (DHS) method to significantly reduce computational costs and incorporates an I/O module and a learning module to handle large datasets. With DeepDendrite, we successfully implemented a three-layer hybrid neural network, the Human Pyramidal Cell Network (HPC-Net) (Fig. ). This network demonstrated efficient training capabilities in image classification tasks, achieving approximately 25 times speedup compared to training on a traditional CPU-based platform (Fig. · 추가 테이블 ). 6a, b 6f 1 The illustration of the Human Pyramidal Cell Network (HPC-Net) for image classification. Images are transformed to spike trains and fed into the network model. Learning is triggered by error signals propagated from soma to dendrites. Training with mini-batch. Multiple networks are simulated simultaneously with different images as inputs. The total weight updates ΔW are computed as the average of ΔWi from each network. Comparison of the HPC-Net before and after training. Left, the visualization of hidden neuron responses to a specific input before (top) and after (bottom) training. Right, hidden layer weights (from input to hidden layer) distribution before (top) and after (bottom) training. Workflow of the transfer adversarial attack experiment. We first generate adversarial samples of the test set on a 20-layer ResNet. Then use these adversarial samples (noisy images) to test the classification accuracy of models trained with clean images. Prediction accuracy of each model on adversarial samples after training 30 epochs on MNIST (left) and Fashion-MNIST (right) datasets. Run time of training and testing for the HPC-Net. The batch size is set to 16. Left, run time of training one epoch. Right, run time of testing. Parallel NEURON + Python: training and testing on a single CPU with multiple cores, using 40-process-parallel NEURON to simulate the HPC-Net and extra Python code to support mini-batch training. DeepDendrite: training and testing the HPC-Net on a single GPU with DeepDendrite. a b c d e f Additionally, it is widely recognized that the performance of Artificial Neural Networks (ANNs) can be undermined by adversarial attacks —intentionally engineered perturbations devised to mislead ANNs. Intriguingly, an existing hypothesis suggests that dendrites and synapses may innately defend against such attacks . Our experimental results utilizing HPC-Net lend support to this hypothesis, as we observed that networks endowed with detailed dendritic structures demonstrated some increased resilience to transfer adversarial attacks compared to standard ANNs, as evident in MNIST and Fashion-MNIST datasets (Fig. 이 증거는 dendrites의 내재적 인 생물 물리적 특성은 적대적 간섭에 대한 ANN의 견고성을 증가시키는 데 중요 할 수 있음을 암시합니다.그러나 ImageNet와 같은 더 도전적인 데이터 세트를 사용하여 이러한 발견을 검증하기 위해 추가 연구를 수행하는 것이 필수적입니다. . 93 56 94 95 96 6d, e 97 In conclusion, DeepDendrite has shown remarkable potential in image classification tasks, opening up a world of exciting future directions and possibilities. To further advance DeepDendrite and the application of biologically detailed dendritic models in AI tasks, we may focus on developing multi-GPU systems and exploring applications in other domains, such as Natural Language Processing (NLP), where dendritic filtering properties align well with the inherently noisy and ambiguous nature of human language. Challenges include testing scalability in larger-scale problems, understanding performance across various tasks and domains, and addressing the computational complexity introduced by novel biological principles, such as active dendrites. By overcoming these limitations, we can further advance the understanding and capabilities of biophysically detailed dendritic neural networks, potentially uncovering new advantages, enhancing their robustness against adversarial attacks and noisy inputs, and ultimately bridging the gap between neuroscience and modern AI. Methods Simulation with DHS CoreNEURON 시뮬레이션 ( ) uses the NEURON architecture and is optimized for both memory usage and computational speed. We implement our Dendritic Hierarchical Scheduling (DHS) method in the CoreNEURON environment by modifying its source code. All models that can be simulated on GPU with CoreNEURON can also be simulated with DHS by executing the following command: 35 https://github.com/BlueBrain/CoreNeuron 25 coreneuron_exec -d /path/to/models -e time --cell-permute 3 --cell-nthread 16 --gpu 사용 옵션은 테이블에있는 것처럼 . 1 Accuracy of the simulation using cellular-level parallel computation To ensure the accuracy of the simulation, we first need to define the correctness of a cellular-level parallel algorithm to judge whether it will generate identical solutions compared with the proven correct serial methods, like the Hines method used in the NEURON simulation platform. Based on the theories in parallel computing , a parallel algorithm will yield an identical result as its corresponding serial algorithm, if and only if the data process order in the parallel algorithm is consistent with data dependency in the serial method. The Hines method has two symmetrical phases: triangularization and back-substitution. By analyzing the serial computing Hines method , we find that its data dependency can be formulated as a tree structure, where the nodes on the tree represent the compartments of the detailed neuron model. In the triangularization process, the value of each node depends on its children nodes. In contrast, during the back-substitution process, the value of each node is dependent on its parent node (Fig. ). Thus, we can compute nodes on different branches in parallel as their values are not dependent. 34 55 1d Based on the data dependency of the serial computing Hines method, we propose three conditions to make sure a parallel method will yield identical solutions as the serial computing Hines method: (1) The tree morphology and initial values of all nodes are identical to those in the serial computing Hines method; (2) In the triangularization phase, a node can be processed if and only if all its children nodes are already processed; (3) In the back-substitution phase, a node can be processed only if its parent node is already processed. Once a parallel computing method satisfies these three conditions, it will produce identical solutions as the serial computing method. Computational cost of cellular-level parallel computing method To theoretically evaluate the run time, i.e., efficiency, of the serial and parallel computing methods, we introduce and formulate the concept of computational cost as follows: given a tree and threads (basic computational units) to perform triangularization, parallel triangularization equals to divide the node set 의 into subsets, i.e., = { , , … } where the size of each subset | | ≤ , i.e., at most nodes can be processed each step since there are only threads. The process of the triangularization phase follows the order: → → … → , and nodes in the same subset can be processed in parallel. So, we define | | (the size of set , i.e., here) as the computational cost of the parallel computing method. In short, we define the computational cost of a parallel method as the number of steps it takes in the triangularization phase. Because the back-substitution is symmetrical with triangularization, the total cost of the entire solving equation phase is twice that of the triangularization phase. T k V T n V V1 V2 Vn Vi k k k V1 V2 Vn VI V V n Mathematical scheduling problem Based on the simulation accuracy and computational cost, we formulate the parallelization problem as a mathematical scheduling problem: 나무를 주고 = { , 긍정적 인 integer , where is the node-set and is the edge set. Define partition ( ) = { , ... ... }, | | ≤ , 1 ≤ ≤ n, where | | indicates the cardinal number of subset , i.e., the number of nodes in , and for each node ∈ 모든 자녀들이 노드 | ∈children( )} must in a previous subset ≤ 1 ≤ < . Our goal is to find an optimal partition ( ) whose computational cost | ( )| is minimal. T V E k V E P V V1 V2 Vn Vi k i Vi Vi Vi v Vi c c v Vj j i P* V P* V Here subset 그것은 모든 노드에서 구성되어 있으며, it will be computed at -th step (Fig. (그런데 | ≤ indicates that we can compute nodes each step at most because the number of available threads is . The restriction “for each node ∈ , all its children nodes { | ∈children( )} must in a previous subset , where 1 ≤ < ” indicates that node can be processed only if all its child nodes are processed. Vi i 2e Vi k k k v Vi c c v Vj j i v DHS 구현 우리는 위의 수학적 스케줄링 문제를 해결함으로써 각 신경계 모델에 대한 선형 방정식을 해결하는 계산을 일치시키는 최적의 방법을 찾는 것을 목표로합니다. ( ) for all nodes ∈ . Then, the following two steps will be executed iteratively until every node ∈ is assigned to a subset: (1) find all candidate nodes and put these nodes into candidate set 노드가 모든 아동 노드가 처리되었거나 아동 노드가 없는 경우에만 후보자입니다. | ≤ , i.e., the number of candidate nodes is smaller or equivalent to the number of available threads, remove all nodes in and put them into , otherwise, remove deepest nodes from and add them to subset . Label these nodes as processed nodes (Fig. ). After filling in subset , go to step (1) to fill in the next subset . d v v V v V Q Q k Q V*i k Q Vi 2d Vi Vi+1 Correctness proof for DHS After applying DHS to a neural tree = { , }, we get a partition ( ) = { , , … }, | | ≤ , 1 ≤ ≤ . Nodes in the same subset 그것은 동시에 계산될 것이며, steps to perform triangularization and back-substitution, respectively. We then demonstrate that the reordering of the computation in DHS will result in a result identical to the serial Hines method. T V E P V V1 V2 Vn Vi k i n Vi n Partition의 ( ) obtained from DHS decides the computation order of all nodes in a neural tree. Below we demonstrate that the computation order determined by ( ) satisfies the correctness conditions. ( ) is obtained from the given neural tree . Operations in DHS do not modify the tree topology and values of tree nodes (corresponding values in the linear equations), so the tree morphology and initial values of all nodes are not changed, which satisfies condition 1: the tree morphology and initial values of all nodes are identical to those in serial Hines method. In triangularization, nodes are processed from subset 두 . As shown in the implementation of DHS, all nodes in subset are selected from the candidate set , and a node can be put into only if all its child nodes have been processed. Thus the child nodes of all nodes in are in { , , … }, meaning that a node is only computed after all its children have been processed, which satisfies condition 2: in triangularization, a node can be processed if and only if all its child nodes are already processed. In back-substitution, the computation order is the opposite of that in triangularization, i.e., from to . As shown before, the child nodes of all nodes in are in { , , … }, so parent nodes of nodes in are in { , , … }, which satisfies condition 3: in back-substitution, a node can be processed only if its parent node is already processed. P V P V P V T V1 유엔 VI Q Q Vi V1 V2 Vi-1 Vn v1 Vi V1 V2 Vi-1 Vi Vi+1 Vi+2 유엔 Optimality proof for DHS 증거의 아이디어는 다른 최적의 솔루션이 있다면 알고리즘이 요구하는 단계 수를 증가시키지 않고 DHS 솔루션으로 변환 할 수 있다는 것입니다. 각 subset에 대해 in ( ), DHS moves (thread number) corresponding candidate set의 가장 깊은 노드 to . If the number of nodes in is smaller than , move all nodes from to . To simplify, we introduce , indicating the depth sum of deepest nodes in . All subsets in ( ) satisfy the max-depth criteria (Supplementary Fig. ): . We then prove that selecting the deepest nodes in each iteration makes an optimal partition. If there exists an optimal partition = { , , … } containing subsets that do not satisfy the max-depth criteria, we can modify the subsets in ( ) so that all subsets consist of the deepest nodes from and the number of subsets ( | ( )|) remain the same after modification. VI P V k Qi Vi Qi k Qi Vi Di k Qi P V 6a P(V) P*(V) V*1 V*2 V*s P* V Q P* V Without any loss of generalization, we start from the first subset not satisfying the criteria, i.e., . There are two possible cases that will make not satisfy the max-depth criteria: (1) | and there exist some valid nodes in that are not put to 2) | = but nodes in are not the deepest nodes in . V*i V*i V*i k Qi V*i V * I k V * I k Qi For case (1), because some candidate nodes are not put to , 이 노드들은 다음 하위 세트에 있어야 합니다. As , we can move the corresponding nodes from the subsequent subsets to 그럼에도 불구하고, 그것은 서브 세트의 수를 증가시키지 않을 것이며, 조건을 충족시키는 방법 (Supplementary Fig. 사례 (2) 사례 (2) 사례 (2) 사례 | = , these deeper nodes that are not moved from the candidate set into must be added to subsequent subsets (Supplementary Fig. , bottom). These deeper nodes can be moved from subsequent subsets to through the following method. Assume that after filling , is picked and one of the -th deepest nodes is still in , thus will be put into a subsequent subset ( > ). We first move from to + , then modify subset + 더 as follows: if | + | ≤ and none of the nodes in + is the parent of node , stop modifying the latter subsets. Otherwise, modify + as follows (Supplementary Fig. 1) 부모 노드의 경우 is in + , move this parent node to + ; else move the node with minimum depth from + to + . After adjusting , modify subsequent subsets + , + 더 , … with the same strategy. Finally, move from to . V*i V * I < k V * I V*i 6B V*i k Qi V*i 6b V*i V*i v k v’ Qi v’ V*j j i v V*i V*i 1 V*i 1 V*i 1 k V*i 1 v V*i 1 6c v V*i 1 V*i 2 V*i 1 V*i 2 V*i V*i 1 V*i 2 V*j-1 v’ V*j V*i With the modification strategy described above, we can replace all shallower nodes in with the -th deepest node in and keep the number of subsets, i.e., | ( 변경 후에 동일한 node를 변경할 수 있습니다.We can modify the nodes with the same strategy for all subsets in ( ) that do not contain the deepest nodes. Finally, all subsets ∈ ( ) can satisfy the max-depth criteria, and | ( 은 변경 후에는 변경되지 않습니다. V * I k Qi P * V P* V V*i P* V P * V In conclusion, DHS generates a partition ( ), and all subsets ∈ ( 최대 깊이 조건을 만족시킵니다.For any other optimal partition ( ) we can modify its subsets to make its structure the same as ( ), 즉, 각 하위 세트는 후보 세트의 가장 깊은 노드들로 구성되어 있으며, keep. ( ) the same after modification. So, the partition ( DHS에서 얻은 것은 최적의 파티션 중 하나입니다. P V VI P V P* V P V P* V | P V GPU 구현 및 메모리 강화 높은 메모리 전송량을 달성하기 위해, GPU는 (1) 글로벌 메모리, (2) 캐시, (3) 레지스트리, 글로벌 메모리가 큰 용량이지만 낮은 전송량을 가지고 있는 메모리 계층을 사용하며, 레지스트리는 낮은 용량이지만 높은 전송량을 가지고 있습니다. GPU는 SIMT(Single-Instruction, Multiple-Thread) 아키텍처를 사용합니다. 워프는 GPU의 기본 스케줄링 유닛입니다(워프는 32개의 평행 스레드의 그룹입니다).워프는 다른 스레드에 대한 다른 데이터로 동일한 명령을 실행합니다. 이 배치에서 계산을 수행하는 데 필요한 노드의 올바른 순서는 DHS가 시리얼 히인스 방법과 동일한 결과를 얻는지 확인하기 위해서입니다. GPU에 DHS를 구현할 때, 우리는 먼저 모든 세포를 다수의 워프로 그룹화합니다. 비슷한 형태를 가진 세포는 동일한 워프로 그룹화됩니다. 우리는 모든 신경계에 DHS를 적용하여 각 신경계의 구역을 다수의 스레드에 할당합니다. 신경계가 워프로 그룹화되기 때문에 동일한 신경계의 스레드는 동일한 워프에 있습니다. 따라서 워프의 내부 동기화는 시리얼 히인스 방법의 데이터 의존성과 일치하는 계산 순서를 유지합니다. 마지막으로, 각 워프의 46 워프가 글로벌 메모리에서 미리 조정된 데이터와 연속적으로 저장된 데이터를 로드할 때, 그것은 전체적으로 캐시를 활용할 수 있으며, 이는 높은 메모리 전송량을 초래할 수 있으며, 스파이더로 저장된 데이터에 액세스하면 메모리 전송량이 줄어들 것입니다.포장 할당 및 스레드 재조정 후, 우리는 글로벌 메모리에서 데이터를 전환하여 컴퓨팅 명령과 일치하여 워프가 프로그램을 실행할 때 연속적으로 저장된 데이터를 로드할 수 있습니다.또한, 우리는 그 필요한 임시 변수를 글로벌 메모리 대신 레지스트에 넣습니다. 레지스트에는 가장 높은 메모리 전송량이 있으므로 레지스트 사용이 DHS를 더욱 가속화합니다. Full-spine 및 few-spine 생물 물리적 모델 우리는 인간의 피라미드 신경을 사용했습니다. . The membrane capacitance m = 0.44 μF cm-2, membrane resistance m = 48,300 Ω cm2, and axial resistivity a = 261.97 Ω cm. 이 모델에서, 모든 덴드리트는 수동 케이블로 모델링되었으며, somas는 활성화되었습니다. l = -83.1 mV. Ion channels such as Na+ and K+ were inserted on soma and initial axon, and their reversal potentials were Na = 67.6 mV, K = -102 mV respectively. All these specific parameters were set the same as in the model of Eyal, et al. , for more details please refer to the published model (ModelDB, access No. 238347). 51 c r r E E E 51 In the few-spine model, the membrane capacitance and maximum leak conductance of the dendritic cables 60 μm away from soma were multiplied by a spine factor to approximate dendritic spines. In this model, spine was set to 1.9. Only the spines that receive synaptic inputs were explicitly attached to dendrites. F F In the full-spine model, all spines were explicitly attached to dendrites. We calculated the spine density with the reconstructed neuron in Eyal, et al. . The spine density was set to 1.3 μm-1, and each cell contained 24994 spines on dendrites 60 μm away from the soma. 51 The morphologies and biophysical mechanisms of spines were the same in few-spine and full-spine models. The length of the spine neck neck = 1.35 μm and the diameter neck = 0.25 μm, whereas the length and diameter of the spine head were 0.944 μm, i.e., the spine head area was set to 2.8 μm2. Both spine neck and spine head were modeled as passive cables, with the reversal potential = -86 mV. The specific membrane capacitance, membrane resistance, and axial resistivity were the same as those for dendrites. L D El Synaptic inputs We investigated neuronal excitability for both distributed and clustered synaptic inputs. All activated synapses were attached to the terminal of the spine head. For distributed inputs, all activated synapses were randomly distributed on all dendrites. For clustered inputs, each cluster consisted of 20 activated synapses that were uniformly distributed on a single randomly-selected compartment. All synapses were activated simultaneously during the simulation. AMPA-based and NMDA-based synaptic currents were simulated as in Eyal et al.’s work. AMPA conductance was modeled as a double-exponential function and NMDA conduction as a voltage-dependent double-exponential function. For the AMPA model, the specific rise and decay were set to 0.3 and 1.8 ms. For the NMDA model, rise and decay were set to 8.019 and 34.9884 ms, respectively. The maximum conductance of AMPA and NMDA were 0.73 nS and 1.31 nS. τ τ τ τ Background noise We attached background noise to each cell to simulate a more realistic environment. Noise patterns were implemented as Poisson spike trains with a constant rate of 1.0 Hz. Each pattern started at start = 10 ms and lasted until the end of the simulation. We generated 400 noise spike trains for each cell and attached them to randomly-selected synapses. The model and specific parameters of synaptic currents were the same as described in , except that the maximum conductance of NMDA was uniformly distributed from 1.57 to 3.275, resulting in a higher AMPA to NMDA ratio. t Synaptic Inputs Exploring neuronal excitability We investigated the spike probability when multiple synapses were activated simultaneously. For distributed inputs, we tested 14 cases, from 0 to 240 activated synapses. For clustered inputs, we tested 9 cases in total, activating from 0 to 12 clusters respectively. Each cluster consisted of 20 synapses. For each case in both distributed and clustered inputs, we calculated the spike probability with 50 random samples. Spike probability was defined as the ratio of the number of neurons fired to the total number of samples. All 1150 samples were simulated simultaneously on our DeepDendrite platform, reducing the simulation time from days to minutes. Performing AI tasks with the DeepDendrite platform Conventional detailed neuron simulators lack two functionalities important to modern AI tasks: (1) alternately performing simulations and weight updates without heavy reinitialization and (2) simultaneously processing multiple stimuli samples in a batch-like manner. Here we present the DeepDendrite platform, which supports both biophysical simulating and performing deep learning tasks with detailed dendritic models. DeepDendrite는 3 개의 모듈로 구성됩니다 (Supplementary Fig. ): (1) an I/O module; (2) a DHS-based simulating module; (3) a learning module. When training a biophysically detailed model to perform learning tasks, users first define the learning rule, then feed all training samples to the detailed model for learning. In each step during training, the I/O module picks a specific stimulus and its corresponding teacher signal (if necessary) from all training samples and attaches the stimulus to the network model. Then, the DHS-based simulating module initializes the model and starts the simulation. After simulation, the learning module updates all synaptic weights according to the difference between model responses and teacher signals. After training, the learned model can achieve performance comparable to ANN. The testing phase is similar to training, except that all synaptic weights are fixed. 5 HPC-Net model Image classification is a typical task in the field of AI. In this task, a model should learn to recognize the content in a given image and output the corresponding label. Here we present the HPC-Net, a network consisting of detailed human pyramidal neuron models that can learn to perform image classification tasks by utilizing the DeepDendrite platform. HPC-Net has three layers, i.e., an input layer, a hidden layer, and an output layer. The neurons in the input layer receive spike trains converted from images as their input. Hidden layer neurons receive the output of input layer neurons and deliver responses to neurons in the output layer. The responses of the output layer neurons are taken as the final output of HPC-Net. Neurons between adjacent layers are fully connected. For each image stimulus, we first convert each normalized pixel to a homogeneous spike train. For pixel with coordinates ( ) in the image, the corresponding spike train has a constant interspike interval ISI( ) (in ms) which is determined by the pixel value ( ) as shown in Eq. ( ). x, y τ x, y p x, y 1 In our experiment, the simulation for each stimulus lasted 50 ms. All spike trains started at 9 + ISI ms and lasted until the end of the simulation. Then we attached all spike trains to the input layer neurons in a one-to-one manner. The synaptic current triggered by the spike arriving at time is given by τ t0 where is the post-synaptic voltage, the reversal potential syn = 1 mV, the maximum synaptic conductance max = 0.05 μS, and the time constant = 0.5 ms. v E g τ Neurons in the input layer were modeled with a passive single-compartment model. The specific parameters were set as follows: membrane capacitance m = 1.0 μF cm-2, membrane resistance m = 104 Ω cm2, axial resistivity a = 100 Ω cm, reversal potential of passive compartment l = 0 mV. c r r E The hidden layer contains a group of human pyramidal neuron models, receiving the somatic voltages of input layer neurons. The morphology was from Eyal, et al. , and all neurons were modeled with passive cables. The specific membrane capacitance m = 1.5 μF cm-2, membrane resistance m = 48,300 Ω cm2, 정렬 저항 a = 261.97 Ω cm, and the reversal potential of all passive cables l = 0 mV. Input neurons could make multiple connections to randomly-selected locations on the dendrites of hidden neurons. (영문) 숨겨진 neurons의 dendrites에 무작위로 선택된 위치에 다수의 연결을 만들 수 있다. -th synapse of the -th input neuron on neuron ’s dendrite is defined as in Eq. ( ), where is the synaptic conductance, is the synaptic weight, is the ReLU-like somatic activation function, and is the somatic voltage of the 타이밍에 있는 neuron input . 51 c r r E k i j 4 gijk Wijk i t 출력 계층의 신경계는 또한 수동 단일 부문 모델로 모델링되었으며 각 숨겨진 신경계는 각 출력 신경계에 단 하나의 시나프틱 연결만을 만들었습니다. 모든 특정 매개 변수는 입력 신경계와 동일하게 설정되었습니다. ). 4 Image classification with HPC-Net For each input image stimulus, we first normalized all pixel values to 0.0-1.0. Then we converted normalized pixels to spike trains and attached them to input neurons. Somatic voltages of the output neurons are used to compute the predicted probability of each class, as shown in equation , where 그것은 가능성의 -th class predicted by the HPC-Net, is the average somatic voltage from 20 ms to 50 ms of the - 제출 신경계, 그리고 indicates the number of classes, which equals the number of output neurons. The class with the maximum predicted probability is the final classification result. In this paper, we built the HPC-Net with 784 input neurons, 64 hidden neurons, and 10 output neurons. 6 pi i i C Synaptic plasticity rules for HPC-Net Inspired by previous work , we use a gradient-based learning rule to train our HPC-Net to perform the image classification task. The loss function we use here is cross-entropy, given in Eq. ( ), where is the predicted probability for class , indicates the actual class the stimulus image belongs to, = 1 if input image belongs to class , and = 0 if not. 36 7 pi i yi yi i yi When training HPC-Net, we compute the update for weight (the synaptic weight of the -th synapse connecting neuron to neuron ) at each time step. After the simulation of each image stimulus, is updated as shown in Eq. ( ): Wijk k i j Wijk 8 Here is the learning rate, is the update value at time , , are somatic voltages of neuron 그리고 respectively, is the -th synaptic current activated by neuron on neuron , its synaptic conductance, is the transfer resistance between the -th connected compartment of neuron on neuron ’s dendrite to neuron ’s soma, s = 30 ms, e = 50 ms are start time and end time for learning respectively. For output neurons, the error term can be computed as shown in Eq. ( ). For hidden neurons, the error term is calculated from the error terms in the output layer, given in Eq. ( ). t vj vi i j 이하 k i j 가이키 rijk k i j j t t 10 11 Since all output neurons are single-compartment, equals to the input resistance of the corresponding compartment, . Transfer and input resistances are computed by NEURON. Mini-batch training is a typical method in deep learning for achieving higher prediction accuracy and accelerating convergence. DeepDendrite also supports mini-batch training. When training HPC-Net with mini-batch size batch, we make batch copies of HPC-Net. During training, each copy is fed with a different training sample from the batch. DeepDendrite first computes the weight update for each copy separately. After all copies in the current training batch are done, the average weight update is calculated and weights in all copies are updated by this same amount. N N Robustness against adversarial attack with HPC-Net To demonstrate the robustness of HPC-Net, we tested its prediction accuracy on adversarial samples and compared it with an analogous ANN (one with the same 784-64-10 structure and ReLU activation, for fair comparison in our HPC-Net each input neuron only made one synaptic connection to each hidden neuron). We first trained HPC-Net and ANN with the original training set (original clean images). Then we added adversarial noise to the test set and measured their prediction accuracy on the noisy test set. We used the Foolbox , to generate adversarial noise with the FGSM method . ANN was trained with PyTorch , and HPC-Net was trained with our DeepDendrite. For fairness, we generated adversarial noise on a significantly different network model, a 20-layer ResNet . The noise level ranged from 0.02 to 0.2. We experimented on two typical datasets, MNIST and Fashion-MNIST . Results show that the prediction accuracy of HPC-Net is 19% and 16.72% higher than that of the analogous ANN, respectively. 98 99 93 100 101 95 96 Reporting summary Further information on research design is available in the linked to this article. Nature Portfolio Reporting Summary 데이터 가용성 이 연구의 결과를 지원하는 데이터는 이 논문과 함께 제공된 종이, 추가 정보 및 소스 데이터 파일 내에서 사용할 수 있습니다.The source code and data that used to reproduce the results in Figs. – Available at . The MNIST dataset is publicly available at . The Fashion-MNIST dataset is publicly available at . are provided with this paper. 3 6 https://github.com/pkuzyc/DeepDendrite http://yann.lecun.com/exdb/mnist https://github.com/zalandoresearch/fashion-mnist Source data Code availability The source code of DeepDendrite as well as the models and code used to reproduce Figs. – in this study are available at . 3 6 https://github.com/pkuzyc/DeepDendrite References McCulloch, W. S. & Pitts, W. A logical calculus of the ideas immanent in nervous activity. , 115–133 (1943). Bull. Math. Biophys. 5 LeCun, Y., Bengio, Y. & Hinton, G. 깊은 학습. 자연 521, 436–444 (2015). Poirazi, P., Brannon, T. & Mel, B. W. Arithmetic of subthreshold synaptic summation in a model CA1 pyramidal cell. , 977–987 (2003). Neuron 37 London, M. & Häusser, M. Dendritic computation. , 503–532 (2005). Annu. Rev. Neurosci. 28 Branco, T. & Häusser, M. The single dendritic branch as a fundamental functional unit in the nervous system. , 494–502 (2010). Curr. Opin. Neurobiol. 20 Stuart, G. J. & Spruston, N. Dendritic integration: 60 years of progress. , 1713–1721 (2015). Nat. Neurosci. 18 Poirazi, P. & Papoutsi, A. 계산 모델로 dendritic 기능을 빛내는. Nat. Rev. Neurosci. 21, 303-321 (2020). Yuste, R. & Denk, W. Dendritic spines as basic functional units of neuronal integration. , 682–684 (1995). Nature 375 Engert, F. & Bonhoeffer, T. Dendritic spine changes associated with hippocampal long-term synaptic plasticity. , 66–70 (1999). Nature 399 Yuste, R. Dendritic spines and distributed circuits. , 772–781 (2011). Neuron 71 유스테, R. 덴드리트 스핀의 전기 부문화. Annu. Rev. Neurosci. 36, 429–449 (2013). Rall, W. Branching dendritic trees and motoneuron membrane resistivity. , 491–527 (1959). Exp. Neurol. 1 Segev, I. & Rall, W. Computational study of an excitable dendritic spine. , 499–523 (1988). J. Neurophysiol. 60 Silver, D. et al. 깊은 신경 네트워크와 나무 검색을 사용하여 가는 게임을 마스터하는 것. Nature 529, 484–489 (2016). Silver, D. et al. 체스, shogi, 그리고 자기 플레이를 마스터하는 일반적인 강화 학습 알고리즘.Science 362, 1140-1144 (2018). McCloskey, M. & Cohen, N. J. 연결주의 네트워크에서의 재앙적인 간섭 : 연속 학습 문제. Psychol. Learn. Motiv. 24, 109–165 (1989). French, R. M. Catastrophic forgetting in connectionist networks. , 128–135 (1999). Trends Cogn. Sci. 3 Naud, R. & Sprekeler, H. Sparse 폭발은 멀티플렉시드 신경 코드에서 정보 전송을 최적화합니다. Proc. Natl Acad. Sci. USA 115, E6329-E6338 (2018). Sacramento, J., Costa, R. P., Bengio, Y. & Senn, W. Dendritic cortical microcircuits approximate the backpropagation algorithm. in (NeurIPS*,* 2018). Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Payeur, A., Guerguiev, J., Zenke, F., Richards, B. A. & Naud, R. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. , 1010–1019 (2021). Nat. Neurosci. 24 Bicknell, B. A. & Häusser, M. A synaptic learning rule for exploiting nonlinear dendritic computation. , 4001–4017 (2021). Neuron 109 Moldwin, T., Kalmenson, M. & Segev, I. gradient clusteron: dendritic nonlinearities, structural plasticity, and gradient descent를 통해 분류 작업을 해결하는 것을 배우는 모델 신경계. Hodgkin, A. L. & Huxley, A. F. A quantitative description of membrane current and Its application to conduction and excitation in nerve. , 500–544 (1952). J. Physiol. 117 Rall, W. Theory of physiological properties of dendrites. , 1071–1092 (1962). Ann. N. Y. Acad. Sci. 96 Hines, M. L. & Carnevale, N. T. Neural Comput. 9, 1179-1209 (1997) 신경 컴퓨터 시뮬레이션 환경. Bower, J. M. & Beeman, D. in (eds Bower, J.M. & Beeman, D.) 17–27 (Springer New York, 1998). The Book of GENESIS: Exploring Realistic Neural Models with the GEneral NEural SImulation System Hines, M. L., Eichner, H. & Schürmann, F. Neuron splitting in compute-bound parallel network simulations enables runtime scaling with twice as many processors. , 203–210 (2008). J. Comput. Neurosci. 25 Hines, M. L., Markram, H. & Schürmann, F. Fully implicit parallel simulation of single neurons. , 439–448 (2008). J. Comput. Neurosci. 25 Ben-Shalom, R., Liberman, G. & Korngreen, A. Accelerating compartmental modeling on a graphical processing unit. , 4 (2013). Front. Neuroinform. 7 Tsuyuki, T., Yamamoto, Y. & Yamazaki, T. 그래픽 처리 단위에 공간 구조를 가진 신경 모델의 효율적인 숫자 시뮬레이션. In Proc. 2016 International Conference on Neural Information Processing (eds Hirose894Akiraet al.) 279–285 (Springer International Publishing, 2016). Vooturi, D. T., Kothapalli, K. & Bhalla, U. S. Parallelizing Hines Matrix Solver in Neuron Simulations on GPU. In 388–397 (IEEE, 2017). Proc. IEEE 24th International Conference on High Performance Computing (HiPC) Huber, F. Efficient tree solver for hines matrices on the GPU. Preprint at (2018). https://arxiv.org/abs/1810.12742 Korte, B. & Vygen, J. 6 edn (Springer, 2018). Combinatorial Optimization Theory and Algorithms Gebali, F. 알고리즘과 병렬 컴퓨팅 (Wiley, 2011) Kumbhar, P. et al. CoreNEURON: An optimized compute engine for the NEURON simulator. , 63 (2019). Front. Neuroinform. 13 Urbanczik, R. & Senn, W. Learning by the dendritic prediction of somatic spiking. , 521–528 (2014). Neuron 81 Ben-Shalom, R., Aviv, A., Razon, B. & Korngreen, A. Optimizing ion channel models using a parallel genetic algorithm on graphical processors. , 183–194 (2012). J. Neurosci. Methods 206 Mascagni, M. A parallelizing algorithm for computing solutions to arbitrarily branched cable neuron models. , 105–114 (1991). J. Neurosci. Methods 36 McDougal, R. A. et al. Twenty years of modelDB and beyond: building essential modeling tools for the future of neuroscience. , 1–10 (2017). J. Comput. Neurosci. 42 Migliore, M., Messineo, L. & Ferrante, M. Dendritic Ih는 CA1 피라미드 신경계에서 동기화되지 않은 디스털 입력의 시간 합계를 선택적으로 차단합니다. Hemond, P. et al. Distinct classes of pyramidal cells exhibit mutually exclusive firing patterns in hippocampal area CA3b. , 411–424 (2008). Hippocampus 18 Hay, E., Hill, S., Schürmann, F., Markram, H. & Segev, I. 광범위한 dendritic 및 perisomatic 활성 속성을 캡처하는 neocortical layer 5b 피라미드 세포의 모델. PLoS Comput. Biol. 7, e1002107 (2011). Masoli, S., Solinas, S. & D’Angelo, E. 자세한 purkinje 세포 모델에서 행동 잠재적 처리는 축역 부문화에 중요한 역할을 보여줍니다. Lindroos, R. et al. Basal ganglia neuromodulation over multiple temporal and structural scales—simulations of direct pathway MSNs investigate the fast onset of dopaminergic effects and predict the role of Kv4.2. , 3 (2018). Front. Neural Circuits 12 Migliore, M. et al. Synaptic clusters function as odor operators in the olfactory bulb. , 8499–8504 (2015). Proc. Natl Acad. Sci. USa 112 NVIDIA. . (2021). CUDA C++ Programming Guide https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html NVIDIA. . (2021). CUDA C++ Best Practices Guide https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Harnett, M. T., Makara, J. K., Spruston, N., Kath, W. L. & Magee, J. C. Synaptic amplification by dendritic spines enhances input cooperativity. , 599–602 (2012). Nature 491 Chiu, C. Q. et al. Compartmentalization of GABAergic inhibition by dendritic spines. , 759–762 (2013). Science 340 Tønnesen, J., Katona, G., Rózsa, B. & Nägerl, U. V. 척추 목의 성형성은 시나프스의 구간화를 조절합니다. Nat. Neurosci. 17, 678–685 (2014). Eyal, G. et al. Human cortical pyramidal neurons: from spines to spikes via models. Front. Cell. Neurosci. 12, 181 (2018). Koch, C. & Zador, A. dendritic 스핀의 기능 : 생화학적 대신 전기 부문화를 지원하는 장치. J. Neurosci. 13, 413-422 (1993). Koch, C. Dendritic spines. In (Oxford University Press, 1999). Biophysics of Computation Rapp, M., Yarom, Y. & Segev, I. 뇌졸중 purkinje 세포의 케이블 속성에 대한 병렬 섬유 배경 활동의 영향. Hines, M. Efficient computation of branched nerve equations. , 69–76 (1984). Int. J. Bio-Med. Comput. 15 Nayebi, A. & Ganguli, S. Biologically inspired protection of deep networks from adversarial attacks. Preprint at (2017). https://arxiv.org/abs/1703.09202 Goddard, N. H. & Hood, G. Large-Scale Simulation Using Parallel GENESIS. in The Book of GENESIS: Exploring Realistic Neural Models with the General Neural Simulation System (eds Bower James M. & Beeman David) 349-379 (스프링거 뉴욕, 1998). Migliore, M., Cannia, C., Lytton, W. W., Markram, H. & Hines, M. L. Parallel network simulations with NEURON. , 119 (2006). J. Comput. Neurosci. 21 Lytton, W. W. et al. Simulation neurotechnologies for advancing brain research: parallelizing large networks in NEURON. Neural Comput. 28, 2063–2090 (2016). Valero-Lara, P. et al. cuHinesBatch: GPU의 인간 뇌 프로젝트에서 여러 Hines 시스템을 해결. In Proc. 2017 International Conference on Computational Science 566-575 (IEEE, 2017). Akar, N. A. et al. Arbor—A morphologically-detailed neural network simulation library for contemporary high-performance computing architectures. In 274–282 (IEEE, 2019). Proc. 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) Ben-Shalom, R. et al. NeuroGPU: Accelerating multi-compartment, biophysically detailed neuron simulations on GPUs. , 109400 (2022). J. Neurosci. Methods 366 Rempe, M. J. & Chopp, D. L. A predictor-corrector algorithm for reaction-diffusion equations associated with neural activity on branched structures. , 2139–2161 (2006). SIAM J. Sci. Comput. 28 Kozloski, J. & Wagner, J. 대규모 신경 조직 시뮬레이션에 대한 초기화 된 솔루션. Front. Neuroinform. 5, 15 (2011). Jayant, K. et al. Targeted intracellular voltage recordings from dendritic spines using quantum-dot-coated nanopipettes. , 335–342 (2017). Nat. Nanotechnol. 12 Palmer, L. M. & Stuart, G. J. Membrane potential changes in dendritic spines during action potentials and synaptic input. , 6897–6903 (2009). J. Neurosci. 29 Nishiyama, J. & Yasuda, R. Biochemical computation for spine structural plasticity. , 63–75 (2015). Neuron 87 Yuste, R. & Bonhoeffer, T. Morphological changes in dendritic spines associated with long-term synaptic plasticity. , 1071–1089 (2001). Annu. Rev. Neurosci. 24 Holtmaat, A. & Svoboda, K. Experience-dependent structural synaptic plasticity in the mammalian brain. , 647–658 (2009). Nat. Rev. Neurosci. 10 Caroni, P., Donato, F. & Muller, D. Structural plasticity upon learning: regulation and functions. , 478–490 (2012). Nat. Rev. Neurosci. 13 Keck, T. et al. Massive restructuring of neuronal circuits during functional reorganization of adult visual cortex. , 1162 (2008). Nat. Neurosci. 11 Hofer, S. B., Mrsic-Flogel, T. D., Bonhoeffer, T. & Hübener, M. Experience leaves a lasting structural trace in cortical circuits. , 313–317 (2009). Nature 457 Trachtenberg, J. T. et al. Long-term in vivo imaging of experience-dependent synaptic plasticity in adult cortex. , 788–794 (2002). Nature 420 Marik, S. A., Yamahachi, H., McManus, J. N., Szabo, G. & Gilbert, C. D. Axonal dynamics of excitatory and inhibitory neurons in somatosensory cortex. , e1000395 (2010). PLoS Biol. 8 Xu, T. et al. Rapid formation and selective stabilization of synapses for enduring motor memories. , 915–919 (2009). Nature 462 Albarran, E., Raissi, A., Jáidar, O., Shatz, C. J. & Ding, J. B. Enhancing motor learning by increasing the stability of newly formed dendritic spines in the motor cortex. , 3298–3311 (2021). Neuron 109 Branco, T. & Häusser, M. Synaptic integration gradients in single cortical pyramidal cell dendrites. , 885–892 (2011). Neuron 69 Major, G., Larkum, M. E. & Schiller, J. Active properties of neocortical pyramidal neuron dendrites. , 1–24 (2013). Annu. Rev. Neurosci. 36 Gidon, A. et al. Dendritic action potentials and computation in human layer 2/3 cortical neurons. 과학 367, 83–87 (2020). Doron, M., Chindemi, G., Muller, E., Markram, H. & Segev, I. Timed synaptic inhibition shapes NMDA spikes, influencing local dendritic processing and global I/O properties of cortical neurons. , 1550–1561 (2017). Cell Rep. 21 Du, K. et al. Cell-type-specific inhibition of the dendritic plateau potential in striatal spiny projection neurons. , E7612–E7621 (2017). Proc. Natl Acad. Sci. USA 114 Smith, S. L., Smith, I. T., Branco, T. & Häusser, M. Dendritic spikes enhance stimulus selectivity in cortical neurons in vivo. , 115–120 (2013). Nature 503 Xu, N.-l et al. Nonlinear dendritic integration of sensory and motor input during an active sensing task. , 247–251 (2012). Nature 492 Takahashi, N., Oertner, T. G., Hegemann, P. & Larkum, M. E. Active cortical dendrites modulate perception. , 1587–1590 (2016). Science 354 Sheffield, M. E. & Dombeck, D. A. Calcium transient prevalence across the dendritic arbour predicts place field properties. , 200–204 (2015). Nature 517 Markram, H. et al. Reconstruction and simulation of neocortical microcircuitry. , 456–492 (2015). Cell 163 Billeh, Y. N. et al. Systematic integration of structural and functional data into multi-scale models of mouse primary visual cortex. , 388–403 (2020). Neuron 106 Hjorth, J. et al. The microcircuits of striatum in silico. , 202000671 (2020). Proc. Natl Acad. Sci. USA 117 Guerguiev, J., Lillicrap, T. P. & Richards, B. A. Towards deep learning with segregated dendrites. , e22901 (2017). elife 6 Iyer, A. et al. Avoiding catastrophe: active dendrites enable multi-task learning in dynamic environments. , 846219 (2022). Front. Neurorobot. 16 Jones, I. S. & Kording, K. P. 단일 신경계가 덴드리틱 나무에 대한 연속적인 계산을 통해 흥미로운 기계 학습 문제를 해결할 수 있습니까? Bird, A. D., Jedlicka, P. & Cuntz, H. Dendritic normalisation improves learning in sparsely connected artificial neural networks. , e1009202 (2021). PLoS Comput. Biol. 17 Goodfellow, I. J., Shlens, J. & Szegedy, C. Explaining and harnessing adversarial examples. In (ICLR, 2015). 3rd International Conference on Learning Representations (ICLR) Papernot, N., McDaniel, P. & Goodfellow, I. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. Preprint at (2016). https://arxiv.org/abs/1605.07277 Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. , 2278–2324 (1998). Proc. IEEE 86 Xiao, H., Rasul, K. & Vollgraf, R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. Preprint at (2017). http://arxiv.org/abs/1708.07747 Bartunov, S. et al. Assessing the scalability of biologically-motivated deep learning algorithms and architectures. In (NeurIPS, 2018). Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Rauber, J., Brendel, W. & Bethge, M. Foolbox: A Python toolbox to benchmark the robustness of machine learning models. In (2017). Reliable Machine Learning in the Wild Workshop, 34th International Conference on Machine Learning Rauber, J., Zimmermann, R., Bethge, M. & Brendel, W. Foolbox native: fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX. , 2607 (2020). J. Open Source Softw. 5 Paszke, A. et al. PyTorch: An imperative style, high-performance deep learning library. In (NeurIPS, 2019). Advances in Neural Information Processing Systems 32 (NeurIPS 2019) He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 770–778 (IEEE, 2016). Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 인식 이 작품은 중국 K.D.와 T.H., 중국 국립 자연과학 재단 (No. 6182588101)에 K.M., 중국 국립 핵심 R&D 프로그램 (No. 2018B030338001)에 T.H., 중국 국립 자연과학 재단 (No. 6182588102)에 Y.T., 스웨덴 연구위원회 (VR-M-2020-01652), 스웨덴 전자 과학 연구 센터 (No. 2022ZD01163005)에 L.M., EU/Horizon 2020의 핵심 영역 R&D 프로그램 (No. 2018B030338001)에 의해 지원되었습니다. This paper is under CC by 4.0 Deed (Attribution 4.0 International) license. available on nature 이 종이는 CC by 4.0 Deed (Attribution 4.0 International) 라이선스 available on nature