Sa a se yon atik sou konsepsyon an nan blòk ekstraksyon karakteristik ki te devlope pou amelyore deteksyon objè plizyè-scale pandan y ap kenbe rapid inference nan aplikasyon reyèl. Koneksyon partikilè atravè etap (CSP) Premye, WongKinYiu et al. [ ] te prezante inovasyon sa a enstriktè ak rezoud pwoblèm la nan enfòmasyon gradient redondant nan granmoun rezo a neural konvulsyonèl pi gwo. Objektif prensipal li se enrichir interaksyon gradient pandan y ap redwi pri konvèsyon. Cross-stage partial connections (CSP) konsève diversity gradient pa konbine kat karakteristik soti nan kòmansman an ak fen chak etap rezo: kat karakteristik nan kat baz la se divize an de pati: youn pase nan yon blòk dense ak kat tranzisyon, pandan y ap lòt la sove chemen sa a ak konekte dirèkteman nan etap la pwochen. Sa a estrikti te fèt yo rezoud pwoblèm plizyè, ki gen ladan amelyore kapasite a aprantisaj nan CNN a, retire entèlijans 1 CSP-DenseNet kenbe benefis re-kòmanse karakteristik DenseNet a pandan y ap redwi enfòmasyon doub gradient pa tire flux gradient, rive nan yon estrateji hierarchique fonksyon fusion nan yon kouch tranzisyon pati. Dapre , sa a apwòch redwi òdinatè a pa 20%, pandan y ap reyalize egzak oswa menm pi wo presizyon sou ImageNet dataset la. Eksperyans nan Autè C3 nan YOLOv4 ak YOLOv5 sèvi ak modil la Cross Stage Partial (CSP) amelyore ekstraksyon karakteristik nan boutèy la. Blòk la C3 se yon implemantasyon pratik nan sa a CSP arsitektur nan Modèl yo yo. Ultrasonik Nan blòk la C3, kat fonksyon entwodiksyon yo divize an de pati. Yon pati se pwosesis pa yon konvolisyon 1×1 ki te swiv pa paralèl blòk bottleneck, pandan y ap pati lòt la pase nan yon konvolisyon 1×1 separe ak skippe boutèy la konplètman. Sa yo de jaden yo Lè sa a, yo konkate sou dimansyon kanal la ak fonde pa yon lòt konvolisyon 1×1 yo pwodwi pwodiksyon an. n Input (x) │ ┌────────┴─────────┐ │ │ [1x1 Conv] [1x1 Conv] (cv1) (cv2) │ │ [Bottlenecks] │ (m: n blocks) │ │ │ └────────┬─────────┘ │ [Concat along C] │ [1x1 Conv → cv3] │ Output ak aplikasyon ultralytics ( ): Lyen nan class C3(nn.Module): """CSP Bottleneck with 3 convolutions.""" def __init__(self, c1: int, c2: int, n: int = 1, shortcut: bool = True, g: int = 1, e: float = 0.5): """ Initialize the CSP Bottleneck with 3 convolutions. Args: c1 (int): Input channels. c2 (int): Output channels. n (int): Number of Bottleneck blocks. shortcut (bool): Whether to use shortcut connections. g (int): Groups for convolutions. e (float): Expansion ratio. """ super().__init__() c_ = int(c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1) self.cv2 = Conv(c1, c_, 1, 1) self.cv3 = Conv(2 * c_, c2, 1) self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, k=((1, 1), (3, 3)), e=1.0) for _ in range(n))) def forward(self, x: torch.Tensor) -> torch.Tensor: """Forward pass through the CSP bottleneck with 3 convolutions.""" return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1)) Cross-Stage pati ak koneksyon 2F (C2F) Blòk la C2f bati sou CSPNet, elaji li plis ankò: olye pou yon sèl chemen fonksyon, li prezante de koneksyon fonksyon karakteristik paralèl, chak ak mwatye kantite chanèl pwodiksyon. Ide sa a, ki te premye parèt nan YOLOv7 ak YOLOv8 [ Nan [...] ], swiv prensip yo menm jan ak CSP pa divize kat karakteristik envantè yo diminye redondans òdinatè ak amelyore re-fòmasyon karakteristik. 2 3 Nan yon blòk C2f, tensè a enpòte se divize an de chemen: youn apeprè chemen Bottleneck kòm yon kout, pandan y ap lòt la pase nan plizyè chemen Bottleneck. Dapre CSP orijinal la, ki sèvi ak sèlman pwodiksyon final Bottleneck a, C2f kolekte tout pwodiksyon Bottleneck entèmedyè yo ak konkanize yo - amelyore diversite karakteristik ak reprezantasyon. Sa a estrikti doub karakteristik fusion (2F) tou ede rezo a ranplase okklizyon pi byen, fè deteksyon yo pi fò nan sken yo difisil. aplikasyon an nan ultralytics ( ) nan: Lyen nan class C2f(nn.Module): """Faster Implementation of CSP Bottleneck with 2 convolutions.""" def __init__(self, c1: int, c2: int, n: int = 1, shortcut: bool = False, g: int = 1, e: float = 0.5): """ Initialize a CSP bottleneck with 2 convolutions. Args: c1 (int): Input channels. c2 (int): Output channels. n (int): Number of Bottleneck blocks. shortcut (bool): Whether to use shortcut connections. g (int): Groups for convolutions. e (float): Expansion ratio. """ super().__init__() self.c = int(c2 * e) # hidden channels self.cv1 = Conv(c1, 2 * self.c, 1, 1) self.cv2 = Conv((2 + n) * self.c, c2, 1) # optional act=FReLU(c2) self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n)) def forward(self, x: torch.Tensor) -> torch.Tensor: """Forward pass through C2f layer.""" y = list(self.cv1(x).chunk(2, 1)) y.extend(m(y[-1]) for m in self.m) return self.cv2(torch.cat(y, 1)) def forward_split(self, x: torch.Tensor) -> torch.Tensor: """Forward pass using split() instead of chunk().""" y = self.cv1(x).split((self.c, self.c), 1) y = [y[0], y[1]] y.extend(m(y[-1]) for m in self.m) return self.cv2(torch.cat(y, 1)) Cross Etap pati ak Kernel gwosè 2 (C3k2) blòk Pwodwi pou Telefòn ( ] Sèvi ak blòk yo C3K2 ki anba a nan tèt la pou ekstraksyon karakteristik nan diferan etap nan eskalye li yo nan pwosesis karakteristik milti-scale - yon lòt evolisyon nan bouchon klasik CSP. Blòk la C3K2 divize karaktè a karakteristik ak pwosesis li ak plizyè konvolisyon 3x3 limyè, fusion nan rezilta yo apre sa. Sa a amelyore flux enfòmasyon pandan y ap rete pi kompak pase yon bouchon CSP konplè, diminye kantite paramèt fòme. 4 Blòk la C3K kenbe estrikti baz la menm jan ak C2f men pa divize pwodiksyon an apre konvolisyon an inisyal. Anplis de sa, li kouri envantè a atravè Bottleneck layers ak enteryè concatenations, fini ak yon final 1×1 convolution. Anplis de sa C2f, C3K ajoute fleksibilite ak gwosè koutim Customizable, ede modèl la pi byen capture detaye fin nan diferan echèl objè. n Baze sou lide sa a, C3K2 ranplase Bottlenecks senpleman ak plizyè blòk C3K. Li kòmanse ak yon blòk Conv, stable plizyè blòk C3K nan yon seri, concatenates pwodiksyon yo ak envantè orijinal la, ak fini ak yon lòt layer Conv - melanje CSP a divize-fusion konsèp ak kernels fleksib pou balans vitès, efikasite paramèt, ak plis rich multi-scale ekstraksyon karakteristik. Input: [Batch, c1, H, W] │ [cv1] (1x1 Conv) → splits channels into 2c │ ┌─────────────┐ │ │ Branch 1 Branch 2 (Bypass) (Bottleneck chain) │ │ ├─> C3k Block #1 │ ├─> C3k Block #2 │ ... (n times) │ └─────────────┬─────────────┐ Concatenate [Bypass, Split, C3k outputs] │ [cv2] (1x1 Conv) │ Output: [Batch, c2, H, W] Chak blòk C3K sèvi ak Bottlenecks paralèl ak koutim Custom, bay plis fleksibilite pou ekstraksyon karakteristik ak pèmèt modèl la yo adapte pi byen nan modèl konplèks. C3k Input: [Batch, c, H, W] │ [cv1] (1x1 Conv, expand/split) │ ┌───────────────┐ │ │ ByPass Bottleneck blocks │ ┌─────────────┐ │ B1, B2, ...Bn (parallel) └─────────────┘ └───────────────┬───────┘ Concatenate │ [cv2] (1x1 Conv) │ C3k Output: [Batch, c, H, W] aplikasyon an nan ultralytics ( ) nan: Lyen nan class C3k(C3): """C3k is a CSP bottleneck module with customizable kernel sizes for feature extraction in neural networks.""" def __init__(self, c1: int, c2: int, n: int = 1, shortcut: bool = True, g: int = 1, e: float = 0.5, k: int = 3): """ Initialize C3k module. Args: c1 (int): Input channels. c2 (int): Output channels. n (int): Number of Bottleneck blocks. shortcut (bool): Whether to use shortcut connections. g (int): Groups for convolutions. e (float): Expansion ratio. k (int): Kernel size. """ super().__init__(c1, c2, n, shortcut, g, e) c_ = int(c2 * e) # hidden channels # self.m = nn.Sequential(*(RepBottleneck(c_, c_, shortcut, g, k=(k, k), e=1.0) for _ in range(n))) self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, k=(k, k), e=1.0) for _ in range(n))) class C3k2(C2f): """Faster Implementation of CSP Bottleneck with 2 convolutions.""" def __init__( self, c1: int, c2: int, n: int = 1, c3k: bool = False, e: float = 0.5, g: int = 1, shortcut: bool = True ): """ Initialize C3k2 module. Args: c1 (int): Input channels. c2 (int): Output channels. n (int): Number of blocks. c3k (bool): Whether to use C3k blocks. e (float): Expansion ratio. g (int): Groups for convolutions. shortcut (bool): Whether to use shortcut connections. """ super().__init__(c1, c2, n, shortcut, g, e) self.m = nn.ModuleList( C3k(self.c, self.c, 2, shortcut, g) if c3k else Bottleneck(self.c, self.c, shortcut, g) for _ in range(n) ) Konklisyon Nan kou, modèn YOLO estrikti yo kontinye devlope pa ajoute blòk tankou C3, C2f, C3k, ak C3k2 - tout te bati alantou lide a debaz nan Cross-Stage Partitional (CSP) koneksyon. Sa a abord CSP diminye kominikasyon ak ogmante reprezantasyon karakteristik nan menm tan an. Block Outer Structure Inner Structure Kernel flexibility C3 Parallel Bottlenecks Bottlenecks Fixed kernels C2f Serial Bottlenecks Bottlenecks Fixed kernels C3k Parallel Bottlenecks Bottlenecks Custom kernels C3k2 Serial C3k blocks Each C3k has parallel Bottlenecks Custom kernels C3 nan Bottlenecks paralèl boutèy Kòmanse Kernels C2F nan Seri nan Bottlenecks boutèy Kòmanse Kernels C3K nan Bottlenecks paralèl boutèy Kòmanse Kernels C3K2 nan Blòk seri C3k Tout C3k gen Bottlenecks paralèl Kòmanse Kernels Sa yo rafine arsitektur kolektivman ede modèl YOLO kenbe presizyon deteksyon segondè pandan y ap rete vit ak limyè ase pou deplwaman tan reyèl - yon avantaj kritik pou aplikasyon divès kalite Lyen https://arxiv.org/pdf/1911.11929 https://arxiv.org/pdf/2207.02696 https://arxiv.org/pdf/2408.15857 https://arxiv.org/html/2410.17725v1#S3