Autori : Ittai Dayan Holger R. Roth izgovor Uslovi korišćenja Ahmed Harouni Amilcare Gentili Anas Z. Abidin Andrija Liu Anthony Beardsworth Costa Bradford Džej Vud Chien-Sung Tsai Chih-Hung Wang Chun-Nan Hsu C. K. Li Peiying Ruan Daguang Xu Dufan Wu Džejmi Huang Felipe Campos Kitamura Griffin Lacey Gustavo César de Antônio Corradi Gustavo Nino Hao-Hsin Shin Hirofumi Obinata Osećaj Ren Jason C. Crane Jesse Tetreault Jiahui Guan John W. Garrett Džošua D. Kaggie Jung Gil Park Kejt Drejer Krishna Juluru Kristopher Kersten Marcio Aloisio Bezerra Cavalcanti Rockenbach Marius George Linguraru Masoom A. Haider Meena AbdelMaseeh Nicola Rieke Pablo F. Damasceno Pedro Mario Cruz e Silva Pochuan Wang Sheng Xu Shuichi Kawano Sira Sriswasdi Soo Young Park Thomas M. Grist Varun Buch Watsamon Jantarabenjakul Weichung Wang Won Young Tak Xiang Li Xihong Lin Young Joon Kwon Abood Quraini Andrew Feng Andrew N. Priest Baris Turkbey Benjamin Glicksberg Bernardo Bizzo Byung Seok Kim Carlos Tor-Díez Chia-Cheng Lee Chia-Jung Hsu Chin Lin Chiu-Ling Lai Christopher P. Hess Colin Compas Deepeksha Bhatia Eric K. Oermann Evan Leibovitz Hisashi Sasaki Hitoshi Mori Isaac Yang Jae Ho Sohn Krishna Nand Keshava Murthy Li-Chen Fu Matheus Ribeiro Furtado de Mendonça Mike Fralick Min Kyu Kang Mohammad Adil Natalie Gangai Peerapon Vateekul Pierre Elnajjar Sarah Hickman Sharmila Majumdar Shelley L. McLeod Sheridan Reed Stefan Željko Stephanie Harmon Tatsuya Kodama Thanyawee Puthanakit Tony Mazzulli Vitor Lima de Lavor Yothin Rakvongthai Yu Rim Lee Yuhong Wen Fiona J. Gilbert Mona G. Flores Quanzheng Li Autori : Izveštaj Dayan Holger R. Roth izgovor Uslovi korišćenja Ahmadined Haruni Amilcare ljubazni Anas Z. Abidin Andrija Liu Anthony Beardsworth Kost Bradford Džej Vud Knjiga Knjiga Knjige Čih-Hung Vang Čun-Nan Hsu C. K. Li Korištenje ruža Daguang Xu Dufan Wu Džejmi Huang Filipi Campos Kitamura Grejpfrut Lacey Gustavo Cezar de Antônio Corradi Gustavo Nino Šin Hao-Hsin Hirofumi Obina Osećaj Ren Džejson C. Krane Džejsi Tetreault Džihadistički Guan Džon W. Garret Džošua D. Kaggie Jung Gil Park u blizini Kejt Drejer Krišna Juluru Krištof Kersten Marcio Aloisio Bezerra Cavalcanti Rokenbač Marius Džordž Linguraru Masoom A. Haider Meena AbdelMaseeh Niš Rieke Pavao F. Damasceno Uslovi korišćenja: Pedro Mario Cruz e Silva Vanjski Wang Šeng Xu Uslovi korišćenja Sira Srisvasdi Mladi park Toma M. Grist Varun Knjiga Vatsamon večera Vanjski Wang Pobjeda Young Tak Li Li Li Knjiga Xihong Lin Mladi Joon Kwon Uslovi korišćenja Andrija Feng Andreju N. Priestu Uslovi korišćenja Džejms Gliksberg Bernarda Bizza Saznajte više o Kim Karlos Tor-Diz Ključna reč: Chia-Cheng Lee Čia-Jung Hsu Chin Lin Čiu-Ling Lai Krištof P. Hess Kolin Kompas Deepeksha Bhatia Erika K. Oermana Evan Leibovitz poruka HISASHI SASAKI Sljedeći članakHitoshi Mori Izvor: Isaac Yang Jae Ho sin Krišna Nand Keshava Murthy Sljedeći članak: Li-Chen Fu Uslovi korišćenja Matheus Ribeiro Furtado de Mendonça Majk Fralick Sljedeći članak Min Kyu Kang Muhamed Adil Natalija Gangai Peerapon Vateekul Pjer Elnajjar Majka Hickman Šarmila Majumdar Šejli L. Mekleod Šeridan Reed Stefan Željko Knjige Stefani Harmon Tatsuya Kodama Uslovi korišćenja Toni Mazuli Vitor Lima za posao Sljedeći Članak Yothin Rakvongthai Yu Rim Lija Vučić Wen Fiona Džilbert Monika G. Flores Čačak Li Abstrakcija Federated Learning Setup (FL) je metoda koja se koristi za obuku modela veštačke inteligencije sa podacima iz više izvora, uz održavanje anonimnosti podataka, čime se uklanjaju mnoge prepreke za dijeljenje podataka. Ovde smo koristili podatke iz 20 instituta širom svijeta za obuku modela FL, nazvanog EXAM (elektronski medicinski zapis (EMR) X-ray AI model prsa), koji predviđa buduće potrebe kisika simptomatskih pacijenata sa COVID-19 koristeći ulaze vitalnih znakova, laboratorijske podatke i prsne X-zrake. EXAM je postigao prosječnu površinu ispod krivine (AUC) > 0,92 za predviđanje rezultata na 24 i 72 sata od trenutka početne prezentacije u hitnoj sobi, i pružao je 16% poboljšanje Glavni Znanstvene, akademske, medicinske i podatkovne znanstvene zajednice okupile su se pred pandemskom krizom COVID-19 kako bi brzo procijenile nove paradigme u veštačkoj inteligenciji (AI) koje su brze i sigurne, i potencijalno poticale razmjenu podataka i obuku modeliranja i testiranja bez uobičajenih prepreka za privatnost i vlasništvo podacima konvencionalne saradnje. , Pružatelji zdravstvenih usluga, istraživači i industrija okrenuli su fokus na rješavanje nezadovoljenih i kritičnih kliničkih potreba stvorenih krizom, sa izuzetnim rezultatima. , , , , , , Naručivanje kliničkih ispitivanja ubrzalo je i olakšalo nacionalno regulatorno tijelo i međunarodni duh saradnje. , , Analitika podataka i veštačka inteligencija oduvijek su promicali otvorene i kolaborativne pristupe, obuhvaćajući koncepte kao što su softver otvorenog koda, reproducirajuće istraživanje, skladišta podataka i javno dostupnost anonimnih skupova podataka. , Pandemija je naglasila potrebu da se hitno sprovode kolaboracije podataka koje osnažuju kliničke i znanstvene zajednice pri odgovaranju na brzo evoluirajuće i široko rasprostranjene globalne izazove. , , . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Konkretan primer ovih vrsta saradnje je naš prethodni rad na modelu kliničke podrške odlučivanju (CDS) SARS-COV-2 koji se temelji na AI-u. Ovaj model CDS-a razvijen je u Mass General Brigham (MGB) i validiran je preko podataka višestrukih zdravstvenih sustava. Ulaznice u model CDS-a bile su slike X-zraka prsnog koša (CXR), vitalni znakovi, demografski podaci i laboratorijske vrijednosti koje su pokazane u prethodnim publikacijama da predviđaju rezultate pacijenata s COVID-19 , , , CXR je odabran kao ulaz za slikanje jer je široko dostupan i obično označen smjernicama kao što su one koje pruža ACR Društvo Fleischner Organizacija WHO , national thoracic societies Nacionalno ministarstvo zdravlja COVID priručnici i radiološka društva širom svijeta Izlazak modela CDS bio je rezultat, nazvan CORISK , koji odgovara potrebama podrške kisiku i koji bi mogao pomoći u ispitivanju pacijenata od strane frontline kliničara , , Poznato je da pružatelji zdravstvenih usluga preferiraju modele koji su validirani na vlastitim podacima. Do danas, većina AI modela, uključujući gore spomenuti CDS model, obučena je i validirana na „uzetim“ podacima koji često nemaju raznolikost. , , što potencijalno rezultira prekomjernom opremom i nižom generalizabilnošću. Ovo se može ublažiti obukom različitih podataka sa više lokacija bez centralizacije podataka Koristeći metode kao što su transfer učenje , ili FL. FL je metoda koja se koristi za obuku AI modela na različitim izvorima podataka, bez da se podaci transportuju ili izlažu izvan njihove izvorne lokacije. . 18 19 20 21 22 23 24 25 26 27 28 29 30 27 31 32 33 34 35 36 Federated Learning podržava brzo pokretanje centralizovanih eksperimenata uz poboljšanu sljedivost podataka i procjenu algoritamskih promjena i utjecaja Jedan pristup FL-u, koji se zove klient-server, šalje „neobučeni“ model na druge servere („uzroke“) koji obavljaju djelomične zadatke obuke, a rezultati se vraćaju da bi se spajali u centralni („federirani“) server. . 37 36 Upravljanje podacima za FL održava se na lokalnom nivou, ublažavajući zabrinutosti o privatnosti, sa samo modelskim težinama ili gradijentima koji se komuniciraju između lokacija klijenata i federiranog servera , FL je već pokazao obećanje u nedavnim medicinskim aplikacijama za slikanje , , , , uključujući i analizu COVID-19 , , . A notable example is a mortality prediction model in patients infected with SARS-COV-2 that uses clinical features, albeit limited in terms of number of modalities and scale . 38 39 40 41 42 43 8 44 45 46 Naš cilj je bio da razvijemo robustan, generalizabilan model koji bi mogao pomoći u trijiranju pacijenata. Teorizovali smo da se model CDS može uspešno federirati, s obzirom na njegovu upotrebu ulaznih podataka koji su relativno uobičajeni u kliničkoj praksi i koji se ne oslanjaju uvelike na procjene ovisne o operatoru o stanju pacijenta (kao što su klinički utisci ili prijavljeni simptomi). Umjesto toga, korišteni su laboratorijski rezultati, vitalni znakovi, studija slike i općenito zarobljeni demografski (tj. dob). Stoga smo prekvalificirali model CDS sa raznovrsnim podacima koristeći pristup FL klijenta-servera kako bismo razvili novi globalni model FL, koji je nazvan EXAM, koristeći CXR i EMR Naša hipoteza bila je da bi EXAM bio bolji od lokalnih modela i da bi se bolje generalizovao u svim zdravstvenim sustavima. Rezultati Model arhitekture za ispit EXAM model se temelji na CDS modelu navedenom gore Ukupno, 20 karakteristika (19 iz EMR-a i jedan CXR) korišteni su kao ulaz u model. Rezultat (tj. „zemlja istina“) etikete su dodijeljene na osnovu terapije kiseonikom pacijenta nakon 24- i 72-satnih perioda od početnog prijema u odeljenje za hitne slučajeve (ED). . 27 1 Oznake rezultata pacijenata postavljene su na 0, 0,25, 0,50 i 0,75 ovisno o najintenzivnijoj terapiji kisikom koju je pacijent primio u prozoru predviđanja. Kategorije terapije kisikom bile su, odnosno, sobni vazduh (RA), kisik s niskim protokom (LFO), kisik s visokim protokom (HFO)/noninvasivna ventilacija (NIV) ili mehanička ventilacija (MV). Ako je pacijent umro unutar prozora predviđanja, oznaka rezultata postavljena je na 1. Za EMR karakteristike korištene su samo prve vrednosti uhvaćene u ED-u, a predobrada podataka uključivala je deidentifikaciju, imputaciju nedostajućih vrednosti i normalizaciju na nulu sredinu i varijancu jedinica. Model stoga spajanje informacija iz EMR i CXR karakteristika, koristeći 34-slojni konvolucionalne neuronske mreže (ResNet34) da se izvuku karakteristike iz CXR i Deep & Cross mreže da se spojiti karakteristike zajedno s EMR karakteristike (za više detalja, pogledajte Rezultat modela je ocjena rizika, nazvana ocjena EXAM, koja je kontinuirana vrijednost u rasponu od 0 do 1 za svaku od 24-časovnih i 72-časovnih predviđanja koja odgovara gore opisanim oznaka. Metode Federiranje modela EXAM model je osposobljen koristeći kohortu od 16.148 slučajeva, čineći ga ne samo među prvim FL modelima za COVID-19 već i vrlo velikim i multi-kontinentalnim razvojnim projektom u klinički relevantnoj AI (Slik. Podaci između lokacija nisu bili usklađeni prije ekstrakcije i, s obzirom na realne kliničke informatike okolnosti, meticulous usklađivanje unosa podataka nije provedeno od strane autora (Fig. U svakom slučaju) 1a i b 1c i d , Mapa svijeta koja pokazuje 20 različitih klijenata sajtova koji doprinose studiji EXAM. , Broj slučajeva doprinesenih od strane svake institucije ili lokacije (klijenti 1 predstavljaju lokaciju koja doprinosi najvećem broju slučajeva). , Chest X-ray intensity distribution at each client site. , Dob pacijenata na svakoj lokaciji klijenta, prikazujući minimalnu i maksimalnu dob (zvezde), prosječnu dob (trijumfovi) i standardno odstupanje (horizontalne trake). . a b c d 1 U svakom slučaju, potrebno je obratiti pažnju na činjenicu da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja pokazali da su rezultati istraživanja ( « 1 × 10–3, Wilcoxon potpisani test rang) od 16% (kao što je definisano prosječnom AUC prilikom pokretanja modela na odgovarajućim lokalnim test setovima: od 0,795 do 0,920, ili 12,5 postotnih bodova) (Slika. Takođe je rezultiralo 38% poboljšanjem općenitosti (kao što je definisano prosječnom AUC prilikom pokretanja modela na svim test setovima: od 0,667 do 0,920, ili 25,3 postotnih bodova) najboljeg globalnog modela za predviđanje 24-satnog tretmana kiseonikom u odnosu na modele obučene samo na vlastitim podacima lokacije (Slik. Za rezultate predviđanja 72 sata tretmana kiseonikom, najbolje globalno modeliranje rezultiralo je prosječnim poboljšanjem performansi od 18% u odnosu na lokalno obučene modele, dok se generalizabilnost globalnog modela poboljšala u prosjeku za 34% (Extended Data Fig. Stabilnost naših rezultata potvrđena je ponavljanjem tri treninga lokalne i FL na različitim randomizovanim podijeljenim podacima. P 2a 2B 1 , Učinak na svakom klijenta test postavljen u predviđanju 24-satnog tretmana kisikom za modele obučene samo na lokalnim podacima (Local) u odnosu na najbolji globalni model dostupan na serveru (FL (gl. najbolje). , Generalizabilnost (prosečna performansa na podacima o testiranju drugih lokacija, kao što je prosječna AUC) kao funkcija veličine skupova podataka klijenta (bez slučajeva). Zelena horizontalna linija označava performanse generalizabilnosti najboljeg globalnog modela. Prikazuje se performanse za 18 od 20 klijenata, jer je klijenta 12 imalo rezultate samo za 72 sata kiseonika (Extended Data Fig. ) и клијент 14 имао је случајеве само са РА третманом, тако да оцењивање метрике (од АУЦ) није било применљиво у било ком од ових случајева ( Podaci za klijenta 14 su takođe isključeni iz izračuna prosječne generalizabilnosti u lokalnim modelima. a b 1 Metode Lokalni modeli koji su obučeni koristeći neuravnotežene kohorte (na primjer, uglavnom blagi slučajevi COVID-19) značajno su koristili od FL pristupa, sa značajnim poboljšanjem u predviđanju prosječne AUC performanse za kategorije sa samo nekoliko slučajeva. To je bilo očigledno na klijenta lokaciji 16 (neuravnoteženi skup podataka), s većinom pacijenata koji doživljavaju blagu ozbiljnost bolesti i sa samo nekoliko teških slučajeva. i proširenih podataka Fig. Što je još važnije, generalizabilnost FL modela je znatno povećana u odnosu na lokalno obučeni model. 3a 2 , ROC at client site 16, with unbalanced data and mostly mild cases. , ROC of the local model at client site 12 (a small dataset), mean ROC of models trained on larger datasets corresponding to the five client sites in the Boston area (1, 4, 5, 6, 8) and ROC of the best global model in prediction of 72-h oxygen treatment for different thresholds of EXAM score (left, middle, right). The mean ROC is calculated based on five locally trained models while the gray area denotes the ROC standard deviation. ROCs for three different cutoff values ( ) of the EXAM risk score are shown. Pos and neg denote the number of positive and negative cases, respectively, as defined by this range of EXAM score. a b t In the case of client sites with relatively small datasets, the best FL model markedly outperformed not only the local model but also those trained on larger datasets from five client sites in the Boston area of the USA (Fig. ). 3b The global model performed well in predicting oxygen needs at 24/72 h in patients both COVID positive and negative (Extended Data Fig. ). 3 Validation at independent sites Following initial training, EXAM was subsequently tested at three independent validation sites: Cooley Dickinson Hospital (CDH), Martha’s Vineyard Hospital (MVH) and Nantucket Cottage Hospital (NCH), all in Massachusetts, USA. The model was not retrained at these sites and it was used only for validation purposes. The cohort size and model inference results are summarized in Table , and the ROC curves and confusion matrices for the largest dataset (from CDH) are shown in Fig. . The operating point was set to discriminate between nonmechanical ventilation and mechanical ventilation (MV) treatment (or death). The FL global trained model, EXAM, achieved an average AUC of 0.944 and 0.924 for 24- and 72-h prediction tasks, respectively (Table ), which exceeded the average performance among sites used in training EXAM. For prediction of MV treatment (or death) at 24 h, EXAM achieved a sensitivity of 0.950 and specificity of 0.882 at CDH, and a sensitivity of 1.000 specificity of 0.934 at MVH. NCH did not have any cases with MV/death at 24 h. In regard to 72-h MV prediction, EXAM achieved a sensitivity of 0.929 and specificity of 0.880 at CDH, sensitivity of 1.000 and specificity of 0.976 at MVH and sensitivity of 1.000 and specificity of 0.929 at NCH. 2 4 2 , , Performance (ROC) (top) and confusion matrices (bottom) of the EXAM FL model on the CDH dataset for prediction of oxygen requirement at 24 h ( ) and 72 h ( ). ROC za tri različite vrednosti rezanja ( ) of the EXAM risk score are shown. a b a b t For MV at CDH at 72 h, EXAM had a low false-negative rate of 7.1%. Representative failure cases are presented in Extended Data Fig. , showing two false-negative cases from CDH where one case had many missing EMR data features and the other had a CXR with a motion artifact and some missing EMR features. 4 Use of differential privacy A primary motivation for healthcare institutes to use FL is to preserve the security and privacy of their data, as well as adherence to data compliance measures. For FL, there remains the potential risk of model ‘inversion’ or even the reconstruction of training images from the model gradients themselves . To counter these risks, security-enhancing measures were used to mitigate risk in the event of data ‘interception’ during site-server communication . We experimented with techniques to avoid interception of FL data, and added a security feature that we believe could encourage more institutions to use FL. We thus validated previous findings showing that partial weight sharing, and other differential privacy techniques, can successfully be applied in FL Kroz istragu o shemi djelomične podele težine , , , we showed that models can reach a comparable performance even when only 25% of weight updates are shared (Extended Data Fig. ). 47 48 49 50 50 51 52 5 Diskusija Ova studija sadrži veliku, realnu studiju o zdravstvenoj skrbi FL u smislu broja lokacija i broja podataka koji se koriste. Vjerujemo da pruža snažan dokaz koncepta izvedivosti korištenja FL za brz i kolaborativan razvoj potrebnih modela AI u zdravstvenoj skrbi. Naša studija uključuje više lokacija na četiri kontinenta i pod nadzorom različitih regulatornih tijela, i tako drži obećanje da će se pružiti različitim reguliranim tržištima na ubrzan način. Globalni model FL, EXAM, pokazao se snažnijim i postigao bolje rezultate na pojedinačnim lokacijama nego bilo koji model obučeni na samo lokalnim podacima. Vjerujemo da je dosljedno poboljšanje postignuto zbog većeg, ali i raznolikijeg, skupa podataka, upotrebe unosnih podataka koji se mogu standardizovati For a client site with a relatively small dataset, two typical approaches could be used for fitting a useful model: one is to train locally with its own data, the other is to apply a model trained on a larger dataset. For sites with small datasets, it would have been virtually impossible to build a performant deep learning model using only their local data. The finding, that these two approaches were outperformed on all three prediction tasks by the global FL model, indicates that the benefit for client sites with small datasets arising from participation in FL collaborations is substantial. This is probaby a reflection of FL’s ability to capture more diversity than local training, and to mitigate the bias present in models trained on a homogenous population. An under-represented population or age group in one hospital/region might be highly represented in another region—such as children who might be differentially affected by COVID-19, including disease manifestations in lung imaging . 46 Rezultati validiranja potvrdili su da je globalni model robustan, podržavajući našu hipotezu da su FL-obučeni modeli generalizabilni u svim zdravstvenim sustavima. Oni pružaju uvjerljiv slučaj za upotrebu prediktivnih algoritama u skrbi za pacijente COVID-19 i upotrebu FL-a u stvaranju modela i testiranju. Sudjelovanjem u ovoj studiji klijenti su dobili pristup EXAM-u, koji će biti dodatno validiran prije traženja bilo kakvog regulatornog odobrenja ili budućeg uvođenja u kliničku njegu. Planovi su u tijeku za potvrđivanje EXAM-a u perspektivnim „proizvodnim“ postavkama na MGB-u koristeći COVID-19 ciljane resurse. , kao i na različitim lokacijama koje nisu bile deo obuke za ispit. 53 Over 200 prediction models to support decision-making in patients with COVID-19 have been published Za razliku od većine publikacija usredotočenih na dijagnozu COVID-19 ili predviđanje smrtnosti, predvidjeli smo potrebe za kiseonikom koje imaju implikacije za upravljanje pacijentima. Također smo koristili slučajeve s nepoznatim statusom SARS-COV-2, pa je model mogao pružiti ulaz liječniku pre nego što dobije rezultat za PCR s obrnutom transkripcijom (RT-PCR), čineći ga korisnim za kliničku situaciju u stvarnom životu. Uvod u slikanje modela koristi se u uobičajenoj praksi, za razliku od modela koji koriste računalnu tomografiju prsnog koša, nekonsenzualnu dijagnostičku modalitet. Dizajn modela bio je ograničen na objektivne prediktore, za razliku od mnogih objavljenih studija koje su 19 Patient cohort identification and data harmonization are not novel issues in research and data science , but are further complicated, when using FL, given the lack of visibility on other sites’ datasets. Improvements to clinical information systems are needed to streamline data preparation, leading to better leverage of a network of sites participating in FL. This, in conjunction with hyperparameter engineering, can allow algorithms to ‘learn’ more effectively from larger data batches and adapt model parameters to a particular site for further personalization—for example, through further fine-tuning on that site . A system that would allow seamless, close-to real-time model inference and results processing would also be of benefit and would ‘close the loop’ from training to model deployment. 54 39 Because data were not centralized they are not readily accessible. Given that, any future analysis of the results, beyond what was derived and collected, is limited. Similar to other machine learning models, EXAM is limited by the quality of the training data. Institutions interested in deploying this algorithm for clinical care need to understand potential biases in the training. For example, the labels used as ground truth in the training of the EXAM model were derived from 24- and 72-h oxygen consumption in the patient; it is assumed that oxygen delivered to the patient equates the oxygen need. However, in the early phase of the COVID-19 pandemic, many patients were provided high-flow oxygen prophylactically regardless of their oxygen need. Such clinical practice could skew the predictions made by this model. Since our data access was limited, we did not have sufficient available information for the generation of detailed statistics regarding failure causes, post hoc, at most sites. However, we did study failure cases from the largest independent test site, CDH, and were able to generate hypotheses that we can test in the future. For high-performing sites, it seems that most failure cases fall into one of two categories: (1) low quality of input data—for example, missing data or motion artifact in CXR; or (2) out-of-distribution data—for example a very young patient. U budućnosti namjeravamo istražiti i potencijal za „populacijski drift“ zbog različitih faza progresije bolesti. Funkcija koja bi poboljšala ove vrste suradnje na velikoj razini je mogućnost predviđanja doprinosa svake stranice klijenta prema poboljšanju globalnog modela FL. To će pomoći u odabiru lokacije klijenta, kao i u prioritetizaciji napora za prikupljanje podataka i anotaciju. Ovo je posebno važno s obzirom na visoke troškove i tešku logistiku tih velikih konzorcijskih napora, i omogućit će tim naporima da uhvate raznolikost, a ne čistu količinu uzoraka podataka. Future approaches may incorporate automated hyperparameter searching , neural architecture search i drugo automatizovano strojno učenje pristupima za efikasnije pronalaženje optimalnih parametara obuke za svaku stranicu klijenta. 55 56 57 Known issues of batch normalization (BN) in FL motivated us to fix our base model for image feature extraction to reduce the divergence between unbalanced client sites. Future work might explore different types of normalization techniques to allow the training of AI models in FL more effectively when client data are nonindependent and identically distributed. 58 49 Nedavni radovi na napadima na privatnost unutar FL-a izazvali su zabrinutost zbog curenja podataka tokom obuke modela U međuvremenu, algoritmi zaštite ostaju neistraženi i ograničeni višestrukim čimbenicima. , , pokazuju dobru zaštitu, mogu oslabiti performanse modela. algoritmi šifriranja, kao što je homomorfno šifriranje Kvantitativan način za mjerenje privatnosti omogućio bi bolje izbore za odlučivanje o minimalnim parametrima privatnosti potrebnim uz održavanje klinički prihvatljivih performansi , , . 59 36 48 49 60 36 48 49 Nakon daljeg potvrđivanja, predviđamo uvođenje modela EXAM-a u podešavanju ED-a kao načina za procjenu rizika na razini pacijenta i populacije, te da kliničarima pružimo dodatnu referentnu točku prilikom obavljanja često teškog zadatka razmatranja pacijenata. Takođe predviđamo korištenje modela kao osetljivijeg metrika na razini populacije kako bi se pomoglo u uravnoteženju resursa između regija, bolnica i odjela. Naša nada je da će slični napori FL-a moći razbiti podatkovne sile i omogućiti brži razvoj veoma potrebnih modela AI-a u bliskoj budućnosti. Methods Ethics approval Sve obaveštene Procedure za zdravlje u Institutu za zdravlje u Torontu obavljene su u skladu sa načelima za ljudsko eksperimentiranje, kako je definirano u Izjavi o Helsinkiju i Međunarodnoj konferenciji o usklađivanju dobre kliničke prakse, te su odobrene od strane relevantnih institucijskih odbora za reviziju Beth Sinungner Hospital u sljedećim lokacijama za potvrđivanje: CDH, MVH, NCH i na sljedećim lokacijama za obuku: MGB, General Hospital (MGH), Newton-Wellesley Hospital, San Shore Medical Center i New Newkner Hospital (sve osam tih bolnica obuhvaćeno je odobrenjem odobrenja odobrenja od MGB-a, MVH, NCH, 2020P002673, a obavešteni odobrenje je MI-CLAIM guidelines for reporting of clinical AI models were followed (Supplementary Note ) 2 Study setting The study included data from 20 institutions (Fig. ): MGB, MGH, Brigham and Women’s Hospital, Newton-Wellesley Hospital, North Shore Medical Center and Faulkner Hospital; Children’s National Hospital in Washington, DC; NIHR Cambridge Biomedical Research Centre; The Self-Defense Forces Central Hospital in Tokyo; National Taiwan University MeDA Lab and MAHC and Taiwan National Health Insurance Administration; Tri-Service General Hospital in Taiwan; Kyungpook National University Hospital in South Korea; Faculty of Medicine, Chulalongkorn University in Thailand; Diagnosticos da America SA in Brazil; University of California, San Francisco; VA San Diego; University of Toronto; National Institutes of Health in Bethesda, Maryland; University of Wisconsin-Madison School of Medicine and Public Health; Memorial Sloan Kettering Cancer Center in New York; and Mount Sinai Health System in New York. Institutions were recruited between March and May 2020. Dataset curation started in June 2020 and the final data cohort was added in September 2020. Between August and October 2020, 140 independent FL runs were conducted to develop the EXAM model and, by the end of October 2020, EXAM was made public on NVIDIA NGC , , Podaci sa tri nezavisne lokacije korišćeni su za neovisnu validaciju: CDH, MVH i NCH, svi u Massachusettsu, SAD. Te tri bolnice imale su karakteristike populacije pacijenata različite od lokacija za obuku. Podaci korišteni za validaciju algoritma sastojali su se od pacijenata primljenih na ED na tim lokacijama između marta 2020. i februara 2021. i koji su zadovoljili iste kriterije uključivanja podataka koji su korišteni za obuku modela FL. 1a 61 62 63 Data collection 20 lokacija klijenata pripremilo je ukupno 16.148 slučajeva (i pozitivnih i negativnih) u svrhu obuke, validacije i testiranja modela (Slik. ). Medical data were accessed in relation to patients who satisfied the study inclusion criteria. Client sites strived to include all COVID-positive cases from the beginning of the pandemic in December 2019 and up to the time they started local training for the EXAM study. All local training had started by 30 September 2020. The sites also included other patients in the same period with negative RT–PCR test results. Since most of the sites had more SARS-COV-2-negative than -positive patients, we limited the number of negative patients included to, at most, 95% of the total cases at each client site. 1b Jedan "slučaj" uključivao je CXR i potrebne ulazne podatke dobivene iz medicinske evidencije pacijenta. Distribucija i obrasci CXR intenziteta slike (vrednosti piksela) uvelike su se razlikovali među lokacijama zbog mnoštva faktora specifičnih za pacijenta i lokaciju, kao što su različiti proizvođači uređaja i protokoli za slikanje, kao što je prikazano na slici. . Patient age and EMR feature distribution varied greatly among sites, as expected owing to the differing demographics between globally distributed hospitals (Extended Data Fig. ). 1b 1c,d 6 Kriteriji uključivanja pacijenata Kriteriji za uključivanje pacijenata bili su: (1) pacijent predstavljen bolničkom ED-u ili ekvivalentnom; (2) pacijent je imao RT-PCR test obavljen u bilo kojem trenutku između predstavljanja ED-u i izbacivanja iz bolnice; (3) pacijent je imao CXR u ED-u; i (4) pacijentova evidencija imala je najmanje pet vrijednosti EMR-a detaljnih u Tablici. , all obtained in the ED, and the relevant outcomes captured during hospitalization. Of note, The CXR, laboratory results and vitals used were the first available for capture during the visit to the ED. The model did not incorporate any CXR, laboratory results or vitals acquired after leaving the ED. 1 Model input In total, 21 EMR features were used as input to the model. The outcome (that is, ground truth) labels were assigned based on patient requirements after 24- and 72-h periods from initial admission to the ED. A detailed list of the requested EMR features and outcomes can be seen in Table . 1 The distribution of oxygen treatment using different devices at different client sites is shown in Extended Data Fig. , which details the device usage at admission to the ED and after 24- and 72-h periods. The difference in dataset distribution between the largest and smallest client sites can be seen in Extended Data Fig. . 7 8 The number of positive COVID-19 cases, as confirmed by a single RT–PCR test obtained at any time between presentation to the ED and discharge from the hospital, is listed in Supplementary Table . Each client site was asked to randomly split its dataset into three parts: 70% for training, 10% for validation and 20% for testing. For both 24- and 72-h outcome prediction models, random splits for each of the three repeated local and FL training and evaluation experiments were independently generated. 1 Razvoj modela ispita Postoji široka varijacija u kliničkom toku pacijenata koji dolaze u bolnicu sa simptomima COVID-19, a neki doživljavaju brzo pogoršanje funkcije disanja koje zahtijevaju različite intervencije za sprečavanje ili ublažavanje hipoksemije. , . A critical decision made during the evaluation of a patient at the initial point of care, or in the ED, is whether the patient is likely to require more invasive or resource-limited countermeasures or interventions (such as MV or monoclonal antibodies), and should therefore receive a scarce but effective therapy, a therapy with a narrow risk–benefit ratio due to side effects or a higher level of care, such as admittance to the intensive care unit Za razliku od toga, pacijent koji je na manjem riziku od zahtijevanja invazivne terapije kisikom može biti stavljen u manje intenzivnu opremu, kao što je redovna odjeljka, ili čak pušten iz ED-a za nastavak samokontrole kod kuće. EXAM je razvijen kako bi pomogao u razvrstavanju takvih pacijenata. 62 63 64 65 Of note, the model is not approved by any regulatory agency at this time and it should be used only for research purposes. EXAM score EXAM was trained using FL; it outputs a risk score (termed EXAM score) similar to CORISK (Extended Data Fig. ) и може се користити на исти начин за сортирање пацијената. То одговара захтевима за подршку кисеонику пацијента у року од два прозора - 24 и 72 сата - након почетне презентације ЕД. ilustrira kako se CORISK i ispitni rezultat mogu koristiti za razvrstavanje pacijenata. 27 9a 9b Chest X-ray images were preprocessed to select the anterior position image and exclude lateral view images, and then scaled to a resolution of 224 × 224. As shown in Extended Data Fig. , the model fuses information from both EMR and CXR features (based on a modified ResNet34 with spatial attention pretrenirano na CheXpert dataset) and the Deep & Cross network . To converge these different data types, a 512-dimensional feature vector was extracted from each CXR image using a pretrained ResNet34, with spatial attention, then concatenated with the EMR features as the input for the Deep & Cross network. The final output was a continuous value in the range 0–1 for both 24- and 72-h predictions, corresponding to the labels described above, as shown in Extended Data Fig. Koristili smo križnu entropiju kao funkciju gubitka i „Adam“ kao optimizator. korisnik koristi NVIDIA Clara Train SDK Prosječna AUC za zadatke klasifikacije (≥LFO, ≥HFO/NIV ili ≥MV) izračunata je i upotrijebljena kao konačna metrika evaluacije, uz normalizaciju na nulu sredinu i varijancu jedinica. U svakom slučaju) 9a 66 67 68 9b 69 70 27 Feature imputation and normalization Uslovi korišćenja algoritma je korišćen za imputiranje funkcija EMR, na osnovu lokalnog skupa podataka o obuci. Ako je funkcija EMR potpuno nedostajala iz skupa podataka o klijentima, koristila se prosječna vrijednost te funkcije, izračunata isključivo na podacima sa MGB klijenata. Zatim su funkcije EMR preuređene na nulu prosjeka i jediničnu varijancu na osnovu statistike izračunate na podacima sa MGB klijenata. 71 Detalji spajanja podataka EMR-CXR pomoću Deep & Cross mreže Za modeliranje interakcija funkcija iz EMR i CXR podataka na razini slučaja, korišćena je shema dubokih značajki zasnovana na arhitekturi Deep & Cross mreže. . Binary and categorical features for the EMR inputs, as well as 512-dimensional image features in the CXR, were transformed into fused dense vectors of real values by embedding and stacking layers. The transformed dense vectors served as input to the fusion framework, which specifically employed a crossing network to enforce fusion among input from different sources. The crossing network performed explicit feature crossing within its layers, by conducting inner products between the original input feature and output from the previous layer, thus increasing the degree of interaction across features. At the same time, two individual classic deep neural networks with several stacked, fully connected feed-forward layers were trained. The final output of our framework was then derived from the concatenation of both classic and crossing networks. 68 FL details Arguably the most established form of FL is implemention of the federated averaging algorithm as proposed by McMahan et al. , or variations thereof. This algorithm can be realized using a client-server setup where each participating site acts as a client. One can think of FL as a method aiming to minimize a global loss function by reducing a set of local loss functions, which are estimated at each site. By minimizing each client site’s local loss while also synchronizing the learned client site weights on a centralized aggregation server, one can minimize global loss without needing to access the entire dataset in a centralized location. Each client site learns locally, and shares model weight updates with a central server that aggregates contributions using secure sockets layer encryption and communication protocols. The server then sends an updated set of weights to each client site after aggregation, and sites resume training locally. The server and client site iterate back and forth until the model converges (Extended Data Fig. U svakom slučaju) 72 9c Pseudoalgoritam za FL je prikazan u Dodatnoj napomeni U našim eksperimentima, postavili smo broj federacionih krugova na = 200, with one local training epoch per round at each client. The number of clients, , was up to 20 depending on the network connectivity of clients or available data for a specific targeted outcome period (24 or 72 h). The number of local training iterations, , zavisi od veličine skupova podataka kod svakog klijenta and is used to weigh each client’s contributions when aggregating the model weights in federated averaging. During the FL training task, each client site selects its best local model by tracking the model’s performance on its local validation set. At the same time, the server determines the best global model based on the average validation scores sent from each client site to the server after each FL round. After FL training finishes, the best local models and the best global model are automatically shared with all client sites and evaluated on their local test data. 1 T t K nk k When training on local data only (the baseline), we set the epoch number to 200. The Adam optimizer was used for both local training and FL with an initial learning rate of 5 × 10–5 and a stepwise learning rate decay with a factor 0.5 after every 40 epochs, which is important for the convergence of federated averaging Slučajne afinne transformacije, uključujući rotaciju, translacije, rezanje, skaliranje i slučajni intenzitet buke i pomicanja, primenjene su na slike za povećanje podataka tokom obuke. 73 Zbog osjetljivosti BN slojeva Kada smo se bavili različitim klijentima u neovisnom i identično distribuiranom okruženju, otkrili smo da se najbolje performanse modela javljaju kada se pretrenirani ResNet34 drži s prostornom pažnjom. parameters fixed during FL training (that is, using a learning rate of zero for those layers). The Deep & Cross network that combines image features with EMR features does not contain BN layers and hence was not affected by BN instability issues. 58 47 U ovoj studiji smo istražili shemu za očuvanje privatnosti koja dijeli samo djelomične ažuriranja modela između servera i klijenata lokacija. ažuriranja težine su rangirana tokom svake iteracije po veličini doprinosa, a samo određeni postotak najvećih ažuriranja težine je podijeljen sa serverom. (t) (Extended Data Fig. ), koji je izračunan od svih ne-zero gradijenata, Δ , and could be different for each client U svakom FL krugu Varijacije ove sheme mogu uključivati dodatne rezanje velikih gradijenata ili diferencijalnih shema privatnosti koji dodaju slučajnu buku gradijentima, ili čak sirovim podacima, pre nego što ulaze u mrežu . k 5 Uslovi korišćenja (t) k t 49 51 Statistička analiza We conducted a Wilcoxon signed-rank test to confirm the significance of the observed improvement in performance between the locally trained model and the FL model for the 24- and 72-h time points (Fig. and Extended Data Fig. Nulna hipoteza je odbijena jednostrano. « 1 × 10–3 u oba slučaja. 2 1 P Pearsonova korelacija je korišćena za procjenu generalizabilnosti (robusnost prosječne AUC vrednosti na testne podatke drugih klijenata) lokalno obučene modele u odnosu na odgovarajuću lokalnu veličinu skupova podataka. za 0,43 = 0.035, degrees of freedom (df) = 17 for the 24-h model and za 0,62 = 0,003, df = 16 za model 72-h).To ukazuje da veličina skupova podataka nije jedini faktor koji određuje robusnost modela prema nevidljivim podacima. r P r P To compare ROC curves from the global FL model and local models trained at different sites (Extended Data Fig. ), we bootstrapped 1,000 samples from the data and computed the resulting AUCs. We then calculated the difference between the two series and standardized using the formula = (AUC1 – AUC2)/ , where is the standardized difference, is the standard deviation of the bootstrap differences and AUC1 and AUC2 are the corresponding bootstrapped AUC series. By comparing with normal distribution, we obtained the Vrednosti prikazane u Dodatnoj tabeli . The results show that the null hypothesis was rejected with very low vrijednosti, pokazujući statističku značajnost superiornosti rezultata FL. Izračunavanje vrednosti su izvedene u R sa pROC bibliotekom . 3 D s D s D P 2 P P 74 Budući da model predviđa diskretan ishod, kontinuirani rezultat od 0 do 1, jednostavna evaluacija kalibracije kao što je qqplot nije moguća. ). proveli smo jednosmjernu analizu varijacije (ANOVA) testove za usporedbu lokalnih i FL modela rezultata između četiri kategorije zemaljske istine (RA, LFO, HFO, MV). -statistic, calculated as the variation between the sample means divided by variation within the samples and representing the degree of dispersion among different groups, was used to quantify the models. Our results show that the -values of five different local sites are 245.7, 253.4, 342.3, 389.8 and 634.8, while that of the FL model is 843.5. Given that larger -values mean that groups are more separable, the scores from our FL model clearly show a greater dispersion among the four ground truth categories. Furthermore, the value of the ANOVA test on the FL model is <2 × 10–16, indicating that the FL prediction scores are statistically significantly different among the different prediction classes. 10 F F F P Izvješće sažetak Dodatne informacije o dizajnu istraživanja dostupne su u linked to this article. Izvješće o istraživanju prirode Dostupnost podataka Sastav podataka iz 20 instituta koji su učestvovali u ovoj studiji ostaje pod njihovom zaštitom. Ovi podaci korišteni su za obuku na svakoj od lokalnih lokacija i nisu bili podijeljeni sa bilo kojom od drugih institucija koje su učestvovale ili s federiranim serverom, i nisu javno dostupni. Podaci sa nezavisnih validirajućih lokacija održava CAMCA, a pristup se može zatražiti kontaktiranjem Q.L. Na temelju određivanja od strane CAMCA, pregled dijeljenja podataka i izmjena IRB-a za istraživačke svrhe može se provesti od strane MGB istraživačke administracije i u skladu sa MGB IRB i politikom. Dostupnost koda Svi kôd i softver koji se koriste u ovoj studiji su javno dostupni na NGC-u. Da biste pristupili, prijavili se kao gost ili kreirali profil, unesite jedan od URL-ova ispod. Obučeni modeli, smjernice za pripremu podataka, kod za obuku, validiranje testiranja modela, datoteka Readme, uputstvo za instalaciju i datoteke licencija su javno dostupni na NVIDIA NGC-u Uslovi : Federirani softver za učenje dostupan je kao dio Clara Train SDK: . Alternatively, use this command to download the model “wget --content-disposition -O clara_train_covid19_exam_ehr_xray_1.zip”. 61 https://ngc.nvidia.com/catalog/models/nvidia:med:clara_train_covid19_exam_ehr_xray https://ngc.nvidia.com/catalog/containers/nvidia:clara-train-sdk https://api.ngc.nvidia.com/v2/models/nvidia/med/clara_train_covid19_exam_ehr_xray/versions/1/zip Referencije Budd, J. et al. Digital technologies in the public-health response to COVID-19. , 1183–1192 (2020). Nat. Med. 26 Moorthy, V., Henao Restrepo, A. M., Preziosi, M.-P. & Swaminathan, S. Data sharing for novel coronavirus (COVID-19). , 150 (2020). Bull. World Health Organ. 98 Chen, Q., Allot, A. & Lu, Z. Keep up with the latest coronavirus research. , 193 (2020). Nature 579 Fabbri, F., Bhatia, A., Mayer, A., Schlotter, B. & Kaiser, J. BCG IT spend pulse: how COVID-19 is shifting tech priorities. (2020). https://www.bcg.com/publications/2020/how-covid-19-is-shifting-big-it-spend Candelon, F., Reichert, T., Duranton, S., di Carlo, R. C. & De Bondt, M. The rise of the AI-powered company in the postcrisis world. (2020). https://www.bcg.com/en-gb/publications/2020/business-applications-artificial-intelligence-post-covid Chao, H. et al. Integrative analysis for COVID-19 patient outcome prediction. , 101844 (2021). Med. Image Anal. 67 Zhu, X. et al. Joint prediction and time estimation of COVID-19 developing severe symptoms using chest CT scan. , 101824 (2021). Med. Image Anal. 67 Yang, D. et al. Federated semi-supervised learning for Covid region segmentation in chest ct using multi-national data from China, Italy, Japan. , 101992 (2021). Med. Image Anal. 70 Minaee, S., Kafieh, R., Sonka, M., Yazdani, S. & Jamalipour Soufi, G. Deep-COVID: predicting COVID-19 from chest X-ray images using deep transfer learning. , 101794 (2020). Med. Image Anal. 65 COVID-19 Studies from the World Health Organization Database. (2020). https://clinicaltrials.gov/ct2/who_table ACTIV. (2020). https://www.nih.gov/research-training/medical-research-initiatives/activ Coronavirus Treatment Acceleration Program (CTAP). US Food and Drug Administration (2020). https://www.fda.gov/drugs/coronavirus-covid-19-drugs/coronavirus-treatment-acceleration-program-ctap Gleeson, P., Davison, A. P., Silver, R. A. & Ascoli, G. A. A commitment to open source in neuroscience. , 964–965 (2017). Neuron 96 Piwowar, H. et al. The state of OA: a large-scale analysis of the prevalence and impact of open access articles. , e4375 (2018). PeerJ. 6 European Society of Radiology (ESR). What the radiologist should know about artificial intelligence – an ESR white paper. , 44 (2019). Insights Imaging 10 Pesapane, F., Codari, M. & Sardanelli, F. Artificial intelligence in medical imaging: threat or opportunity? Radiologists again at the forefront of innovation in medicine. , 35 (2018). Eur. Radiol. Exp. 2 Price, W. N. 2nd & Cohen, I. G. Privacy in the age of medical big data. , 37–43 (2019). Nat. Med. 25 Liang, W. et al. Development and validation of a clinical risk score to predict the occurrence of critical illness in hospitalized patients with COVID-19. , 1081–1089 (2020). JAMA Intern. Med. 180 Wynants, L. et al. Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal. , m1328 (2020). Brit. Med. J. 369 Zhang, L. et al. D-dimer levels on admission to predict in-hospital mortality in patients with Covid-19. , 1324–1329 (2020). J. Thromb. Haemost. 18 Sands, K. E. et al. Patient characteristics and admitting vital signs associated with coronavirus disease 2019 (COVID-19)-related mortality among patients admitted with noncritical illness. (2020). https://doi.org/10.1017/ice.2020.461 American College of Radiology. CR recommendations for the use of chest radiography and computed tomography (CT) for suspected COVID-19 infection. (2020). https://www.acr.org/Advocacy-and-Economics/ACR-Position-Statements/Recommendations-for-Chest-Radiography-and-CT-for-Suspected-COVID19-Infection Rubin, G. D. et al. The role of chest imaging in patient management during the COVID-19 pandemic: a multinational consensus statement from the Fleischner Society. , 172–180 (2020). Radiology 296 World Health Organization. Use of chest imaging in COVID-19. (2020). https://www.who.int/publications/i/item/use-of-chest-imaging-in-covid-19 Jamil, S. et al. Diagnosis and management of COVID-19 disease. , 10 (2020). Am. J. Respir. Crit. Care Med. 201 Redmond, C. E., Nicolaou, S., Berger, F. H., Sheikh, A. M. & Patlas, M. N. Emergency radiology during the COVID-19 pandemic: The Canadian Association of Radiologists Recommendations for Practice. , 425–430 (2020). Can. Assoc. Radiologists J. 71 Buch, V. et al. Development and validation of a deep learning model for prediction of severe outcomes in suspected COVID-19 Infection. Preprint at (2021). https://arxiv.org/abs/2103.11269 Lyons, C. & Callaghan, M. The use of high-flow nasal oxygen in COVID-19. , 843–847 (2020). Anaesthesia 75 Whittle, J. S., Pavlov, I., Sacchetti, A. D., Atwood, C. & Rosenberg, M. S. Respiratory support for adult patients with COVID-19. , 95–101 (2020). J. Am. Coll. Emerg. Physicians Open 1 Ai, J., Li, Y., Zhou, X. & Zhang, W. COVID-19: treating and managing severe cases. , 370–371 (2020). Cell Res. 30 Esteva, A. et al. A guide to deep learning in healthcare. , 24–29 (2019). Nat. Med. 25 Cahan, E. M., Hernandez-Boussard, T., Thadaney-Israni, S. & Rubin, D. L. Putting the data before the algorithm in big data addressing personalized healthcare. , 78 (2019). NPJ Digit. Med. 2 Thrall, J. H. et al. Artificial intelligence and machine learning in radiology: opportunities, challenges, pitfalls, and criteria for success. , 504–508 (2018). J. Am. Coll. Radiol. 15 Shilo, S., Rossman, H. & Segal, E. Axes of a revolution: challenges and promises of big data in healthcare. , 29–38 (2020). Nat. Med. 26 Gao, Y. & Cui, Y. Deep transfer learning for reducing health care disparities arising from biomedical data inequality. , 5131 (2020). Nat. Commun. 11 Rieke, N. et al. The future of digital health with federated learning. , 119 (2020). NPJ Dig. Med. 3 Yang, Q., Liu, Y., Chen, T. & Tong, Y. Federated machine learning: concept and applications. , 12 (2019). ACM Trans. Intell. Syst. Technol. 10 Ma, C. et al. On safeguarding privacy and security in the framework of federated learning. , 242–248 (2020). IEEE Netw. 34 Brisimi, T. S. et al. Federated learning of predictive models from federated Electronic Health Records. , 59–67 (2018). Int. J. Med. Inform. 112 Roth, H. R. et al. Federated learning for breast density classification: a real-world implementation. In , (eds. Albarqouni, S. et al.) Vol. 12,444, 181–191 (Springer International Publishing, 2020). Proc. Second MICCAI Workshop, DART 2020 and First MICCAI Workshop, DCL 2020 Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning Sheller, M. J. et al. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. , 12598 (2020). Sci. Rep. 10 Remedios, S. W., Butman, J. A., Landman, B. A. & Pham, D. L. in (eds Remedios, S. W. et al.) (Springer, 2020). Federated Gradient Averaging for Multi-Site Training with Momentum-Based Optimizers Xu, Y. et al. A collaborative online AI engine for CT-based COVID-19 diagnosis. Preprint at (2020). https://www.medrxiv.org/content/10.1101/2020.05.10.20096073v2 Raisaro, J. L. et al. SCOR: A secure international informatics infrastructure to investigate COVID-19. , 1721–1726 (2020). J. Am. Med. Inform. Assoc. 27 Vaid, A. et al. Federated learning of electronic health records to improve mortality prediction in hospitalized patients with COVID-19: machine learning approach. , e24207 (2021). JMIR Med. Inform. 9 Nino, G. et al. Pediatric lung imaging features of COVID-19: a systematic review and meta-analysis. , 252–263 (2021). Pediatr. Pulmonol. 56 Fredrikson, M., Jha, S. & Ristenpart, T. Model inversion attacks that exploit confidence information and basic countermeasures. In 1322–1333, (2015). Proc. 22nd ACM SIGSAC Conference on Computer and Communications Security https://doi.org/10.1145/2810103.2813677 Zhu, L., Liu, Z. & Han, S. in (eds Wallach, H. et al.) 14774–14784 (Curran Associates, Inc., 2019). Advances in Neural Information Processing Systems 32 Kaissis, G. A., Makowski, M. R., Rückert, D. & Braren, R. F. Secure, privacy-preserving and federated machine learning in medical imaging. , 305–311 (2020). Nat. Mach. Intell. 2 Li, W. et al. in 133–141 (Springer, 2019). Privacy-Preserving Federated Brain Tumour Segmentation Shokri, R. & Shmatikov, V. Privacy-preserving deep learning. In (2015). Proc. 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton) https://doi.org/10.1109/allerton.2015.7447103 Li, X. et al. Multi-site fMRI analysis using privacy-preserving federated learning and domain adaptation: ABIDE results. , 101765 (2020). Med. Image Anal. 65 Estiri, H. et al. Predicting COVID-19 mortality with electronic medical records. , 15 (2021). NPJ Dig. Med. 4 Jiang, G. et al. Harmonization of detailed clinical models with clinical study data standards. , 65–74 (2015). Methods Inf. Med. 54 Yang, D. et al. in . (2019). Searching Learning Strategy with Reinforcement Learning for 3D Medical Image Segmentation https://doi.org/10.1007/978-3-030-32245-8_1 Elsken, T., Metzen, J. H. & Hutter, F. Neural architecture search: a survey. , 1–21 (2019). J. Mach. Learning Res. 20 Yao, Q. et al. Taking human out of learning applications: a survey on automated machine learning. Preprint at (2019). https://arxiv.org/abs/1810.13306 Ioffe, S. & Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In , PMLR , 448–456 (2015). Proc. 32nd International Conf. Machine Learning 37 Kaufman, S., Rosset, S. & Perlich, C. Leakage in data mining: formulation, detection, and avoidance. In , 556–563 (2011). Proc. 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Zhang, C. et al. BatchCrypt: efficient homomorphic encryption for cross-silo federated learning. In , 493–506 (2020). Proc. 2020 USENIX Annual Technical Conference, ATC 2020 . (2020). Nvidia NGC Catalog: COVID-19 Related Models https://ngc.nvidia.com/catalog/models?orderBy=scoreDESC&pageNumber=0&query=covid&quickFilter=models&filters Marini, J. J. & Gattinoni, L. Management of COVID-19 respiratory distress. , 2329–2330 (2020). JAMA 323 Cook, T. M. et al. Consensus guidelines for managing the airway in patients with COVID-19: Guidelines from the Difficult Airway Society, the Association of Anaesthetists the Intensive Care Society, the Faculty of Intensive Care Medicine and the Royal College of Anaesthetist. , 785–799 (2020). Anaesthesia 75 Galloway, J. B. et al. A clinical risk score to identify patients with COVID-19 at high risk of critical care admission or death: an observational cohort study. , 282–288 (2020). J. Infect. 81 Kilaru, A. S. et al. Return hospital admissions among 1419 COVID-19 patients discharged from five U.S. emergency departments. , 1039–1042 (2020). Acad. Emerg. Med. 27 He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In (2016). Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/cvpr.2016.90 Irvin, J. et al. CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. , 590–597 (2019). Proc. AAAI Conf. Artif. Intell. 33 Wang, R., Fu, B., Fu, G. & Wang, M. Deep & Cross network for Ad Click predictions. In Article no. 12 (2017). Proc. ADKDD’17 Abadi, M. et al. TensorFlow: asystem for large-scale machine learning. In , USENIX Association 265–283 (2016). 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) . (2020). NVIDIA Clara Imaging https://developer.nvidia.com/clara-medical-imaging Stekhoven, D. J. & Bühlmann, P. MissForest–non-parametric missing value imputation for mixed-type data. , 112–118 (2012). Bioinformatics 28 McMahan, H., Moore, E., Ramage, D., Hampson, S. & y Arcas, B. A. Communication-efficient learning of deep networks from decentralized data. (2017). http://proceedings.mlr.press/v54/mcmahan17a.html Hsieh, K., Phanishayee, A., Mutlu, O. & Gibbons, P. B. The non-IID data quagmire of decentralized machine learning. In PMLR 119 (2020). Proc. 37th International Conf. Machine Learning Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. , 77 (2011). BMC Bioinformatics 12 Priznanja MGB zahvalio sledećim pojedincima za njihovu podršku: J. Brink, Department of Radiology, Massachusetts General Hospital, Boston, MA; N. Guo, Center for Advanced Medical Computing and Analysis, Department of Radiology, Massachusetts General Hospital, Harvard General Medical School, Harvard Medical School, Boston, MA; MA; J. K. Cramer, direktor Centra za kliničku nauku podataka, Massachusetts General Brigham, Boston, MA; T. Schultz, Center for Biomical Imaging, Massachusetts General Hospital, Boston, MA; S. Pomantz, Department of Radiological Computing and Analysis, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA; MA J. K. Cramer, direktor ovog istraživanja, QTIM laboratorij u Harvard A. Martinos Center for Biomical Imaging, MGH; S preko Medicinskog fakulteta, Univerzitet Chulalongkorn zahvaljuje Ratchadapisek Sompoch Endowment Fund RA (PO) (nr. 001/63) za prikupljanje i upravljanje kliničkim podacima i biološkim uzorcima koji se odnose na COVID-19 za istraživačku radnu grupu, Medicinski fakultet, Univerzitet Chulalongkorn. NIHR Cambridge Biomedical Research Centre zahvaljuje A. Priest, koji je podržan od strane NIHR (Cambridge Biomedical Research Centre at Cambridge University Hospitals NHS Foundation Trust). Nacionalni tajvanski univerzitetski MeDA Lab i MAHC i tajvanska Nacionalna uprava za zdravstveno osiguranje zahvaljujujuju MOST Zajedničkom istraživačkom centru za AI tehnologiju, Nacionalnoj upravi za zdravstveno osiguranje All Vista, Tajvanu https://data.ucsf.edu/covid19 Ovaj članak je dostupan u prirodi pod licencom CC by 4.0 Deed (Attribution 4.0 International). Ovaj članak je dostupan u prirodi pod licencom CC by 4.0 Deed (Attribution 4.0 International).