A New Privacy-First AI Predicts COVID Severity Using X-Rays and Medical Records

Els autors: Ittai Dayan Holger R. Roth Aoxiao Zhong Ahmed Harouni Amilcare Gentili Anas Z. Abidin Andrew Liu Anthony Beardsworth Costa Bradford J. Wood Chien-Sung Tsai Chih-Hung Wang Chun-Nan Hsu C. K. Lee Peiying Ruan Daguang Xu Dufan Wu Eddie Huang Felipe Campos Kitamura Griffin Lacey Gustavo César de Antônio Corradi Gustavo Nino Hao-Hsin Shin Hirofumi Obinata Hui Ren Jason C. Crane Jesse Tetreault Jiahui Guan John W. Garrett Joshua D. Kaggie Jung Gil Park Keith Dreyer Krishna Juluru Kristopher Kersten Marcio Aloisio Bezerra Cavalcanti Rockenbach Marius George Linguraru Masoom A. Haider Meena AbdelMaseeh Nicola Rieke Pablo F. Damasceno Pedro Mario Cruz e Silva Pochuan Wang Sheng Xu Shuichi Kawano Sira Sriswasdi Soo Young Park Thomas M. Grist Varun Buch Watsamon Jantarabenjakul Weichung Wang Won Young Tak Xiang Li Xihong Lin Young Joon Kwon Abood Quraini Andrew Feng Andrew N. Priest Baris Turkbey Benjamin Glicksberg Bernardo Bizzo Byung Seok Kim Carlos Tor-Díez Chia-Cheng Lee Chia-Jung Hsu Chin Lin Chiu-Ling Lai Christopher P. Hess Colin Compas Deepeksha Bhatia Eric K. Oermann Evan Leibovitz Hisashi Sasaki Hitoshi Mori Isaac Yang Jae Ho Sohn Krishna Nand Keshava Murthy Li-Chen Fu Matheus Ribeiro Furtado de Mendonça Mike Fralick Min Kyu Kang Mohammad Adil Natalie Gangai Peerapon Vateekul Pierre Elnajjar Sarah Hickman Sharmila Majumdar Shelley L. McLeod Sheridan Reed Stefan Gräf Stephanie Harmon Tatsuya Kodama Thanyawee Puthanakit Tony Mazzulli Vitor Lima de Lavor Yothin Rakvongthai Yu Rim Lee Yuhong Wen Fiona J. Gilbert Mona G. Flores Quanzheng Li Els autors: Itàlia Dayan Holger R. Roth Aoxiao Zhong Agustí Harouni Amilcare Gentil Anas Z. Abidí Andreu Liu Direcció: Anthony Beardsworth Costa Bradford J. Wood és Cantàbria Tsai Xi-Hung Wang Chun-Nan Hsu C. K. Lleó Pessebre Ruan Daguang Xu Càritas Wu El senyor Huang Felip Campos Kitamura Griffin Lacey Gustau Cèsar d'António Corradi Gustavo Nino Hao-Hsin Shin Hirofumi Obinata Hui Ren Títol: Jason C. Crane Jesse Tetreault Jiahui Guan Joan W. Garrett Josuè D. Kaggie Parc de Jung Gil Keith Dreyer Krishna Juluru Cristòfor Cerdanyola Marcio Aloisio Bezerra Cavalcanti Rockenbach Marius George Linguraru Masoom A. Haider Mireia Abdelmaseeh Nicolau Riu Pablo F. Damasceno Pedro Mario Cruz i Silva Pochuan Wang Xàtiva Xu Shuichi Kawano Sra Srisà Parc Jove Thomas M. Grist Llibre de Varó El sopar de Watson Guanyador Wang Guanyador Jove Xian Li Línia Xihong Young Joon Kwon Abood Quraini Andrew Feng Andrew N. Priest Barris Turkbey Benjamí Glicksberg Bernardo Bizzo Byung Seok Kim Carles Tor-Díez Chia-Cheng Lee Xia-Jung Hsu Xina Lin Llengua Llengua Christopher P. Hess Colin Compas Deepeksha Bhatia Eric K. Oermann i els seus Evan Leibovitz Hisenda de Sasaki Hitoshi Morí Isaac Yang Jae Ho Sohn Krishna Nand Keshava Murthy Li-Chen Fú Matheus Ribeiro Furtado de Mendonça Mike Fralick Min Kyu Kang Mohammad Adil Natació Gangai Peerapon Vateekul Pierre Elnajjar La Sarah Hickman Xarmila Majumdar Shelley L. McLeod i els seus Sheridan Reed Stefan Gräf Estètica Harmon Tatsuya Kodama Thanyawee Puthanakit Tony Mazzulli Vitor Lima de Treball Yothin Rakvongthai Yu Rim Lee i els seus Xàtiva Wen Fiona J. Gilbert Mona G. Flores Quàntic Li Abstract L'aprenentatge federat (FL) és un mètode utilitzat per a l'entrenament de models d'intel·ligència artificial amb dades de múltiples fonts, mantenint l'anonimat de dades, eliminant així moltes barreres al compartir dades. Aquí hem utilitzat dades de 20 instituts de tot el món per a formar un model FL, anomenat EXAM (Electronic Medical Record (EMR) chest X-ray AI model), que prediu les futures necessitats d'oxigen dels pacients simptomàtics amb COVID-19 utilitzant entrades de signes vitals, dades de laboratori i raigs X de pit. L'EXAM va aconseguir una superfície mitjana sota la corba (AUC) > 0,92 per predir els resultats a 24 i 72 hores des del moment de la presentació inicial a la sala d'em principal The scientific, academic, medical and data science communities have come together in the face of the COVID-19 pandemic crisis to rapidly assess novel paradigms in artificial intelligence (AI) that are rapid and secure, and potentially incentivize data sharing and model training and testing without the usual privacy and data ownership hurdles of conventional collaborations , . Healthcare providers, researchers and industry have pivoted their focus to address unmet and critical clinical needs created by the crisis, with remarkable results , , , , , , . Clinical trial recruitment has been expedited and facilitated by national regulatory bodies and an international cooperative spirit , , . The data analytics and AI disciplines have always fostered open and collaborative approaches, embracing concepts such as open-source software, reproducible research, data repositories and making available anonymized datasets publicly , . The pandemic has emphasized the need to expeditiously conduct data collaborations that empower the clinical and scientific communities when responding to rapidly evolving and widespread global challenges. Data sharing has ethical, regulatory and legal complexities that are underscored, and perhaps somewhat complicated, by the recent entrance of large technology companies into the healthcare data world , , . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 A concrete example of these types of collaboration is our previous work on an AI-based SARS-COV-2 clinical decision support (CDS) model. This CDS model was developed at Mass General Brigham (MGB) and was validated across multiple health systems’ data. The inputs to the CDS model were chest X-ray (CXR) images, vital signs, demographic data and laboratory values that were shown in previous publications to be predictive of outcomes of patients with COVID-19 , , , . CXR was selected as the imaging input because it is widely available and commonly indicated by guidelines such as those provided by ACR , the Fleischner Society , the WHO , national thoracic societies Manual COVID del Ministeri Nacional de Salut i Societats de Radiologia de tot el món . The output of the CDS model was a score, termed CORISK , that corresponds to oxygen support requirements and that could aid in triaging patients by frontline clinicians , , S'ha sabut que els proveïdors de salut prefereixen models que van ser validats sobre les seves pròpies dades. . To date most AI models, including the aforementioned CDS model, have been trained and validated on ‘narrow’ data that often lack diversity , , potentially resulting in overfitting and lower generalizability. This can be mitigated by training with diverse data from multiple sites without centralization of data using methods such as transfer learning , o FL. FL és un mètode utilitzat per entrenar models d'IA en diferents fonts de dades, sense que les dades siguin transportades o exposades fora de la seva ubicació original. . 18 19 20 21 22 23 24 25 26 27 28 29 30 27 31 32 33 34 35 36 L'aprenentatge federat dóna suport al llançament ràpid d'experiments orquestrats centralment amb millor traçabilitat de dades i avaluació de canvis algorítmics i impacte Un enfocament a FL, anomenat client-server, envia un model "no entrenat" a altres servidors ("nodes") que realitzen tasques de formació parcials, i al seu torn envia els resultats de tornada per ser fusionats al servidor central ("federat"). . 37 36 Governance of data for FL is maintained locally, alleviating privacy concerns, with only model weights or gradients communicated between client sites and the federated server , . FL has already shown promise in recent medical imaging applications , , , Anàlisi del COVID-19 , , Un exemple notable és un model de predicció de la mortalitat en pacients infectats amb SARS-COV-2 que utilitza característiques clíniques, encara que limitades en termes de nombre de modalitats i escala. . 38 39 40 41 42 43 8 44 45 46 El nostre objectiu era desenvolupar un model robust i generalitzable que pogués ajudar a triar els pacients. Vam teoritzar que el model CDS es podia federar amb èxit, atès que s'utilitzaven entrades de dades que són relativament comunes en la pràctica clínica i que no es depenien en gran mesura d'avaluacions dependents de l'operador de la condició del pacient (com les impressions clíniques o els símptomes reportats). En canvi, es van utilitzar resultats de laboratori, signes vitals, un estudi d'imatge i una demografia comunament capturada (és a dir, edat). Per tant, vam reprendre el model CDS amb diverses dades utilitzant un enfocament FL client-server per desenvolupar un nou model FL global, que es va anomenar EXAM, utilitzant La nostra hipòtesi era que EXAM funcionaria millor que els models locals i generalitzaria millor a través dels sistemes de salut. Resultats The EXAM model architecture The EXAM model is based on the CDS model mentioned above En total, es van utilitzar 20 característiques (19 de l'EMR i una de la CXR) com a entrada al model. Les etiquetes de resultats (és a dir, "veritat fonamental") es van assignar basant-se en la teràpia d'oxigen del pacient després de períodes de 24 i 72 hores des de l'admissió inicial al departament d'emergències (ED). . 27 1 Les etiquetes de resultats dels pacients es van establir en 0, 0,25, 0,50 i 0,75 depenent de la teràpia d'oxigen més intensa que va rebre el pacient en la finestra de predicció. Les categories de teràpia d'oxigen van ser, respectivament, aire d'habitació (RA), oxigen de baix flux (LFO), oxigen d'alt flux (HFO)/ventilació no invasiva (NIV) o ventilació mecànica (MV). Si el pacient va morir dins de la finestra de predicció, la etiqueta de resultat es va establir en 1. Per a les característiques EMR, només es van utilitzar els primers valors capturats en l'ED i el preprocessament de dades va incloure la desidentificació, la imputació de valors mancats i la normalització a la mitjana zero i la variància unitària. El model, per tant, fusiona informació de les característiques EMR i CXR, utilitzant una xarxa neural convolucional de 34 capes (ResNet34) per extreure característiques d'una CXR i una xarxa Deep & Cross per concatenar les característiques juntament amb les característiques EMR (per a més detalls ampliats, vegeu La sortida del model és una puntuació de risc, anomenada puntuació EXAM, que és un valor continu en el rang de 0 a 1 per a cadascuna de les prediccions de 24 i 72 hores corresponents a les etiquetes descrites anteriorment. Methods Federating the model El model EXAM va ser entrenat utilitzant una cohort de 16.148 casos, convertint-lo no només entre els primers models FL per a COVID-19 sinó també en un projecte de desenvolupament molt gran i multicontinental en IA clínicament rellevant (Fig. Les dades entre els llocs no es van harmonitzar abans de l'extracció i, tenint en compte les circumstàncies d'informàtica clínica de la vida real, no es va dur a terme una meticulosa harmonització de les entrades de dades pels autors (Fig. ). 1a i B 1C i D , World map indicating the 20 different client sites contributing to the EXAM study. , Number of cases contributed by each institution or site (client 1 represents the site contributing the largest number of cases). , Chest X-ray intensity distribution at each client site. , Age of patients at each client site, showing minimum and maximum ages (asterisks), mean age (triangles) and standard deviation (horizontal bars). The number of samples of each client site is shown in Supplementary Table . a b c d 1 L'estudi de la capacitat d'experimentació de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip de l'equip ( « 1 × 10–3, Wilcoxon signed-rank test) of 16% (as defined by average AUC when running the model on respective local test sets: from 0.795 to 0.920, or 12.5 percentage points) (Fig. També va resultar en una millora del 38% de la generalització (com es defineix per l'AUC mitjana quan s'executa el model en tots els conjunts de proves: de 0,667 a 0,920, o 25,3 punts percentuals) del millor model global per a la predicció del tractament d'oxigen de 24 hores en comparació amb els models entrenats només en les dades pròpies d'un lloc (Fig. ). For the prediction results of 72-h oxygen treatment, the best global model training resulted in an average performance improvement of 18% compared to locally trained models, while generalizability of the global model improved on average by 34% (Extended Data Fig. L'estabilitat dels nostres resultats es va validar mitjançant la repetició de tres curses de formació local i FL en diferents divisions de dades aleatòries. P 2a 2b 1 , El rendiment de la prova de cada client s'estableix en la predicció del tractament d'oxigen de 24 hores per a models entrenats sobre dades locals només (Local) versus el del millor model global disponible al servidor (FL (vegeu el millor). , Generalizability (performance average on other sites’ test data, as represented by average AUC) as a function of a client’s dataset size (no cases).La línia horitzontal verda denota el rendiment de generalizability del millor model global. ) and client 14 had cases only with RA treatment, such that the evaluation metric (av. AUC) was not applicable in either of these cases ( ). Data for client 14 were also excluded from computation of average generalizability in local models. a b 1 Mètodes Els models locals que van ser entrenats utilitzant cohorts desequilibrades (per exemple, la majoria de casos lleus de COVID-19) van beneficiar-se notablement de l'enfocament FL, amb una millora substancial en el rendiment mitjà AUC de predicció per a categories amb només uns pocs casos. Això va ser evident en el lloc del client 16 (un conjunt de dades desequilibrat), amb la majoria dels pacients que experimenten severitat de la malaltia lleu i amb només uns pocs casos greus. and Extended Data Fig. ). More important, the generalizability of the FL model was considerably increased over the locally trained model. 3a 2 , ROC al lloc del client 16, amb dades desequilibrades i casos majoritàriament lleus. , ROC of the local model at client site 12 (a small dataset), mean ROC of models trained on larger datasets corresponding to the five client sites in the Boston area (1, 4, 5, 6, 8) and ROC of the best global model in prediction of 72-h oxygen treatment for different thresholds of EXAM score (left, middle, right). The mean ROC is calculated based on five locally trained models while the gray area denotes the ROC standard deviation. ROCs for three different cutoff values ( ) of the EXAM risk score are shown. Pos and neg denote the number of positive and negative cases, respectively, as defined by this range of EXAM score. a b t En el cas de llocs de client amb conjunts de dades relativament petits, el millor model FL va superar notablement no només el model local, sinó també els entrenats en conjunts de dades més grans de cinc llocs de client a l'àrea de Boston dels EUA (Fig. ). 3b The global model performed well in predicting oxygen needs at 24/72 h in patients both COVID positive and negative (Extended Data Fig. ). 3 Validation at independent sites Following initial training, EXAM was subsequently tested at three independent validation sites: Cooley Dickinson Hospital (CDH), Martha’s Vineyard Hospital (MVH) and Nantucket Cottage Hospital (NCH), all in Massachusetts, USA. The model was not retrained at these sites and it was used only for validation purposes. The cohort size and model inference results are summarized in Table , and the ROC curves and confusion matrices for the largest dataset (from CDH) are shown in Fig. . The operating point was set to discriminate between nonmechanical ventilation and mechanical ventilation (MV) treatment (or death). The FL global trained model, EXAM, achieved an average AUC of 0.944 and 0.924 for 24- and 72-h prediction tasks, respectively (Table Per a la predicció del tractament MV (o mort) a les 24 h, l'EXAM va aconseguir una sensibilitat de 0,950 i especificitat de 0,882 a CDH, i una sensibilitat de 1.000 especificitat de 0,934 a MVH. NCH no va tenir cap cas amb MV / mort a les 24 h. Pel que fa a la predicció MV de 72 h, l'EXAM va aconseguir una sensibilitat de 0,929 i especificitat de 0,880 a CDH, sensibilitat de 1.000 i especificitat de 0,976 a MVH i sensibilitat de 1.000 i especificitat de 0,929 a NCH. 2 4 2 , , Performance (ROC) (top) and confusion matrices (bottom) of the EXAM FL model on the CDH dataset for prediction of oxygen requirement at 24 h ( ) and 72 h ( ). ROCs for three different cutoff values ( ) de la puntuació de risc de l'examen es mostren. a b a b t Per a MV a CDH a les 72 h, EXAM tenia una taxa baixa de fals negatiu del 7,1%. , mostrant dos casos fals-negatius de CDH on un cas tenia moltes característiques de dades EMR que falten i l'altre tenia un CXR amb un artefacte de moviment i algunes característiques EMR que falten. 4 Use of differential privacy A primary motivation for healthcare institutes to use FL is to preserve the security and privacy of their data, as well as adherence to data compliance measures. For FL, there remains the potential risk of model ‘inversion’ o fins i tot la reconstrucció de les imatges d'entrenament dels mateixos gradients del model . To counter these risks, security-enhancing measures were used to mitigate risk in the event of data ‘interception’ during site-server communication . We experimented with techniques to avoid interception of FL data, and added a security feature that we believe could encourage more institutions to use FL. We thus validated previous findings showing that partial weight sharing, and other differential privacy techniques, can successfully be applied in FL . Through investigation of a partial weight-sharing scheme , , , vam mostrar que els models poden aconseguir un rendiment comparable fins i tot quan només es comparteixen el 25% de les actualitzacions de pes (Fig. ). 47 48 49 50 50 51 52 5 Discussion Aquest estudi compta amb un gran estudi FL de salut en el món real pel que fa al nombre de llocs i el nombre de punts de dades utilitzats. Creiem que proporciona una potent prova de concepte de la viabilitat de l'ús de FL per al desenvolupament ràpid i col·laboratiu dels models d'IA necessaris en la salut. El nostre estudi va involucrar múltiples llocs a quatre continents i sota la supervisió de diferents organismes reguladors, i així manté la promesa de ser proporcionat a diferents mercats regulats d'una manera accelerada. El model FL global, EXAM, va demostrar ser més robust i va aconseguir millors resultats en llocs individuals que qualsevol model entrenat sobre dades només locals. Creiem que es va aconseguir una millora consistent a causa d'un conjunt de dades més gran, però també més divers, l'ús d' For a client site with a relatively small dataset, two typical approaches could be used for fitting a useful model: one is to train locally with its own data, the other is to apply a model trained on a larger dataset. For sites with small datasets, it would have been virtually impossible to build a performant deep learning model using only their local data. The finding, that these two approaches were outperformed on all three prediction tasks by the global FL model, indicates that the benefit for client sites with small datasets arising from participation in FL collaborations is substantial. This is probaby a reflection of FL’s ability to capture more diversity than local training, and to mitigate the bias present in models trained on a homogenous population. An under-represented population or age group in one hospital/region might be highly represented in another region—such as children who might be differentially affected by COVID-19, including disease manifestations in lung imaging . 46 The validation results confirmed that the global model is robust, supporting our hypothesis that FL-trained models are generalizable across healthcare systems. They provide a compelling case for the use of predictive algorithms in COVID-19 patient care, and the use of FL in model creation and testing. By participating in this study the client sites received access to EXAM, to be further validated ahead of pursuing any regulatory approval or future introduction into clinical care. Plans are under way to validate EXAM prospectively in ‘production’ settings at MGB leveraging COVID-19 targeted resources , as well as at different sites that were not a part of the EXAM training. 53 Over 200 prediction models to support decision-making in patients with COVID-19 have been published . Unlike the majority of publications focused on diagnosis of COVID-19 or prediction of mortality, we predicted oxygen requirements that have implications for patient management. We also used cases with unknown SARS-COV-2 status, and so the model could provide input to the physician ahead of receiving a result for PCR with reverse transcription (RT–PCR), making it useful for a real-life clinical setting. The model’s imaging input is used in common practice, in contrast with models that use chest computed tomography, a nonconsensual diagnostic modality. The model’s design was constrained to objective predictors, unlike many published studies that leveraged subjective clinical impressions. The data collected reflect varied incidence rates, and thus the ‘population momentum’ we encountered is more diverse. This implies that the algorithm can be useful in populations with different incidence rates. 19 Patient cohort identification and data harmonization are not novel issues in research and data science , but are further complicated, when using FL, given the lack of visibility on other sites’ datasets. Improvements to clinical information systems are needed to streamline data preparation, leading to better leverage of a network of sites participating in FL. This, in conjunction with hyperparameter engineering, can allow algorithms to ‘learn’ more effectively from larger data batches and adapt model parameters to a particular site for further personalization—for example, through further fine-tuning on that site Un sistema que permetés la inferència del model i el processament de resultats en temps real sense problemes també seria beneficiós i "tancaria el cercle" de la formació a la implementació del model. 54 39 Atès que les dades no van ser centralitzades no són fàcilment accessibles, qualsevol anàlisi futura dels resultats, més enllà del que es va derivar i recollir, és limitada. Similar to other machine learning models, EXAM is limited by the quality of the training data. Institutions interested in deploying this algorithm for clinical care need to understand potential biases in the training. For example, the labels used as ground truth in the training of the EXAM model were derived from 24- and 72-h oxygen consumption in the patient; it is assumed that oxygen delivered to the patient equates the oxygen need. However, in the early phase of the COVID-19 pandemic, many patients were provided high-flow oxygen prophylactically regardless of their oxygen need. Such clinical practice could skew the predictions made by this model. Com que el nostre accés a les dades era limitat, no teníem informació suficient per generar estadístiques detallades sobre les causes d'error, post-hoc, en la majoria dels llocs. No obstant això, vam estudiar casos d'error del lloc de prova independent més gran, CDH, i vam ser capaços de generar hipòtesis que podríem provar en el futur. In future, we also intend to investigate the potential for a ‘population drift’ due to different phases of disease progression. We believe that, owing to the diversity across the 20 sites, this risk may have been mitigated. A feature that would enhance these kinds of large-scale collaboration is the ability to predict the contribution of each client site towards improving the global FL model. This will help in client site selection, and in prioritization of data acquisition and annotation efforts. The latter is especially important given the high costs and difficult logistics of these large-consortia endeavors, and it will enable these endeavors to capture diversity rather than the sheer quantity of data samples. Future approaches may incorporate automated hyperparameter searching , neural architecture search i altres aprenentatges automàtics enfocaments per trobar els paràmetres de formació òptims per a cada lloc del client de manera més eficient. 55 56 57 Known issues of batch normalization (BN) in FL motivated us to fix our base model for image feature extraction to reduce the divergence between unbalanced client sites. Future work might explore different types of normalization techniques to allow the training of AI models in FL more effectively when client data are nonindependent and identically distributed. 58 49 Recent works on privacy attacks within the FL setting have raised concerns on data leakage during model training . Meanwhile, protection algorithms remain underexplored and constrained by multiple factors. While differential privacy algorithms , , show good protection, they may weaken the model’s performance. Encryption algorithms, such as homomorphic encryption , maintain performance but may substantially increase message size and training time. A quantifiable way to measure privacy would allow better choices for deciding the minimal privacy parameters necessary while maintaining clinically acceptable performance , , . 59 36 48 49 60 36 48 49 Després d'una validació addicional, preveiem el desplegament del model EXAM en l'establiment ED com una manera d'avaluar el risc tant a nivell per pacient com a nivell de població, i per proporcionar als metges un punt de referència addicional quan realitzen la tasca freqüentment difícil de triar pacients. Mètodes Aprovació ètica Tots els procediments informats del Centre de Salut de la Universitat de Toronto van ser conduïts d'acord amb els principis de la Declaració d'Hèlsinki i de la Conferència Internacional sobre l'harmonització de les directrius de bones pràctiques clíniques de salut, i van ser aprovats pels consells d'avaluació institucionals pertinents de l'Institut de Salut de la Universitat de Toronto als següents llocs de validació: CDH, MVH, NCH i als següents llocs de formació: MGB, Mass General Hospital (MGH), Brigham and Women's Hospital, Newton-Wellesley Hospital, North Shore San Public Medical Center i New Faulkner Hospital (tots els vuit d'aquests hospitals van estar coberts pel consentiment informat de la Junta d'Etica de la Universitat de Washington, DC (no. 000 MI-CLAIM guidelines for reporting of clinical AI models were followed (Supplementary Note ) 2 Study setting The study included data from 20 institutions (Fig. ): MGB, MGH, Brigham and Women's Hospital, Newton-Wellesley Hospital, North Shore Medical Center i Faulkner Hospital; Hospital Nacional dels nens a Washington, DC; NIHR Cambridge Biomedical Research Centre; L'Hospital Central de les Forces d'Autodefensa a Tòquio; National Taiwan University MeDA Lab i MAHC i Taiwan National Health Insurance Administration; Tri-Service General Hospital a Taiwan; Kyungpook National University Hospital a Corea del Sud; Facultat de Medicina, Chulalongkorn University a Tailàndia; Diagnosticos da America SA al Brasil; Universitat de Califòrnia, San Francisco; VA San Diego; Universitat de Toronto; Institutos Nacionals de Salut a Bethesda, Maryland; Universitat de Wisconsin-Madison School of Medicine and Public Health; Memorial Sloan Ketter , , Les dades de tres llocs independents van ser utilitzades per a la validació independent: CDH, MVH i NCH, tots a Massachusetts, EUA. Aquests tres hospitals tenien característiques de població de pacients diferents dels llocs de formació. Les dades utilitzades per a la validació de l'algoritme van consistir en pacients admesos a l'ED en aquests llocs entre març de 2020 i febrer de 2021, i que van satisfer els mateixos criteris d'inclusió que les dades utilitzades per entrenar el model FL. 1a 61 62 63 Data collection The 20 client sites prepared a total of 16,148 cases (both positive and negative) for the purposes of training, validation and testing of the model (Fig. ). Medical data were accessed in relation to patients who satisfied the study inclusion criteria. Client sites strived to include all COVID-positive cases from the beginning of the pandemic in December 2019 and up to the time they started local training for the EXAM study. All local training had started by 30 September 2020. The sites also included other patients in the same period with negative RT–PCR test results. Since most of the sites had more SARS-COV-2-negative than -positive patients, we limited the number of negative patients included to, at most, 95% of the total cases at each client site. 1b A ‘case’ included a CXR and the requisite data inputs taken from the patient’s medical record. A breakdown of the cohort size of the dataset for each client site is shown in Fig. . The distribution and patterns of CXR image intensity (pixel values) varied greatly among sites owing to a multitude of patient- and site-specific factors, such as different device manufacturers and imaging protocols, as shown in Fig. . Patient age and EMR feature distribution varied greatly among sites, as expected owing to the differing demographics between globally distributed hospitals (Extended Data Fig. ). 1b 1c,d 6 Patient inclusion criteria Patient inclusion criteria were: (1) patient presented to the hospital’s ED or equivalent; (2) patient had a RT–PCR test performed at any time between presentation to the ED and discharge from the hospital; (3) patient had a CXR in the ED; and (4) patient’s record had at least five of the EMR values detailed in Table , all obtained in the ED, and the relevant outcomes captured during hospitalization. Of note, The CXR, laboratory results and vitals used were the first available for capture during the visit to the ED. The model did not incorporate any CXR, laboratory results or vitals acquired after leaving the ED. 1 Model input In total, 21 EMR features were used as input to the model. The outcome (that is, ground truth) labels were assigned based on patient requirements after 24- and 72-h periods from initial admission to the ED. A detailed list of the requested EMR features and outcomes can be seen in Table . 1 The distribution of oxygen treatment using different devices at different client sites is shown in Extended Data Fig. , que detalla l'ús del dispositiu en l'admissió a l'ED i després dels períodes de 24 i 72 h. La diferència en la distribució dels conjunts de dades entre els llocs de client més grans i més petits es pot veure a la Figura de dades esteses. . 7 8 The number of positive COVID-19 cases, as confirmed by a single RT–PCR test obtained at any time between presentation to the ED and discharge from the hospital, is listed in Supplementary Table . Each client site was asked to randomly split its dataset into three parts: 70% for training, 10% for validation and 20% for testing. For both 24- and 72-h outcome prediction models, random splits for each of the three repeated local and FL training and evaluation experiments were independently generated. 1 Model de desenvolupament de l'examen There is wide variation in the clinical course of patients who present to hospital with symptoms of COVID-19, with some experiencing rapid deterioration in respiratory function requiring different interventions to prevent or mitigate hypoxemia , Una decisió crítica presa durant l'avaluació d'un pacient en el punt inicial de cura, o en l'ED, és si el pacient és probable que requereixi contramesures o intervencions més invasives o amb recursos limitats (com els anticossos MV o monoclona), i per tant hauria de rebre una teràpia escassa però eficaç, una teràpia amb un estret ratio risc-benefici a causa d'efectes secundaris o un nivell més alt de cura, com l'admissió a la unitat d'atenció intensiva. . In contrast, a patient who is at lower risk of requiring invasive oxygen therapy may be placed in a less intensive care setting such as a regular ward, or even released from the ED for continuing self-monitoring at home EXAM va ser desenvolupat per ajudar a triar aquests pacients. 62 63 64 65 Of note, the model is not approved by any regulatory agency at this time and it should be used only for research purposes. EXAM score EXAM was trained using FL; it outputs a risk score (termed EXAM score) similar to CORISK (Extended Data Fig. ) and can be used in the same way to triage patients. It corresponds to a patient’s oxygen support requirements within two windows—24 and 72 h—after initial presentation to the ED. Extended Data Fig. illustrates how CORISK and the EXAM score can be used for patient triage. 27 9a 9b Chest X-ray images were preprocessed to select the anterior position image and exclude lateral view images, and then scaled to a resolution of 224 × 224. As shown in Extended Data Fig. , the model fuses information from both EMR and CXR features (based on a modified ResNet34 with spatial attention Pre-entrenat en el conjunt de dades de CheXpert) and the Deep & Cross network . To converge these different data types, a 512-dimensional feature vector was extracted from each CXR image using a pretrained ResNet34, with spatial attention, then concatenated with the EMR features as the input for the Deep & Cross network. The final output was a continuous value in the range 0–1 for both 24- and 72-h predictions, corresponding to the labels described above, as shown in Extended Data Fig. . We used cross-entropy as the loss function and ‘Adam’ as the optimizer. The model was implemented in Tensorflow using the NVIDIA Clara Train SDK . The average AUC for the classification tasks (≥LFO, ≥HFO/NIV or ≥MV) was calculated and used as the final evaluation metric, with normalization to zero-mean and unit variance. CXR images were preprocessed to select the correct series and exclude lateral view images, then scaled to a resolution of 224 × 224 (ref. ). 9a 66 67 68 9b 69 70 27 Imputació i normalització A MissForest algorithm was used to impute EMR features, based on the local training dataset. If an EMR feature was completely missing from a client site dataset, the mean value of that feature, calculated exclusively on data from MGB client sites, was used. Then, EMR features were rescaled to zero-mean and unit variance based on statistics calculated on data from the MGB client sites. 71 Details of EMR–CXR data fusion using the Deep & Cross network To model the interactions of features from EMR and CXR data at the case level, a deep-feature scheme was used based on a Deep & Cross network architecture . Binary and categorical features for the EMR inputs, as well as 512-dimensional image features in the CXR, were transformed into fused dense vectors of real values by embedding and stacking layers. The transformed dense vectors served as input to the fusion framework, which specifically employed a crossing network to enforce fusion among input from different sources. The crossing network performed explicit feature crossing within its layers, by conducting inner products between the original input feature and output from the previous layer, thus increasing the degree of interaction across features. At the same time, two individual classic deep neural networks with several stacked, fully connected feed-forward layers were trained. The final output of our framework was then derived from the concatenation of both classic and crossing networks. 68 FL detalls Potser la forma més establida de FL és la implementació de l'algorisme de mitjana federada tal com va proposar McMahan et al. , or variations thereof. This algorithm can be realized using a client-server setup where each participating site acts as a client. One can think of FL as a method aiming to minimize a global loss function by reducing a set of local loss functions, which are estimated at each site. By minimizing each client site’s local loss while also synchronizing the learned client site weights on a centralized aggregation server, one can minimize global loss without needing to access the entire dataset in a centralized location. Each client site learns locally, and shares model weight updates with a central server that aggregates contributions using secure sockets layer encryption and communication protocols. The server then sends an updated set of weights to each client site after aggregation, and sites resume training locally. The server and client site iterate back and forth until the model converges (Extended Data Fig. ). 72 9c A pseudoalgorithm of FL is shown in Supplementary Note . In our experiments, we set the number of federated rounds at = 200, with one local training epoch per round a cada client. El nombre de clients, , va ser de fins a 20 segons la connectivitat de la xarxa dels clients o les dades disponibles per a un període de resultat específic (24 o 72 h). , depends on the dataset size at each client and is used to weigh each client’s contributions when aggregating the model weights in federated averaging. During the FL training task, each client site selects its best local model by tracking the model’s performance on its local validation set. At the same time, the server determines the best global model based on the average validation scores sent from each client site to the server after each FL round. After FL training finishes, the best local models and the best global model are automatically shared with all client sites and evaluated on their local test data. 1 T t K NN k L'optimitzador Adam es va utilitzar tant per a la formació local com per a FL amb una taxa d'aprenentatge inicial de 5 × 10–5 i una taxa d'aprenentatge gradual de decadència amb un factor 0,5 després de cada 40 èpoques, que és important per a la convergència de les mitjanes federades. Les transformacions afines aleatòries, incloent rotació, traduccions, tall, escalada i soroll i canvis d'intensitat aleatòria, es van aplicar a les imatges per a l'augment de dades durant l'entrenament. 73 Owing to the sensitivity of BN layers when dealing with different clients in a nonindependent and identically distributed setting, we found the best model performance occurred when keeping the pretrained ResNet34 with spatial attention parameters fixed during FL training (that is, using a learning rate of zero for those layers). The Deep & Cross network that combines image features with EMR features does not contain BN layers and hence was not affected by BN instability issues. 58 47 In this study we investigated a privacy-preserving scheme that shares only partial model updates between server and client sites. The weight updates were ranked during each iteration by magnitude of contribution, and only a certain percentage of the largest weight updates was shared with the server. To be exact, weight updates (also known as gradients) were shared only if their absolute value was above a certain percentile threshold, (t) (Extended Data Fig. ), which was computed from all non-zero gradients, Δ , i podria ser diferent per a cada client in each FL round . Variations of this scheme could include additional clipping of large gradients or differential privacy schemes that add random noise to the gradients, or even to the raw data, before feeding into the network . k 5 Mèxic(t) k t 49 51 Statistical analysis We conducted a Wilcoxon signed-rank test to confirm the significance of the observed improvement in performance between the locally trained model and the FL model for the 24- and 72-h time points (Fig. and Extended Data Fig. ). The null hypothesis was rejected with one-sided « 1 × 10–3 in both cases. 2 1 P La correlació de Pearson es va utilitzar per avaluar la generalitzabilitat (robustesa del valor mitjà de l'AUC a les dades de prova d'altres llocs de client) dels models formats localment en relació amb la mida del conjunt de dades local respectiu. = 0.43, = 0.035, degrees of freedom (df) = 17 for the 24-h model and = 0.62, = 0.003, df = 16 for the 72-h model). This indicates that dataset size alone is not the only factor determining a model’s robustness to unseen data. r P r P Per comparar les corbes de ROC del model global FL i els models locals entrenats en diferents llocs (Fig. ), vam iniciar 1.000 mostres de les dades i vam calcular les AUC resultants. Després vam calcular la diferència entre les dues sèries i vam estandarditzar utilitzant la fórmula = (AUC1 i AUC2) , where is the standardized difference, és la desviació estàndard de les diferències d'enlairament i AUC1 i AUC2 són les sèries d'AUC d'enlairament corresponents. Amb la distribució normal s’obté el values illustrated in Supplementary Table . The results show that the null hypothesis was rejected with very low values, indicating the statistical significance of the superiority of FL outcomes. The computation of Els valors van ser executats en R amb la biblioteca pROC . 3 D s D s D P 2 P P 74 Atès que el model prediu un resultat discret, una puntuació contínua de 0 a 1, no és possible una avaluació de la calibració directa com una qqplot. Hem realitzat proves d'anàlisi unidireccional de variació (ANOVA) per comparar les puntuacions del model local i FL entre quatre categories de veritat fonamental (RA, LFO, HFO, MV). -estadística, calculada com la variació entre els mitjans de la mostra dividida per la variació dins de les mostres i representant el grau de dispersió entre els diferents grups, es va utilitzar per quantificar els models. Els valors de cinc llocs locals diferents són 245.7, 253.4, 342.3, 389.8 i 634.8, mentre que el del model FL és 843.5. - els valors signifiquen que els grups són més separables, les puntuacions del nostre model FL mostren clarament una major dispersió entre les quatre categories de veritat fonamental. value of the ANOVA test on the FL model is <2 × 10–16, indicating that the FL prediction scores are statistically significantly different among the different prediction classes. 10 F F F P Resum informatiu Més informació sobre el disseny està disponible a la Enllaç a aquest article. Resum de la investigació Nature Disponibilitat de dades Aquestes dades es van utilitzar per a la formació en cadascun dels llocs locals i no es van compartir amb cap de les altres institucions participants o amb el servidor federat, i no estan disponibles públicament. Les dades dels llocs de validació independents són mantingudes per CAMCA, i l'accés es pot sol·licitar contactant amb Q.L. Basant-se en la determinació de CAMCA, una revisió de compartir dades i una modificació de l'IRB per a finalitats de recerca pot ser realitzada per l'administració de recerca de MGB i d'acord amb l'IRB i la política de MGB. Disponibilitat del codi Tot el codi i el programari utilitzats en aquest estudi estan disponibles públicament a NGC. Per accedir-hi, iniciar sessió com a convidat o crear un perfil, introduïu una de les URL següents. Els models formats, les directrius de preparació de dades, el codi per a la formació, la validació de la prova del model, el fitxer de readme, les directrius d'instal·lació i els fitxers de llicència estan públicament disponibles a NVIDIA NGC · : El programari federat d’aprenentatge està disponible com a part del SDK de Clara Train: Alternativament, utilitzeu aquest comandament per descarregar el model "wget --content-disposition" -O clara_train_covid19_exam_ehr_xray_1.zip”. 61 https://ngc.nvidia.com/catalog/models/nvidia:med:clara_train_covid19_exam_ehr_xray https://ngc.nvidia.com/catalog/containers/nvidia:clara-train-sdk https://api.ngc.nvidia.com/v2/models/nvidia/med/clara_train_covid19_exam_ehr_xray/versions/1/zip Referències Budd, J. et al. Digital technologies in the public-health response to COVID-19. , 1183–1192 (2020). Nat. Med. 26 Moorthy, V., Henao Restrepo, A. M., Preziosi, M.-P. & Swaminathan, S. Data sharing for novel coronavirus (COVID-19). , 150 (2020). Bull. World Health Organ. 98 Chen, Q., Allot, A. & Lu, Z. Keep up with the latest coronavirus research. , 193 (2020). Nature 579 Fabbri, F., Bhatia, A., Mayer, A., Schlotter, B. & Kaiser, J. BCG IT spend pulse: how COVID-19 is shifting tech priorities. (2020). https://www.bcg.com/publications/2020/how-covid-19-is-shifting-big-it-spend Candelon, F., Reichert, T., Duranton, S., di Carlo, R. C. & De Bondt, M. The rise of the AI-powered company in the postcrisis world. (2020). https://www.bcg.com/en-gb/publications/2020/business-applications-artificial-intelligence-post-covid Chao, H. et al. Integrative analysis for COVID-19 patient outcome prediction. , 101844 (2021). Med. Image Anal. 67 Zhu, X. et al. Joint prediction and time estimation of COVID-19 developing severe symptoms using chest CT scan. , 101824 (2021). Med. Image Anal. 67 Yang, D. et al. Federated semi-supervised learning for Covid region segmentation in chest ct using multi-national data from China, Italy, Japan. , 101992 (2021). Med. Image Anal. 70 Minaee, S., Kafieh, R., Sonka, M., Yazdani, S. & Jamalipour Soufi, G. Deep-COVID: predicting COVID-19 from chest X-ray images using deep transfer learning. , 101794 (2020). Med. Image Anal. 65 COVID-19 Studies from the World Health Organization Database. (2020). https://clinicaltrials.gov/ct2/who_table ACTIV. (2020). https://www.nih.gov/research-training/medical-research-initiatives/activ Coronavirus Treatment Acceleration Program (CTAP). US Food and Drug Administration (2020). https://www.fda.gov/drugs/coronavirus-covid-19-drugs/coronavirus-treatment-acceleration-program-ctap Gleeson, P., Davison, A. P., Silver, R. A. & Ascoli, G. A. A commitment to open source in neuroscience. , 964–965 (2017). Neuron 96 Piwowar, H. et al. The state of OA: a large-scale analysis of the prevalence and impact of open access articles. , e4375 (2018). PeerJ. 6 European Society of Radiology (ESR). What the radiologist should know about artificial intelligence – an ESR white paper. , 44 (2019). Insights Imaging 10 Pesapane, F., Codari, M. & Sardanelli, F. Artificial intelligence in medical imaging: threat or opportunity? Radiologists again at the forefront of innovation in medicine. , 35 (2018). Eur. Radiol. Exp. 2 Price, W. N. 2nd & Cohen, I. G. Privacy in the age of medical big data. , 37–43 (2019). Nat. Med. 25 Liang, W. et al. Development and validation of a clinical risk score to predict the occurrence of critical illness in hospitalized patients with COVID-19. , 1081–1089 (2020). JAMA Intern. Med. 180 Wynants, L. et al. Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal. , m1328 (2020). Brit. Med. J. 369 Zhang, L. et al. D-dimer levels on admission to predict in-hospital mortality in patients with Covid-19. , 1324–1329 (2020). J. Thromb. Haemost. 18 Sands, K. E. et al. Patient characteristics and admitting vital signs associated with coronavirus disease 2019 (COVID-19)-related mortality among patients admitted with noncritical illness. (2020). https://doi.org/10.1017/ice.2020.461 American College of Radiology. CR recommendations for the use of chest radiography and computed tomography (CT) for suspected COVID-19 infection. (2020). https://www.acr.org/Advocacy-and-Economics/ACR-Position-Statements/Recommendations-for-Chest-Radiography-and-CT-for-Suspected-COVID19-Infection Rubin, G. D. et al. The role of chest imaging in patient management during the COVID-19 pandemic: a multinational consensus statement from the Fleischner Society. , 172–180 (2020). Radiology 296 World Health Organization. Use of chest imaging in COVID-19. (2020). https://www.who.int/publications/i/item/use-of-chest-imaging-in-covid-19 Jamil, S. et al. Diagnosis and management of COVID-19 disease. , 10 (2020). Am. J. Respir. Crit. Care Med. 201 Redmond, C. E., Nicolaou, S., Berger, F. H., Sheikh, A. M. & Patlas, M. N. Emergency radiology during the COVID-19 pandemic: The Canadian Association of Radiologists Recommendations for Practice. , 425–430 (2020). Can. Assoc. Radiologists J. 71 Buch, V. et al. Development and validation of a deep learning model for prediction of severe outcomes in suspected COVID-19 Infection. Preprint at (2021). https://arxiv.org/abs/2103.11269 Lyons, C. & Callaghan, M. The use of high-flow nasal oxygen in COVID-19. , 843–847 (2020). Anaesthesia 75 Whittle, J. S., Pavlov, I., Sacchetti, A. D., Atwood, C. & Rosenberg, M. S. Respiratory support for adult patients with COVID-19. , 95–101 (2020). J. Am. Coll. Emerg. Physicians Open 1 Ai, J., Li, Y., Zhou, X. & Zhang, W. COVID-19: treating and managing severe cases. , 370–371 (2020). Cell Res. 30 Esteva, A. et al. A guide to deep learning in healthcare. , 24–29 (2019). Nat. Med. 25 Cahan, E. M., Hernandez-Boussard, T., Thadaney-Israni, S. & Rubin, D. L. Putting the data before the algorithm in big data addressing personalized healthcare. , 78 (2019). NPJ Digit. Med. 2 Thrall, J. H. et al. Artificial intelligence and machine learning in radiology: opportunities, challenges, pitfalls, and criteria for success. , 504–508 (2018). J. Am. Coll. Radiol. 15 Shilo, S., Rossman, H. & Segal, E. Axes of a revolution: challenges and promises of big data in healthcare. , 29–38 (2020). Nat. Med. 26 Gao, Y. & Cui, Y. Deep transfer learning for reducing health care disparities arising from biomedical data inequality. , 5131 (2020). Nat. Commun. 11 Rieke, N. et al. The future of digital health with federated learning. , 119 (2020). NPJ Dig. Med. 3 Yang, Q., Liu, Y., Chen, T. & Tong, Y. Federated machine learning: concept and applications. , 12 (2019). ACM Trans. Intell. Syst. Technol. 10 Ma, C. et al. On safeguarding privacy and security in the framework of federated learning. , 242–248 (2020). IEEE Netw. 34 Brisimi, T. S. et al. Federated learning of predictive models from federated Electronic Health Records. , 59–67 (2018). Int. J. Med. Inform. 112 Roth, H. R. et al. Federated learning for breast density classification: a real-world implementation. In , (eds. Albarqouni, S. et al.) Vol. 12,444, 181–191 (Springer International Publishing, 2020). Proc. Second MICCAI Workshop, DART 2020 and First MICCAI Workshop, DCL 2020 Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning Sheller, M. J. et al. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. , 12598 (2020). Sci. Rep. 10 Remedios, S. W., Butman, J. A., Landman, B. A. & Pham, D. L. in (eds Remedios, S. W. et al.) (Springer, 2020). Federated Gradient Averaging for Multi-Site Training with Momentum-Based Optimizers Xu, Y. et al. A collaborative online AI engine for CT-based COVID-19 diagnosis. Preprint at (2020). https://www.medrxiv.org/content/10.1101/2020.05.10.20096073v2 Raisaro, J. L. et al. SCOR: A secure international informatics infrastructure to investigate COVID-19. , 1721–1726 (2020). J. Am. Med. Inform. Assoc. 27 Vaid, A. et al. Federated learning of electronic health records to improve mortality prediction in hospitalized patients with COVID-19: machine learning approach. , e24207 (2021). JMIR Med. Inform. 9 Nino, G. et al. Pediatric lung imaging features of COVID-19: a systematic review and meta-analysis. , 252–263 (2021). Pediatr. Pulmonol. 56 Fredrikson, M., Jha, S. & Ristenpart, T. Model inversion attacks that exploit confidence information and basic countermeasures. In 1322–1333, (2015). Proc. 22nd ACM SIGSAC Conference on Computer and Communications Security https://doi.org/10.1145/2810103.2813677 Zhu, L., Liu, Z. & Han, S. in (eds Wallach, H. et al.) 14774–14784 (Curran Associates, Inc., 2019). Advances in Neural Information Processing Systems 32 Kaissis, G. A., Makowski, M. R., Rückert, D. & Braren, R. F. Secure, privacy-preserving and federated machine learning in medical imaging. , 305–311 (2020). Nat. Mach. Intell. 2 Li, W. et al. in 133–141 (Springer, 2019). Privacy-Preserving Federated Brain Tumour Segmentation Shokri, R. & Shmatikov, V. Privacy-preserving deep learning. In (2015). Proc. 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton) https://doi.org/10.1109/allerton.2015.7447103 Li, X. et al. Multi-site fMRI analysis using privacy-preserving federated learning and domain adaptation: ABIDE results. , 101765 (2020). Med. Image Anal. 65 Estiri, H. et al. Predicting COVID-19 mortality with electronic medical records. , 15 (2021). NPJ Dig. Med. 4 Jiang, G. et al. Harmonization of detailed clinical models with clinical study data standards. , 65–74 (2015). Methods Inf. Med. 54 Yang, D. et al. in . (2019). Searching Learning Strategy with Reinforcement Learning for 3D Medical Image Segmentation https://doi.org/10.1007/978-3-030-32245-8_1 Elsken, T., Metzen, J. H. & Hutter, F. Neural architecture search: a survey. , 1–21 (2019). J. Mach. Learning Res. 20 Yao, Q. et al. Taking human out of learning applications: a survey on automated machine learning. Preprint at (2019). https://arxiv.org/abs/1810.13306 Ioffe, S. & Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In , PMLR , 448–456 (2015). Proc. 32nd International Conf. Machine Learning 37 Kaufman, S., Rosset, S. & Perlich, C. Leakage in data mining: formulation, detection, and avoidance. In , 556–563 (2011). Proc. 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Zhang, C. et al. BatchCrypt: efficient homomorphic encryption for cross-silo federated learning. In , 493–506 (2020). Proc. 2020 USENIX Annual Technical Conference, ATC 2020 . (2020). Nvidia NGC Catalog: COVID-19 Related Models https://ngc.nvidia.com/catalog/models?orderBy=scoreDESC&pageNumber=0&query=covid&quickFilter=models&filters Marini, J. J. & Gattinoni, L. Management of COVID-19 respiratory distress. , 2329–2330 (2020). JAMA 323 Cook, T. M. et al. Consensus guidelines for managing the airway in patients with COVID-19: Guidelines from the Difficult Airway Society, the Association of Anaesthetists the Intensive Care Society, the Faculty of Intensive Care Medicine and the Royal College of Anaesthetist. , 785–799 (2020). Anaesthesia 75 Galloway, J. B. et al. A clinical risk score to identify patients with COVID-19 at high risk of critical care admission or death: an observational cohort study. , 282–288 (2020). J. Infect. 81 Kilaru, A. S. et al. Return hospital admissions among 1419 COVID-19 patients discharged from five U.S. emergency departments. , 1039–1042 (2020). Acad. Emerg. Med. 27 He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In (2016). Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/cvpr.2016.90 Irvin, J. et al. CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. , 590–597 (2019). Proc. AAAI Conf. Artif. Intell. 33 Wang, R., Fu, B., Fu, G. & Wang, M. Deep & Cross network for Ad Click predictions. In Article no. 12 (2017). Proc. ADKDD’17 Abadi, M. et al. TensorFlow: asystem for large-scale machine learning. In , USENIX Association 265–283 (2016). 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) . (2020). NVIDIA Clara Imaging https://developer.nvidia.com/clara-medical-imaging Stekhoven, D. J. & Bühlmann, P. MissForest–non-parametric missing value imputation for mixed-type data. , 112–118 (2012). Bioinformatics 28 McMahan, H., Moore, E., Ramage, D., Hampson, S. & y Arcas, B. A. Communication-efficient learning of deep networks from decentralized data. (2017). http://proceedings.mlr.press/v54/mcmahan17a.html Hsieh, K., Phanishayee, A., Mutlu, O. & Gibbons, P. B. The non-IID data quagmire of decentralized machine learning. In PMLR 119 (2020). Proc. 37th International Conf. Machine Learning Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. , 77 (2011). BMC Bioinformatics 12 Reconeixement Les opinions expressades en aquest estudi són les dels autors i no necessàriament les de l'NHS, el NIHR, el Departament de Salut i Assistència Social o qualsevol de les organitzacions associades amb els autors. MGB agraeix als següents individus el seu suport: J. Brink, Departament de Radiologia, Massachusetts General Hospital, Harvard Medical School, Boston, MA; M. Kalra, Departament de Radiologia, Massachusetts General Hospital, Harvard Medical School, Boston, MA; N. Neumark, Centre de Ciència de Dades Clíniques, Massachusetts General Brigham, Boston, MA; T. Schultz, Departament de Radiologia, Massachusetts General Hospital, Boston, MA; N. Guo, Centre de Computació Mèdica Avançada i Anàlisi, Departament d'Estudis de Radiologia, Massachusetts General Hospital, Harvard Medical School, Boston, MA Per mitjà de la Facultat de Medicina, la Universitat de Chulalongkorn agraeix al Ratchadapisek Sompoch Endowment Fund RA (PO) (no 001/63) la recopilació i gestió de dades clíniques i mostres biològiques relacionades amb COVID-19 per a la Task Force de Recerca, la Facultat de Medicina, la Universitat de Chulalongkorn. NIHR Centre de Recerca Biomèdica de Cambridge agraeix a A. Priest, que és recolzat pel NIHR (Centre de Recerca Biomèdica de Cambridge de la Fundació NHS de la Universitat de Cambridge). National Taiwan University MeDA Lab i el MAHC i l'Administració Nacional d'Assegurances de Salut de Taiwan agraeixen al MOST Joint Research Center for AI Technology, a l'All Vista Health https://data.ucsf.edu/covid19 Aquest document està disponible en la naturalesa sota la llicència CC by 4.0 Deed (Attribution 4.0 International). Aquest document està disponible en la naturalesa sota la llicència CC by 4.0 Deed (Attribution 4.0 International).