paint-brush
Ichi Chitsva cheAI Tool Chinoti Kugadzirisa Matambudziko eData zvirinani pane chero chimwe chinhu-Hechi ndicho chikonzero nei zvichikosha.by@hugedata
Nhoroondo itsva

Ichi Chitsva cheAI Tool Chinoti Kugadzirisa Matambudziko eData zvirinani pane chero chimwe chinhu-Hechi ndicho chikonzero nei zvichikosha.

by Huge Data7m2025/01/02
Read on Terminal Reader

Kurebesa; Kuverenga

Vatsvagiri vakagadzira mhinduro inosimbisa matatu akakosha matekiniki ekuwedzera kugadzirisa-dambudziko mune data sainzi.
featured image - Ichi Chitsva cheAI Tool Chinoti Kugadzirisa Matambudziko eData zvirinani pane chero chimwe chinhu-Hechi ndicho chikonzero nei zvichikosha.
Huge Data HackerNoon profile picture
0-item

Vanyori:

(1) Sirui Hong, DeepWisdom, uye vanyori ava vakapa zvakaenzana mubasa iri;

(2) Yizhang Lin, DeepWisdom, uye vanyori ava vakapa zvakaenzana mubasa iri;

(3) Bang Liu, Universite de Montreal & Mila nevanyori ava vakanyorwa muhurongwa hwearufabheti;

(4) Bangbang Liu, DeepWisdom nevanyori ava vakapa zvakaenzana kubasa iri;

(5) Binhao Wu, DeepWisdom nevanyori ava vakapa zvakaenzana mubasa iri;

(6) Danyang Li, DeepWisdom nevanyori ava vakapa zvakaenzana kubasa iri;

(7) Jiaqi Chen, Fudan University uye vanyori ava vakapa zvakaenzana mubasa iri;

(8) Jiayi Zhang, Renmin University of China uye vanyori ava vakapa zvakaenzana mubasa iri;

(9) Jinlin Wang, DeepWisdom nevanyori ava vakapa zvakaenzana kubasa iri;

(10) Li Zhang, Fudan University uye vanyori ava vakapa zvakaenzana mubasa iri;

(11) Lingyao Zhang, vanyori ava vakapa zvakaenzana kubasa iri;

(12) Min Yang, 5Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences uye vanyori ava vakapa zvakaenzana mubasa iri;

(13) Mingchen Zhuge, AI Initiative, King Abdullah University of Science and Technology uye vanyori ava vakapa zvakaenzana mubasa iri;

(14) Taicheng Guo, University of Notre Dame uye vanyori ava vakapa zvakaenzana mubasa iri;

(15) Tuo Zhou, Yunivhesiti yeHong Kong uye vanyori ava vakapa zvakaenzana mubasa iri;

(16) Wei Tao, Fudan University uye vanyori ava vakapa zvakaenzana mubasa iri;

(17) Wenyi Wang, AI Initiative, King Abdullah University of Science and Technology uye vanyori ava vakapa zvakaenzana mubasa iri;

(18) Xiangru Tang, Yale University uye vanyori ava vakapa zvakaenzana kubasa iri;

(19) Xiangtao Lu, DeepWisdom nevanyori ava vakapa zvakaenzana mubasa iri;

(20) Xiawu Zheng, Xiamen University uye vanyori ava vakapa zvakaenzana kubasa iri;

(21) Xinbing Liang, DeepWisdom, East China Normal University uye vanyori ava vakapa zvakaenzana mubasa iri;

(22) Yaying Fei, Beijing University of Technology uye vanyori ava vakapa zvakafanana kubasa iri;

(23) Yuheng Cheng, The Chinese University of Hong Kong, Shenzhen uye vanyori ava vakapa zvakaenzana mubasa iri;

(24) Zongze Xu, DeepWisdom, Hohai University uye vanyori ava vakapa zvakaenzana mubasa iri;

(25) Chenglin Wu, DeepWisdom uye munyori anowirirana.

Chiziviso cheMhariri: Ichi chikamu 1 che5 chetsvagurudzo inotsanangura kuvandudzwa kweData Interpreter, mhinduro yesainzi yedata rakasiyana uye mabasa chaiwo enyika. Verenga zvimwe pasi apa.

Table of Links

ABSTRACT

Large Language Model (LLM)-based agents vakaratidza kushanda kunoshamisa. Nekudaro, kuita kwavo kunogona kukanganiswa mune data sainzi mamiriro ayo anoda chaiyo-nguva yekugadzirisa data, hunyanzvi mukugadzirisa nekuda kwekutsamira kwakaoma pakati pemabasa akasiyana-siyana, uye kugona kuona zvikanganiso zvine musoro zvekufunga chaiko. Muchidzidzo ichi, tinosuma Data Interpreter, mhinduro yakagadzirirwa kugadzirisa nekodhi inosimbisa nzira nhatu dzakakosha dzekuwedzera kugadzirisa matambudziko musayenzi yedata: 1) kuronga kwakasimba ne hierarchical graph structures for real-time data adaptability; 2) kubatanidzwa kwechishandiso zvine simba kuti uwedzere hunyanzvi hwekodhi panguva yekuuraya, kupfumisa hunyanzvi hunodiwa; 3) zvine musoro kusawirirana kuzivikanwa mumhinduro, uye kusimudzira kwehunyanzvi kuburikidza neruzivo rwekurekodha. Isu tinoongorora iyo Dhata Interpreter pane akasiyana data sainzi uye chaiyo-nyika mabasa. Zvichienzaniswa neyakavhurika-sosi yekutanga, yakaratidza hukuru hwekuita, ichiratidza kuvandudzwa kwakakosha mumabasa ekudzidza muchina, ichiwedzera kubva pa0.86 kusvika 0.95. Pamusoro pezvo, yakaratidza kuwedzera kwe26% mune MATH dataset uye inoshamisa 112% kuvandudzwa mumabasa akazaruka. Mhinduro yacho ichaburitswa pa https://github.com/geekan/MetaGPT.

1 SUMO

Makuru Mutauro Models (LLMs) akagonesa vamiririri kuti vagone mumhando dzakasiyana dzemashandisirwo, vachiratidza kuchinjika uye kushanda kwavo (Guo et al., 2024; Wu et al., 2023a; Zhou et al., 2023b). Aya maLLM-ane simba vamiririri vakapesvedzera zvakanyanya nzvimbo dzakaita sesoftware engineering (Hong et al., 2023), kufamba-famba kwakaoma kwakavhurika-pasirese mamiriro (Wang et al., 2023; Chen et al., 2024a), kufambisa mubatanidzwa wevazhinji-vamiririri zvimiro zve multimodal mabasa (Zhuge et al., 2023), kuvandudza kuterera kwevabatsiri vechokwadi. (Lu et al., 2023), kugadzirisa hungwaru hweboka (Zhuge et al., 2024), uye nekubatsira mukutsvagisa kwesainzi (Tang et al., 2024).


Zvidzidzo zvenguva pfupi yakanangana nekuvandudza kugona kwekugadzirisa matambudziko kweava vamiririri nekuvandudza maitiro avo ekufunga, nechinangwa chekuwedzera hunyanzvi uye kugona (Zhang et al., 2023; Besta et al., 2023; Sel et al., 2023; Yao et al., 2023; ., 2024; Wei et al., 2022). Nekudaro, data-centric matambudziko esainzi, anosanganisira kudzidza muchina, kuongororwa kwedata, uye kugadzirisa dambudziko remasvomhu, zvinopa matambudziko akasiyana anoramba achigadziriswa. Muchina wekudzidza maitiro unosanganisira yakaoma, yakareba basa rekubata matanho, inoratidzwa nekuomesesa kutsamira pakati pemabasa akawanda. Izvi zvinoda kupindira kwenyanzvi pakugadzirisa mashandiro uye kugadziridzwa kwakasimba muchiitiko chekutadza kana kugadzirisa data. Zvinowanzonetsa kuti maLLM ape mhinduro chaiyo mukuedza kumwe chete. Uyezve, matambudziko aya anoda kunyatsofunga kufunga, uye kunyatso simbisa data (RomeraParedes et al., 2023), izvo zvinounza mamwe matambudziko kuLLM-based agent framework.


Mufananidzo 1: Kuenzanisa neakasiyana akavhurika-sosi masisitimu pamashini ekudzidza mabasa uye chaiyo-yepasirese yakavhurika-yakapera mabasa.


Uyezve, mabasa aripo akadai sa (Qiao et al., 2023; OpenAI, 2023; Lucas, 2023) anogadzirisa matambudziko edhatacentric kuburikidza nekodhi-yakavakirwa dambudziko-nzira dzekugadzirisa, dzinozivikanwa semuturikiri paradigm, iyo inosanganisa static chinodiwa kuora pamwe nekodhi kuuraya. Nekudaro, akati wandei akakosha matambudziko anomuka kana uchishandisa aya masisitimu mune inoshanda data sainzi mabasa: 1) Kutsamira kwedata kusimba: Kuomarara kurimo musainzi yedata kunobva mukupindirana kwakaoma pakati pematanho akasiyana, ayo ari pasi pekuchinja-chaiyo-nguva (Liu et al. , 2021). Kuti uwane mhedzisiro chaiyo, kucheneswa kwedata uye yakazara chimiro engineering zvinodikanwa usati wagadzira chero muchina wekudzidza modhi. Naizvozvo, zvakakosha kutarisa shanduko yedata uye kugadzirisa zvine simba kune yakashandurwa data uye akasiyana. Muchina wekudzidza wokuenzanisira maitiro, unosanganisira kusarudzwa kwechimiro, kudzidziswa modhi, uye kuongorora, zvinosanganisira huwandu hwakawanda hwekugadzirisa vanoshanda uye nzvimbo dzekutsvaga (Zheng et al., 2021). Dambudziko riri mukugadzira uye kugadzirisa iyo yese process code panguva imwe chete. 2) Ruzivo rwechizinda chakanatswa: Irwo ruzivo rwehunyanzvi uye macoding maitiro emasaenzi edata akakosha mukugadzirisa matambudziko ane chekuita nedata. Kazhinji yakamisikidzwa mukodhi kodhi uye data, ruzivo urwu runoramba rusingawanikwe kune azvino LLMs. Semuyenzaniso, kugadzira kodhi yeshanduko yedata munzvimbo dzakaita sesimba kana geology zvinogona kunetsa kune maLLM pasina inodiwa domain hunyanzvi. Nzira dziripo dziripo dzinonyanya kutsamira paLLMs, kuvimbika uko kunogona kugadzirisa maitiro asi zvichikanganisa kuita. 3) Zvakaomarara logic zvinodiwa: Parizvino, vaturikiri vakaita sa (Qiao et al., 2023; OpenAI, 2023; Lucas, 2023) vanosanganisira kuuraya kodhi uye kukanganisa kwekutora zvikanganiso kuti uwedzere kugadzirisa matambudziko. Nekudaro, ivo vanowanzoregeredza kuuraya pasina kukanganisa, vachifunga zvisiri izvo sechakarurama. Nepo mabasa ekutanga ekuronga anogona kugadziridzwa uye achitsamira pamhinduro yekukurumidza kuuraya kana zvinodiwa zvatsanangurwa, matambudziko esainzi yedata anowanzo kuunza kusanzwisisika, kusarongeka, uye kusanyatsotsanangurwa zvinodiwa, zvichiita kuti zviome kuti maLLM anzwisise. Nekuda kweizvozvo, LLM-yakagadzirwa kodhi mhinduro dzekugadzirisa basa dzinogona kunge dziine kusajeka kunoda kusimbiswa kwakasimba kwekunzwika kune musoro, kupfuudzira kunze kwekungoita mhinduro.


Kugadzirisa matambudziko ambotaurwa, tinosuma mumiriri weLLM, anonzi Dhata Interpreter, akagadzirirwa zvakanangana nendima yesainzi yedata. Uyu mumiririri anotevera chirongwa-kodhi-yekusimbisa nzira yekuzadzisa zvinodiwa nevanhu nekupwanya mabasa, kuita kodhi, uye kuongorora mhinduro. Kunyanya, isu tinokurudzira 1) Kuronga zvine simba nechimiro chechimiro: Yedu Dhata Interpreter inoshandisa hierarchical graph zvimiro kuti inzwisise kuomarara kwesainzi yedata zvakanyanya. Nzira yekuronga ine simba inoishongedza nekuchinjika kune mutsauko webasa, zvichiratidza kunyatsoita mukutarisa shanduko yedata uye kutonga kwakaomesesa kutsamira kuri mumatambudziko esainzi yedata. 2) Mashandisirwo echishandiso uye chizvarwa: Isu tinosimudzira hunyanzvi hwekodha nekubatanidza akasiyana-akanyorwa-akanyorwa kodhi snippets, uye kugadzira echinyakare maturusi emamwe mabasa anopfuura kungo API-yakatarisana nekugona. Maitiro aya anosanganisira otomatiki musanganiswa wezvishandiso zvakasiyana neanozvigadzira kodhi. Inoshandisa basa-level execution kuvaka yakazvimirira uye kuwedzera raibhurari yayo yezvishandiso, kurerutsa mashandisirwo ezvishandiso, uye kugadzirisa kodhi sezvinodiwa. 3) Kunatsiridza kufunga zvine logic bug kuziva: Izvi zvakavakirwa pachivimbo chibodzwa chinobva mumhedzisiro yekuurayiwa uye bvunzo-inofambiswa nekusimbiswa, izvo zvakakosha kune yakasarudzika-yemahara mamiriro. Inoona kusawirirana pakati pekodhi mhinduro uye test code execution uye inoenzanisa miedzo yakawanda kuderedza zvikanganiso zvepfungwa. Munguva yese yekuuraya uye maitiro ekufunga, zviitiko-chikamu chebasa, kunyanya zvinosanganisira metadata uye runtime trajectory, izvo zvinosanganisira zvese kubudirira uye kutadza, zvinorekodhwa.


Sezvinoratidzwa muFigure 1, yedu Durikiri yeData inopfuura zvakanyanya iripo yakavhurika-sosi masisitimu. Zvichienzaniswa neaya ekutanga, Mududziri weData anoratidza kuita kwepamusoro, ne10.3% (kubva pa0.86 kusvika 0.95) kuvandudzwa mumabasa ekudzidza muchina uye 26% kukwidziridzwa paMATH dataset, ichiratidza kugona kwakasimba kwekugadzirisa matambudziko. Mumabasa akazaruka, kuita kwayo kwakapetwa kaviri, zvichiratidza kuwedzera kwe112%, kuratidza kushanda kwayo mukugadzirisa matambudziko akawanda.


Isu tinopfupikisa mipiro yedu sezvizvi:


• Isu tinopa humbowo hwekuronga hwakasimba hune zvimiro zve hierarchical, kusimudzira kuchinjika uye kugona kugadzirisa matambudziko mu data sainzi mabasa.


• Isu tinonatsiridza hunyanzvi uye kugona kwekukodha muLLMs nekuunza otomatiki maturusi ekubatanidza ekushandisa maturusi uye kugadzira.


• Isu tinovandudza kufunga nekubatanidza kuongorora uye ruzivo, nekudaro tichiwedzera kurongeka uye kugona kwekugadzirisa matambudziko.


• Zviedzo zvedu zvinoratidza kuti Muturikiri wedu weData anodarika mabenchmark aripo mumabasa ekudzidza muchina, matambudziko emasvomhu, nemabasa akazaruka, nokudaro achiisa mwero mutsva wekuita.


Iri bepa rinowanikwa pa arxiv pasi peCC BY 4.0 DEED rezinesi.