Meet CulturaX: Training AI Models in 167 Languages for Multi-Language Techby@mikeyoung44

Meet CulturaX: Training AI Models in 167 Languages for Multi-Language Tech

tldt arrow
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Researchers at the University of Oregon and Adobe Research have constructed a game-changing resource called CulturaX. This dataset provides: Text data for a whopping 167 languages. Over 6 trillion words in total. Extensive cleaning and deduplication. Completely free and open availability. The democratization and benefits of AI can be shared across diverse linguistic groups.
featured image - Meet CulturaX: Training AI Models in 167 Languages for Multi-Language Tech
Mike Young HackerNoon profile picture

@mikeyoung44

Mike Young

Among other things, launching AIModels.fyi ... Find the right AI model for your project - https://aimodels.fyi


Receive Stories from @mikeyoung44


Credibility

react to story with heart

RELATED STORIES

L O A D I N G
. . . comments & more!