The Rust ecosystem has seen tremendous growth in data processing libraries, with Polars leading the charge as a blazingly fast DataFrame library. However, a new contender has emerged that takes a fundamentally different approach to data engineering and analysis: Elusion. Elusion While Polars focuses on pure performance and memory efficiency with its Apache Arrow-based columnar engine, Elusion positions itself as equaly dedicated for performance and memory efficiency, also with Appache Arrow and DataFusion, as well as a comprehensive data engineering platform that prioritizes flexibility, ease of use, and integration capabilities alongside high performance. Architecture Philosophy: Different Approaches to the Same Goals Polars: Performance-First Design Polars is written from scratch in Rust, designed close to the machine and without external dependencies. It's based on Apache Arrow's memory model, providing very cache efficient columnar data structures and focuses on: Ultra-fast query execution with SIMD optimizations Memory-efficient columnar processing Lazy evaluation with query optimization Streaming for out-of-core processing Elusion: Flexibility-First Design Elusion takes a different approach, prioritizing developer experience and integration capabilities: Core Philosophy: "Elusion wants you to be you!" Core Philosophy: "Elusion wants you to be you!" "Elusion wants you to be you!" Unlike traditional DataFrame libraries that enforce specific patterns, Elusion offers flexibility in constructing queries without enforcing specific patterns or chaining orders. You can build your queries in ANY SEQUENCE that makes sense to you, writing functions in ANY ORDER, and Elusion ensures consistent results regardless of the function call order. Loading files into DataFrames: Regular Loading: ~4.95 seconds for complex queries on 900k rows CustomDataFrame::new() CustomDataFrame::new() Streaming Loading: ~3.62 seconds for the same operations CustomDataFrame::new_with_stream() CustomDataFrame::new_with_stream() Performance improvement: 26.9% faster with streaming approach Polars approach: Polars approach: let df = LazyFrame::scan_csv("data.csv", ScanArgsCSV::default())?
    .filter(col("amount").gt(100))
    .select([col("customer"), col("amount")])
    .collect()?; let df = LazyFrame::scan_csv("data.csv", ScanArgsCSV::default())?
    .filter(col("amount").gt(100))
    .select([col("customer"), col("amount")])
    .collect()?; Elusion approach - flexible ordering: Elusion approach - flexible ordering: let df = CustomDataFrame::new("data.csv", "sales").await?
    .filter("amount > 100")           
    .select(["customer", "amount"]) 
    .elusion("result").await?;

// Or reorder as you find fit - same result
let df = CustomDataFrame::new("data.csv", "sales").await?
    .select(["customer", "amount"])   // Select first
    .filter("amount > 100")           // Filter second
    .elusion("result").await?; let df = CustomDataFrame::new("data.csv", "sales").await?
    .filter("amount > 100")           
    .select(["customer", "amount"]) 
    .elusion("result").await?;

// Or reorder as you find fit - same result
let df = CustomDataFrame::new("data.csv", "sales").await?
    .select(["customer", "amount"])   // Select first
    .filter("amount > 100")           // Filter second
    .elusion("result").await?; Polars Basic file loading: let df = LazyFrame::scan_csv("data.csv", ScanArgsCSV::default())?
    .collect()?;

// Parquet with options
let df = LazyFrame::scan_parquet("data.parquet", ScanArgsParquet::default())?
    .collect()?; let df = LazyFrame::scan_csv("data.csv", ScanArgsCSV::default())?
    .collect()?;

// Parquet with options
let df = LazyFrame::scan_parquet("data.parquet", ScanArgsParquet::default())?
    .collect()?; Elusion Data Loading - Comprehensive Sources: use elusion::prelude::*; use elusion::prelude::*; Local files with auto-recognition let df = CustomDataFrame::new("data.csv", "sales").await?;
let df = CustomDataFrame::new("data.xlsx", "sales").await?;  // Excel support
let df = CustomDataFrame::new("data.parquet", "sales").await?; let df = CustomDataFrame::new("data.csv", "sales").await?;
let df = CustomDataFrame::new("data.xlsx", "sales").await?;  // Excel support
let df = CustomDataFrame::new("data.parquet", "sales").await?; Streaming for large files (currently only supports .csv files) let df = CustomDataFrame::new_with_stream("large_data.csv", "sales").await?; let df = CustomDataFrame::new_with_stream("large_data.csv", "sales").await?; Load entire folders let df = CustomDataFrame::load_folder(
    "/path/to/folder",
    Some(vec!["csv", "xlsx"]), // Filter file types or `None` for all types
    "combined_data"
).await?; let df = CustomDataFrame::load_folder(
    "/path/to/folder",
    Some(vec!["csv", "xlsx"]), // Filter file types or `None` for all types
    "combined_data"
).await?; Azure Blob Storage (currently supports csv and json files) let df = CustomDataFrame::from_azure_with_sas_token(
    "https://account.blob.core.windows.net/container",
    "sas_token",
    Some("folder/file.csv"), //or keep `None` to take everything from folder
    "azure_data"
).await?; let df = CustomDataFrame::from_azure_with_sas_token(
    "https://account.blob.core.windows.net/container",
    "sas_token",
    Some("folder/file.csv"), //or keep `None` to take everything from folder
    "azure_data"
).await?; SharePoint let df = CustomDataFrame::load_from_sharepoint(
    "tenant-id",
    "client-id", 
    "https://company.sharepoint.com/sites/Site",
    "Documents/data.xlsx",
    "sharepoint_data"
).await?; let df = CustomDataFrame::load_from_sharepoint(
    "tenant-id",
    "client-id", 
    "https://company.sharepoint.com/sites/Site",
    "Documents/data.xlsx",
    "sharepoint_data"
).await?; REST API to DataFrame let api = ElusionApi::new();

api.from_api_with_headers(
    "https://api.example.com/data",
    headers,
    "/path/to/output.json"
).await?;

let df = CustomDataFrame::new("/path/to/output.json", "api_data").await?; let api = ElusionApi::new();

api.from_api_with_headers(
    "https://api.example.com/data",
    headers,
    "/path/to/output.json"
).await?;

let df = CustomDataFrame::new("/path/to/output.json", "api_data").await?; Database connections let postgres_df = CustomDataFrame::from_postgres(&conn, query, "pg_data").await?;

let mysql_df = CustomDataFrame::from_mysql(&conn, query, "mysql_data").await?; let postgres_df = CustomDataFrame::from_postgres(&conn, query, "pg_data").await?;

let mysql_df = CustomDataFrame::from_mysql(&conn, query, "mysql_data").await?; Polars: Structured Approach Polars requires logical ordering let result = df
    .lazy()
    .filter(col("amount").gt(100))
    .group_by([col("category")])
    .agg([col("amount").sum().alias("total")])
    .sort("total", SortMultipleOptions::default())
    .collect()?; let result = df
    .lazy()
    .filter(col("amount").gt(100))
    .group_by([col("category")])
    .agg([col("amount").sum().alias("total")])
    .sort("total", SortMultipleOptions::default())
    .collect()?; Elusion: Any-Order Flexibility All of these produce the same result: Traditional order: let result1 = df
    .select(["category", "amount"])
    .filter("amount > 100")
    .agg(["SUM(amount) as total"])
    .group_by(["category"])
    .order_by(["total"], ["DESC"])
    .elusion("result").await?; let result1 = df
    .select(["category", "amount"])
    .filter("amount > 100")
    .agg(["SUM(amount) as total"])
    .group_by(["category"])
    .order_by(["total"], ["DESC"])
    .elusion("result").await?; Filter first let result2 = df
    .filter("amount > 100")
    .agg(["SUM(amount) as total"])
    .select(["category", "amount"])
    .group_by(["category"])
    .order_by(["total"], ["DESC"])
    .elusion("result").await?; let result2 = df
    .filter("amount > 100")
    .agg(["SUM(amount) as total"])
    .select(["category", "amount"])
    .group_by(["category"])
    .order_by(["total"], ["DESC"])
    .elusion("result").await?; Aggregation first let result3 = df
    .agg(["SUM(amount) as total"])
    .filter("amount > 100")
    .group_by(["category"])
    .select(["category", "amount"])
    .order_by(["total"], ["DESC"])
    .elusion("result").await?; let result3 = df
    .agg(["SUM(amount) as total"])
    .filter("amount > 100")
    .group_by(["category"])
    .select(["category", "amount"])
    .order_by(["total"], ["DESC"])
    .elusion("result").await?; All produce identical results! Advanced Features: Where Elusion Shines Built-in Visualization and Reporting Create interactive dashboards Built-in Visualization and Reporting Create interactive dashboards let plots = [
    (&line_plot, "Sales Timeline"),
    (&bar_chart, "Category Performance"),
    (&histogram, "Distribution Analysis"),
];

let tables = [
    (&summary_table, "Summary Stats"),
    (&detail_table, "Transaction Details")
];

CustomDataFrame::create_report(
    Some(&plots),
    Some(&tables),
    "Sales Analysis Dashboard",
    "dashboard.html",
    Some(layout_config),
    Some(table_options)
).await?; let plots = [
    (&line_plot, "Sales Timeline"),
    (&bar_chart, "Category Performance"),
    (&histogram, "Distribution Analysis"),
];

let tables = [
    (&summary_table, "Summary Stats"),
    (&detail_table, "Transaction Details")
];

CustomDataFrame::create_report(
    Some(&plots),
    Some(&tables),
    "Sales Analysis Dashboard",
    "dashboard.html",
    Some(layout_config),
    Some(table_options)
).await?; Automated Pipeline Scheduling Schedule data engineering pipelines Automated Pipeline Scheduling Schedule data engineering pipelines let scheduler = PipelineScheduler::new("5min", || async {
    // Load from Azure
    let df = CustomDataFrame::from_azure_with_sas_token(
        azure_url, sas_token, Some("folder/"), "raw_data"
    ).await?;
    
    // Process data
    let processed = df
        .select(["date", "amount", "category"])
        .agg(["SUM(amount) as total", "COUNT(*) as transactions"])
        .group_by(["date", "category"])
        .order_by(["date"], ["ASC"])
        .elusion("processed").await?;
    
    // Write results
    processed.write_to_parquet(
        "overwrite",
        "output/processed_data.parquet",
        None
    ).await?;
    
    Ok(())
}).await?; let scheduler = PipelineScheduler::new("5min", || async {
    // Load from Azure
    let df = CustomDataFrame::from_azure_with_sas_token(
        azure_url, sas_token, Some("folder/"), "raw_data"
    ).await?;
    
    // Process data
    let processed = df
        .select(["date", "amount", "category"])
        .agg(["SUM(amount) as total", "COUNT(*) as transactions"])
        .group_by(["date", "category"])
        .order_by(["date"], ["ASC"])
        .elusion("processed").await?;
    
    // Write results
    processed.write_to_parquet(
        "overwrite",
        "output/processed_data.parquet",
        None
    ).await?;
    
    Ok(())
}).await?; Advanced JSON Processing Can handle complex JSON structures with Arrays and Objects let df = CustomDataFrame::new("complex_data.json", "json_data").await?; let df = CustomDataFrame::new("complex_data.json", "json_data").await?; If you have json fields/columns in your files you can explode them: Extract simple JSON fields: Extract simple JSON fields: let simple = df.json([
    "metadata.'$timestamp' AS event_time",
    "metadata.'$user_id' AS user",
    "data.'$amount' AS transaction_amount"
]); let simple = df.json([
    "metadata.'$timestamp' AS event_time",
    "metadata.'$user_id' AS user",
    "data.'$amount' AS transaction_amount"
]); Extract from JSON arrays: Extract from JSON arrays: let complex = df.json_array([
    "events.'$value:id=purchase' AS purchase_amount",
    "events.'$timestamp:id=login' AS login_time",
    "events.'$status:type=payment' AS payment_status"
]); let complex = df.json_array([
    "events.'$value:id=purchase' AS purchase_amount",
    "events.'$timestamp:id=login' AS login_time",
    "events.'$status:type=payment' AS payment_status"
]); When to Choose Which Choose Polars When: Pure performance is the top priority You prefer structured, optimized query patterns Memory efficiency is critical You need minimal dependencies
Choose Elusion When: You need integration flexibility (cloud storage, APIs, databases) Developer experience and query flexibility matter You want built-in visualization and reporting You need automated pipeline scheduling Working with diverse data sources (Excel, SharePoint, REST APIs) You prefer intuitive, any-order query building Choose Polars When: Pure performance is the top priority You prefer structured, optimized query patterns Memory efficiency is critical You need minimal dependencies Choose Elusion When: You need integration flexibility (cloud storage, APIs, databases) Developer experience and query flexibility matter You want built-in visualization and reporting You need automated pipeline scheduling Working with diverse data sources (Excel, SharePoint, REST APIs) You prefer intuitive, any-order query building Installation and Getting Started Polars Polars [dependencies]
polars = { version = "0.50.0", features = ["lazy"] } [dependencies]
polars = { version = "0.50.0", features = ["lazy"] } Elusion [dependencies] Elusion [dependencies] elusion = "4.0.0"
tokio = { version = "1.45.0", features = ["rt-multi-thread"] } elusion = "4.0.0"
tokio = { version = "1.45.0", features = ["rt-multi-thread"] } Elusion With specific features Elusion With specific features elusion = { version = "4.0.0", features = ["dashboard", "azure", "postgres"] } elusion = { version = "4.0.0", features = ["dashboard", "azure", "postgres"] } Rust version requirement: Polars: >= 1.80
Elusion: >= 1.81 Polars: >= 1.80
Elusion: >= 1.81 Real-World Example: Sales Data Analysis Polars Implementation: use polars::prelude::*;

let df = LazyFrame::scan_csv("sales.csv", ScanArgsCSV::default())?
    .filter(col("amount").gt(100))
    .group_by([col("category")])
    .agg([
        col("amount").sum().alias("total_sales"),
        col("amount").mean().alias("avg_sale"),
        col("customer_id").n_unique().alias("unique_customers")
    ])
    .sort("total_sales", SortMultipleOptions::default().with_order_descending(true))
    .collect()?;

println!("{}", df); use polars::prelude::*;

let df = LazyFrame::scan_csv("sales.csv", ScanArgsCSV::default())?
    .filter(col("amount").gt(100))
    .group_by([col("category")])
    .agg([
        col("amount").sum().alias("total_sales"),
        col("amount").mean().alias("avg_sale"),
        col("customer_id").n_unique().alias("unique_customers")
    ])
    .sort("total_sales", SortMultipleOptions::default().with_order_descending(true))
    .collect()?;

println!("{}", df); Elusion Implementation: use elusion::prelude::*; #[tokio::main]
async fn main() -> ElusionResult<()> {
    // Load data (flexible source)
    let df = CustomDataFrame::new("sales.csv", "sales").await?;
    
    // Build query in any order that makes sense to you
    let analysis = df
        .filter("amount > 100")                
        .agg([                                    
            "SUM(amount) as total_sales",
            "AVG(amount) as avg_sale", 
            "COUNT(DISTINCT customer_id) as unique_customers"
        ])
        .group_by(["category"])                
        .order_by(["total_sales"], ["DESC"])       
        .elusion("sales_analysis").await?;
    
    // If you like to display result
       analysis.display().await?;
    
    // Create visualization
    let bar_chart = analysis.plot_bar(
        "category",
        "total_sales", 
        Some("Sales by Category")
    ).await?;
    
    // Generate report
    CustomDataFrame::create_report(
        Some(&[(&bar_chart, "Sales Performance")]),
        Some(&[(&analysis, "Summary Table")]),
        "Sales Analysis Report",
        "sales_report.html",
        None,
        None
    ).await?;
    
    Ok(())
} #[tokio::main]
async fn main() -> ElusionResult<()> {
    // Load data (flexible source)
    let df = CustomDataFrame::new("sales.csv", "sales").await?;
    
    // Build query in any order that makes sense to you
    let analysis = df
        .filter("amount > 100")                
        .agg([                                    
            "SUM(amount) as total_sales",
            "AVG(amount) as avg_sale", 
            "COUNT(DISTINCT customer_id) as unique_customers"
        ])
        .group_by(["category"])                
        .order_by(["total_sales"], ["DESC"])       
        .elusion("sales_analysis").await?;
    
    // If you like to display result
       analysis.display().await?;
    
    // Create visualization
    let bar_chart = analysis.plot_bar(
        "category",
        "total_sales", 
        Some("Sales by Category")
    ).await?;
    
    // Generate report
    CustomDataFrame::create_report(
        Some(&[(&bar_chart, "Sales Performance")]),
        Some(&[(&analysis, "Summary Table")]),
        "Sales Analysis Report",
        "sales_report.html",
        None,
        None
    ).await?;
    
    Ok(())
} Conclusion Elusion v4.0.0 represents a paradigm shift in DataFrame libraries, prioritizing developer experience, integration flexibility, and comprehensive data engineering capabilities. The choice between Polars and Elusion depends on your priorities: For raw computational performance and memory efficiency: Polars For comprehensive data engineering with flexible development: Elusion Elusion's "any-order" query building, extensive integration capabilities, built-in visualization, and automated scheduling make it particularly attractive for teams that need to work with diverse data sources and want a more intuitive development experience. Both libraries showcase the power of Rust in the data processing space, offering developers high-performance alternatives to traditional Python-based solutions. The Rust DataFrame ecosystem is thriving, and having multiple approaches ensures that different use cases and preferences are well-served. Try Elusion v4.0.0 today: cargo add elusion@4.0.0 cargo add elusion@4.0.0 For more information and examples, visit the Elusion github repository: Elusion repository and join the growing community of Rust data engineers who are discovering the flexibility and power of any-order DataFrame operations. Elusion repository

The code in this story is for educational purposes. The readers are solely responsible for whatever they build with it.

Dash

Rust DataFrame Alternatives to Polars: Meet Elusion v4.0.0

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

03/09/2018: Biggest Stories in the Cryptosphere

The Noonification: Immigrant Teens Are Working Dangerous Night Shifts in Factories (11/21/2022)

The Noonification: How to Implement a Merkle Tree in Solidity (11/12/2023)

10 Ways to Optimize Your Database

10 Ways to Reduce Data Loss and Potential Downtime Of Your Database

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

03/09/2018: Biggest Stories in the Cryptosphere

The Noonification: Immigrant Teens Are Working Dangerous Night Shifts in Factories (11/21/2022)

The Noonification: How to Implement a Merkle Tree in Solidity (11/12/2023)

10 Ways to Optimize Your Database

10 Ways to Reduce Data Loss and Potential Downtime Of Your Database

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps