Example Usage¶
In this example some simple processing is performed on the example.csv
file.
Update Cargo.toml
¶
Add the following to your Cargo.toml
file:
datafusion = "11.0"
tokio = "1.0"
Run a SQL query against data stored in a CSV:¶
use datafusion::prelude::*;
#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
// register the table
let ctx = SessionContext::new();
ctx.register_csv("example", "tests/data/example.csv", CsvReadOptions::new()).await?;
// create a plan to run a SQL query
let df = ctx.sql("SELECT a, MIN(b) FROM example GROUP BY a LIMIT 100").await?;
// execute and print results
df.show().await?;
Ok(())
}
Use the DataFrame API to process data stored in a CSV:¶
use datafusion::prelude::*;
#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
// create the dataframe
let ctx = SessionContext::new();
let df = ctx.read_csv("tests/data/example.csv", CsvReadOptions::new()).await?;
let df = df.filter(col("a").lt_eq(col("b")))?
.aggregate(vec![col("a")], vec![min(col("b"))])?
.limit(0, Some(100))?;
// execute and print results
df.show().await?;
Ok(())
}
Output from both examples¶
+---+--------+
| a | MIN(b) |
+---+--------+
| 1 | 2 |
+---+--------+
Identifiers and Capitalization¶
Please be aware that all identifiers are effectively made lower-case in SQL, so if your csv file has capital letters (ex: Name
) you must put your column name in double quotes or the examples won’t work.
To illustrate this behavior, consider the capitalized_example.csv
file:
Run a SQL query against data stored in a CSV:¶
use datafusion::prelude::*;
#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
// register the table
let ctx = SessionContext::new();
ctx.register_csv("example", "tests/data/capitalized_example.csv", CsvReadOptions::new()).await?;
// create a plan to run a SQL query
let df = ctx.sql("SELECT \"A\", MIN(b) FROM example GROUP BY \"A\" LIMIT 100").await?;
// execute and print results
df.show().await?;
Ok(())
}
Use the DataFrame API to process data stored in a CSV:¶
use datafusion::prelude::*;
#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
// create the dataframe
let ctx = SessionContext::new();
let df = ctx.read_csv("tests/data/capitalized_example.csv", CsvReadOptions::new()).await?;
let df = df.filter(col("A").lt_eq(col("c")))?
.aggregate(vec![col("A")], vec![min(col("b"))])?
.limit(0, Some(100))?;
// execute and print results
df.show().await?;
Ok(())
}
Output from both examples¶
+---+--------+
| A | MIN(b) |
+---+--------+
| 2 | 1 |
| 1 | 2 |
+---+--------+