narwhals is the dataframe-agnostic compatibility layer that lets one codebase run on pandas, Polars, PyArrow, Modin, cuDF, DuckDB, and more. The mental model: wrap a native frame with nw.from_native, write the transformation once in the narwhals API (which mirrors Polars), then nw.to_native to hand the result back unchanged in its original type. narwhals is a thin, zero-dependency translation layer, not a new dataframe engine, so you can drop it into the middle of existing code. This cheatsheet walks the daily loop in eight panels.
Wrap & Unwrap
narwhals never owns your data; it borrows it. nw.from_native puts a thin wrapper around a pandas, Polars, or PyArrow object, and nw.to_native returns the exact same native type when you are done, so narwhals is a translation layer you can drop into the middle of existing code.
nw.from_native(df) # wrap a native frame
nw.from_native(df, eager_only=True) # eager DataFrames only (reject LazyFrames)
nw.from_native(s, allow_series=True) # allow a Series through too
nw.to_native(df) # hand the result back, unchanged
nw.from_dict(data, backend="polars") # build a narwhals frame from scratchSee the top-level API.
Inspect & Metadata
Every frame answers the same metadata questions (shape, columns, schema) regardless of what is underneath, and implementation plus get_native_namespace let you branch on the real backend only when you truly need to.
df.shape # (rows, cols)
df.columns # the column names
df.schema # name -> dtype mapping
df.implementation # which engine is underneath
nw.get_native_namespace(df) # reach the native module
df.to_pandas() # convert to a concrete backend (also .to_polars())See the DataFrame API.
Select, Columns & Expressions
narwhals borrows the Polars expression model: you describe columns with nw.col(...) and combine them lazily, then select (replace the frame) or with_columns (add to it). Expressions are reusable recipes, not eager values.
df.select(nw.col("a", "b")) # pick columns
df.with_columns((nw.col("a") + nw.col("b")).alias("c")) # add / overwrite
nw.lit(0) # a literal value
nw.when(nw.col("a") > 1).then(...).otherwise(...) # conditional column
nw.sum_horizontal("a", "b") # combine columns horizontally
df.select(ncs.numeric()) # pick by selector (ncs)Filter, Sort & Transform Rows
Row-shaping verbs (filter, sort, unique, drop_nulls, head) read the same on a one-million-row Polars LazyFrame and a tiny pandas DataFrame, so you write the logic once and trust it everywhere.
df.filter(nw.col("a") > 1) # keep rows matching a predicate
df.sort("a", descending=True) # sort by one or more columns
df.head(5) # first / last N rows (df.tail(5))
df.unique(subset=["a"]) # drop duplicate rows
df.drop_nulls() # drop missing values
df.with_row_index() # add a row indexSee the DataFrame API.
Group, Aggregate & Join
Split-apply-combine (group_by().agg()), relational joins (join), stacking (concat), and window functions (.over()) are the relational core; narwhals maps each to its backend’s native, optimized implementation rather than reimplementing it.
df.group_by("k").agg(nw.col("v").sum()) # group then aggregate
df.group_by("k").agg(nw.col("v").mean(), nw.len()) # multiple aggregations
df.join(other, on="id", how="left") # inner / left / outer join
df.join(other, how="cross") # cross join
nw.concat([df1, df2], how="vertical") # stack frames vertically
nw.col("v").sum().over("k") # window expression over groupsSee the DataFrame API.
Expression Namespaces
Typed operations live under .str, .dt, .cat, .list, and .struct so string, datetime, and categorical work is uniform across engines that otherwise spell these very differently.
nw.col("name").str.to_uppercase() # transform text
nw.col("name").str.contains("a") # test a substring -> Boolean
nw.col("ts").dt.year() # extract date parts
nw.col("ts").dt.truncate("1d") # truncate / offset datetimes
nw.col("c").cat.get_categories() # categorical -> categories
nw.col("d").str.to_datetime(format="%Y-%m-%d") # parse strings to datesSee the Expr API (the .str/.dt/.cat sub-namespaces).
Lazy Frames & I/O
lazy() turns a frame into a query plan that does nothing until collect(), which lets the underlying engine optimize the whole pipeline. The read_*/scan_* helpers need an explicit backend= because narwhals refuses to guess which engine should own freshly loaded data.
df.lazy() # go lazy (defer execution)
lf.collect() # trigger the computation
nw.scan_csv("data.csv", backend="polars") # scan a CSV lazily
nw.read_csv("data.csv", backend="pandas") # read a CSV eagerly
lf.sink_parquet("out.parquet") # stream a lazy result to disk
df.write_parquet("out.parquet") # write a frame outSee the LazyFrame API.
Write Portable Functions
The whole point: decorate a function with @nw.narwhalify (or wrap manually with from_native/to_native) and it accepts and returns whatever dataframe library the caller uses. Import narwhals.stable.v1 when you ship a library and need the API frozen across narwhals releases.
@nw.narwhalify # auto-wrap/unwrap a function
def f(df):
return df.with_columns(nw.col("a").mean()) # write the body once
# manual long-hand of the decorator:
# df = nw.from_native(df); ...; return nw.to_native(df)
import narwhals.stable.v1 as nw # version-locked API for libraries
nw.from_native(x, pass_through=True) # pass non-frames through untouched
nw.from_native(s, series_only=True) # accept a Series onlySee the complete example and the top-level API.
Quick Reference
| Command | What it does |
|---|---|
nw.from_native(df) |
Wrap a native frame as a narwhals frame |
nw.to_native(df) |
Return the original native type |
nw.from_dict(d, backend="polars") |
Build a narwhals frame from a dict |
@nw.narwhalify |
Auto wrap/unwrap a whole function |
import narwhals.stable.v1 as nw |
Version-locked API for libraries |
| Flag | Meaning |
|---|---|
eager_only=True |
Reject LazyFrames; only eager DataFrames |
series_only=True |
Accept a Series only |
allow_series=True |
Allow a Series alongside frames |
pass_through=True |
Return non-frame objects unchanged |
| Namespace | Example |
|---|---|
.str |
nw.col("s").str.to_uppercase() |
.dt |
nw.col("t").dt.year() |
.cat |
nw.col("c").cat.get_categories() |
.list |
nw.col("x").list.len() |
.struct |
nw.col("x").struct.field("k") |
Appendix: Sample Code
The canonical narwhals function and an end-to-end pipeline.
The portable function (decorator form)
import narwhals as nw
@nw.narwhalify
def add_total(df):
# Works whether `df` is a pandas, Polars, or PyArrow frame.
return df.with_columns(
(nw.col("value") * nw.col("qty")).alias("total")
)
# Call it with ANY supported backend; you get the SAME type back:
# add_total(pandas_df) -> pandas.DataFrame
# add_total(polars_df) -> polars.DataFrameThe same thing, manual wrap/unwrap
import narwhals as nw
def add_total(df_native):
df = nw.from_native(df_native)
df = df.with_columns((nw.col("value") * nw.col("qty")).alias("total"))
return nw.to_native(df)Lazy + I/O with an explicit backend
scan_* and read_* require backend=; narwhals will not guess the engine:
import narwhals as nw
lf = nw.scan_csv("data.csv", backend="polars") # -> nw.LazyFrame (a plan)
out = (
lf.filter(nw.col("amount") > 0)
.group_by("category")
.agg(nw.col("amount").sum())
.collect() # -> nw.DataFrame (runs now)
)
out.write_parquet("summary.parquet")References
narwhals documentation
- narwhals documentation (home) and installation
- Basics: DataFrame, Series, complete example, conversion
- API reference: top-level functions, DataFrame, LazyFrame, Expr, Series, selectors
- Overhead (why the layer is essentially free) and extending narwhals
Project