nanoparquet-package {nanoparquet}R Documentation

nanoparquet: Read and Write 'Parquet' Files

Description

Self-sufficient reader and writer for flat 'Parquet' files. Can read most 'Parquet' data types. Can write many 'R' data types, including factors and temporal types. See docs for limitations.

Details

nanoparquet is a reader and writer for a common subset of Parquet files.

Features:

Limitations:

Installation

Install the R package from CRAN:

install.packages("nanoparquet")

Usage

Read

Call read_parquet() to read a Parquet file:

df <- nanoparquet::read_parquet("example.parquet")

To see the columns of a Parquet file and how their types are mapped to R types by read_parquet(), call read_parquet_schema() first:

nanoparquet::read_parquet_schema("example.parquet")

Folders of similar-structured Parquet files (e.g. produced by Spark) can be read like this:

df <- data.table::rbindlist(lapply(
  Sys.glob("some-folder/part-*.parquet"),
  nanoparquet::read_parquet
))
Write

Call write_parquet() to write a data frame to a Parquet file:

nanoparquet::write_parquet(mtcars, "mtcars.parquet")

To see how the columns of the data frame will be mapped to Parquet types by write_parquet(), call infer_parquet_schema() first:

nanoparquet::infer_parquet_schema(mtcars)
Inspect

Call read_parquet_info(), read_parquet_schema(), or read_parquet_metadata() to see various kinds of metadata from a Parquet file:

nanoparquet::read_parquet_info("mtcars.parquet")
nanoparquet::read_parquet_schema("mtcars.parquet")
nanoparquet::read_parquet_metadata("mtcars.parquet")

If you find a file that should be supported but isn't, please open an issue here with a link to the file.

Options

See also ?parquet_options() for further details.

License

MIT

Author(s)

Maintainer: Gábor Csárdi csardi.gabor@gmail.com

Authors:

Other contributors:

See Also

Useful links:


[Package nanoparquet version 0.4.2 Index]