Skip to content

Data Handling

DataFrames tutorials:

Useful packages:

  • DataSkimmer.jl produces a summary of tabular data (e.g. DataFrames), including histogram.
  • Strapping.jl converts between structs and tables.
  • SplitApplyCombine.jl contains data manipulation routines, such as splitdims (converting between vectors of vectors and matrices etc.), group, innerjoin. Similar to what DataFrames offers, but for additional data types.
  • InvertedIndices.jl for selecting when conditions are not true.

DataFrames

Chaining transformations

  • only have a single combine with multiple transformations as in

combine(df, :a => sum, :b => mean)

  • even with @chain from Chain.jl multiple combine in a row do not work. The result of each combine is fed into the next step. Which makes sense.

Deleting columns:

  • Using Not from InvertedIndices: select!(df, Not(:x1));

Missing Values

Missings.jl has convenience functions for dealing with missing values.