What happens when a tibble is printed? This vignette documents the control flow and the data flow, explains the design choices, and shows the default implementation for the "tbl" class. It is mainly of interest for implementers of table subclasses. Customizing the formatting of a vector class in a tibble is described in vignette("pillar", package = "vctrs"). The different customization options are showcased in vignette("extending").

Requirements

  • Fit into pre-specified width, distributing across multiple tiers if necessary

  • Optionally shrink and stretch individual columns

  • Header, body and footer for the tibble

    • Avoid recomputation of information
  • Custom components for the pillars in a tibble, top-aligned

    • The container, not the column vectors, determine the appearance
  • Customization of the entire output and of the pillars

  • Support for data frame columns (packed data frames) and matrix/array columns

  • Pillars are always shown from left to right, no “holes” in the colonnade

    • If the first column consumes all available space, the remaining columns are not shown, even if they all would fit if the first column is omitted.
  • Printing pillars should take time proportional to the number of characters printed, and be “fast enough”.

Overview

The overall control and data flow are illustrated in the diagram below. Boxes are functions and methods. Solid lines are function calls. Dotted lines represent information that a function obtains via argument or (in the case of options) queries actively.

The pillar package uses debugme for debugging. Activating debugging for pillar is another way to track the control flow, see vignette("debugme") for details.

Initialization

A tibble is a list of columns of class "tbl_df" and "tbl". Printing is designed to work for non-data-frame table-like objects such as lazy tables. The print.tbl() method calls format() for the object and prints the output.

tbl <- tibble::tibble(a = 1:3, b = tibble::tibble(c = 4:6, d = 7:9), e = 10:12)
print(tbl, width = 23)
#> # A tibble: 3 x 3
#>       a   b$c    $d
#>   <int> <int> <int>
#> 1     1     4     7
#> 2     2     5     8
#> 3     3     6     9
#> # … with 1 more
#> #   variable: e <int>
str(tbl)
#> tibble [3 × 3] (S3: tbl_df/tbl/data.frame)
#>  $ a: int [1:3] 1 2 3
#>  $ b: tibble [3 × 2] (S3: tbl_df/tbl/data.frame)
#>   ..$ c: int [1:3] 4 5 6
#>   ..$ d: int [1:3] 7 8 9
#>  $ e: int [1:3] 10 11 12

Source code of pillar:::print.tbl()

print.tbl <- function(x, width = NULL, ..., n = NULL, n_extra = NULL) {
  writeLines(format(x, width = width, ..., n = n, n_extra = n_extra))
  invisible(x)
}

The format.tbl() method creates a setup object, and uses that object to format header, body and footer.

Source code of pillar:::format.tbl()

format.tbl <- function(x, width = NULL, ..., n = NULL, n_extra = NULL) {
  check_dots_empty(action = signal)

  # Reset local cache for each new output
  force(x)
  num_colors(forget = TRUE)

  setup <- tbl_format_setup(x,
    width = width, ...,
    n = n,
    max_extra_cols = n_extra
  )

  header <- tbl_format_header(x, setup)
  body <- tbl_format_body(x, setup)
  footer <- tbl_format_footer(x, setup)
  c(header, body, footer)
}

While it’s possible to extend or override these methods for your "tbl" subclass, often overriding the more specialized methods shown below is sufficient.

Setup

Most of the work for formatting actually happens in tbl_format_setup(). The desired output width is baked into the setup object and must be available when calling. Setup objects print like a tibble but with a clear separation of header, body, and footer.

setup <- tbl_format_setup(tbl, width = 24)
setup
#> <pillar_tbl_format_setup>
#> <tbl_format_header(setup)>
#> # A tibble: 3 x 3
#> <tbl_format_body(setup)>
#>       a   b$c    $d
#>   <int> <int> <int>
#> 1     1     4     7
#> 2     2     5     8
#> 3     3     6     9
#> <tbl_format_footer(setup)>
#> # … with 1 more
#> #   variable: e <int>

A setup object is required here to avoid computing information twice. For instance, the dimensions shown in the header or the extra columns displayed in the footer are available only after the body has been computed.

The generic dispatches over the container, so that you can override it if necessary. It is responsible for assigning default values to arguments before passing them on to the method.

Source code of tbl_format_setup()

tbl_format_setup <- function(x, width = NULL, ...,
                             n = NULL, max_extra_cols = NULL) {
  "!!!!DEBUG tbl_format_setup()"

  width <- get_width_print(width)

  n <- get_n_print(n, nrow(x))

  max_extra_cols <- get_max_extra_cols(max_extra_cols)

  # Calls UseMethod("tbl_format_setup"),
  # allows using default values in S3 dispatch
  out <- tbl_format_setup_(x, width, ..., n = n, max_extra_cols = max_extra_cols)
  return(out)
  UseMethod("tbl_format_setup")
}

The default implementation converts the input to a data frame via as.data.frame(head(x)), and returns an object constructed with new_tbl_format_setup() that contains the data frame and additional information. If you override this method, e.g. to incorporate more information, you can add new items to the default setup object, but you should not overwrite existing items.

Source code of pillar:::tbl_format_setup.tbl()

tbl_format_setup.tbl <- function(x, width, ...,
                                 n, max_extra_cols) {
  "!!!!DEBUG tbl_format_setup.tbl()"

  # Number of rows
  rows <- nrow(x)

  if (is.na(rows)) {
    df <- df_head(x, n + 1)
    if (nrow(df) <= n) {
      rows <- nrow(df)
    } else {
      df <- vec_head(df, n)
    }
  } else {
    df <- df_head(x, n)
  }

  if (is.na(rows)) {
    # Lazy table with too many rows
    needs_dots <- (nrow(df) >= n)
  } else {
    # Lazy table with few rows, or regular data frame
    needs_dots <- (rows > n)
  }

  if (needs_dots) {
    rows_missing <- rows - n
  } else {
    rows_missing <- 0L
  }

  # Header
  tbl_sum <- tbl_sum(x)

  # Body
  rownames(df) <- NULL

  colonnade <- ctl_colonnade(
    df,
    has_row_id = if (.row_names_info(x) > 0) "*" else TRUE,
    width = width,
    controller = x
  )

  body <- colonnade$body

  # Extra columns
  extra_cols <- colonnade$extra_cols
  extra_cols_total <- length(extra_cols)

  if (extra_cols_total > max_extra_cols) {
    length(extra_cols) <- max_extra_cols
  }

  # Result
  new_tbl_format_setup(
    x = x,
    df = df,
    width = width,
    tbl_sum = tbl_sum,
    body = body,
    rows_missing = rows_missing,
    rows_total = rows,
    extra_cols = extra_cols,
    extra_cols_total = extra_cols_total
  )
}

At the core, the internal function ctl_colonnade() composes the body. Its functionality and the customization points it offers are detailed in the “Colonnade” section below.

Colonnade

The internal function ctl_colonnade() composes the body. It performs the following tasks:

  1. Create a pillar object for every column that fits, using ctl_new_compound_pillar(), ctl_new_pillar() and ultimately pillar() and pillar_shaft()
  2. Determine the number of tiers and the width for each tier
  3. Distribute the pillars across the tiers, assigning a width to each pillar.
  4. Format each pillar via its format() function, passing the now known width.
  5. Combine the formatted pillars horizontally.
  6. Combine the tiers vertically.
  7. Return the formatted body, and the columns that could not fit.

In the following, the first and the fourth steps are discussed.

Creating pillar objects

Each column in the tibble is passed to ctl_new_compound_pillar(), which eventually calls ctl_new_pillar() once or several times.

Compound pillars

The ctl_new_compound_pillar() generic dispatches on the container:

ctl_new_compound_pillar(tbl, tbl$a, width = 20)
#> <pillar>
#>                <int>
#>                    1
#>                    2
#>                    3
ctl_new_compound_pillar(tbl, tbl$b, width = 20)
#> <compound_pillar[2]>
#>         c
#>     <int>
#>         4
#>         5
#>         6
#> … and 1 more sub-pillars

The default method distinguishes between compound and simple pillars. Data frame, matrix, and array columns are decomposed into sub-pillars and returned as a compound pillar. Regular vectors are forwarded to ctl_new_pillar(). Implementers of "tbl" subclasses will rarely if ever need to extend or override this method.

Source code of pillar:::ctl_new_compound_pillar.tbl()

ctl_new_compound_pillar.tbl <- function(controller, x, width, ..., title = NULL) {
  "!!!!DEBUG ctl_new_compound_pillar.tbl(`v(width)`, `v(title)`)"

  if (is.data.frame(x)) {
    new_data_frame_pillar(x, controller, width, title = title)
  } else if (is.matrix(x)) {
    new_matrix_pillar(x, controller, width, title = title)
  } else if (is.array(x) && length(dim(x)) > 1) {
    new_array_pillar(x, controller, width, title = title)
  } else {
    ctl_new_pillar(controller, x, width, ..., title = prepare_title(title))
  }
}

Simple pillars

The ctl_new_pillar() method is called for columns that are not data frames or arrays, and also dispatches over the container.

ctl_new_compound_pillar(tbl, tbl$a, width = 20)
#> <pillar>
#>                <int>
#>                    1
#>                    2
#>                    3

Source code of pillar:::ctl_new_pillar.tbl()

ctl_new_pillar.tbl <- function(controller, x, width, ..., title = NULL) {
  "!!!!DEBUG ctl_new_pillar.tbl(`v(width)`, `v(title)`)"

  pillar(x, title, if (!is.null(width)) max0(width))
}

The default method calls pillar() directly, passing the maximum width available.

Source code of pillar()

pillar <- function(x, title = NULL, width = NULL, ...) {
  "!!!!DEBUG pillar(`v(class(x))`, `v(title)`, `v(width)`)"

  pillar_from_shaft(
    new_pillar_title(title),
    new_pillar_type(x),
    pillar_shaft(x, ...),
    width
  )
}

Formatting for title and type is provided by new_pillar_title() and new_pillar_type(). The body can be customized by implementing pillar_shaft() for a vector class, see vignette("pillar", package = "vctrs") for details. If title or type don’t fit the available width, pillar_shaft() is never called.

This function now returns NULL if the width is insufficient to contain the data. It is possible to change the appearance of pillars by overriding or extending ctl_new_pillar().

Components

Both compound and simple pillar objects share the same structure and are ultimately constructed with new_pillar().

Source code of new_pillar()

new_pillar <- function(components, ..., width = NULL, class = NULL) {
  "!!!!DEBUG new_pillar(`v(width)`, `v(class)`)"

  check_dots_empty()
  if (length(components) > 0 && !is_named(components)) {
    abort("All components must have names.")
  }

  structure(
    components,
    width = width,
    class = c(class, "pillar")
  )
}

A pillar is stored as a list of components. For simple pillars each component has length one, for compound pillars all components have the same length. In the future, this restriction may be levied to support nested components, e.g. for column titles spanning multiple sub-pillars for compound pillars. The maximum width available for the simple pillar of for each sub-pillar of a compound pillar is also recorded.

Layout of the objects contained in a pillar

Formatting pillars

When a pillar object is constructed, it has a minimum a desired (maximum) width. Because it depends on the number and width of other pillar objects that may not be even constructed, the final width is not known yet. It is passed to format(), which uses the desired width if empty:

Source code of pillar:::format.pillar()

format.pillar <- function(x, width = NULL, ...) {
  if (is.null(width)) {
    width <- get_width(x)
  }

  if (is.null(width)) {
    widths <- pillar_get_widths(x)
    width <- sum(widths) - length(widths) + 1L
  }

  new_vertical(pillar_format_parts_2(x, width))
}