This comprehensive toolkit provide a consistent and extensible framework for working with missing values in vectors. The companion package 'tidyimpute' provides similar functionality for list-like and table-like structures). Functions exist for detection, removal, replacement, imputation, recollection, etc. of 'NAs'.

**na.tools** is a comprehensive library for handling missing (NA) values.
It has several goals:

- extend existing
`stats::na.*()`

functions, - provide a collection for all functions for working with missing data together, and
- provide a consistent and intuitive interface.

In this package, there are methods for the detection, removal, replacement,
--imputation--, *recollection*, etc. of missing values (`NAs`

). This libraries focus
is on vectors (atomics). For **tidy**/**dplyr** compliant methods operating on
tables and lists, please use the
tidyimpute package which
depends on this package.

```
devtools::install_github( "decisionpatterns/na.tools")
```

```
install.packages("na.tools")
```

- Over
**70**functions for working with missing values (See [#Function List] below.) - Standardizes and extends
`na.*`

functions found in the*stats*package. - Extensible S3 methods
- Calculate statistics on missing values:
`n_na`

,`pct_na`

- Remove missing values:
`na.rm`

- Replacement/Imputation:
- Type/class and length-safe replacement. (
**na.tools**will never change the length or class of its argument.) produce an object with a different length/nrow or type/class of its target.) - Three types of imputations:
- constant
- univariate commutative (order-independent)
- univariate mom-commutative (order-dependent), e.g time series data

- Replace using scalar, vector or function(s)

- Type/class and length-safe replacement. (
- Easy mnemonics:
- functions beginning with
`na.`

return a transformed version of the input vector with missing values imputes/

- functions beginning with

- recall/track which values have been replaced and how.

```
x <- 1:3
x[2] <- NA_real_
any_na(x)
all_na(x)
which_na(x)
n_na(x)
pct_na(x)
na.rm(x)
na.replace(x, 2)
na.replace(x, mean) # error
na.replace(x, na.mean) # Works
na.zero(x)
na.mean(x)
na.cumsum(x)
```

`na.n`

- Count mising values`na.pct`

- Calculate pct of missing values

`which.na`

- Return logical or character indicating which elements are missing`all.na`

(`na.all`

) - test if all elements are missing`any.na`

(`na.any`

) - test if any elements are missing

`na.rm`

- remove`NA`

s (with tables is equivalent to`drop_cols_all_na`

)`na.trim`

- remove`NA`

s from beginning or end (non-commutative/order matters)

There are two types of imputation methods for plain vectors. They are distinguished by their replacement values.

In "constant" imputation methods, missing values are replaced by an
*a priori* selected constant value. No calculation are performed to derive
replacement values and all missing value assume the same transformied value.

`na.zero`

: Replace`NA`

s with 0`na.true`

|`na.false`

: ...`TRUE`

`na.inf`

/`na.neginf`

: ...`Inf`

/`-Inf`

`na.constant`

: constant value`.na`

In functional imputation, the value is calculated from the vector containing the missing value(s) -- and only that vector. Missing values may impute to different values. Replacement values may (or may not) be affected by the ording of the vector.

**Cummatative functions**

Commutative functions provide the same result irregarless of the ordering of the input vectors. Therefore, these functions do not depend on the ordering of elements of the input vector.

(When imputing in a table, imputation by function is also called
*column-based imputation* since replacement values derive from the single
column. Table-based imputation is found in the **tidyimpute** package.)

`na.max`

- maximum`na.min`

- minumum`na.mean`

- mean`na.median`

- median value`na.quantile`

- quantile value`na.sample`

/`na.random`

- randomly sampled value

** Non-commulative functions **s

`na.cummax`

- cumulative max`na.cummin`

- cumulative min`na.cumsum`

- cumulative sum`na.cumprod`

- cumulative prod

**General Imputation**

`na.replace`

/`na.explicit`

- atomic vectors only. General replacement function`na.unreplace`

/`na.implicit`

- turn explicit values back into NAs

A number of other packages have methods for working with missing values and/or imputation. Here is a short, incomplete and growing list:

`randomForest::na.roughfix()`

- imputes with`median`

`zoo::na.*`

- collection of*non-commutative*imputation techniques for time series data.- CRAN Task View: Multivariate Statistics:

- add
`na.true`

and`na.false`

- Replace
`na.most_freq`

with`na.mode`

- README.md: Fix canonical link to multivariate task view
- Remove tools directory from task view

- Add
`NA_explicit_`

as an exported constant for explicit categorical values. - Convert man to use markdown.
- Remove old aliases

- Fix
`na_replace`

(and`na_explicit`

) to add levels for values if they do not already exist. - Add tests
- Fix documentation

- Add na_explicit and na_implicit

- na_replace: revert from using
`ifelse`

because of edge cases - add
`zzz.R`

- add
`NEWS.md`

- add tests for
`na_replace`

`na_replace`

now uses`ifelse`

and prevent recycling`value`