fastai.transforms¶
Introduction and Overview¶
The fastai transforms pipeline for images is designed to convert your images into a form ready to be batched by your DataLoader and passed to your model. It can be used to conveniently create transformations like normalization/denormalization and various augmentations. The module contains different predefined pipelines that you can almost directly pass to your data object. This is especially convenient for working with pretrained models, because the images should be normalized according to the statistics of the (pre)training data set.
Normalization¶
One of the most common image transformations is normalizing the images. The transforms module provides predefined pipelines in which you can pass the statistics (mean and standard deviation of each channel) of your training data.
Important
Make sure you use only the statistics of the training data to normalize both training and validation data.
Normalizing for Pretrained Models¶
The statistics for all pretrained models available via the fastai.conv_learner module are predefined and accesible via the tfms_from_model
function. Here’s an example of how to use this function to create a pipeline for a pretrained resnet34 model:
from fastai.transforms import *
from fastai.dataset import *
arch = resnet34
sz = 224
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
The tfms_from_model
function returns separate image transformers for the training and validation set. It calls the tfms_from_stats
function for custom image statistics and passes the mean and the standard deviation of the imagenet dataset.
-
fastai.transforms.
tfms_from_model
(f_model, sz, aug_tfms=None, max_zoom=None, pad=0, crop_type=<CropType.RANDOM: 1>, tfm_y=None, sz_y=None, pad_mode=2, norm_y=True, scale=None)¶ Returns separate transformers of images for training and validation. Transformers are constructed according to the image statistics given by the model. (See tfms_from_stats)
- Arguments:
- f_model: model, pretrained or not pretrained
Normalizing with Custom Statistics¶
If you train a model from scratch you can use the tfms_from_stats
function and pass in the statistics of your training set. For each channel you have to define mean and standard deviation:
from fastai.transforms import *
from fastai.dataset import *
stats = A([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
sz = 224
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_stats(stats, sz))
-
fastai.transforms.
tfms_from_stats
(stats, sz, aug_tfms=None, max_zoom=None, pad=0, crop_type=<CropType.RANDOM: 1>, tfm_y=None, sz_y=None, pad_mode=2, norm_y=True, scale=None)¶ Given the statistics of the training image sets, returns separate training and validation transform functions
Adding Augmentation¶
Besides normalization this module also contains transformations that can be used for image augmentation. Each of those transformations is defined as a class (see Available Transformations below). The functions tfms_from_model
and tfms_from_stats
can take lists of those classes as definitions which augmentations to perform.
Predefined Augmentations¶
The module contains predefined sets of transformations that can be used for common image augmentations:
transforms_basic
contains a random rotation up to 10° and random lighting change up to 5%transforms_side_on
containstransforms_basic
and a random flip (suitable for pictures of “every day objects”)transforms_top_down
containstransforms_basic
and a random rotation by multiples of 90° (for example suitable for satellite images)
To use those predefined sets of augmentations in tfms_from_model
and tfms_from_stats
you have to pass them to aug_tfms
:
from fastai.transforms import *
from fastai.dataset import *
stats = A([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
sz = 224
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_stats(stats, sz, aug_tfms=transforms_basic))
Custom Augmentations¶
To define custom augmentations you can simply pass your combination of transformations to aug_tfms
as a list. All available transformations and how to define custom transformations is described in the next section.
Available Transformations¶
Each transformation is defined as a class inherited from the abstract parent class Transform
. To define a transformation overwrite the abstract method do_transform
. A transform may include a random component. If it does, it will often need to transform y
using the same random values as x
(e.g. a horizontal flip in segmentation must be applied to the mask as well). Therefore, the method set_state
is used to ensure all random state is calculated in one place.
-
class
fastai.transforms.
Transform
(tfm_y=<TfmType.NO: 1>)¶ A class that represents a transform.
All other transforms should subclass it. All subclasses should override do_transform.
- tfm_y : TfmType
- type of transform
-
do_transform
(x, is_y)¶
-
set_state
()¶
There’s also a Class CoordTransform
inherited from Transform
which is used to represent all coordinate based transformations like cropping, rotating and scaling.
-
class
fastai.transforms.
CoordTransform
(tfm_y=<TfmType.NO: 1>)¶ A coordinate transform.
-
static
make_square
(y, x)¶
-
map_y
(y0, x)¶
-
transform_coord
(x, ys)¶
-
static
The currently implemented transformations are:¶
-
class
fastai.transforms.
Normalize
(m, s, tfm_y=<TfmType.NO: 1>)¶ Normalizes an image to zero mean and unit standard deviation, given the mean m and std s of the original image
-
class
fastai.transforms.
Denormalize
(m, s)¶ De-normalizes an image, returning it to original format.
-
class
fastai.transforms.
ChannelOrder
(tfm_y=<TfmType.NO: 1>)¶ changes image array shape from (h, w, 3) to (3, h, w). tfm_y decides the transformation done to the y element.
-
class
fastai.transforms.
AddPadding
(pad, mode=2, tfm_y=<TfmType.NO: 1>)¶ A class that represents adding paddings to an image.
The default padding is border_reflect Arguments ———
- pad : int
- size of padding on top, bottom, left and right
- mode:
- type of cv2 padding modes. (e.g., constant, reflect, wrap, replicate. etc. )
-
class
fastai.transforms.
CenterCrop
(sz, tfm_y=<TfmType.NO: 1>, sz_y=None)¶ A class that represents a Center Crop.
This transforms (optionally) transforms x,y at with the same parameters. Arguments ———
- sz: int
- size of the crop.
- tfm_y : TfmType
- type of y transformation.
-
class
fastai.transforms.
RandomCrop
(targ_sz, tfm_y=<TfmType.NO: 1>, sz_y=None)¶ A class that represents a Random Crop transformation.
This transforms (optionally) transforms x,y at with the same parameters. Arguments ———
- targ: int
- target size of the crop.
- tfm_y: TfmType
- type of y transformation.
-
class
fastai.transforms.
NoCrop
(sz, tfm_y=<TfmType.NO: 1>, sz_y=None)¶ A transformation that resize to a square image without cropping.
This transforms (optionally) resizes x,y at with the same parameters. Arguments:
- targ: int
- target size of the crop.
tfm_y (TfmType): type of y transformation.
-
class
fastai.transforms.
Scale
(sz, tfm_y=<TfmType.NO: 1>, sz_y=None)¶ A transformation that scales the min size to sz.
- Arguments:
- sz: int
- target size to scale minimum size.
- tfm_y: TfmType
- type of y transformation.
-
class
fastai.transforms.
RandomScale
(sz, max_zoom, p=0.75, tfm_y=<TfmType.NO: 1>, sz_y=None)¶ Scales an image so that the min size is a random number between [sz, sz*max_zoom]
This transforms (optionally) scales x,y at with the same parameters. Arguments:
- sz: int
- target size
- max_zoom: float
- float >= 1.0
- p : float
- a probability for doing the random sizing
- tfm_y: TfmType
- type of y transform
-
class
fastai.transforms.
RandomRotate
(deg, p=0.75, mode=2, tfm_y=<TfmType.NO: 1>)¶ Rotates images and (optionally) target y.
Rotating coordinates is treated differently for x and y on this transform.
- Arguments:
- deg (float): degree to rotate. p (float): probability of rotation mode: type of border tfm_y (TfmType): type of y transform
-
class
fastai.transforms.
RandomDihedral
(tfm_y=<TfmType.NO: 1>)¶ Rotates images by random multiples of 90 degrees and/or reflection. Please reference D8(dihedral group of order eight), the group of all symmetries of the square.
-
class
fastai.transforms.
RandomFlip
(tfm_y=<TfmType.NO: 1>, p=0.5)¶
-
class
fastai.transforms.
RandomLighting
(b, c, tfm_y=<TfmType.NO: 1>)¶
-
class
fastai.transforms.
RandomRotateZoom
(deg, zoom, stretch, ps=None, mode=2, tfm_y=<TfmType.NO: 1>)¶ Selects between a rotate, zoom, stretch, or no transform. Arguments:
deg - maximum degrees of rotation. zoom - maximum fraction of zoom. stretch - maximum fraction of stretch. ps - probabilities for each transform. List of length 4. The order for these probabilities is as listed respectively (4th probability is ‘no transform’.
-
class
fastai.transforms.
RandomStretch
(max_stretch, tfm_y=<TfmType.NO: 1>)¶
-
class
fastai.transforms.
PassThru
(tfm_y=<TfmType.NO: 1>)¶
-
class
fastai.transforms.
RandomBlur
(blur_strengths=5, probability=0.5, tfm_y=<TfmType.NO: 1>)¶ Adds a gaussian blur to the image at chance. Multiple blur strengths can be configured, one of them is used by random chance.
-
class
fastai.transforms.
Cutout
(n_holes, length, tfm_y=<TfmType.NO: 1>)¶
-
class
fastai.transforms.
GoogleNetResize
(targ_sz, min_area_frac=0.08, min_aspect_ratio=0.75, max_aspect_ratio=1.333, flip_hw_p=0.5, tfm_y=<TfmType.NO: 1>, sz_y=None)¶ Randomly crops an image with an aspect ratio and returns a squared resized image of size targ
- Arguments:
- targ_sz: int
- target size
- min_area_frac: float < 1.0
- minimum area of the original image for cropping
- min_aspect_ratio : float
- minimum aspect ratio
- max_aspect_ratio : float
- maximum aspect ratio
- flip_hw_p : float
- probability for flipping magnitudes of height and width
- tfm_y: TfmType
- type of y transform
Note
Transformations are often run in multiple threads. Therefore any state must be stored in thread local storage. The Transform class provide a thread local store attribute for you to use. See RandomFlip class for an example of how to use random state safely in Transform subclasses.
The class TfmType
defines the type of transformation:
-
class
fastai.transforms.
TfmType
¶ Type of transformation. Parameters
- IntEnum: predefined types of transformations
NO: the default, y does not get transformed when x is transformed. PIXEL: x and y are images and should be transformed in the same way.
Example: image segmentation.COORD: y are coordinates (i.e bounding boxes) CLASS: y are class labels (same behaviour as PIXEL, except no normalization)