Preprocessing
Overview
Band provides a set of APIs to preprocess the data. The preprocessing is mendatory and time-consuming process to run the machine learning model on the Band. Band provides Buffer
and BufferProcessor
to efficiently develop the preprocessing pipeline.
Buffer
is an arbitrary wrapper of the data. It wraps the data with the metadata such as image, text, and Band Tensor
to be used in the preprocessing pipeline. BufferProcessor
is a set of APIs to preprocess the data. It provides the basic preprocessing APIs such as resize
, normalize
, and crop
. We currently support image preprocessing only but we will support other data types such as text and audio in the future.
Example Usage - ImageProcessor
Below example shows how to use BufferProcessor
to preprocess the image data. The example creates a Buffer
from raw RGB data with (width, height) dimentions and (3) channels. Then, it creates an ImageProcessor
with Resize
and Normalize
operations. It preprocesses the Buffer
and updates a Tensor
with (224, 224) dimentions and (3) channels with kFloat32
data type with the data normalized with (127.5, 127.5).
Tensor* tensor = ... // Create a tensor
// Create a buffer
unsigned char* data = new unsigned char[width * height * 3];
data = ... // Fill the data
Buffer* buffer = Buffer::CreateFromRaw(data, width, height, 3, BufferFormat::kRGB, DataType::kUInt8);
// Create an image processor
ImageProcessorBuilder builder;
builder.SetResize(224, 224);
builder.SetNormalize(127.5f, 127.5f);
absl::StatusOr<std::unique_ptr<BufferProcessor>> preprocessor =
preprocessor_builder.Build();
// Preprocess the buffer
preprocessor->process(*buffer, *tensor);
... // Use the tensor
Buffer
Buffer
can be created from following data types and metadata:
- raw data, width, height,
BufferFormat
,DataType
, andBufferOrientation
(BufferFormat::kGrayScale
,BufferFormat::kRGB
,BufferFormat::kRGBA
, andBufferFormat::kRaw
only) - y plane, u plane, v plane, width, height, raw stride of y plane, raw stride of uv plane, pixel stride of uv plane,
BufferFormat
,DataType
, andBufferOrientation
(BufferFormat::kYV12
,BufferFormat::kYV21
,BufferFormat::kNV21
, andBufferFormat::kNV12
only) Tensor
Currently, BufferFormat
s that are not kRaw
only support kUInt8
DataType
.
Enumeration Types
BufferFormat
kGrayScale
- 8-bit gray scalekRGB
- 8-bit RGBkRGBA
- 8-bit RGBAkNV21
- YUV 4:2:0, 8 bit per channel, interleavedkNV12
- YUV 4:2:0, 8 bit per channel, interleavedkYV12
- YUV 4:2:0, 8 bit per channel, planarkYV21
- YUV 4:2:0, 8 bit per channel, planarkRaw
- raw data
DataType
kNoType
kFloat32
kInt32
kUInt8
kInt64
kString
kBool
kInt16
kComplex64
kInt8
kFloat16
kFloat64
BufferProcessor
ImageProcessor
ImageProcessor
supports following operations:
Crop(int x0, int y0, int x1, int y1)
: crop from top-left corner, inclusiveResize(int width, int height)
: resize to a new sizeRotate(float angle)
: counter-clockwise, between 0 and 360 in multiples of 90Flip(bool horizontal, bool vertical)
ConvertColorSpace(BufferFormat target_format)
: convert the color spaceNormalize(float mean, float std)
DataTypeConvert()
: convert the data type to the output data type, e.g., convert from 8-bit RGB to 32-bit float RGB (tensor).
ImageProcessorBuilder
provides a simple way to create an ImageProcessor
. The user predefines the operations and ImageProcessorBuilder
will create an ImageProcessor
with the operations.
By default, ImageProcessorBuilder
without any operation will create a ImageProcessor
provides a direct mapping from entire Buffer
to Tensor
without normalization. This covers the most common use case of the preprocessing.