Preprocessing
Overview
Band provides a set of APIs to preprocess the data. The preprocessing is mendatory and time-consuming process to run the machine learning model on the Band. Band provides Buffer and BufferProcessor to efficiently develop the preprocessing pipeline.
Buffer is an arbitrary wrapper of the data. It wraps the data with the metadata such as image, text, and Band Tensor to be used in the preprocessing pipeline. BufferProcessor is a set of APIs to preprocess the data. It provides the basic preprocessing APIs such as resize, normalize, and crop. We currently support image preprocessing only but we will support other data types such as text and audio in the future.
Example Usage - ImageProcessor
Below example shows how to use BufferProcessor to preprocess the image data. The example creates a Buffer from raw RGB data with (width, height) dimentions and (3) channels. Then, it creates an ImageProcessor with Resize and Normalize operations. It preprocesses the Buffer and updates a Tensor with (224, 224) dimentions and (3) channels with kFloat32 data type with the data normalized with (127.5, 127.5).
Tensor* tensor = ... // Create a tensor
// Create a buffer
unsigned char* data = new unsigned char[width * height * 3];
data = ... // Fill the data
Buffer* buffer = Buffer::CreateFromRaw(data, width, height, 3, BufferFormat::kRGB, DataType::kUInt8);
// Create an image processor
ImageProcessorBuilder builder;
builder.SetResize(224, 224);
builder.SetNormalize(127.5f, 127.5f);
absl::StatusOr<std::unique_ptr<BufferProcessor>> preprocessor =
preprocessor_builder.Build();
// Preprocess the buffer
preprocessor->process(*buffer, *tensor);
... // Use the tensor
Buffer
Buffer can be created from following data types and metadata:
- raw data, width, height,
BufferFormat,DataType, andBufferOrientation(BufferFormat::kGrayScale,BufferFormat::kRGB,BufferFormat::kRGBA, andBufferFormat::kRawonly) - y plane, u plane, v plane, width, height, raw stride of y plane, raw stride of uv plane, pixel stride of uv plane,
BufferFormat,DataType, andBufferOrientation(BufferFormat::kYV12,BufferFormat::kYV21,BufferFormat::kNV21, andBufferFormat::kNV12only) Tensor
Currently, BufferFormats that are not kRaw only support kUInt8 DataType.
Enumeration Types
BufferFormatkGrayScale- 8-bit gray scalekRGB- 8-bit RGBkRGBA- 8-bit RGBAkNV21- YUV 4:2:0, 8 bit per channel, interleavedkNV12- YUV 4:2:0, 8 bit per channel, interleavedkYV12- YUV 4:2:0, 8 bit per channel, planarkYV21- YUV 4:2:0, 8 bit per channel, planarkRaw- raw data
DataTypekNoTypekFloat32kInt32kUInt8kInt64kStringkBoolkInt16kComplex64kInt8kFloat16kFloat64
BufferProcessor
ImageProcessor
ImageProcessor supports following operations:
Crop(int x0, int y0, int x1, int y1): crop from top-left corner, inclusiveResize(int width, int height): resize to a new sizeRotate(float angle): counter-clockwise, between 0 and 360 in multiples of 90Flip(bool horizontal, bool vertical)ConvertColorSpace(BufferFormat target_format): convert the color spaceNormalize(float mean, float std)DataTypeConvert(): convert the data type to the output data type, e.g., convert from 8-bit RGB to 32-bit float RGB (tensor).
ImageProcessorBuilder provides a simple way to create an ImageProcessor. The user predefines the operations and ImageProcessorBuilder will create an ImageProcessor with the operations.
By default, ImageProcessorBuilder without any operation will create a ImageProcessor provides a direct mapping from entire Buffer to Tensor without normalization. This covers the most common use case of the preprocessing.