Tuesday, August 2, 2022
HomeArtificial IntelligencePicture Augmentation with Keras Preprocessing Layers and tf.picture

Picture Augmentation with Keras Preprocessing Layers and tf.picture

Final Up to date on July 20, 2022

After we work on a machine studying drawback associated to pictures, not solely we have to accumulate some photographs as coaching knowledge, but in addition must make use of augmentation to create variations within the picture. It’s very true for extra advanced object recognition issues.

There are lots of methods for picture augmentation. You might use some exterior libraries or write your personal features for that. There are some modules in TensorFlow and Keras for augmentation, too. On this publish you’ll uncover how we are able to use the Keras preprocessing layer in addition to tf.picture module in TensorFlow for picture augmentation.

After studying this publish, you’ll know:

  • What are the Keras preprocessing layers and how one can use them
  • What are the features offered by tf.picture module for picture augmentation
  • The best way to use augmentation along with tf.knowledge dataset

Let’s get began.

Picture Augmentation with Keras Preprocessing Layers and tf.picture.
Photograph by Steven Kamenar. Some rights reserved.


This text is break up into 5 sections; they’re:

  • Getting Pictures
  • Visualizing the Pictures
  • Keras Preprocessing Layesr
  • Utilizing tf.picture API for Augmentation
  • Utilizing Preprocessing Layers in Neural Networks

Getting Pictures

Earlier than we see how we are able to do augmentation, we have to get the photographs. Finally, we’d like the photographs to be represented as arrays, for instance, in HxWx3 in 8-bit integers for the RGB pixel worth. There are lots of methods to get the photographs. Some might be downloaded as a ZIP file. In the event you’re utilizing TensorFlow, you might get some picture dataset from the tensorflow_datasets library.

On this tutorial, we’re going to use the citrus leaves photographs, which is a small dataset in lower than 100MB. It may be downloaded from tensorflow_datasets as follows:

Operating this code the primary time will obtain the picture dataset into your laptop with the next output:

The perform above returns the photographs as a tf.knowledge dataset object and the metadata. This can be a classification dataset. We are able to print the coaching labels with the next:

and this prints:

In the event you run this code once more at a later time, you’ll reuse the downloaded picture. However the different technique to load the downloaded photographs right into a tf.knowledge dataset is to the image_dataset_from_directory() perform.

As we are able to see the display screen output above, the dataset is downloaded into the listing ~/tensorflow_datasets. In the event you take a look at the listing, you see the listing construction as follows:

The directories are the labels and the photographs are recordsdata saved below their corresponding listing. We are able to let the perform to learn the listing recursively right into a dataset:

You might need to set batch_size=None if you don’t want the dataset to be batched. Often we wish the dataset to be batched for coaching a neural community mannequin.

Visualizing the Pictures

You will need to visualize the augmentation end result so we are able to confirm the augmentation result’s what we would like it to be. We are able to use matplotlib for this.

In matplotlib, we now have the imshow() perform to show a picture. Nevertheless, for the picture to be displayed appropriately, the picture needs to be introduced as an array of 8-bit unsigned integer (uint8).

Given we now have a dataset created utilizing image_dataset_from_directory(), we are able to get the primary batch (of 32 photographs) and show a number of of them utilizing imshow(), as follows:

Right here we show 9 photographs in a grid, and label the photographs with their corresponding classification label, utilizing ds.class_names. The pictures needs to be transformed to NumPy array in uint8 for show. This code shows a picture like the next:

The whole code from loading the picture to show is as follows.

Notice that, in case you’re utilizing tensorflow_datasets to get the picture, the samples are introduced as a dictionary as a substitute of a tuple of (picture,label). It’s best to change your code barely into the next:

In the remainder of this publish, we assume the dataset is created utilizing image_dataset_from_directory(). You might must tweak the code barely in case your dataset is created in another way.

Keras Preprocessing Layers

Keras comes with many neural community layers reminiscent of convolution layers that we have to prepare. There are additionally layers with no parameters to coach, reminiscent of flatten layers to transform an array reminiscent of a picture right into a vector.

The preprocessing layers in Keras are particularly designed to make use of in early phases in a neural community. We are able to use them for picture preprocessing, reminiscent of to resize or rotate the picture or to regulate the brightness and distinction. Whereas the preprocessing layers are alleged to be half of a bigger neural community, we are able to additionally use them as features. Beneath is how we are able to use the resizing layer as a perform to remodel some photographs and show them side-by-side with the unique:

Our photographs are in 256×256 pixels and the resizing layer will make them into 256×128 pixels. The output of the above code is as follows:

For the reason that resizing layer is a perform itself, we are able to chain them to the dataset itself. For instance,

The dataset ds has samples within the type of (picture, label). Therefore we created a perform that takes in such tuple and preprocess the picture with the resizing layer. We assigned this perform as an argument for map() within the dataset. After we draw a pattern from the brand new dataset created with the map() perform, the picture will likely be a reworked one.

There are extra preprocessing layers out there. In under, we display some.

As we noticed above, we are able to resize the picture. We are able to additionally randomly enlarge or shrink the peak or width of a picture. Equally, we are able to zoom in or zoom out on a picture. Beneath is an instance to control the picture dimension in numerous methods for a most of 30% improve or lower:

This code reveals photographs as follows:

Whereas we specified a hard and fast dimension in resize, we now have a random quantity of manipulation in different augmentations.

We are able to additionally do flipping, rotation, cropping, and geometric translation utilizing preprocessing layers:

This code reveals the next photographs:

And eventually, we are able to do augmentations on colour changes as properly:

This reveals the photographs as follows:

For completeness, under is the code to show the results of numerous augmentations:

Lastly, you will need to level out that almost all neural community mannequin can work higher if the enter photographs are scaled. Whereas we often use 8-bit unsigned integer for the pixel values in a picture (e.g., for show utilizing imshow() as above), neural community prefers the pixel values to be between 0 and 1, or between -1 and +1. This may be executed with a preprocessing layers, too. Beneath is how we are able to replace certainly one of our instance above so as to add the scaling layer into the augmentation:

Utilizing tf.picture API for Augmentation

Apart from the preprocessing layer, the tf.picture module additionally offered some features for augmentation. In contrast to the preprocessing layer, these features are meant for use in a user-defined perform and assigned to a dataset utilizing map() as we noticed above.

The features offered by tf.picture should not duplicates of the preprocessing layers, though there are some overlap. Beneath is an instance of utilizing the tf.picture features to resize and crop photographs:

Beneath is the output of the above code:

Whereas the show of photographs match what we’d count on from the code, using tf.picture features is kind of totally different from that of the preprocessing layers. Each tf.picture perform is totally different. Subsequently, we are able to see the crop_to_bounding_box() perform takes pixel coordinates however the central_crop() perform assumes a fraction ratio as argument.

These features are additionally totally different in the best way randomness is dealt with. A few of these perform doesn’t assume random conduct. Subsequently, the random resize ought to have the precise output dimension generated utilizing a random quantity generator individually earlier than calling the resize perform. Another perform, reminiscent of stateless_random_crop(), can do augmentation randomly however a pair of random seed in int32 must be specified explicitly.

To proceed the instance, there are the features for flipping a picture and extracting the Sobel edges:

which reveals the next:

And the next are the features to control the brightness, distinction, and colours:

This code reveals the next:

Beneath is the entire code to show all the above:

These augmentation features needs to be sufficient for many use. However when you’ve got some particular thought on augmentation, in all probability you would want a greater picture processing library. OpenCV and Pillow are frequent however highly effective libraries that means that you can remodel photographs higher.

Utilizing Preprocessing Layers in Neural Networks

We used the Keras preprocessing layers as features within the examples above. However they can be used as layers in a neural community. It’s trivial to make use of. Beneath is an instance on how we are able to incorporate a preprocessing layer right into a classification community and prepare it utilizing a dataset:

Operating this code offers the next output:

Within the code above, we created the dataset with cache() and prefetch(). This can be a efficiency approach to permit the dataset to arrange knowledge asynchronously whereas the neural community is skilled. This could be important if the dataset has another augmentation assigned utilizing the map() perform.

You will note some enchancment in accuracy in case you eliminated the RandomFlip and RandomRotation layers since you make the issue simpler. Nevertheless, as we would like the community to foretell properly on a large variations of picture high quality and properties, utilizing augmentation may help our ensuing community extra highly effective.

Additional Studying

Beneath are documentations from TensorFlow which can be associated to the examples above:


On this publish, you’ve seen how we are able to use the tf.knowledge dataset with picture augmentation features from Keras and TensorFlow.

Particularly, you realized:

  • The best way to use the preprocessing layers from Keras, each as a perform and as a part of a neural community
  • The best way to create your personal picture augmentation perform and apply it to the dataset utilizing the map() perform
  • The best way to use the features offered by the tf.picture module for picture augmentation

Develop Deep Studying Initiatives with Python!

Deep Learning with Python

 What If You May Develop A Community in Minutes

…with just some strains of Python

Uncover how in my new E book:

Deep Studying With Python

It covers end-to-end tasks on matters like:

Multilayer PerceptronsConvolutional Nets and Recurrent Neural Nets, and extra…

Lastly Deliver Deep Studying To

Your Personal Initiatives

Skip the Teachers. Simply Outcomes.

See What’s Inside



Please enter your comment!
Please enter your name here

Most Popular