In an effort to understand the inners of deep neural networks and what information those models learn after been trained on a specific task, a team at Google come up with what's know today as DeepDream. The experiment results were interesting, it appears that neural networks trained to classify an image (i.e. discriminate between different classes) contains enough information to generate new images that can be artistic.

deepdream artistic images

In this notebook we will implement DeepDream in Tensorflow from scratch and test it on couple of images.

The DeepDream model is simply built on a backbone model trained on imagenet and the output of the model will be few activation layers picked from this backbone model. Then we run an image through this model, compute the gradients with respect to the activations of those output layers, then modify the original image to increase the magnitude of the activations which as a result will magnify the patterns in the image.

In our case, the backbone model is Inception V3 (which you can read more about it here). The following driagram shows an overview of this model architecture:

inception v3 model diagram

#collapse
import numpy as np
import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.applications.inception_v3 import *
from tensorflow.keras.preprocessing.image import *
from tqdm import tqdm
import matplotlib.pyplot as plt

To create the DeepDream model, we define the following helper function that uses InceptionV3 from TF Hub and uses the input layers as output of the model. Note that by default we are picking ramdom activation layers from the InceptionV3 model.

def create_model(layers=None):
    if not layers:
        layers = ['mixed3', 'mixed5']
    base = InceptionV3(weights='imagenet', include_top=False)
    outputs = [base.get_layer(name).output for name in layers]
    return Model(base.input, outputs)

We need to define few utils functions that we will use to process the images, for example scaling an image by a factor or converting tensor image into a numpy array.

def convert_tensor_to_nparray(image):
    image = 255 * (image + 1.0) / 2.0
    image = tf.cast(image, tf.uint8)
    image = np.array(image)
    return image

def scale_image(image, base_shape, scale, factor):
    new_shape = tf.cast(base_shape * (scale **factor), tf.int32)
    image = tf.image.resize(image,new_shape).numpy()
    image = preprocess_input(image)
    image = tf.convert_to_tensor(image)
    return image

Next, we define a function to calculate the loss which is simly the average of the activations resulting from doing a forward pass with the input image.

def calculate_loss(model, image):
    image_batch = tf.expand_dims(image, axis=0)
    activations = model(image_batch)
    if len(activations) == 1:
        activations = [activations]

    losses = []
    for activation in activations:
        loss = tf.math.reduce_mean(activation)
        losses.append(loss)

    total_loss = tf.reduce_sum(losses)
    return total_loss

To calculate the gradients, we need to perform a forward pass inside a tf.GradientTape(), after that we simply update the image to maximize the activations in the next run.

Note how we are using the tf.function annotation which will improve the performance significantly.

@tf.function
def forward_pass(model, image, steps, step_size):
    loss = tf.constant(0.0)
    for _ in range(steps):
        with tf.GradientTape() as tape:
            tape.watch(image)
            loss = calculate_loss(model, image)

        gradients = tape.gradient(loss, image)
        gradients /= tf.math.reduce_std(gradients) + 1e-8

        image = image + gradients * step_size
        image = tf.clip_by_value(image, -1, 1)

    return image, loss 

All the previous functions are combined and used in the following funciton which will take an input image and a model, and construct the final dreaming looking picture.

The other input parameters to this function have the following purpose:

  • octave_scale the scale by which we'll increase the size of an image
  • octave_power_factors the factor that will be applied as a power to the previous scale parameter.
  • steps the number of iteration we run the image over the deepdream model
  • step_size will be used to scale the gradients before adding them to the image
def dream(dreamer_model, image, octave_scale=1.30, octave_power_factors=None, steps=100, step_size=0.01):
    if not octave_power_factors:
        octave_power_factors = [*range(-2, 3)]
    image = tf.constant(np.array(image))
    base_shape = tf.shape(image)[:-1]
    base_shape = tf.cast(base_shape, tf.float32)
    steps = tf.constant(steps)
    step_size = tf.constant(tf.convert_to_tensor(step_size))

    for factor in octave_power_factors:
        image = scale_image(image, base_shape, octave_scale, factor)
        image, _ = forward_pass(dreamer_model, image, steps, step_size)
        image = convert_tensor_to_nparray(image)
        
    base_shape = tf.cast(base_shape, tf.int32)
    image = tf.image.resize(image, base_shape)
    image = tf.image.convert_image_dtype(image /255.0,dtype=tf.uint8)
    image = np.array(image)
    return np.array(image)

Now we can apply this DeepDream model to an image, you can pick any one you like.

!curl https://miro.medium.com/max/1750/1*E-S7Y80jIFuZ03xyc89fnA.jpeg -o image.jpeg
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 68250  100 68250    0     0   812k      0 --:--:-- --:--:-- --:--:--  812k
def load_image(path):
    image = load_img(path)
    image = img_to_array(image)
    return image
def show_image(image):
    plt.imshow(image)
    plt.show()
original_image = load_image('image.jpeg')
show_image(original_image / 255.0)

First, lets try the image with all default parameters, and activation layers

model = create_model()
output_image = dream(model, original_image)
show_image(output_image)
WARNING:tensorflow:5 out of the last 5 calls to <function forward_pass at 0x7f095a587a70> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.

Let's try with different activation layers

Note: the first layers tend to learn basic patterns (e.g. lines and shapes), while layers closer to the output learn more complex patterns as they combine the previous basic patterns.
model = create_model(['mixed2', 'mixed5', 'mixed7'])
output_image = dream(model, original_image)
show_image(output_image)
WARNING:tensorflow:6 out of the last 6 calls to <function forward_pass at 0x7f095a587a70> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.

The result is a softer image as a result of adding more layers.

Finally, let's try some custom octaves power factors

model = create_model()
output_image = dream(model, original_image, octave_power_factors=[-3, -1, 0, 3])
show_image(output_image)

The resulting image seem to have less noise and more heterogeneous patterns, a mixture of both high- and low-level patterns, as well as a better color distribution.

As an exercise, try different parameters and you will see that the results vary widely:

  • Play with different step_size values, a big value will result in much noise added to the original images
  • Use higher layers to obtain pictures with less noise and more nuanced patterns.
  • Use more octaves, which will result into more images passed to the model at different scales.