Â
Project Overview:
This is my first personal project involving machine learning, training a ProGAN to synthesize images of succulents!
Â
Why I started this project
As I just finished the course COMP9444, in which they introduced the topic of GAN in the final few weeks of the schedule, which I found very interesting, but because there was no time left for that term, they only introduced the theory of GAN. So during the term break, I decided to explore this area with a project, and at the time, I was OBSESSED with succulents, so I thought, why not, maybe I can use GAN to generate images of succulents and also gain some experience in a practical machine learning project.
Â
Project Phase:
This project has three phases:
- Gathering the training data
- Training the model
- Inference from the model
Gathering the training data
At first, I tried to find a dataset that contained similar succulent images, but I couldnât find one that satisfied my requirements:
- The resolutions of images need to be larger than 1024x1024 and consistent
- All the images of succulents need to have a similar shape (a Cactus and an Echeveria have very different shapes)
- All the images need to have a similar structure (top-down view, succulent at the center)
- There should be a large number of images (>5000)
Â
As a result, I decided to collect the data myself, and this part of the project takes up the majority of my time. I can collect data from two potential sources: shopping websites and social media. And I chose to collect data from Taobao, where I buy my plants. I found it easier to scrape data from a shopping page because there are highly likely to be 10-20 images of similar plants recommended by the shopping website algorithm on the same page. In order to reach the same result on social media platforms such as Instagram or RED, I must manually search for the name of the variety.
I didn't write a scraper for this task since I was concerned that scraping from Taobao would be illegal, and instead, I use a chrome extension that can download all the images from all the tabs at once and filter the images with some threshold (larger than 800x800 resolution). If I gonna redo this project, I'll probably write a scraper because manually collecting data is really time-consuming.
Cleaning the data
I use this method to collect about 12000 images, then I write a Python script to crop all the images to be 512x512 at the center to remove the noisy background. And then I manually filter images that do not satisfy my requirements, reducing the number of images used to train the model to about 9000. As I looked at it from the future, there are certainly better ways to do it, such as using an unsupervised learning model to group similar images, but it may be unnecessary for a small project like this.
Problem with the dataset
One reason I think this project did not work as intended is that the images are still too noisy when compared to the CelebA dataset used by ProGAN.
- The background is often made up of high objects such as pebble ground cover or peat moss (In CelebA, the background are mostly smooth and textureless object like walls or banners)
- Even though I limited the succulent species to Echeveria and carefully filtered out those with more than one head, they can still vary a lot in shape and color, making them much more diverse than the headshots in the CelebA dataset.
Â
Training the model
I followed the implementation and training setup from this video, then I trained the model in Colab.
It would be better if I wrote all of the code myself, but it is not really the primary goal of this project.
I used the free K80 GPU to train the model, and I stopped at 512x512 resolution as it takes about 2hr to train one epoch. I didnât count the total training time as there was a 4-6 hr GPU quota per day, and I had to save the model a lot and resume the training in the background whenever I had time.
To be honest, I don't think the results are very good, the background is blurry and you canât really see the detail of the plants, only the general structure of the plants, unlike the result from ProGAN.
There are many improvements I can make to get better results:
- Train the model for a longer period of time and with a better GPU, as I can still observe the drop in loss and change of image quality during training
- Change to a different model, or make adjustments to ProGAN to adapt the succulent dataset?
- Collect more high quality data, as I only have ~9000 images and in ProGAN they used ~30000 images. And I think I need to get rid of some of the images as they might introduce noise to the models
Inference from the model
Use the generator to generate images of succulent đș
Takeaways:
The project helped me to learn more about whatâs involved in a machine learning project. In retrospect, if there is anything I would change, I would do more research on how to prepare the dataset and maybe automate the process to collect more high-quality data.
Â
Â