vineri, 12 ianuarie 2024

Petals to the Metal - Flower Classification on TPU

What are TPUs?

    TPUs are powerful hardware accelerators specialized in deep learning tasks. They were developed (and first used) by Google to process large image databases, such as extracting all the text from Street View. Google started using TPU in 2015; then, they made it public in 2018.

    TPUs are custom build processing units to work for a specific app framework. That is TensorFlow. An open-source machine learning platform, with state of the art tools, libraries, and community, so the user can quickly build and deploy ML apps.


TPU version 3.0

    The difference between CPU, GPU and TPU is that the CPU handles all the logics, calculations and input/output of the computer, it is a general-purpose processor. In comparison, GPU is an additional processor to enhance the graphical interface and run high-end tasks. TPUs are powerful custom-built processors to run the project made on a specific framework, i.e. TensorFlow.

    Different types of processors are suited for different types of machine learning models. TPUs are well suited for CNNs, while GPUs have benefits for some fully-connected neural networks, and CPUs can have advantages for RNNs.




How TPU works?

    Google designed Cloud TPUs as a matrix processor specialized for neural network workloads. TPUs can't run word processors, control rocket engines, or execute bank transactions, but they can handle massive matrix operations used in neural networks at fast speeds.
    The TPU host streams data into an infeed queue. The TPU loads data from the infeed queue and stores them in HBM memory. When the computation is completed, the TPU loads the results into the outfeed queue. The TPU host then reads the results from the outfeed queue and stores them in the host's memory.
As a result, TPUs can achieve a high-computational throughput on neural network calculations.
    The primary task for TPUs is matrix processing, which is a combination of multiply and accumulate operations. TPUs contain thousands of multiply-accumulators that are directly connected to each other to form a large physical matrix. This is called a systolic array architecture.
    To perform the matrix operations, the TPU loads the parameters from HBM memory into the Matrix Multiplication Unit (MXU). Then, the TPU loads data from HBM memory. As each multiplication is executed, the result is passed to the next multiply-accumulator. The output is the summation of all multiplication results between the data and parameters. No memory access is required during the matrix multiplication process.

About the dataset

  In this competition, 104 types of flowers are classified based on their images drawn from five different public datasets. Some classes are very narrow, containing only a particular sub-type of flower (e.g. pink primroses) while other classes contain many sub-types (e.g. wild roses). The dataset contains imperfections - images of flowers in odd places, or as a backdrop to modern machinery.
    This competition provides its files in TFRecord format. The TFRecord format is a container format frequently used in Tensorflow to group and shard data data files for optimal training performace. Each file contains the id and img (the actual pixels in array form) information for many images.  

Implementing the solution

    It’s difficult to fathom just how vast and diverse our natural world is. There are over 5,000 species of mammals, 10,000 species of birds, 30,000 species of fish – and astonishingly, over 400,000 different types of flowers. In this competition, we’re challenged to build a machine learning model that identifies the type of flowers in a dataset of images.
    After identifying the type of flowers, we will continue by training the model and making prediction to reveal patterns in the kinds of images our model has trouble with. Before making our final predictions on the test set, we will evaluate model's predictions on the validation set. This will help us diagnose problems in training or suggest ways our model can be improved.

References

[1] Getting Started With TPUs - https://www.kaggle.com/code/ryanholbrook/getting-started-with-tpus#Step-3:-Define-a-model 
[2] Efficiently using TPU for image classification - 
https://medium.com/swlh/efficiently-using-tpu-for-image-classification-ed20d2970893
[3] Introduction to Cloud TPU -
https://cloud.google.com/tpu/docs/intro-to-tpu
[4] When to use CPUs vs GPUs vs TPUs in a Kaggle Competition? -
https://towardsdatascience.com/when-to-use-cpus-vs-gpus-vs-tpus-in-a-kaggle-competition-9af708a8c3eb








Niciun comentariu:

Trimiteți un comentariu

Gestionarea traficului prin inteligenta artificiala

  Gestionarea traficului             Circulația rutieră devine din ce în ce mai aglomerată și mai lentă, favorizându-se producerea a numeroa...