All about Google Colaboratory you want to explore

8 minute read

URL: https://colab.research.google.com/

Colab H/W specifications:

Parameter Google Colab
GPU Nvidia K80/T4
GPU Memory 12GB/16GB
GPU Memory clock 0.82GHz/ 1.59GHz
Performance 4.1 TFlops
No. CPU cores 2
Available RAM 12GB
Max. Execution time 12 Hours
Max. idle time 90 min

Advantages of using Colab?

  1. Free GPU !!! (One of the main reason I believe for thriving the DL community is Open-source and Access to GPU’s*

  2. Sharing: We can share the notebooks easily with others having a Gmail account. We can also share the notebook URL for public access.

  3. Github: we can clone the git repos with a single command and save the notebooks with hassle-free UI.

  4. File downloads: In ML / DL, Datasets are crucial and, we know the file sizes are more for vision-related datasets. With moderate internet connectivity need ages to download but, with the colab, we can download in minutes (you heard right)

  5. Code snippets: Colab has tiny code snippets, You can explore these on your own

  6. Default pre-loaded python packages: Colab notebook is pre-installed with the libraries (up to date) for DL/ML

A small caution

  1. If we close the tab or lost the internet connection for more than 90 minutes, all the work will loose and, a new VM instance will be assigned.

  2. Colab will remove the running VM after 12 hours, no matter how much time remained to complete the training.

Colab UI

UI

Know allocated specs

GPU VM instance specs

Google colab has two different GPU models, namely K80 and T4. However, we cannot select which of them you want due to the availability issues.

!nvidia-smi

Know the CPU and RAM info

!cat /proc/cpuinfo
!cat /proc/meminfo

Setting up the libraries

1. command to check the installed packages

!pip list 

2. Package installation

Colab comes with the pre-installed packages, if you feel the need of other packages you’re free to install

#Option-1: Using pip command

!pip install opencv-contrib-python

#Option-2: using apt-get command

!apt-get -qq install -y graphviz && pip install -q pydot

3. Command to check the specific package version

import keras

keras.__version__

Upload / Download the Files/datasets

  1. Upload files from the local directory To upload data from the local machine Run the below cell (To run the cell use, (shift+enter). Click on Choose Files, local machine file directory window pops up, and then choose the file (image/video/document/ .etc) your need to upload.
from google.colab import files

files.upload()

2. Download from the Google drive

Data can be downloaded from our own Gdrive or we can also download drive files shared by others (public link)

Let’s get started

  1. Access the data from our Gdrive
from google.colab import drive

drive.mount('/content/gdrive')
  1. Accessing Gdrive files shared in public

Example:

  1. You’ll fing the drive URL something similar shared below

    https://drive.google.com/open?id=0B_URf9ZWjSC11Xzc4R2d0N2c

  2. Then copy the alpha numeric characters after ?id =
!gdown --id 0B_URf9ZWjAW7SC11Xzc4R2d0N2c

3. Downloading Files from the website

To download the files from the website then we use !wget command

!wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz

Zip/Unzip, untar the folders

1. Zip the folder

Syntax !zip (zip file name.zip) (source folder to zip)

!zip data.zip /content/sample_data 

2. Unzip the folder

syntax:

!unzip (path to the zip folder) -d (destination folder)

!unzip data.zip -d /content

3. Extract tar files

!tar -xvf  '/content/train-images-idx3-ubyte'

Copy / Move files

# To list files within the directory 
# !ls with the folder path

!ls /content/content/
#To copy files !cp <source file> <destination file>
!cp /content/data.zip /content/content
# command to move files
!mv '/content/data.zip' '/content/content/sample_data'

Delete a file/folder

To delete a file, !rm command is used

!rm '/content/Trolling Euclid.pdf'

To delete a folder, !rm -rf command is used

!rm -rf /content/check

Downloading datasets from Kaggle API

  1. Go to your account, Scroll to API section and Click Expire API Token to remove previous tokens

  2. Click on Create New API Token - It will download kaggle.json file on your machine.

  3. Go to your Google Colab project file and run the following commands:

! pip install -q kaggle
#upload the kaggle.json file that you downloaded by following the above steps
from google.colab import files

files.upload()
! mkdir ~/.kaggle
#Make directory named kaggle and copy kaggle.json file there.
! cp kaggle.json ~/.kaggle/
#change the permission of the file
! chmod 600 ~/.kaggle/kaggle.json
! kaggle datasets list
! kaggle datasets download -d emmarex/plantdisease
!unzip /content/plantdisease.zip

Cloning Git repo

#To clone the git repos we use !git clone repo

!git clone https://github.com/tensorflow/models.git

Running .py files

!py /content/loader.py

Changing working directories

#to know present working directory
!pwd
#Commands to find sub directories
!ls /content/sample_data

Running Tensorboard within colab

%load_ext tensorboard
%tensorboard --logdir logs

Switching between tensorflow versions

Colab is pre loaded tensorflow 2.x.

If your code works with tensorflow 1.x then we can choose 1.x version using the below command

%tensorflow_version 1.x

Playing Video inline

from IPython.display import HTML
from base64 import b64encode
mp4 = open('/content/snake_bot.mp4','rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML("""
<video width=400 controls>
      <source src="%s" type="video/mp4">
</video>
""" % data_url)

Thanks for going through the post

If I have missed mentioning any commands, please do let me know in the comments.

Updated: