All About Google Colab - 🐦#

Working with Data#

For this notebook, we will be focusing on using Google Colab [Goo]

A Temporary Solution#

You can create and interact with files when you access Google Colab, BUT they will disappear after your session ends. An example would be:

f = open("thanosSnapping.txt", "a")
f.write("you should've gone for the head")
f.close()

Warning

Super serious here. If files are not written to your local machine or your Google Drive. They will be gone the next time you log in.

We need a better solution for this. So, let’s take a look at these options.

Connecting to Your Google Drive#

There are two ways to connect to your Google Drive.

Selecting the Icon to Connect#

Icon for Connecting to Google Colab

Connecting Using Google Colab Code#

from google.colab import drive
drive.mount('/content/drive')

This will look something like this:

Connecting to Google Colab with Code

I am Connected, Now What?#

It can be a bit difficult to figure out where to place files and where your files are located in Google Colab. Once you connect, you will need to navigate the folder structure to find your file. They did not make this very easy to find. Here is a quick screenshot.

file structure in Google Colab

Connecting to Gitlab#

Another way to work with data outside of your Google Drive is using something like GitHub or, in our case GitLab. A bit of a warning before we begin.

Warning

Github is an open repository. Any file placed on Github is available for ANYONE to view, download, or edit.

Make sure you read of the terms of services as well!

For accessing GitLab through Penn State, here is the link (https://git.psu.edu/). You do need to setup a username/password to start creating.

If you are interest in using Git, let me show the basics.

Creating a Connection (SSH)#

The first step in this process is to create a connection between Google Colab and GitLab, and this starts with saying to Google Colab: You can connect to this resource because I gave you access. The way we do this is to create an SSH Key.

Note

Code will run in Google Colab but has been commented out to work on the website.

!ssh-keygen -t ed25519 -C "SSH key for google colab"
#!ssh-keygen -t ed25519 -C "SSH key for google colab"

To see this file, we will need to read the file using a UNIX command call 😸 (cat or concatenate)

!cat /root/.ssh/id_ed25519.pub
#!cat /root/.ssh/id_ed25519.pub

We now need to tell SSH that the PSU GitLab is fine to connect to (it is a known host).

!ssh-keyscan git.psu.edu >> /root/.ssh/known_hosts
#!ssh-keyscan git.psu.edu >> /root/.ssh/known_hosts

Now in GitLab, we need to add this SSH key to our account.

GitLab adding SSH

Finally, let’s test this connection.

!ssh -T git@git.psu.edu
#!ssh -T git@git.psu.edu

You should receive a message that looks like this:

Warning: Permanently added the ECDSA host key for IP address β€˜...’ to the list of known hosts. Welcome to GitLab, @you!

The Push and Pull of Git#

Git, basically, is a push and pull mechanism. You pull your project down from GitLab. Make changes. Then push it back to GitLab. Here is an example of doing this with your file from before. First, we clone the repository.

!git clone git@git.psu.edu:pmd19/workshop_temp.git
#!git clone git@git.psu.edu:pmd19/workshop_temp.git

We then pull the current content from GitLab. Note cd stands for Change Directory, and PWD stands for Print Work Directory.

%cd /content/workshop_temp
!pwd
!git pull origin master
#%cd /content/workshop_temp
#!pwd
#!git pull origin master

We write our file to our directory and add it to the list of things we want to update on GitLab.

!pwd
f = open("thanosSnapping.txt", "w")
f.write("you should've gone for the head")
f.close()
!git add "thanosSnapping.txt"
#!pwd
#f = open("thanosSnapping.txt", "w")
#f.write("you should've gone for the head")
#f.close()
#!git add "thanosSnapping.txt"

Git wants to make sure it knows who is updating what, so it needs a person and an email address to associate the changes. We then commit to these changes, and finally, we push our changes to GitLab.

!git config --global user.email 'pmd19@psu.edu'
!git config --global user.name 'Patrick Dudas'
!git commit -m "testing"
!git push git@git.psu.edu:pmd19/workshop_temp.git
#!git config --global user.email 'pmd19@psu.edu'
#!git config --global user.name 'Patrick Dudas'
#!git commit -m "testing"
#!git push git@git.psu.edu:pmd19/workshop_temp.git

Things to be Aware of in Google Colab#

Warning

I am writing this on 2/2/2022 (lots of 2s there). So, if this is the βŒ› future βŒ›, some of these options may have changed.

Keyboard Shortcuts#

Shown below is how you find keyboard shortcuts for any operating system.

Keyboard Shortcuts in Google Colab

Note

By far, the most used keyboard shortcut that I use is ctrl+Enter (Windows) or ⌘+Enter (Mac). This command will run the content of the current cell.

Downloading or Reusing a Notebook in Another Environment#

Sometimes you want to download your notebook either as a notebook (.ipynb) or a python (.py) file. Here is how.

How to Download as .py

Renaming Your Notebook#

How to rename your notebook

This figure above showcases how you rename your notebook. Too be honest, I do not care for how this is implemented, because…

Warning

When you rename your notebook, do not delete the .ipynb. This will remove the file extension. DO BETTER, GOOGLE COLAB!

Sharing a Notebook#

Like other Google Workspace applications, you can also share notebooks. Just select the Share button in the top-right.

How to share your notebook

Warning

Another warning?! So, unlike Google Docs. You cannot have multiple people work on the same notebook. Actually, sharing could lead to others overwriting other people’s work. DO BETTER, GOOGLE COLAB!