Prerequisites
Thanks to generous support from Google, each of you will receive $50 of Google Cloud credits to use for this course. You will receive an email from the instructor by the end of the second week with instructions on how to request a Google Cloud Platform coupon. Once you have redeemed the coupon, you are ready to create your VM.
Note that if you haven’t tried Google Cloud before, you may get a credit for signing up. In that case, you can use your sign-up credit first before your course coupon.
You will also need the following software on your laptop:
- A web browser.
- A command-line “shell” for your laptop (which is different from the VM shell). For MacOS, it’s simply the Terminal application. If you use Windows, we recommend that you install and use Git BASH.
- The VNC viewer, if you want to access the Google VM not through shell but via a remote Desktop.
Creating a VM
- Head to the Google Cloud Platform Console, authenticate yourself, and create a “project” if you haven’t already.
- Click on the top-left corner menu button, and select “Compute Engine” (or go there directly). You may have to first enable this API by clicking ‘enable’. Go to “VM instances” and click “Create an instance.”
- For machine type,
e2-smallwill be enough for this course, but feel free to explore more expensive options should the need arise for your course project. - For boot disk, select
Ubuntu 20.04 LTS Minimal, and a minimum of 20GB standard persistent disk. - For firewall, you can allow HTTP/HTTPS traffic, which may be useful for your course project.
The other options can be left at default for now.
- For machine type,
- Once the VM has been created, you will see it listed as one of your “VM instances”. When it is running, you will see an “SSH” option near the end of the line. Click on that to connect, and you will be inside a shell of your VM. Follow the instructions for Help/Readying VM for the Course.
- You now need to reboot your VM. Exit from from your VM shell (using the
exitcommand), go back to the compute engine console, select your running VM, stop it and then start it using the buttons on top of the page.
Managing VMs
BILLING:
To redeem your credits with the coupon code, use this form after creating your account. You will have to enter a credit card when signing up, but we will not add this to any billing options so you will not be charged unless you have other projects external to this class.
After creating the project you can ensure that you aren’t being billed personally for this project, make sure to go into ‘Billing’ in the left-hand navigation bar, then navigate to “Manage Billing Accounts” > “My Projects” and double-check that your project shows ‘Billing Account for Education’. If it doesn’t, you can change billing under the “Actions” drop-down menu on the right-hand side of the table.
You can stop, start, and SSH into your VM all conveniently from the Compute Engine Console. Make a habit of stopping your VM when not in use, because it will cost you credits!
OTHER:
Note that you can click on the “SSH” button multiple times to get multiple shells for multitasking.
When your VM is running, you will see an external IP address assigned to it, which you can use to connect to your VM in other ways (using your own SSH client, or accessing a website running on your VM, for example).
You can also delete your VM, but you will lose everything in your VM. Do NOT do this unless you really want to get rid of the VM and all its data (e.g., when you finish this course).
Setting up an SSH Key Pair to Access VMs
First, to generate a key pair, get a shell on your laptop, and issue the following command:
ssh-keygen -t rsa
It will ask for location and pass phrase. Accept the default location, and take a note of it. You can use an empty pass phrase (assuming you have exclusive access to your own laptop).
After you generate the keys, issue the following command to show your public key (replace /path/to below with the location you noted in the above step):
cat /path/to/id_rsa.pub
The output should start with ssh-rsa, followed by a long sequence of random characters, and then a user@machine string (where user here is likely your user name in your laptop’s operating system).
Then, point your browser to the Compute Engine Console and click on “Metadata” and then the “SSH keys” section. You should see some default SSH keys here already with the same user name (likely your Google user name); take a note of this name—let’s call it vm_user_name; this is your user name on the VM. Now, edit the list of SSH keys to add your public key, as follows:
- Copy and paste your public key, i.e., the output of the
catcommand above, into the text box. - Edit the
userin the last part of your public key to bevm_user_name, i.e., consistent with other keys that already exist. The user name that appears to the left of the text box will be automatically updated to reflect the change. - Leave the
@machinepart of your public key as is.
Once the key is accepted, and assuming your Google VM is up and running, you can test by accessing your VM from the shell on your laptop as follows (vm_user_name refers to your user name on the VM as noted in the above step; replace vm_external_addr below by the external IP address assigned to your Google VM, which you can find in your Compute Engine Console):
ssh vm_user_name@vm_external_addr
If everything was set up correctly, you should be inside your VM shell. Type exit will disconnect you from the VM and put you back into the shell on your laptop.
Setting up GUI Access to VM
GUI access to VM gives you a familiar desktop interface. Before proceeding, make you sure you have already set up an SSH key pair to access your VM and tested it as instructed above. Then, you will need to install software both on your laptop and on your VM:
- On your laptop, make sure you have the VNC Viewer installed.
- Get a VM shell and issue the following command on your VM:
/opt/dbcourse/install/install-gui.sh
This command will take some time. It will install a lightweight desktop as well as a remote desktop software called VNC. Then, use the following command to set your remote desktop password, and remember it:
vncpasswd
(There is no need for a second view-only password.) Once it’s done, exit out of your VM shell, and then reboot the VM.
Every time you want to have GUI access your VM, follow the steps below:
- Open a shell on your laptop, and issue the following command to connect to your VM:
ssh -L 5901:localhost:5901 vm_user_name@vm_external_addr - Once connected, type the following command to start the remote desktop server:
tightvncserver :1 -geometry 1600x900 -depth 24
You can replace1600x900with other standard resolutions (a higher resolution will look better, but may make your remote desktop less responsive on a slow network connection). The command will print out a bunch of information (including where to find the log file in case something goes wrong) and exit. If all goes well, the remote desktop server will be up and running in the background. - Now, run the VNC Viewer app. To connect, use address
localhost:5901, and the remote desktop password you set usingvncpasswdearlier. The app will warn you about unencrypted connection, but you can safely ignore it because we are tunneling the connection through a secure SSH connection (that was what the extra-L ...was for in yoursshcommand).
For help on how to use the desktop and what applications are useful, please refer to Linux Basics.
To stop the GUI access, simply disconnect the Chrome VNC viewer, and go back to the shell where you started the remote desktop server and issue the following command:
tightvncserver -kill :1
Then you can exit the VM shell and then the shell on your laptop.
File Access
When working on your VM, remember that it is a machine separate from your own computer, with its own file system and disk. Before proceeding to gain file access, make you sure you have already set up an SSH key pair to access your VM and tested it as instructed above.
To transfer individual files, the easiest way is to open a shell on your laptop, and use the scp command. To download a file from your remote VM:
scp vm_user_name@vm_external_addr:path/to/remote/file path/to/local/file
You may need to familiarize yourself with files and paths in Linux (see Linux Basics for more help). For example, scp vm_user_name@vm_external_addr:hw1/1-query.txt ~/hw1/1-query.txt downloads the file 1-query.txt located under the hw1 folder in your home directory on the remote VM, and stores it as 1-query.txt under the hw1 folder in your home directory on your laptop.
To upload a file to your remote VM:
scp path/to/local/file vm_user_name@vm_external_addr:path/to/remote/file
You can also use scp -r to copy entire folders recursively with just a single command, e.g.:
scp -r vm_user_name@vm_external_addr:path/to/remote_dir/ path/to/local_parent_dir/
will download the remote_dir underneath local_parent_dir/. Things get trickier, e.g., if you already have another directory of the same name as remote_dir/ under local_parent_dir/, so use this command with care.
Alternatively, if you prefer a simpler user interface, we recommend the following:
- For Mac, you can simply use your Finder, by following these instructions; your private key file is likely
~/.ssh/id_rsain this case.- For Windows, install and use
WinSCP, by following these instructions; your private key file is likely namedid_rsain the.sshfolder under where Git BASH considers as your home directory (likelyC:\Users\your_windows_username).
As another (and more powerful) alternative, adventurous/experienced users can consider installing FUSE/SSHFS for MacOS or SSHFS-Win for Windows. These tools will allow you to “mount” the remote files as if they reside directly on your laptop.
Network Access
If you had followed the above instructions to create your VM, you should be able to login remotely via ssh and access any website hosted on your VM on its port 80. Access to other network ports on your VM, however, is generally blocked by default. If you want to open up access to a particular port (say 1234), follow these instructions:
- Go to the compute engine console and click on your VM’s name. You should see a list of information about your VM. Click on the “default” link under “Network.”
- Under the “Network details” page, find the section on “Firewall rules,” and click on “Add firewall rule.”
- Under the “Create a firewall rule” page:
- Under “Name,” enter a short, meaningful name to help you remember what this rule is for.
- Under “Source IP ranges,” enter
0.0.0.0/0. - Under “Allowed protocols and ports,” enter
tcp:1234(replace1234with the particular port number you want access to). - For Target, select All Instances in the network
- Other settings can be left at their default values.
- Click “Create.”
- You may need to stop and then restart your VM in order for the changes to take effect.