I’ve been playing around with some AI stuff, trying to train some models using pytorch
(more on this later, stay tuned!). I pretty quickly ran into a road block where my poor lil 2017 MacBook Pro couldn’t take the heat, and I needed to figure out how to set up a virtual machine to run my models on. Here are the steps I took to set up a remote GPU compute engine using the Google Cloud Platform and to connect to it with VS Code.
Steps
1. Setting up cloud account
There are several different cloud providers that have remote development environment options. I chose Google just because I already had a Google Cloud Platform account. First step would be to create that account, and then:
- Create a new project in Google Cloud Platform (GCP)
- Go to “API and Services” → “Enable APIs and Services”
- Search for “Compute Engine API” and enable it
2. Creating a virutal machine instance
- In GCP, go to Compute Engine → VM Instances
- Click “Create Instance” and configure with these specifications:
- Name: Choose something descriptive (mine is called “vm-old-french-ai”)
- Region: Choose one close to your location whcih has this type of capacity available
- Machine Configuration:
- Click “GPUs”
- Select series: N1
- GPU: Add 1 NVIDIA T4
- Machine type: n1-standard-4 (4 vCPU, 15GB memory)
- Boot disk: Ubuntu 20.04 LTS (100GB)
I had to do some trial and error creating a VM. Some errors I got were:
- Region availability: Not all types of VMs are available in all regions. Choose a region that’s as close to you as possible but still has the type of VM you’re choosing
- Region capacity errors: Some regions have the type of VM available but don’t have capacity at that particular moment. You can try again later or again just try choosing a different region
- GPU quota error: After resolving region errors, I then got an error “The GPUS-ALL-REGIONS-per-project quota maximum has been exceeded.” Click “Request Quota” in the notification, then “Edit Quota”. Request a quota of 1 GPU. Give some details about the project, and then request. This request is reviewed by the Google team. In my case, approval came within minutes.
3. SSH key setup
After the VM is set up, you next have to set up the connection to the VM using SSH:
- Generate an SSH key by opening your terminal on your local machine and running:
ssh-keygen -t rsa -b 4096 -C "your-email@example.com"
Press Enter to accept the default file location, and optionally set a passphrase.
- Copy your public key:
cat ~/.ssh/id_rsa.pub
- Add the key to your VM:
- Stop the VM instance
- Click on the VM name
- Click “Edit”
- Scroll down to “SSH Keys”
- Click “Add Item”
- Paste your public key
Note the username at the end of your SSH key (format: YOUR_USERNAME@computer
) - you’ll need this later.
4. VS Code remote setup
We’ll use VS Code as the development environment to connect to the new VM with:
- In VS Code, install Microsoft’s “Remote Development” extension pack
- Open VS Code command palette (Ctrl/Cmd + Shift + P)
- Type “Remote-SSH: Open Configuration File”
- Select the first configuration file option
- Add this configuration (replace the placeholders):
Host YOUR_VM_NAME
HostName YOUR_EXTERNAL_IP
User YOUR_USERNAME
IdentityFile ~/.ssh/id_rsa
YOUR_VM_NAME
: Whatever you want to call this connection. I used the name of the VM I created in GoogleYOUR_EXTERNAL_IP
: Find this in your Google Cloud VM instances listYOUR_USERNAME
: The username from your SSH key
- Open VS Code command palette (Ctrl/Cmd + Shift + P) again and search for “Remote-SSH: Connect to Host”
- Select the configuration you just created
- If it asks you anything, say “continue” or “yes”
If it works without error, you’re in!
If it fails to connect, first try connecting via terminal:
ssh -v YOUR_USERNAME@YOUR_EXTERNAL_IP -i ~/.ssh/id_rsa
Accept the authenticity prompt by typing “yes”. You should see a prompt like this if it worked:
YOUR_USERNAME@vm-name:~$
You can also verify the connection by running this in the remote terminal:
pwd # Print working directory
whoami # Should show your username
hostname # Should show your VM name
If this terminal connection works but VS Code doesn’t, close VS Code completely and try again.
5. Python environment setup
After connecting to the VM, now we have to set up the python development environment so that we can actually run stuff.
- Install Miniconda:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh
Follow the prompts and say ‘yes’ to initialization.
- If you have an existing project with an
environment.yml
file:
conda env create -f environment.yml
conda activate YOUR_ENV_NAME
6. Git setup
My code is hosted on GitHub so I also have to set up git to work with this code on the VM:
- Install git:
sudo apt update
sudo apt install git -y
- Configure git:
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
- For repository access, create a fine-grained Personal Access Token (PAT) on GitHub:
- Go to GitHub Settings → Developer Settings → Personal Access Tokens → Fine-grained tokens
- Set Repository Access to “Only select repositories”
- Choose your specific repository
- Under Permissions → Repository permissions, set:
- Contents: “Read and write”
- Commit statuses: “Read and write”
- Metadata: “Read-only”
- Pull requests: “Read and write”
- Workflows: “Read and write”
- Clone your repository:
git clone YOUR_REPO_URL
Use your GitHub username and PAT when prompted for credentials.
7. NVIDIA and CUDA setup
Next we have to install some stuff that lets you use the GPUs. Don’t ask me what a GPU is or what all this does, but it worked:
- Verify that we’re on Ubuntu:
lsb_release -a
- Install NVIDIA drivers and CUDA:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install -y ubuntu-drivers-common
sudo ubuntu-drivers devices # Shows available drivers
sudo apt install -y nvidia-driver-560 #Choose the driver from the above command that is listed as recommended
sudo apt install -y nvidia-cuda-toolkit
- Reset your VM by going to GCP list of VM instances and clicking “…” -> “Reset”. Verify installation of drivers worked with:
nvidia-smi
nvcc --version
8. PyTorch GPU setup
Finally we have to check that our installation of pytorch is able to access the GPU:
- Verify GPU access:
-c "import torch; print('PyTorch version:', torch.__version__); print('CUDA available:', torch.cuda.is_available()); print('CUDA version PyTorch was built with:', torch.version.cuda); print('GPU device:', torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'None')" python
- Test GPU functionality:
-c "import torch; x = torch.randn(1000, 1000).cuda(); y = torch.randn(1000, 1000).cuda(); z = torch.matmul(x, y); print('Test completed on GPU')" python
If you get any failures after running this test code, you may need to fiddle with some versions of packages you’re installing in your environment.yml
file. Best advice I can give is to ask ChatGPT/Claude/whatever. This is something that worked for me at one point–uninstalling some package versions and installing other versions with a CUDA toolkit:
conda remove pytorch torchvision torchaudio # Remove CPU version if present
conda install pytorch torchvision torchaudio cudatoolkit=10.1 -c pytorch
If you see a version mismatch between CUDA and PyTorch, it should probably be okay as long as these tests run successfully.
Conclusion
After all of that, your remote environment should be ready to go to run some AI models at non-sluggish speeds. Remember to turn your compute off when you aren’t using it or prepare for 💸💸💸