GCP Developer Certification Preparation Guide

Professional Developer Certification

I recently passed Google Professional Developer Certification, during the preparation I went throught lot resources about the exam. I also used this book which is a good read and covers most of the exam topics. It is very good starting point for the preparation if you have little knowledge on Google Cloud services.

Keep in mind that Google update its services very often, thus any source of information other than the official documentation may become out dated.

The exam is relatively at the same difficulty level of the Data engineer certification exam:

The exman focuses on the following areas:

I could not find a comprehensive resource that covers all aspect of the exam when I started preparing. I had to go over a lot of Google Cloud products page and general Machine Learning resources and at no point I felt ready as both topics are huge. Here I will try to provide a summary of the resources I did found helpful for passing the exam.

Storage

You need to know the different storage classes (see link) for your workload. Which one to use to save costs without sacrificing performance by storing data across different storage classes.

The following table summaries the different storage classes and how they compare to each other.

Class Storage Cost Access Cost Access Frequency Description
Standard High Low Access data frequently Hot or Frequently accessed data: websites, streaming videos, and mobile apps.
Nearline Low High Access data only once a month Data stored for at least 30 days, including data backup and long-tail multimedia content.
Coldline Very low Very High Access data only once a year. Data stored for at least 90 days, including disaster recovery.
Archive Lowest Highest   Data stored for at least 365 days, including regulatory archives.
Multi-Regional Storage High High Access data frequently Equivalent to Standard Storage, except it can only be used for objects stored in multi-regions or dual-regions.

Other important topcis related to Cloud storage

For general best practices related to Google Storage check this link.

Databases

You need to know the different databases offered in GCP and which one to use for a given use case.

SQL

Cloud SQL service provides hosted relational Databases (Postgresql, MySql, SQL Server), and multi-region SQL Database (Spanner). You need to know:

NoSQL

GCP offers a variety of NoSQL databases, you need to know the difference between those services and when to use each one.

BigTable

Bigtable is a hosted NoSQL database alternative to Cassandra and HBase. It stores data in a unique way which makes it suitable for low latency access and time-series data (e.g. Financial market data).

Firestore

Easily develop rich applications using a fully managed, scalable, and serverless document database.

Firestore in Native mode is the next generation of Datastore. It is recommended for storing user-session information and is a natural choice for this test.

Memorystore

Memorystore is an in-memory database suitable as cache for fast data access - link.

Data warehouse

BigQuery is a hosted, serverless data warehouse. It has limited update/delete capabilities for inserted rows but is very performant for analytic workloads.

Syntax

You need to know basic SQL syntax to use BigQuery, for instance the different types of JOIN operations - link

Join Description Example
[INNER] JOIN An INNER JOIN, or simply JOIN, effectively calculates the Cartesian product of the two from_items and discards all rows that do not meet the join condition. FROM A INNER JOIN B ON A.w = B.y
CROSS JOIN returns the Cartesian product of the two from_items. In other words, it combines each row from the first from_item with each row from the second from_item. FROM A CROSS JOIN B
FULL [OUTER] JOIN A FULL OUTER JOIN (or simply FULL JOIN) returns all fields for all rows in both from_items that meet the join condition. FROM A FULL OUTER JOIN B ON A.w = B.y
LEFT [OUTER] JOIN A LEFT OUTER JOIN (or simply LEFT JOIN) for two from_items always retains all rows of the left from_item in the JOIN operation, even if no rows in the right from_item satisfy the join predicate. FROM A LEFT OUTER JOIN B ON A.w = B.y
RIGHT [OUTER] JOIN A RIGHT OUTER JOIN (or simply RIGHT JOIN) is similar and symmetric to that of LEFT OUTER JOIN. FROM A RIGHT OUTER JOIN B ON A.w = B.y

Compute

GCP offers many ways to run application logic, from Cloud Compute that offers lot of freedom and control to AppEngine or Cloud Functions that offer less flexibility but takes care of operations complexity.


AppEngine

AppEngine is one of the earliest services in GCP, it let you build monolithic applications or websites in a range of development languages and takes care of scaling it for you.

Compute Engine

Compute Engine is the Infrastructure as a Service (IaaS) offering in GCP. It is a hosted service that lets you create and run virtual machines on Google infrastructure.

Cloud Functions

Cloud Function is the functions as a service (FaaS) offering on GCP. It lets you run code without having to manage servers or containers. It is best suited for event driven services, and let you scale the number of functions to handle load increase.

Cloud Run

Cloud Run is a serverless service that let you run containers on GCP without having to manage any infrastructure.

Kubernetes Engine

Google Kubernetes Engine (GKE) is the hosted kubernetes service offering on GCP

Auto-scaling

Auto-scaling in GKE is an important topic

Anthos

Networking

Cloud Interconnect

Cloud Interconnect extends your on-premises network to Google’s network through a highly available, low latency connection. You can use Dedicated Interconnect to connect directly to Google or use Partner Interconnect to connect to Google through a supported service provider.

Solution Capacity Description Connectivity
Dedicated Interconnect 10-Gbps or 100-Gbps circuits with flexible VLAN attachment capacities from 50 Mbps to 50 Gbps. A direct connection to Google, must meet Google’s network in colocation facility not through the public internet.
Partner Interconnect Flexible capacities from 50 Mbps to 50 Gbps. connectivity through one of our supported service providers. not through the public internet.

DevOps

Container Registry

Container Registry is a hosted service for securely storing and managing Docker container images.

Cloud Build

Cloud Build is a hosted Continuous Integration service, it lets you continuously build, test, and deploy applications.

Good to know

There is a persistent file system that is shared between steps in a Cloud Build. We change the story to be:

  1. Deploy the Cloud Function.
  2. Save the results of calling the Cloud Function to a file.
  3. Delete the Cloud Function.
  4. Test the content of the file. Since step 2 can now never fail, step 3 is executed and step 4 defines the outcome of the build as a whole.

Logging

Audit

Know the different types of events that Logging agents can capture

Log Type Description Documentation
Admin activity show destroy, create, modify, etc. events for a VM instance. documentation
Data access Show read activities. documentation
Syslog A service running in systemd that outputs to stdout will have logs in syslog and will be scraped by the logging agent. documentation
System event Tell you about live migration, etc. documentation
VPC flow logs uses the substrate specific logging to capture everything. documentation and CloudAcademy course

Export

Logging retains app and audit logs for a limited period of time. You might need to retain logs for longer periods to meet compliance obligations. Alternatively, you might want to keep logs for historical analysis.

You can route logs to Cloud Storage, BigQuery, and Pub/Sub. Using filters, you can include or exclude resources from the export. For example, you can export all Compute Engine logs but exclude high-volume logs from Cloud Load Balancing.

Monitoring

It is very important to put in place a monitoring strategy before pushing an application live to production. GCP offers a set of suite to help monitoring like Cloud Trace, Cloud Profiler and Cloud Debugger.

Also good to know about alternative open-source services that can be used

The following video provides good summary of the different services offered on GCP for monitoring applications:

Cloud Trace

The following picture depicts how traces are visualized in Clout Trace

Cloud Profiler

Cloud Profiler is a statistical, low-overhead profiler that continuously gathers CPU usage and memory-allocation information from your production applications. It attributes that information to the application’s source code, helping you identify the parts of the application consuming the most resources, and otherwise illuminating the performance characteristics of the code.

The following picture depicts how provide are visualized in application stacktraces

Cloud Debugger

Cloud Debugger is a hosted service that makes debugging live application very easy. The service seems to be depreacated but it’s still possible to see a question on it during the exam.

Deployments

Know the different application deployments strategies and how to use tools like Spinnaker for Continuous Deployment.

Some useful resrources on deployments:

Security

Security is a broad topic that covers every aspect of your Cloud. Each of the previous services have a specific built-in security. Some areas to know

Resource Manager

Google Cloud provides container resources such as organizations and projects that allow you to group and hierarchically organize other Google Cloud resources. This hierarchical organization helps you manage common aspects of your resources, such as access control and configuration settings. The Resource Manager API enables you to programmatically manage these container resources. Check the documentation to learn more about this service - link.

Permission

Cloud Endpoints

Endpoints is an API management system that helps you secure, monitor, analyze, and set quotas on your APIs using the same infrastructure Google uses for its own APIs.

Depending on where your API is hosted and the type of communications protocol your API uses:

Option Limitation
OpenAPI  
gRPC Not supported on App Engine or Cloud Functions
Endpoints Frameworks Supported only on App Engine standard Python 2.7 and Java 8

Other

Certification SWAG

After passing the exam, you can choose one of the official certification swags:

developer-certification-swags

That’s all folks

Check the following preparation tips for passing other Google certifications:

Feel free to leave a comment or reach out on twitter @bachiirc