Google Pub/Sub Source Integration

GlassFlow provides a managed connector for Google Pub/Sub

Google Pub/Sub is a messaging service that allows you to send and receive messages between independent applications. It is commonly used to stream events and data between systems in real-time. By using Google Pub/Sub as a source, you can ingest data directly from topics into your GlassFlow pipelines for further processing.

Prerequisites

Before you begin, ensure you have the following:

  1. A Google Cloud account and access to the Google Pub/Sub service.
  2. A Google Pub/Sub topic and subscription that will be used as the data source.
  3. GlassFlow account and access to the GlassFlow WebApp or Python SDK for pipeline creation.
  4. Google Cloud credentials (service account key) for authentication.

Step 1: Setting Up Google Pub/Sub

Follow the step-by-step guidance from Google to set up a Google Cloud console project and perform basic tasks in Pub/Sub using the Google Cloud console

  1. Project ID: You can find your Project ID in the Google Cloud Console under the "Project Info" section on the dashboard.
  2. Pub/Sub Subscription ID:
    • Go to the Google Cloud Console.
    • Navigate to the Pub/Sub section.
    • Select your topic and find the subscription ID under the "Subscriptions" tab.
  3. Service Account Key:
    • Go to the Service Accounts page in the Google Cloud Console.
    • Select your project.
    • Click Create Service Account.
    • Fill in the required details and click Create.
    • Assign the Pub/Sub Subscriber role to the service account.
    • Click Done and then Manage Keys for the service account you created.
    • Click Add Key > Create New Key.
    • Choose JSON and click Create. A JSON file will be downloaded to your computer.

Step 2: Integrating Google Pub/Sub as a Source in GlassFlow

You can integrate Google Pub/Sub with GlassFlow either through the WebApp or by using the Python SDK. Below are the instructions for both methods.

Using WebApp

PubSub WebApp

  1. Log in to the GlassFlow WebApp and navigate to the "Pipelines" section.
  2. Create a new pipeline.
  3. Go to Source section in the pieline setup page and select Google Pub/Sub.
  4. Enter the following details:
    • Project ID: Your Google Cloud project ID.
    • Subscription ID: The Pub/Sub ID from which to pull mnessages from.
    • Credentials JSON: copy and paste the Credentials JSON file content you downloaded above.
  5. Click Continue to save your source settings and configure other parts of your pipeline.

Once created, this source will be available for use in your GlassFlow pipelines. Data will be ingested in real-time as it arrives in the Pub/Sub subscription.

Using PythonSDK

If you are using GlassFlow Python SDK to deploy and manage the pipeline, the following code shows how to configure Google PubSub connector via the SDK.

A fully functional example to create a pipeline with Google PubSub as a managed source connector is available on our examples repo as a Jupyter Notebook

Using Github Actions

If you are using GitHub Actions to to deploy and manage the pipeline, the following snippet shows the YAML configuration of Google PubSub connector component