Google Pub/Sub Source Integration
This page provides detailed instructions on how to integrate Google Pub/Sub as a data source in GlassFlow
Google Pub/Sub is a messaging service that allows you to send and receive messages between independent applications. It is commonly used to stream events and data between systems in real-time. By using Google Pub/Sub as a source, you can ingest data from topics into your GlassFlow pipelines for further processing.
Prerequisites
Before you begin, ensure you have the following:
- A Google Cloud account and access to the Google Pub/Sub service.
- A Google Pub/Sub topic and subscription that will be used as the data source.
- GlassFlow account and access to the GlassFlow WebApp or Python SDK for pipeline creation.
- Google Cloud credentials (service account key) for authentication.
Step 1: Setting Up Google Pub/Sub
Follow the step-by-step guidance from Google to set up a Google Cloud console project and perform basic tasks in Pub/Sub using the Google Cloud console
- Project ID: You can find your Project ID in the Google Cloud Console under the "Project Info" section on the dashboard.
- Pub/Sub Subscription ID:
- Go to the Google Cloud Console.
- Navigate to the Pub/Sub section.
- Select your topic and find the subscription ID under the "Subscriptions" tab.
- Service Account Key:
- Go to the Service Accounts page in the Google Cloud Console.
- Select your project.
- Click
Create Service Account
. - Fill in the required details and click
Create
. - Assign the
Pub/Sub Subscriber
role to the service account. - Click
Done
and thenManage Keys
for the service account you created. - Click
Add Key
>Create New Key
. - Choose
JSON
and clickCreate
. A JSON file will be downloaded to your computer.
Step 2: Integrating Google Pub/Sub as a Source in GlassFlow
You can integrate Google Pub/Sub with GlassFlow either through the WebApp or by using the Python SDK. Below are the instructions for both methods.
Using WebApp
- Log in to the GlassFlow WebApp and navigate to the "Pipelines" section.
- Create a new pipeline.
- Go to Source section in the pieline setup page and select Google Pub/Sub.
- Enter the following details:
- Project ID: Your Google Cloud project ID.
- Subscription ID: The Pub/Sub ID from which to pull mnessages from.
- Credentials JSON: copy and paste the Credentials JSON file content you downloaded above.
- Click Continue to save your source settings and configure other parts of your pipeline.
Once created, this source will be available for use in your GlassFlow pipelines. Data will be ingested in real-time as it arrives in the Pub/Sub subscription.
Using PythonSDK
- Log in to the GlassFlow WebApp and copy your personal access token from the profile page needed for autheticating the PythonSDK.
- Download the jupter notebook from GlassFlow github examples repository that shows step by step guide to create a pipeline with Google Pub/Sub as a data source
- Install the GlassFlow python SDK locally
and run the jupyter notebook.