Glossary
This glossary provides definitions for key terms and concepts related to GlassFlow.
A
API:
The entry point for all requests to the GlassFlow platform via the REST API, responsible for creating a GlassFlow account, authentication, and managing data pipelines.
B
Batch Processing:
The processing of data in large, discrete batches, typically at scheduled intervals. Often used when real-time processing is not required, allowing for more efficient handling of large volumes of data.
C
CLI (Command Line Interface):
A tool provided by GlassFlow for users to interact with the platform through command line commands, enabling pipeline management and other operations such as deploying, monitoring, and troubleshooting.
Custom Transformation Logic:
User-defined Python code that specifies how data should be transformed as it moves through a pipeline. This allows users to perform complex, personalized data transformations according to their needs.
D
Data Pipeline:
A configured sequence of operations in GlassFlow that processes and routes data from sources, through transformations, to destinations. Pipelines automate the movement and transformation of data for analytics or other use cases.
Data Transformation:
The process of converting data from one format, structure, or schema into another within a pipeline. This is a critical step for ensuring data is usable and compatible with downstream systems or analytics.
Destination:
The endpoint in a data pipeline where processed data is sent, such as a database, cloud storage service, or another application. Common destinations include systems like Amazon S3, PostgreSQL databases, or external APIs.
E
Event-Driven Architecture:
A software architecture paradigm where operations are triggered by events or changes in state rather than by linear workflows. This approach allows for real-time data processing and responsiveness to data changes.
F
Function:
In the context of GlassFlow, a Python script containing a handle function that defines custom data transformation logic. Functions are used to process and modify data within pipelines.
Handle Function:
The mandatory function within a transformation script that GlassFlow invokes to process data. It is the main entry point for custom transformation logic and must accept and return data in a specified format.
I
Integration:
The connection and communication between GlassFlow and external services or tools, enhancing the platform's capabilities. This includes integrations with data sources, destinations, and external APIs for seamless data flow.
L
Logging:
The recording of events and operations within GlassFlow, useful for monitoring and debugging pipelines. Logs provide detailed insights into pipeline performance, errors, and system behavior.
M
Message Broker:
In the context of GlassFlow, a message broker facilitates the efficient routing, processing, and delivery of data streams between sources and destinations within data pipelines. It supports various messaging patterns such as publish/subscribe, request/reply, and queuing.
NATS Jetstream:
An advanced, scalable messaging system integrated with GlassFlow for efficient data streaming and event handling. NATS Jetstream allows for high-throughput, low-latency messaging and data storage, supporting real-time data processing workflows.
O
Organization:
Represents a collective group or entity that encompasses multiple users, spaces, and pipelines. It serves as the primary account structure for managing access, projects, and resources within the GlassFlow platform, facilitating collaboration and resource sharing among team members.
P
Pipeline Configuration:
The process of defining the settings and operations of a data pipeline in GlassFlow, including its sources, transformations, and destinations. Pipeline configuration can be done via the GlassFlow WebApp or programmatically using the GlassFlow Python SDK.
PostgreSQL:
An open-source relational database system integrated with GlassFlow for data storage and management. It supports complex queries and is commonly used as a destination for storing processed data.
R
Real-Time Data Processing:
The analysis and processing of data immediately as it is received, enabling instant insights and actions. Real-time processing is crucial for applications that require timely responses, such as fraud detection, monitoring, and live analytics.
S
SDK (Software Development Kit):
A set of tools and libraries provided by GlassFlow for developing applications and services that interact with the platform in Python. The SDK allows users to manage pipelines, data sources, and other GlassFlow resources programmatically.
Serverless Architecture:
A cloud computing execution model where the cloud provider dynamically manages the allocation of machine resources, scaling automatically to match the demands of the application. GlassFlow leverages serverless architecture to provide scalability and flexibility without the need for users to manage infrastructure.
Source:
The origin point in a data pipeline from which data is ingested. Sources can include databases, APIs, file systems, or message brokers.
Space:
A workspace or environment in GlassFlow where related pipelines are organized and managed. Spaces allow for the segregation of different projects, environments, or teams within an organization, providing a way to manage resources and permissions effectively.
T
Transformation Function:
A Python function used in GlassFlow to define custom data transformations. This function is part of the larger pipeline and is invoked to modify the data between its source and destination.
Throughput:
The amount of data processed by a pipeline or system in a given period. High throughput is important for real-time data processing and large-scale data pipelines.
U
User Role:
A defined set of permissions and access rights assigned to users within an organization or Space. Roles allow for granular access control, ensuring that team members have the appropriate level of access to pipelines and other resources.
Unit Test:
A type of software testing that involves testing individual units or components of a system in isolation. In the context of GlassFlow, unit tests are used to ensure that individual components of data pipelines (such as transformation functions) work as expected before being integrated into a larger workflow.
W
Webhook:
A mechanism for one system to send real-time data to another system by making an HTTP request. In GlassFlow, webhooks can be used to trigger actions or notifications based on pipeline events or external changes in data.
X
X-Request-ID:
A unique identifier used to track and correlate API requests in GlassFlow. The X-Request-ID helps in debugging and tracing requests through the system by providing a way to link logs and operations associated with a particular request.
This glossary serves as a reference for the key terms and concepts in GlassFlow, helping users better understand the platform's features and functionality. For further details, please refer to the full documentation.