Data transformation

What is Data Transformation?

Data transformation involves converting data from its original format to a different format or structure to make it more suitable for analysis, processing, or storage.
Data transformation often involves cleaning, enriching, and manipulating data using various libraries and functions.

Common Data Transformations that you can do with GlassFlow

Data Cleaning by removing unwanted columns
Data Enrichment with external data from APIs
Data Validation to check for schema consistency
Data Anomaly Detection
Data Quality Check with custom rules
Data Normalization to adhere to destination schemas
Data Conversion from one format to another
Real-time APIs integration to enrich data
LLMs integration with any custom python library
ML-trained model integration with Hugging face or other providers

Transforming data in Python with GlassFlow

In GlassFlow, you create a custom transformation function in a Python script to transform data. You implement your logic for the transformation inside the handler function.

Deploy transformation function

To deploy and run the transformation function you defined in GlassFlow, you create a pipeline and provide the function along with requirements.txt and any env variables. GlassFlow packages the function as a container amd executes it on its Serverless Execution Engine for every event entering the pipeline.

Python dependencies for transformation

With each import statement in your transformation function script, you are bringing in a new Python dependency. GlassFlow needs to install those dependencies to compile and run the function successfully.

Space

Data Source