Pipeline Execution Errors
If your pipeline is not executing as expected, with errors related to data transformations or processing, these troubleshooting steps can help identify and resolve the issue.
Note:
Always check the pipeline logs in the GlassFlow WebApp for detailed error messages. Logs provide valuable insights into the source of issues, helping you diagnose problems quickly.
Transformation Script Errors
- Symptom: Your pipeline fails during execution, showing errors related to transformation scripts (e.g., syntax errors, missing dependencies).
- Solution:
- Double-check the transformation script (
transform.py
) for any syntax errors or unresolved variables. - Ensure that all necessary dependencies are listed in the
requirements.txt
file. You can test locally by creating a virtual environment and running the script with the specified dependencies.
- Double-check the transformation script (
Missing Dependencies in the Environment
- Symptom: The pipeline fails due to missing dependencies or incompatible versions.
- Solution:
- Verify that your
requirements.txt
includes all libraries used in the transformation script. Be specific with version numbers if required for compatibility. - Update the
requirements.txt
file in the pipeline settings, then the pipeline fetches automatially latest dependencies to ensure all dependencies are installed.
- Verify that your
Data Schema Mismatches
- Symptom: The pipeline errors out or produces unexpected results due to data schema mismatches between input and output.
- Solution:
- Verify the schema of the input data matches the expected structure within your transformation function.
- If you’re using nested or complex data structures, consider adding validation steps in your script to catch schema issues early on.
- Test with a small subset of data locally to confirm the schema before deploying the full pipeline.