Pipeline Execution Errors

If your pipeline is not executing as expected, with errors related to data transformations or processing, these troubleshooting steps can help identify and resolve the issue.

Note:

Always check the pipeline logs in the GlassFlow WebApp for detailed error messages. Logs provide valuable insights into the source of issues, helping you diagnose problems quickly.

Transformation Script Errors

  • Symptom: Your pipeline fails during execution, showing errors related to transformation scripts (e.g., syntax errors, missing dependencies).
  • Solution:
    • Double-check the transformation script (transform.py) for any syntax errors or unresolved variables.
    • Ensure that all necessary dependencies are listed in the requirements.txt file. You can test locally by creating a virtual environment and running the script with the specified dependencies.

Missing Dependencies in the Environment

  • Symptom: The pipeline fails due to missing dependencies or incompatible versions.
  • Solution:
    • Verify that your requirements.txt includes all libraries used in the transformation script. Be specific with version numbers if required for compatibility.
    • Update the requirements.txt file in the pipeline settings, then the pipeline fetches automatially latest dependencies to ensure all dependencies are installed.

Data Schema Mismatches

  • Symptom: The pipeline errors out or produces unexpected results due to data schema mismatches between input and output.
  • Solution:
    • Verify the schema of the input data matches the expected structure within your transformation function.
    • If you’re using nested or complex data structures, consider adding validation steps in your script to catch schema issues early on.
    • Test with a small subset of data locally to confirm the schema before deploying the full pipeline.