Updated on 2025-09-18 GMT+08:00

Development Notes

  • DataArts Fabric SQL offers three ways to register UDFs: Explicit Registration of Scalar UDF, Implicit Registration of Scalar UDF, and SQL DDL Registration. You are advised to use the ibis-fabric SDK for explicit registration, which has two major advantages:
    • The registration interface provided by the SDK fully adheres to the usage norms of DataArts Fabric SQL DDL. Correctly calling the interface avoids issues where UDFs cannot be used normally due to violations of constraint norms.
    • The SDK provides additional constraint checks, such as Python version detection and automatic organization of UDF code packages.
  • The third-party dependency package list for Python UDF must ensure compatibility among libs, and you are advised to explicitly specify stable versions for all dependencies. When you debug UDFs locally, you are advised to use third-party packages from Huawei sources to set up the debugging environment, avoiding issues like inability to install versions or version incompatibilities when the UDF code runs in DataArts Fabric SQL compared to the local environment.
  • When preparing the compressed package and using CLOUDPICKLE to serialize the main function, ensure that the local Python version is 3.11.9 and cloudpickle version is 3.0.0, to prevent parsing failures of the function body due to version inconsistencies in the DataArts Fabric SQL runtime environment.
  • You can only use buckets of the OBS parallel file system to store UDF code compressed packages, and you must grant read permissions to IAM users on LakeFormation.