Connect to BigQuery

Prerequisites

  • Procore Analytics 2.0 SKU.

  • Python 3.8 or higher.

  • Access to Google Cloud Platform (GCP).

  • Required permissions on both Delta Share and BigQuery.

  • Download the zipped package from the company level Procore Analytics tool (via Procore Analytics > Getting Started > Connection Options > BigQuery).

Steps

Set Up Configuration

Delta Share Configuration

  1. Create a file named config.share with your Delta Share credentials in JSON format.

  2. Get required fields.
    Note: These details can be obtained from the Procore Analytics web application.

    • bearerToken: Your Delta Share access token.

    • endpoint: Your Delta Share endpoint URL.

    • shareCredentialsVersion: Version number (currently 1).

Example config.share File

{
"shareCredentialsVersion": 1,
"bearerToken": "",
"endpoint": ""
}

BigQuery Configuration

  1. Download the bigquery.zip file from the Procore Analytics web application.
    Note: You can download the zipped package from the company level Procore Analytics tool (via Procore Analytics > Getting Started > Connection Options > BigQuery).

  2. Extract the package to a directory of your choice.

  3. Open the config.yaml file and modify the following parameters:

    • source_config.config_path: Path to Delta Share configuration file.

    • source_config.tables: Optional list of specific tables to process. Leave it empty to process all tables.

    • target_config.project_id: GCP project ID for BigQuery.

    • target_config.dataset: BigQuery dataset name.

    • target_config.threads: Number of concurrent table processes.

Example config.yaml File

source_config:
config_path: "<path_to_delta_share_config>"
tables: # Optional - list of specific tables to process
- "table1"
- "table2"

target_config:
project_id: "<your-gcp-project-id>"
dataset: "<bigquery-dataset-name>"
target_type: bigquery

Upload Configuration File

  1. Upload both config.yaml and config.share files to the gs bucket.

    1. Google Cloud Storage (GCS)

      • Use format: gs://bucket-name/path/to/config.yaml.

Run the BigQuery Application

  1. Create a Python notebook and install the following packages:

    • %pip install delta-sharing

    • pip install pandas-gbq -U

  2. Copy the code from delta_share_to_bq.py, paste it into your notebook, update the configuration path (config.yaml), and run it.

Monitoring and Logging

The application provides detailed logging with:

  • Processing status for each table.

  • Error messages and exceptions.

  • Concurrent processing of information.

Best Practices

  • Performance Optimization

    • Adjust thread count based on system resources.

    • Monitor memory usage with large tables.

    • Consider table sizes when setting concurrent processes.

  • Error Management

    • Monitor application logs.

    • Set up appropriate alerting.

    • Maintain backup configurations.

Troubleshooting

Common issues and solutions:

  • Connection Failures

    • Verify network connectivity.

    • Check credential validity.

    • Confirm service account permissions.

  • Processing Errors

    • Verify table existence.

    • Check table access permissions.

    • Validate configuration settings.

  • Performance Issues

    • Reduce concurrent threads.

    • Monitor system resources.

Support

For additional help:

  • Review application logs for error details.

  • Verify configuration settings.

  • Ensure all prerequisites are met.

  • Contact your system administrator for permission-related issues.