Onboarding Steps

Bluesky continuously monitors your Snowflake environment, surfacing insight into workloads, performance, bottlenecks and more. Unlike simplistic visibility tools & dashboards or point solutions for query acceleration and warehouse optimization, Bluesky synthesizes signals from multiple layers (query, warehouse, storage, serverless) to deliver visibility into spend and performance, then provides actionable recommendations to remediate crucial performance-related activities.

Summary

Bluesky connects to Snowflake via the Snowflake Database which contains metadata like historical queries, usage metrics, organizations and accounts. Bluesky uses such metadata to analyze underlying workloads, provide granular reporting, and provide efficiency optimization suggestions to improve performance and ROI of Snowflake.

Bluesky requires read-only access to the Snowflake Database and thus, will not be able to mutate any data in it. Bluesky requires no access to the actual business data (e.g. tables, materialized view, schema) in your Snowflake instance. 

Built with Security in Mind

Bluesky pulls Snowflake metadata from your instance via an Airbyte process. Metadata is stored and analyzed in  Bluesky's multi-tenant Snowflake instance where each customer's data is logically separated. The Bluesky platform is fully SOC-2 compliant.

By default, Bluesky does not have read or write access to any of the customer's data that is stored in Snowflake. This access is tightly controlled during the onboarding process where customers create a new user for Bluesky with an extremely limited set of permissions.

Bluesky goes further than most tools when it comes to privacy. We have developed an opt-in feature that checks to make sure there is no identifiable data in the query text. All such data is scrubbed on the customer side never transmitted outside the account.

Bluesky does not process any of your data. It only has read-only access to your Snowflake warehouses. Bluesky data is also encrypted both in transit and at rest. The following instructions describe how to create a read-only user with the appropriate level of access. The steps needed to create this user must be completed by a user with the AccountAdmin role.

Set Up Instructions

In the following script, we will use the Snowflake admin role to create a user and a role for Bluesky with minimal privileges. The reason why we’re using the accountadmin role is because it’s the only role that can provide the Snowflake metadata which Bluesky reads to provide its analysis. The only thing that the Bluesky user will be able to do is read usage data. You will be able to track all the activity of the Bluesky user in the Snowflake UI and in the query history.

Your organization might have multiple accounts. Please make sure you run the script in the account you want to onboard to Bluesky AND the account that has the ORGADMIN role. They might be the same.

Script to Copy:

Copy and paste the following into the Snowflake Console:

SQL Script


-- Step 1. Switch to the Account Admin role
USE ROLE ACCOUNTADMIN;

-- Step 2. Create a new role for Bluesky Data
CREATE ROLE BLUESKY_READONLY
   COMMENT = 'This role is used by Bluesky Data';
   
-- Step 3. Give our role privileges on the internal SNOWFLAKE db to allow us to analyze historical queries and metadata
GRANT IMPORTED PRIVILEGES
  ON DATABASE SNOWFLAKE
  TO ROLE BLUESKY_READONLY;

-- Step 4. Create a small sized Warehouse
CREATE WAREHOUSE BLUESKY_WH WITH WAREHOUSE_SIZE = 'SMALL' WAREHOUSE_TYPE = 'STANDARD' AUTO_SUSPEND = 30 AUTO_RESUME = TRUE MIN_CLUSTER_COUNT = 1 MAX_CLUSTER_COUNT = 2 SCALING_POLICY = 'STANDARD';

-- Step 5. Give our role USAGE on the Warehouse
GRANT USAGE ON WAREHOUSE BLUESKY_WH TO ROLE BLUESKY_READONLY;

-- Step 6. Grant all privileges to the warehouse
GRANT ALL PRIVILEGES ON WAREHOUSE BLUESKY_WH TO ROLE BLUESKY_READONLY;

-- Step 7. Create a user (username is BLUESKY_USER, password is 123)
CREATE USER BLUESKY_USER PASSWORD = '123' DEFAULT_ROLE =  BLUESKY_READONLY
MUST_CHANGE_PASSWORD = TRUE;

-- Step 8. Grant the role to user
GRANT ROLE BLUESKY_READONLY TO USER BLUESKY_USER;

-- Step 9. Set the default role as read_only for safety reasons
ALTER USER BLUESKY_USER SET DEFAULT_ROLE = BLUESKY_READONLY;

-- Step 10. Create a DB for us and grant us write access to the DB. This step allows us to define our own views (e.g., sampling, or filters) for analysis purposes.
CREATE DATABASE BLUESKY_DB;

-- Step 11. Grant all privileges for the newly created DB (BLUESKY_DB) TO ROLE BLUESKY_READONLY
GRANT ALL PRIVILEGES ON DATABASE BLUESKY_DB TO ROLE BLUESKY_READONLY;

-- Step 12. Grant monitoring permission to role BLUESKY_READONLY
GRANT MONITOR USAGE ON ACCOUNT TO ROLE BLUESKY_READONLY;

-- Step 13. Allows access to query operator stats
GRANT MONITOR ON ACCOUNT TO ROLE BLUESKY_READONLY;

-- Step 14. Create a place for us to store the account metadata
CREATE SCHEMA BLUESKY_DB.DATA;

-- Step 15: Grant ownership to the DATA schema
GRANT OWNERSHIP ON SCHEMA BLUESKY_DB.DATA TO ROLE BLUESKY_READONLY;

-- Step 16. Grant privileges to the DATA schema
GRANT ALL PRIVILEGES ON SCHEMA BLUESKY_DB.DATA TO ROLE BLUESKY_READONLY;

-- Step 17. Explicitly grants privilege to create a stage on the newly created schema
GRANT CREATE STAGE ON SCHEMA BLUESKY_DB.DATA TO ROLE BLUESKY_READONLY;

-- Step 18. Creates a storage integration
CREATE OR REPLACE STORAGE INTEGRATION BLUESKY_GCS_INTEGRATION
       TYPE = EXTERNAL_STAGE
       ENABLED = TRUE
       STORAGE_PROVIDER = 'GCS'
       STORAGE_ALLOWED_LOCATIONS = ('*');

-- Step 19. Grant ownership to the BLUESKY_GCS_INTEGRATION storage integration
GRANT OWNERSHIP ON INTEGRATION BLUESKY_GCS_INTEGRATION TO ROLE BLUESKY_READONLY;

-- Step 20. Grant all privileges to the BLUESKY_GCS_INTEGRATION storage integration
GRANT ALL PRIVILEGES ON INTEGRATION BLUESKY_GCS_INTEGRATION TO ROLE BLUESKY_READONLY;

-- Step 21. Grant privilege to create storage integrations
GRANT CREATE INTEGRATION ON ACCOUNT TO ROLE BLUESKY_READONLY;




Neither we nor you need to manage authentication/credentials since we are using first-party Snowflake integrations with Google Cloud Platform. The account used to move data is created and maintained by Snowflake directly, with the keys to those accounts not being exposed anywhere. This means that key rotation and management does not need to be maintained by either you or Bluesky. More context about how these Snowflake-specific accounts are created and used can be found here. This built-in account exists regardless of which cloud your Snowflake account is deployed on.

On our end, we will take the name of the built-in Google Cloud Storage service account and will grant it access to write to the bucket we specified in the Storage Integration and Stages. This bucket is specific to your Snowflake account and is properly isolated such that only you will be able to write to that location.

If your Snowflake account is deployed on Google Cloud Platform then all data will stay within google's private network. If your Snowflake is deployed to AWS or Azure then data will be sent across the public internet using TLS 1.2 encryption. All GCS storage is encrypted at-rest by default using AES-256. More information on cross-cloud data transfers can be found in a FAQ provided by snowflake.

Network Policies (where applicable)

If you are restricting access to your account based on user IP address, please add the following Bluesky server IPs to the Allowed IP Addresses List:

35.162.112.43

52.37.151.111

44.240.212.236

35.193.205.184

35.199.151.33

54.144.234.126

Final Steps 

You will need to let the Bluesky team know your Snowflake URL. Please email the Bluesky team with your Snowflake URL and credentials. If in doubt, please send the email to prodan@getbluesky.io.

Onboarding Steps >Bluesky Estimate >Alert Systems >