Master Snowflake with DBT
- I) SnowSQL
- Introduction to SnowSQL
- Overview of SnowSQL
- Installing and setting up SnowSQL
- Configuring SnowSQL
- Basic Commands
- Connecting to Snowflake
- Executing basic SQL commands
- Data types and literals
- Data Definition Language (DDL)
- Creating and managing databases
- Creating and managing schemas
- Creating and managing tables
- Altering and dropping tables
- Data Manipulation Language (DML)
- Inserting data into tables
- Updating data in tables
- Deleting data from tables
- Querying data from tables
- Filtering and Sorting Data
- Using the WHERE clause
- Using logical operators
- Sorting data using ORDER BY
- String Functions
- CONCAT,
- SUBSTRING,
- LENGTH,
- UPPER,
- LOWER,
- TRIM,
- REPLACE,
- SPLIT,
- POSITION
- Common Table Expressions (CTEs)
- Introduction to CTEs
- Using CTEs in queries
- Recursive CTEs
- Set Operators
- UNION,
- UNION ALL
- INTERSECT
- MINUS/EXCEPT
- Conditional Expressions
- CASE statements
- DECODE function
- IFNULL,
- COALESCE
- Handling NULLs
- Understanding NULL values
- NULL handling functions: ISNULL, IFNULL, NVL, NVL2
- Aggregate Functions
- COUNT,
- SUM,
- AVG,
- MAX,
- MIN
- GROUP BY clause
- HAVING clause
- Analytical Functions
- ROW_NUMBER,
- RANK,
- DENSE_RANK
- LEAD,
- LAG
- LISTAGG
- Joins
- Inner joins
- Left, right, and full outer joins
- Cross joins
- Self joins
- Subqueries
- Inline subqueries
- Correlated subqueries
- EXISTS and NOT EXISTS
- Table Constraints
- Primary keys
- Foreign keys
- Unique constraints
- Check constraints
- Views
- Creating and managing views
- Materialized views
- Difference between views and tables
- Indexes
- Creating indexes
- Using indexes to optimize queries
- Dropping indexes
- Performance Tuning
- Query optimization techniques
- Understanding execution plans
- Using Snowflake’s Query Profiler
- Introduction to Data Warehousing and Snowflake
- What is Data Warehouse
- Limitations of Traditional Data Warehouses
- Evolution of Data Warehouses
- Advantages of Cloud over On-Prem
- Create Snowflake Trial Account
- Snowflake Editions
- Overview of Key Features
- Core Concepts
- Snowflake Architecture
- Explore – Databases, Schemas, and Tables
- Different Roles in Snowflake
- Virtual Warehouses (Compute Engine) – Introduction
- Overview of Snowflake Virtual Warehouses
- Snowflake Pricing
- Snowflake Pricing – Overview
- What are Snowflake Credits
- Snowflake Editions
- Storage Cost
- Resource Monitoring
- Resource Monitors – Introduction
- Properties of Resource Monitor
- Suspension or Resumption
- Detailed Example – How Resource Monitors Work
- Create a Resource Monitor using WebUI
- Micro-Partitioning and Clustering
- Partitioning in Traditional Data Warehouses
- Micro-Partitioning in Snowflake
- Advantages of Micro-Partitioning
- Structure of Micro-Partition
- Query Processing in Snowflake
- Clustering Introduction
- Clustering Keys
- Clustering Metadata
- Clustering Depth
- Automatic Re-Clustering
- How Clustering Works & Its Advantages
- How to Choose the Right Keys for Clustering
- Performance Tuning
- Query History and Caching
- Detailed Lab – Query History
- SQL Queries to Fetch Query History
- Caching in Snowflake
- Caching – Detailed using Examples
- Using Snowflake
- Web Interface
- Virtual Warehouses
- Databases, Tables, & Views
- Queries
- Date & Time Data
- Semi-Structured Data
- Coding Part
- Worksheets – Databases, Schemas, Tables, etc.
- Setting up Context/Parameters for Worksheet
- Run First SQL Statement
- More about Worksheet’s Results Section
- Load Data from AWS
- What is AWS S3 and How to Create Buckets
- What is IAM (Identity & Access Management)
- Create an IAM Group
- Create IAM User
- Create AWS IAM Role
- Upload Data to AWS S3 Using Web Console
- Loading Data – Create Table Schema, Integration Object
- Loading CSV Data to Snowflake Using COPY Command
- Load Complete CSV Data from S3
- Snow pipe
- Continuous Data Ingestion Process
- Introduction to Snow pipe
- Load CSV Data from S3 Using Snow pipe
- Different Types of Tables in Snowflake
- Different Types of Tables
- Create Permanent & Transient Database
- Create Permanent & Transient Schema
- Create Temporary, Permanent & Transient Table
- Convert Table to a Different Type
- Time Travel and Working with History of Objects
- What is Time Travel – Detailed
- Set & Alter Retention Time Property
- Query Historical Results
- Clone Historical Objects
- Restore the Dropped Objects
- Fail Safe
- What is Fail Safe – Detailed Overview
- Check Fail Safe Storage Consumption
- Tasks – Scheduling Service
- Introduction to Tasks
- What is a Tree of Tasks
- How to Implement a Standalone Task
- Check Tasks History
- Streams – CDC (Change Data Capture)
- Introduction to Streams
- How Streams Work Under the Hood
- Capture & Process INSERT Operations
- Capture & Process UPDATE Operations
- Capture & Process DELETE Operations
- How Streams Work Within a Transaction
- Zero-Copy Cloning
- What is Zero-Copy Cloning
- Clone Database, Pipes, Streams, Tasks & Stages
- Clone Schema & Tables
- Secure Data Sharing
- Data Sharing – Definition
- Provider and Consumer Accounts
- Allowed List of Objects to Share
- How Data Sharing Works on Snowflake
- Data Sharing – Architecture
- Materialized Views
- Materialized Views – Introduction
- When to Create Materialized Views
- Advantages of Materialized Views
- Limitations of Materialized Views
- Secure Views
- Secure Views – Introduction
- Secure Views – Hands-On
- Snowflake Task Feature
- Tasks Introduction
- Tasks Demonstration
- Snowflake Task Dependency
- Dependency Demonstration
- Snowflake Continuous Data Pipelines
- Continuous Stream Data Integrations
- Streams, Tasks & Snow pipe
- èLab – Streams & Tasks Lab & Exercises
- Security – Dynamic Data Masking Column Level
- Data Masking – Introduction
- Data Masking – Detailed Overview
- Data Masking – Hands-On
- Partner Connect
- Introduction to Partner Connect
- Available Tools
- SnowSQL (CLI Client) – Installation, Configuring, and Usage
- Understanding Snowflake Billing
- Monitoring Credit and Storage Usage
- Resource Monitors
- User Management Syllabus for DBT Cloud
- What is DBT?
- Overview of DBT as a data transformation tool.
- Importance and benefits of using DBT Cloud for data transformations.
- DBT Account Setup
- Creating and configuring a DBT Cloud account.
- Setting up connections to your data warehouse
- Models
- Writing SQL queries to create models.
- Transforming raw data into structured data within the warehouse.
- Tests
- Writing tests in SQL or YAML to validate data integrity.
- Ensuring transformations produce expected results.
- Materializations
- Understanding different materialization strategies (views, tables, incremental loads).
- Implementing appropriate materializations for different use cases.
- Seeds and Sources
- Loading static CSV files (seeds) into your warehouse.
- Defining and using sources to reference raw data tables in your warehouse.
- Cloud Features
- Utilizing DBT Cloud’s web-based IDE.
- Setting up job scheduling for automated transformations.
- Collaborating with team members using DBT Cloud’s tools.
- Advanced Concepts and Documentation
- Using Jinja templating for reusable and dynamic SQL.
- Generating and managing documentation for models, sources, and tests.
- Macros
- Writing custom SQL functions with Jinja.
- Creating reusable logic for complex SQL generation.
- Snapshots
- Capturing and storing historical data changes over time.
- Implementing snapshots to track data changes.
- Hooks
- Writing custom scripts or SQL statements to run at specific points in the DBT run lifecycle.
- Adding additional custom logic or operations to your DBT workflows.
- Analyses
- Creating ad-hoc analyses that do not get materialized in the database.
- Using analyses for exploratory data analysis and reporting.
- Understand the Requirements
- Set Up Users and Custom Roles
- Grant Privileges to Custom Roles
- Tips for Interview Preparation
- Real-Time Practice Scenarios
