Channel Avatar

The Data Guy @UCQq79zHGZJNzm3SPOfLNmrw@youtube.com

12K subscribers - no pronouns :c

Your one stop shop for all your Data needs! Have a hard prob


17:15
How to Build a Micro-Batching Pipeline with Apache Airflow and BigQuery or Snowflake!
12:02
What is a Data Catalog and Why Are They Useful? Data Catalogues Explained for Beginners!
11:12
What Are AI Agents? Artificial Intelligence Agents Explained for Beginners!
11:00
Infrastructure as Code Explained! IAC Explained for Beginners!
11:58
The Future of Generative AI in Data Engineering! GenAI's Future in Data Engineering Careers!
10:56
Rust Vs. Scala Vs. Python For Data Engineering!
08:13
What is DeepSeek? Open-Source Chinese ChatGPT Competitor Explained!
10:40
Data Engineering Interview Questions Answered and Explained!
12:13
How to Use Apache Spark, Hadoop, and Hive for Data Processing and Model Training/Tuning!
15:30
The Future of Data Engineering! Where is Data Engineering Heading in 2025 and Beyond!
10:20
All Apache Data Formats Explained! Apache Feather Vs. Avro Vs. ORC V. Parquet!
14:29
How to Build an ELT Pipeline with Postgres, Apache Airflow, and dbt with Cosmos!
18:33
How to Choose the Right Tools for Your Data Tech Stack! Data Tools Decision Making Guidelines!
16:58
How to Build a Reverse ETL Pipeline with Airflow, Snowflake, and Salesforce!
08:50
How to Get Started with LakeSail & PySail for Spark! Spark Compute Framework Lakesail Explained!
13:52
How to Become a Sales Engineer! Sales Engineering Explained!
11:39
How to Use FlinkML and MLLib for ML Model Training and Retraining!
14:19
How to Clean Your Data! Data Cleaning Techniques and Examples for Beginners!
11:36
Beginner's Guide to Ray! Ray Explained
10:17
Apache Iceberg Vs. Delta Lake Vs. Apache Hudi! Data Lake Storage Solutions Compared!
15:51
Data Scientist Zero to Hero Guide! Everything You Need to Learn to Get a Job in Data Science!
12:06
How to Build an ELT Pipeline with Google BigQuery, Apache Airflow, and dbt!
13:30
RabbitMQ Vs. Apache Kafka! RabbitMQ and Apache Kafka Explained, Compared and Contrasted!
10:14
How to Run Java Applications with Apache Airflow! Learn to Trigger & Monitor Remote VMs from Airflow
13:16
How to Create a Medallion Data Architecture! Medallion Architecture Guide & Best Practices!
16:30
How to Run Apache Airflow in Production! Best Practices for Running Apache Airflow at Scale!
15:07
Snowflake Vs. AWS RedShift Vs. GCP BigQuery Vs. Azure Synapse for Data Warehousing!
11:31
How to Run Talend Tasks Using Apache Airflow and Create a Talend Operator!
10:28
How to Collect and Visualize Lineage Data from your Data Pipelines with Apache Airflow!
16:16
How to Set Up a Data Lake in Production! Data Lake Best Practices Guide
15:12
How to Use Polars, the Modern Pandas Alternative! Getting Started with Polars for Python!
17:26
How to Build a Production ML Pipeline with Apache Airflow, Databricks, Kafka, and MLFlow!
11:39
How to Use Apache Flink and Apache Kafka to Do Real Time Stream Processing!
18:05
Data Engineer Zero to Hero Guide!
10:08
How to Develop Spark Scripts Locally Before Deploying Them to a Databricks Cluster!
10:16
How to Build an ETL Pipeline with Airbyte, Apache Airflow, and Snowflake!
11:47
How to Use AWS Lambda and Apache Airflow to Create an ETL and Machine Learning Pipeline!
10:47
How to Build an ELT Pipeline with AWS Redshift, Apache Airflow and dbt!
16:01
dbt Core Vs. SQLMesh for SQL Transformations!
12:30
How to Build Auto-Refreshing Analytics Pipelines with Microsoft SQL Server, PowerBI & Apache Airflow
16:35
Apache Flink Vs. Apache Spark Vs. Apache Storm: Which Data Processing Tool is Right for You!
18:05
Collibra Vs. Monte Carlo Vs. Atlan: Data Lineage/Catalog Tools Compared!
10:59
How to Build an ETL Pipeline to Process Google Ads Data with Apache Airflow and BigQuery!
10:44
How to Build an ETL Pipeline with a Couchbase Database and Apache Airflow!
14:02
Apache Cassandra Vs. Redis Vs. MongoDB: NoSQL Databases Compared!
10:30
How to Build an ETL Pipeline with Apache Kafka and RedPanda Connect!
13:35
OLTP vs. OLAP Databases! OLTP and OLAP Databases Explained and Compared!
11:01
How to Get Started with SQLMesh, a dbt Alternative!
11:03
How to Build an ETL Pipeline with Google BigQuery and Apache Airflow!
11:37
How to Use Ray and Apache Airflow for Heavy ML/AI Processing Workloads!
11:48
How to Get Started with Soda for Data Quality Checks!
18:42
Data Engineering Interview Guide! How to Get a Data Engineering Job!
11:35
How to Build a BigQuery Ingestion Pipeline from API's, SFTP Servers, and Pub/Sub with Airflow!
12:48
End-to-End Parallel ETL Pipeline with Airflow, Snowflake, and S3 Buckets!
13:10
Best Practices for dbt With Real World Examples!
11:50
Everything New in Airflow 2.10!
10:25
End to End Micro-Batching Pipeline With Apache Airflow and Kafka!
11:27
What is a Delta Lake? Delta Lakes Explained!
12:13
Apache Hudi Vs Apache Iceberg! Apache Hudi and Iceberg Comparison!
17:29
Data Modeling Best Practices! Best Practices for Designing Your Data Models!