Knowledge Sharing Hub

by SumitSingh • Contributor

07-19-2024 8:25:47 AM

3824 Views
7 replies
12 kudos

From Associate to Professional: My Learning Plan to ace all Databricks Data Engineer Certifications

In today’s data-driven world, the role of a data engineer is critical in designing and maintaining the infrastructure that allows for the efficient collection, storage, and analysis of large volumes of data. Databricks certifications holds significan...

Knowledge Sharing Hub

Reply

3824 Views
7 replies
12 kudos

07-19-2024 8:25:47 AM

View Replies

Latest Reply

sandeepmankikar
New Contributor III

03-12-2025 8:32:21 PM

12 kudos

As an additional tip for those working towards both the Associate and Professional certifications, I recommend avoiding a long gap between the two exams to maintain your momentum. If possible, try to schedule them back-to-back with just a few days in...

12 kudos

03-12-2025 8:32:21 PM

6 More Replies

by MichTalebzadeh • Valued Contributor

04-27-2024 12:00:43 AM

1363 Views
3 replies
0 kudos

Financial Crime detection with the help of Apache Spark, Data Mesh and Data Lake

For those interested in Data Mesh and Data Lakes for FinCrime detection:Data mesh is a relatively new architectural concept for data management that emphasizes domain-driven data ownership and self-service data availability. It promotes the decentral...

Knowledge Sharing Hub

data lakes

Data Mesh

financial crime

spark

Reply

1363 Views
3 replies
0 kudos

04-27-2024 12:00:43 AM

View Replies

Latest Reply

carrolbeau
Visitor

3 hours ago

0 kudos

It's great that you're focusing on financial crime detection with advanced technologies like Apache Spark, Data Mesh, and Data Lake. For those looking to dive deeper into criminal records and related data, tools like KY criminal lookup can provide es...

0 kudos

3 hours ago

2 More Replies

by ThomazRossito • Contributor

04-14-2024 4:31:33 PM

2252 Views
1 replies
1 kudos

Post: Lakehouse Federation - Databricks

Lakehouse Federation - Databricks In the world of data, innovation is constant. And the most recent revolution comes with Lakehouse Federation, a fusion between data lakes and data warehouses, taking data manipulation to a new level. This advancement...

Knowledge Sharing Hub

data engineer

Lakehouse

SQL Analytics

Reply

2252 Views
1 replies
1 kudos

04-14-2024 4:31:33 PM

View Replies

Latest Reply

Freshman
New Contributor III

Monday

1 kudos

Hey Quick Question, Can we use it for the production version ? We have application server as SQL server, we are planning to use lakehouse federation so we can bypass creating and maintaining 100 of workflows. as we a small dataset I am not too sure o...

1 kudos

Monday

by Shahram • New Contributor II

Friday

44 Views
0 replies
1 kudos

Hub Star Modeling 2.0 for Medalion Architecture

Excited to share my latest publication on arXiv!“Hub Star Modeling 2.0 for Medallion Architecture” https://arxiv.org/abs/2504.08788This new version builds on the original Hub Star Modeling approach, published last year, and now tailored for the Meda...

Knowledge Sharing Hub

Reply

44 Views
0 replies
1 kudos

Friday

by genevive_mdonça • Databricks Employee

2 weeks ago

257 Views
1 replies
5 kudos

Handling Complex Nested JSON in Databricks Using schemaHints

When I first got into managing schemas in Databricks, it took me a while to realize that putting in a little planning up front could save me a ton of headaches later on.I was working with these deeply nested, constantly changing JSON files. At first,...

Knowledge Sharing Hub

Reply

257 Views
1 replies
5 kudos

2 weeks ago

View Replies

Latest Reply

Advika
Databricks Employee

2 weeks ago

5 kudos

Great tip @genevive_mdonça! schemaHints help avoid issues with evolving JSON data, making data processing more reliable and easier to maintain. Thanks for sharing.

5 kudos

2 weeks ago

by techgeorge • New Contributor II

3 weeks ago

281 Views
1 replies
0 kudos

Understanding Coalesce, Skewed Joins, and Why AQE Doesn't Always Intervene

In Spark, data skew can be the silent killer of performance. One wide partition pulling in 90% of the data?But even with AQE (Adaptive Query Execution) turned on in Databricks, skewness isn't always automatically identified— and here’s why.What Is co...

Knowledge Sharing Hub

Reply

281 Views
1 replies
0 kudos

3 weeks ago

View Replies

Latest Reply

BigRoux
Databricks Employee

3 weeks ago

0 kudos

@mark_ott , this question seems right up your alley. Care to comment?

0 kudos

3 weeks ago

by Emil_Kaminski • Contributor II

12-20-2023 1:55:42 PM

9689 Views
3 replies
5 kudos

Materials to pass Databricks Data Engineering Associate Exam

Hi Guys, I have passed it already some time ago, but just recently have summarized all the materials which helped me to do it. Pay special attention to GitHub repository, which contains many great exercises prepared by Databricks teamhttps://youtu.be...

Knowledge Sharing Hub

Reply

9689 Views
3 replies
5 kudos

12-20-2023 1:55:42 PM

View Replies

Latest Reply

Alexa_Wadee
New Contributor II

3 weeks ago

5 kudos

I passed my Databricks Data Engineering Associate exam after studying with https://bit.ly/4iaflcm. Their extensive collection of mock tests and Practice Software significantly boosted my score to 93%.

5 kudos

3 weeks ago

2 More Replies

by Yuki • New Contributor III

3 weeks ago

196 Views
0 replies
0 kudos

One of the solution of [FAILED_READ_FILE.NO_HINT] Error while reading file, when display() or SELECT

One of the solution of [FAILED_READ_FILE.NO_HINT] Error while reading file, when display() or SELECTI got stuck with the above error when using `spark.read.table().display()` or directly query the table using %sql.While the display method is just one...

Knowledge Sharing Hub

Reply

196 Views
0 replies
0 kudos

3 weeks ago

by marvin-alpura • New Contributor II

4 weeks ago

417 Views
0 replies
1 kudos

Power BI to Databricks Semantic Layer Generator (DAX → SQL/PySpark)

Hi everyone!I’ve just released an open-source tool that generates a semantic layer in Databricks notebooks from a Power BI dataset using the Power BI REST API. Im not an expert yet, but it gets job done and instead of using AtScale/dbt/or the PBI Sem...

Knowledge Sharing Hub

Reply

417 Views
0 replies
1 kudos

4 weeks ago

by techgeorge • New Contributor II

04-04-2025 1:44:08 AM

270 Views
0 replies
0 kudos

How to train a Convolutional Neural Network on Databricks with Tensorflow and Keras

Here is how to trained a lightweight Convolutional Neuronal Network (CNN) to detect pneumonia from chest X-rays pictures on Azure Databricks. I promise no LLMs, no hype, just real-world deep learning:1. Built it with TensorFlow & Keras on Databricks2...

Knowledge Sharing Hub

Reply

270 Views
0 replies
0 kudos

04-04-2025 1:44:08 AM

by shubham_meshram • New Contributor II

03-31-2025 3:18:21 PM

351 Views
0 replies
0 kudos

When Did the Data Go Wrong? Using Delta Lake Time Travel for Investigation in Databricks

I. IntroductionData pipelines are the lifeblood of modern data-driven organizations. However, even the most robust pipelines can experience unexpected issues: data corruption, erroneous updates, or sudden data drops. When these problems occur, quickl...

Knowledge Sharing Hub

Reply

351 Views
0 replies
0 kudos

03-31-2025 3:18:21 PM

by Brahmareddy • Honored Contributor III

03-25-2025 5:33:16 PM

714 Views
0 replies
1 kudos

Real Lessons in Databricks Schema, Streaming, and Unity Catalog

Hey Databricks community,I wanted to take a moment to share some things I’ve learned while working with Databricks in real projects—especially around schema management, Unity Catalog, Autoloader, and streaming jobs. These are the kinds of small detai...

Knowledge Sharing Hub

Reply

714 Views
0 replies
1 kudos

03-25-2025 5:33:16 PM

by pradeepvatsvk • New Contributor III

03-21-2025 1:58:03 AM

527 Views
0 replies
1 kudos

Inclusion of special characters while saving or downloading as a csv

Hi All, I have data which looks like this High Corona40% 50cl Pm £13.29 but when saving it as a csv it is getting converted into High Corona40% 50cl Pm Â£13.29 . wherever we have the euro sign . I thing to note here is while displaying the data i...

Knowledge Sharing Hub

Reply

527 Views
0 replies
1 kudos

03-21-2025 1:58:03 AM

by Brahmareddy • Honored Contributor III

03-20-2025 7:26:35 PM

683 Views
0 replies
1 kudos

Use Query Patterns to Suggest Indexes Dynamically

Hey folks,Ever notice how a query that used to run super fast suddenly starts dragging? We’ve all been there. As data grows, those little inefficiencies in your SQL start showing up — and they show up hard. That’s where something cool comes in: using...

Knowledge Sharing Hub

Reply

683 Views
0 replies
1 kudos

03-20-2025 7:26:35 PM

by Brahmareddy • Honored Contributor III

08-12-2024 1:28:15 PM

3663 Views
6 replies
4 kudos

My Journey with Schema Management in Databricks

When I first started handling schema management in Databricks, I realized that a little bit of planning could save me a lot of headaches down the road. Here’s what I’ve learned and some simple tips that helped me manage schema changes effectively. On...

Knowledge Sharing Hub

Reply

3663 Views
6 replies
4 kudos

08-12-2024 1:28:15 PM

View Replies

Latest Reply

Brahmareddy
Honored Contributor III

03-19-2025 8:26:19 PM

4 kudos

Haha, glad it made sense! Joao.Try it out, and if you run into any issues, just let me know. Always happy to help! And best friends? You got it!

4 kudos

03-19-2025 8:26:19 PM

5 More Replies

by DataDarvish • New Contributor II

03-18-2025 9:10:52 PM

575 Views
0 replies
1 kudos

Unit Testing for Data Engineering: How to Ensure Production-Ready Data Pipelines

In today’s data-driven world, the success of any business use case relies heavily on trust in the data. This trust is built upon key pillars such as data accuracy, consistency, freshness, and overall quality. When organizations release data into prod...

Knowledge Sharing Hub

Reply

575 Views
0 replies
1 kudos

03-18-2025 9:10:52 PM

Databricks Community

Forum Posts

From Associate to Professional: My Learning Plan to ace all Databricks Data Engineer Certifications

Financial Crime detection with the help of Apache Spark, Data Mesh and Data Lake

Post: Lakehouse Federation - Databricks

Hub Star Modeling 2.0 for Medalion Architecture

Handling Complex Nested JSON in Databricks Using schemaHints

Understanding Coalesce, Skewed Joins, and Why AQE Doesn't Always Intervene

Materials to pass Databricks Data Engineering Associate Exam

One of the solution of [FAILED_READ_FILE.NO_HINT] Error while reading file, when display() or SELECT

Power BI to Databricks Semantic Layer Generator (DAX → SQL/PySpark)

How to train a Convolutional Neural Network on Databricks with Tensorflow and Keras

When Did the Data Go Wrong? Using Delta Lake Time Travel for Investigation in Databricks

Real Lessons in Databricks Schema, Streaming, and Unity Catalog

Inclusion of special characters while saving or downloading as a csv

Use Query Patterns to Suggest Indexes Dynamically

My Journey with Schema Management in Databricks

Unit Testing for Data Engineering: How to Ensure Production-Ready Data Pipelines

Join Us as a Local Community Builder!

Log Custom Transformer with Feature Engineering Cl...

Want to learn LakeFlow Pipelines in community edit...

Standardized Framework to update Databricks job de...

Feature Engineering for Data Engineers: Building B...

Timeout handling with JDBC connection to SQL Wareh...