cancel
Showing results for 
Search instead for 
Did you mean: 
Discussions
Engage in dynamic conversations covering diverse topics within the Databricks Community. Explore discussions on data engineering, machine learning, and more. Join the conversation and expand your knowledge base with insights from experts and peers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Browse the Community

Community Discussions

Engage in vibrant discussions covering diverse learning topics within the Databricks Community. Expl...

3671 Posts

Activity in Discussions

rodneyc8063
by > New Contributor II
  • 22 Views
  • 1 replies
  • 0 kudos

Why are there 3 courses (SQL Analytics on Databricks) for the Databricks Data Analyst Certification?

Hi folksI was interested in doing the certification for the "Databricks Certified Data Analyst Associate" here:https://www.databricks.com/learn/certification/data-analyst-associateLooking at the "Related Training" section I see recommended training i...

rodneyc8063_0-1746626974288.png
  • 22 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @rodneyc8063! Yes, all three courses cover the same content, the difference lies in the format and access: 2-hour Self-Paced: Free, video-only3-hour Self-Paced: Paid, includes hands-on labs4-hour Instructor-Led: Paid, includes labs and a live i...

  • 0 kudos
Klusener
by > New Contributor III
  • 67 Views
  • 9 replies
  • 5 kudos

Smaller dataset causing OOM on large cluster

I have a pyspark job reading the input data volume of just ~50-55GB Parquet data from a delta table on Databricks. Job is using n2-highmem-4 GCP VM and 1-15 worker with autoscaling on databricks. Each workerVM of type n2-highmem-4 has 32GB memory and...

  • 67 Views
  • 9 replies
  • 5 kudos
Latest Reply
Klusener
New Contributor III
  • 5 kudos

Much appreciate @mark_ott  and @BigRoux for the prompt response.The job uses below cluster/settings. Cluster/spark version - Driver: n2-highmem-4 · Workers: n2-highmem-4 · 5-15 workers · DBR: 15.4 LTS (includes Apache Spark 3.5.0, Scala 2.12) on GCPP...

  • 5 kudos
8 More Replies
VVM
by > New Contributor III
  • 19278 Views
  • 14 replies
  • 3 kudos

Resolved! Databricks SQL - Unable to Escape Dollar Sign ($) in Column Name

It seems that due to how Databricks processes SQL cells, it's impossible to escape the $ when it comes to a column name.I would expect the following to work:%sql SELECT 'hi' `$id`The backticks ought to escape everything. And indeed that's exactly wha...

  • 19278 Views
  • 14 replies
  • 3 kudos
Latest Reply
rgower
New Contributor II
  • 3 kudos

+1 here - hoping to hear any updates.

  • 3 kudos
13 More Replies
DaPo
by > New Contributor II
  • 21 Views
  • 0 replies
  • 0 kudos

Model Serving Endpoint: Cuda-OOM for Custom Model

Hello all,I am tasked to evaluate a new LLM  for some use-cases. In particular, I need to build a POC for a chat bot based on that model. To that end, I want to create a custom Serving Endpoint for an LLM pulled from huggingfaces. The model itself is...

  • 21 Views
  • 0 replies
  • 0 kudos
anmol-aidora
by > New Contributor
  • 146 Views
  • 6 replies
  • 0 kudos

Resolved! Serverless: ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied

Hello guys!I am getting this error when running a job:ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/some-python-package'I have lis...

  • 146 Views
  • 6 replies
  • 0 kudos
Latest Reply
anmol-aidora
New Contributor
  • 0 kudos

Thanks for clarifying Isi, really appreciate it

  • 0 kudos
5 More Replies
aravind-ey
by > New Contributor
  • 875 Views
  • 5 replies
  • 1 kudos

vocareum lab access

Hi I am doing a data engineering course in databricks(Partner labs) and would like to have access to vocareum workspace to practice using the demo sessions.can you please help me to get the access to this workspace?regards,Aravind

  • 875 Views
  • 5 replies
  • 1 kudos
Latest Reply
twnlBO
New Contributor II
  • 1 kudos

Can you please provide links? screenshot? more info? This answer is not specific enough.I'm taking the Data Analysis learning path, there are different demos I'd like to practice and there are no SP Lab environment links as mentioned in the videos.

  • 1 kudos
4 More Replies
soumiknow
by > Contributor II
  • 3035 Views
  • 22 replies
  • 1 kudos

Resolved! BQ partition data deleted fully even though 'spark.sql.sources.partitionOverwriteMode' is DYNAMIC

We have a date (DD/MM/YYYY) partitioned BQ table. We want to update a specific partition data in 'overwrite' mode using PySpark. So to do this, I applied 'spark.sql.sources.partitionOverwriteMode' to 'DYNAMIC' as per the spark bq connector documentat...

  • 3035 Views
  • 22 replies
  • 1 kudos
Latest Reply
VZLA
Databricks Employee
  • 1 kudos

@soumiknow , Just checking if there are any further questions, and did my last comment help?

  • 1 kudos
21 More Replies
MaartenH
by > Visitor
  • 46 Views
  • 3 replies
  • 1 kudos

Lakehouse federation for SQL server: database name with spaces

We're currently using lakehouse federation for various sources (Snowflake, SQL Server); usually succesful. However we've encountered a case where one of the databases on the SQL Server has spaces in its name, e.g. 'My Database Name'. We've tried vari...

  • 46 Views
  • 3 replies
  • 1 kudos
Latest Reply
SP_6721
New Contributor III
  • 1 kudos

Hi @MaartenH Can you try creating the foreign catalog like this?CREATE FOREIGN CATALOG your_catalog_name USING CONNECTION your_connection_nameOPTIONS (   database '[My Database Name]');(Do check that the Foreign catalog name must follow Unity Catalog...

  • 1 kudos
2 More Replies
moseb
by > New Contributor II
  • 440 Views
  • 2 replies
  • 0 kudos

Problem with ipywidgets and plotly on Databricks

Hi everyone, I am encountering a problem when using ipywidgets with plotly on Databricks. I am trying to pass interactive arguments to a function and then plot with plotly. When I do the followingdef f(m, b) :    plt.figure(2)    x = np.linspace(-10,...

  • 440 Views
  • 2 replies
  • 0 kudos
Latest Reply
moseb
New Contributor II
  • 0 kudos

Thanks for the suggestion! You're absolutely right. The code was already all in my message, but I can make it easier to copy-paste (and add the imports):from ipywidgets import interactive import matplotlib.pyplot as plt import numpy as np def f(m, b...

  • 0 kudos
1 More Replies
M_S
by > Visitor
  • 59 Views
  • 2 replies
  • 0 kudos

Dataframe is getting empty during execution of daily job with random pattern

Hello, I have a daily ETL job that adds new records to a table for the previous day. However, from time to time, it does not produce any output.After investigating, I discovered that one table is sometimes loaded as empty during execution. As a resul...

M_S_0-1746605849738.png
  • 59 Views
  • 2 replies
  • 0 kudos
Latest Reply
M_S
Visitor
  • 0 kudos

Thank you very much, @BigRoux , for such a detailed and insightful answer!All tables used in this processing are managed Delta tables loaded through Unity Catalog.I will try running it with spark.databricks.io.cache.enabled set to false just to see i...

  • 0 kudos
1 More Replies
Kaz
by > New Contributor II
  • 7464 Views
  • 3 replies
  • 1 kudos

Show full logs on job log

Is it possible to show the full logs of a databricks job? Currently, the logs are skipped with:*** WARNING: max output size exceeded, skipping output. ***However, I don't believe our log files are more than 20 MB. I know you can press the logs button...

  • 7464 Views
  • 3 replies
  • 1 kudos
Latest Reply
Isi
Contributor III
  • 1 kudos

Hey @Kaz ,Unfortunately, the output truncation limit in the Databricks job UI cannot be changed. Once that limit is exceeded, the rest of the logs are skipped, and the full logs become accessible only through the “Logs” button, which, as you mentione...

  • 1 kudos
2 More Replies
SeekingSolution
by > New Contributor II
  • 24 Views
  • 0 replies
  • 0 kudos

Unity Catalog Enablement

Hello,After scouring documentation yesterday, I was finally able to get unity catalog enabled and assigned to my workspace. Or so I thought. When I run the CURRENT METASTORE() command I get the below error:However, when I look at my catalog I can see...

SeekingSolution_0-1746620101890.png SeekingSolution_1-1746620144801.png SeekingSolution_2-1746620282198.png
  • 24 Views
  • 0 replies
  • 0 kudos
bjn
by > Visitor
  • 34 Views
  • 3 replies
  • 0 kudos

Trigger bad records in databricks

I use bad records while reading a csv as follows:df = spark.read.format("csv") .schema(schema) .option("badRecordsPath", bad_records_path) Since bad records are not written immediately, I want to know how can trigger the write...

  • 34 Views
  • 3 replies
  • 0 kudos
Latest Reply
Isi
Contributor III
  • 0 kudos

Hey @bjn ,1) Yes, if you run both df.write.format("noop")... and df.write.format("delta").saveAsTable(...), you’re triggering two separate actions, and Spark will evaluate the DataFrame twice. That includes parsing the CSV and, importantly, processin...

  • 0 kudos
2 More Replies
chinmay0924
by > New Contributor II
  • 204 Views
  • 5 replies
  • 0 kudos

Spark connect client and server versions should be same for executing UDFs

I am trying to execute pandas UDF in databricks. It gives me the following error on serverless compute,File /local_disk0/.ephemeral_nfs/envs/pythonEnv-b11ff17c-9b25-4ccb-927d-06a7d1ca7221/lib/python3.11/site-packages/pyspark/sql/connect/client/core.p...

  • 204 Views
  • 5 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

Serverless is management free which means you cannot choose the image.  Hope this helps. Lou.

  • 0 kudos
4 More Replies
harshajain
by > Visitor
  • 27 Views
  • 1 replies
  • 0 kudos

how to get databricks host name

how to get databricks host name from trial account

  • 27 Views
  • 1 replies
  • 0 kudos
Latest Reply
Renu_
Contributor
  • 0 kudos

Hi @harshajainFollow below steps:Log in to the Databricks trial portalAccess the Workspace provided upon login.The hostname is the part of the URL. For example, if the URL is https://trial-1234568.cloud.databricks.com, the hostname is trial-1234568.

  • 0 kudos
OSZAR »