Databricks Platform Discussions
Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the co...
Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the co...
Engage in vibrant discussions covering diverse learning topics within the Databricks Community. Expl...
Hi folksI was interested in doing the certification for the "Databricks Certified Data Analyst Associate" here:https://www.databricks.com/learn/certification/data-analyst-associateLooking at the "Related Training" section I see recommended training i...
Hello @rodneyc8063! Yes, all three courses cover the same content, the difference lies in the format and access: 2-hour Self-Paced: Free, video-only3-hour Self-Paced: Paid, includes hands-on labs4-hour Instructor-Led: Paid, includes labs and a live i...
I have a pyspark job reading the input data volume of just ~50-55GB Parquet data from a delta table on Databricks. Job is using n2-highmem-4 GCP VM and 1-15 worker with autoscaling on databricks. Each workerVM of type n2-highmem-4 has 32GB memory and...
Much appreciate @mark_ott and @BigRoux for the prompt response.The job uses below cluster/settings. Cluster/spark version - Driver: n2-highmem-4 · Workers: n2-highmem-4 · 5-15 workers · DBR: 15.4 LTS (includes Apache Spark 3.5.0, Scala 2.12) on GCPP...
It seems that due to how Databricks processes SQL cells, it's impossible to escape the $ when it comes to a column name.I would expect the following to work:%sql SELECT 'hi' `$id`The backticks ought to escape everything. And indeed that's exactly wha...
+1 here - hoping to hear any updates.
Hello all,I am tasked to evaluate a new LLM for some use-cases. In particular, I need to build a POC for a chat bot based on that model. To that end, I want to create a custom Serving Endpoint for an LLM pulled from huggingfaces. The model itself is...
Hello guys!I am getting this error when running a job:ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/some-python-package'I have lis...
Thanks for clarifying Isi, really appreciate it
Hi I am doing a data engineering course in databricks(Partner labs) and would like to have access to vocareum workspace to practice using the demo sessions.can you please help me to get the access to this workspace?regards,Aravind
Can you please provide links? screenshot? more info? This answer is not specific enough.I'm taking the Data Analysis learning path, there are different demos I'd like to practice and there are no SP Lab environment links as mentioned in the videos.
We have a date (DD/MM/YYYY) partitioned BQ table. We want to update a specific partition data in 'overwrite' mode using PySpark. So to do this, I applied 'spark.sql.sources.partitionOverwriteMode' to 'DYNAMIC' as per the spark bq connector documentat...
@soumiknow , Just checking if there are any further questions, and did my last comment help?
We're currently using lakehouse federation for various sources (Snowflake, SQL Server); usually succesful. However we've encountered a case where one of the databases on the SQL Server has spaces in its name, e.g. 'My Database Name'. We've tried vari...
Hi @MaartenH Can you try creating the foreign catalog like this?CREATE FOREIGN CATALOG your_catalog_name USING CONNECTION your_connection_nameOPTIONS ( database '[My Database Name]');(Do check that the Foreign catalog name must follow Unity Catalog...
Hi everyone, I am encountering a problem when using ipywidgets with plotly on Databricks. I am trying to pass interactive arguments to a function and then plot with plotly. When I do the followingdef f(m, b) : plt.figure(2) x = np.linspace(-10,...
Thanks for the suggestion! You're absolutely right. The code was already all in my message, but I can make it easier to copy-paste (and add the imports):from ipywidgets import interactive import matplotlib.pyplot as plt import numpy as np def f(m, b...
Hello, I have a daily ETL job that adds new records to a table for the previous day. However, from time to time, it does not produce any output.After investigating, I discovered that one table is sometimes loaded as empty during execution. As a resul...
Thank you very much, @BigRoux , for such a detailed and insightful answer!All tables used in this processing are managed Delta tables loaded through Unity Catalog.I will try running it with spark.databricks.io.cache.enabled set to false just to see i...
Is it possible to show the full logs of a databricks job? Currently, the logs are skipped with:*** WARNING: max output size exceeded, skipping output. ***However, I don't believe our log files are more than 20 MB. I know you can press the logs button...
Hey @Kaz ,Unfortunately, the output truncation limit in the Databricks job UI cannot be changed. Once that limit is exceeded, the rest of the logs are skipped, and the full logs become accessible only through the “Logs” button, which, as you mentione...
Hello,After scouring documentation yesterday, I was finally able to get unity catalog enabled and assigned to my workspace. Or so I thought. When I run the CURRENT METASTORE() command I get the below error:However, when I look at my catalog I can see...
I use bad records while reading a csv as follows:df = spark.read.format("csv") .schema(schema) .option("badRecordsPath", bad_records_path) Since bad records are not written immediately, I want to know how can trigger the write...
Hey @bjn ,1) Yes, if you run both df.write.format("noop")... and df.write.format("delta").saveAsTable(...), you’re triggering two separate actions, and Spark will evaluate the DataFrame twice. That includes parsing the CSV and, importantly, processin...
I am trying to execute pandas UDF in databricks. It gives me the following error on serverless compute,File /local_disk0/.ephemeral_nfs/envs/pythonEnv-b11ff17c-9b25-4ccb-927d-06a7d1ca7221/lib/python3.11/site-packages/pyspark/sql/connect/client/core.p...
Serverless is management free which means you cannot choose the image. Hope this helps. Lou.
how to get databricks host name from trial account
Hi @harshajainFollow below steps:Log in to the Databricks trial portalAccess the Workspace provided upon login.The hostname is the part of the URL. For example, if the URL is https://trial-1234568.cloud.databricks.com, the hostname is trial-1234568.
User | Count |
---|---|
1819 | |
891 | |
473 | |
315 | |
313 |