Databricks Community

flashmav · yesterday

I am doing a merge in a table in parallel via 2 jobs.
The table is a liquid clustered table with the following properties:

delta.enableChangeDataFeed=true
delta.enableDeletionVectors=true
delta.enableRowTracking=true
delta.feature.changeDataFeed=supported
delta.feature.clustering=supported
delta.feature.deletionVectors=supported
delta.feature.domainMetadata=supported
delta.feature.rowTracking=supported
delta.isolationLevel=Serializable

Given that row-level tracking, deletion vectors, and Liquid Clustering are enabled, I expected that concurrent write conflicts would be avoided, especially since the two jobs are modifying non-overlapping rows.However, I’m still encountering concurrent modification errors (e.g., concurrent delete exceptions).Is there something I’m missing, or is this the expected behavior under certain conditions?

BigRoux · yesterday

Hey @flashmav , keep in mind that operations in Delta Lake often occur at the file level rather than the row level. For example, if two sessions attempt to update data in the same file (even if they’re not updating the same row), you may encounter a race condition, resulting in one session throwing an error. It’s important to remember that Delta Lake is not designed for OLTP (Online Transaction Processing) scenarios; it’s optimized for analytics use cases. The ACID transactions supported by Delta are limited in scope. With this context, here are some suggestions to consider:

The errors you are encountering during concurrent MERGE operations on a liquid clustered table with row-level tracking and deletion vectors enabled are expected behavior under certain circumstances.

Analysis of the Situation: 1. Concurrency Conflict Context: Even with row-level tracking and deletion vectors, certain conditions can still lead to concurrent modification errors: - If both jobs attempt operations that result in overlapping file modifications, they might still conflict despite targeting non-overlapping rows. The concurrent modification detection in Delta Lake operates at a granularity of data files rather than individual rows when these files are reused or rewritten. - Operations like MERGE involve both reads and writes that can lead to conflicts if file modification timestamps or metadata tracking indicates overlapping changes.

Liquid Clustering and Row-Level Concurrency:
- Liquid Clustering and row-level concurrency (enabled by delta.enableRowTracking=true and delta.enableDeletionVectors=true) improve conflict management but do not completely eliminate the possibility of conflicts.
- Certain operations (e.g., complex conditional clauses in MERGE or DELETE commands) can still lead to exceptions, such as ConcurrentDeleteReadException or ConcurrentDeleteDeleteException.
Isolation Level:
- Your table is set to the "Serializable" isolation level (delta.isolationLevel=Serializable). While this ensures strict serial execution semantics for transactions, it increases the likelihood of conflict detection when concurrent jobs attempt simultaneous write operations.

Recommendations to Mitigate the Issue: 1. Explicit Predicate Design: - Refactor your MERGE operations to include explicit predicates that clearly denote non-overlapping data regions in the target table. For example, use additional filters based on distinct partitions or ranges to limit potential overlaps.

Scheduling Optimization:
- Stagger the execution of the concurrent jobs to ensure minimal overlap in transactional operations affecting the same table. This can mitigate conflicts caused by simultaneous write attempts.
Optimize File Layout:
- Ensure optimized file sizes by occasionally running OPTIMIZE operations if your table is undergoing heavy ingestion or transactional churn. This reduces the potential for multiple transactions performing concurrent rewrites on the same file.
Switch to WriteSerializable Isolation Level:
- Consider temporarily switching to the "WriteSerializable" isolation level (delta.isolationLevel=WriteSerializable) to relax the conflict detection rigor if the strict serializability is not a hard requirement. Note, however, that this trade-off allows certain operations to reorder in history.
Monitor and Troubleshoot Conflicts:
- Review the specific exceptions thrown during job failures (e.g., ConcurrentDeleteReadException, ConcurrentTransactionException) to fine-tune job parameters and logic further.

If these steps do not resolve your challenges, it may be worth experimenting with additional optimizations or configurations as per your workload's specific architecture and data access patterns.

Cheers, Lou.

Databricks Community

ConcurrentDeleteDeleteException in liquid cluster table

Join Us as a Local Community Builder!

Data + AI Summit 2025 — registration now open!

Upskill yourself and your teams at Data+AI Summit

[eBook] Migrate your legacy data warehouse to Databricks

Announcing updates to Databricks Query Profiles

Databricks Community Champion - April 2025 - Brahma Reddy