Menu

#86 Feature Request: Use DuckDB mssql extension for bulk INSERT to SQL Server

open
nobody
6 days ago
2026-06-22
Anonymous
No

Originally created by: cola3k
Originally owned by: SouravRoy-ETL

The problem

Currently, when exporting data to SQL Server, Duckle appears to use row-by-row INSERT operations via the SQL Server connector. For large datasets (e.g., 1M+ rows), this approach is extremely slow and becomes a major bottleneck in data pipeline workflows.

I have verified that DuckDB itself supports a community mssql extension that connects natively via TDS protocol and supports high-performance bulk writes. Switching to this method would solve the performance issue entirely.

Proposed solution

Replace the current row-by-row insert logic with DuckDB's community mssql extension, leveraging the COPY TO statement with BCP protocol for batch/bulk writes.

According to the extension's documentation, this approach can achieve ~1.2M rows/sec load performance, which would be a massive improvement over the current incremental insert method.

Additional context

DuckDB mssql extension docs: https://duckdb.org/community_extensions/extensions/mssql

The extension requires no ODBC/JDBC drivers — it uses pure TDS protocol over TCP, which is lightweight and easy to bundle.

Discussion

  • Anonymous

    Anonymous - 2026-06-22
     
  • Anonymous

    Anonymous - 2026-06-22

    Originally posted by: SouravRoy-ETL

    Let me try that

     
  • Anonymous

    Anonymous - 7 days ago

    Originally posted by: SouravRoy-ETL

    Done in v0.5.0. snk.sqlserver (and snk.synapse) now bulk-load through the DuckDB mssql community extension over TDS - it ATTACHes the target and writes via COPY/INSERT instead of row-by-row, which is far faster on big loads. It's on by default (the Bulk write toggle); flip it off to fall back to the row-by-row driver for fully offline runs. Verified live against SQL Server 2022.

    Please download - https://github.com/slothflowlabs/duckle/releases/tag/v0.5.0 and let me know!

     
  • Anonymous

    Anonymous - 6 days ago

    Originally posted by: cola3k

    First of all, thank you for implementing the SQL Server bulk write via the mssql extension in v0.5.0. I've tested it and it works great.

    However, I noticed two configuration gaps in the GUI regarding the new bulk write mode:

    1. Batch size is not configurable
      The bulk insert currently uses a hardcoded batch size of 1,000 rows per batch. For large datasets, this is too small and impacts performance. It would be helpful to expose this as a user-configurable setting in the Sink node properties.

    2. TLS configuration
      The current connection string for the mssql extension defaults to TrustServerCertificate=true. When connecting to an older SQL Server instance that does not support TLS, it fails with:

      SQL Server: Invalid Input Error: MSSQL connection validation failed: TLS handshake failed to <ip>:1433 - check TLS configuration</ip>

    The GUI already has two options for the legacy driver mode:

    • Trust TLS cert (dev / self-signed)
    • Insert batch size (driver mode only)

    However, these settings currently only apply to the legacy row-by-row driver. It would be great if they could also apply to the new DuckDB mssql extension bulk write mode, so users can configure TLS behavior and batch size regardless of which write mode is used.

    Thank you for the great work on this release!

     

Log in to post a comment.

Auth0 Logo