Skip to main content

Releases of StarRocks Connector for Spark

Notifications

User guide:

Source codes: starrocks-connector-for-apache-spark

Naming format of the JAR file: starrocks-spark-connector-${spark_version}_${scala_version}-${connector_version}.jar

Methods to obtain the JAR file:

  • Directly download the Spark connector JAR file from the Maven Central Repository.
  • Add the Spark connector as a dependency in your Maven project's pom.xml file and download it. For specific instructions, see user guide.
  • Compile the source codes into Spark connector JAR file. For specific instructions, see user guide.

Version requirements:

Spark connectorSparkStarRocksJavaScala
1.1.13.2, 3.3, or 3.42.5 and later82.12
1.1.03.2, 3.3, or 3.42.5 and later82.12

Release notes

1.1

1.1.2

Features

  • Supports Spark v3.5. #89
  • Supports the starrocks.filter.query parameter when Spark SQL is used to read data from StarRocks. #92
  • Supports reading columns of JSON type from StarRocks. #100

Improvements

  • Optimized error messages. When Spark connector reads data from StarRocks, and columns specified in the starrocks.columns parameter do not exist in the StarRocks table, the returned error message explicitly shows the names of columns that do not exist. #97
  • If an exception occurs when Spark connector requests a query plan from StarRocks FE via HTTP, the FE will return the exception information to the Spark connector through the HTTP status and entity. #98

1.1.1

This release mainly includes some features and improvements for loading data to StarRocks.

NOTICE

Take note of the some changes when you upgrade the Spark connector to this version. For details, see Upgrade Spark connector.

Features

  • The sink supports retrying. #61
  • Support to load data to BITMAP and HLL columns. #67
  • Support to load ARRAY-type data. #74
  • Support to flush according to the number of buffered rows. #78

Improvements

  • Remove useless dependency, and make the Spark connector JAR file lightweight. #55 #57
  • Replace fastjson with jackson. #58
  • Add the missing Apache license header. #60
  • Do not package the MySQL JDBC driver in the Spark connector JAR file. #63
  • Support to configure timezone parameter and become compatible with Spark Java8 API datetime. #64
  • Optimize row-string converter to reduce CPU costs. #68
  • The starrocks.fe.http.url parameter supports to add a http scheme. #71
  • The interface BatchWrite#useCommitCoordinator is implemented to run on DataBricks 13.1 #79
  • Add the hint of checking the privileges and parameters in the error log. #81

Bug fixes

  • Parse escape characters in the CSV related parameters column_separator and row_delimiter. #85

Doc

  • Refactor the docs. #66
  • Add examples of load data to BITMAP and HLL columns. #70
  • Add examples of Spark applications written in Python. #72
  • Add examples of loading ARRAY-type data. #75
  • Add examples for performing partial updates and conditional updates on Primary Key tables. #80

1.1.0

Features

  • Support to load data into StarRocks.

1.0

Features

  • Support to unload data from StarRocks.