# aliyun-odps-jdbc **Repository Path**: mirrors_aliyun/aliyun-odps-jdbc ## Basic Information - **Project Name**: aliyun-odps-jdbc - **Description**: JDBC Driver for ODPS - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-08-08 - **Last Updated**: 2026-05-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # ODPS JDBC ![build](https://github.com/aliyun/aliyun-odps-jdbc/actions/workflows/maven.yml/badge.svg) [![Maven Central](https://maven-badges.herokuapp.com/maven-central/com.aliyun.odps/odps-jdbc/badge.svg)](https://maven-badges.herokuapp.com/maven-central/com.aliyun.odps/odps-jdbc) ## Chinese Docs [MaxCompute JDBC介绍](https://help.aliyun.com/document_detail/143408.html) ## Installation Generally, there are two ways to use ODPS JDBC driver in your project. 1.The first one is to use the standalone library: * Download the with-dependencies-jar from [release page](https://github.com/aliyun/aliyun-odps-jdbc/releases). * Checkout the [CHANGELOG](https://github.com/aliyun/aliyun-odps-jdbc/blob/master/CHANGELOG.md). 2.The second is to rely on maven to resolve the dependencies for you: ```xml com.aliyun.odps odps-jdbc VERSION ``` ## Getting Started Using ODPS JDBC driver is just as using other JDBC drivers. It contains the following few steps: 1\. Explictly load the ODPS JDBC driver using `Class.forName()`: ```java Class.forName("com.aliyun.odps.jdbc.OdpsDriver"); ``` 2\. Connect to the ODPS by creating a `Connection` object with the JDBC driver: ```java Connection conn=DriverManager.getConnection(url,accessId,accessKey); ``` The ODPS server works with RESTful API, so the url looks like: ```java String url="jdbc:odps:ENDPOINT?project=PROJECT_NAME&charset=UTF-8"; ``` The connection properties can also be passed through `Properties`. For example: ```java Properties config=new Properties(); config.put("access_id","..."); config.put("access_key","..."); config.put("project_name","..."); config.put("charset","..."); Connection conn=DriverManager.getConnection("jdbc:odps:",config); ``` 3\. Submit SQL to ODPS by creating `Statement` object and using its `executeQuery()` method: ```java Statement stmt=conn.createStatement(); ResultSet rs=stmt.executeQuery("SELECT foo FROM bar"); ``` 4\. Process the result set. For example: ```java while(rs.next()){ ... } ``` ### Connection String Parameters It is recommended that the key and value in URL should be encoded by using java.net.URLEncoder#encode(java.lang.String). #### Basic | URL key | Property Key | Required | Default value | Description | |:------------------:|:--------------------:|:--------:|:--------------:|:-------------------------------------------------------------------------------------| | `endpoint` | `end_point` | True | | The endpoint of your MaxCompute service | | `project` | `project_name` | True | | The name of your MaxCompute project | | `accessId` | `access_id` | True | | Your Alibaba Cloud access key ID | | `accessKey` | `access_key` | True | | Your Alibaba Cloud access key secret | | `interactiveMode` | `interactive_mode` | False | false | Enable MCQA (interactive mode: maxqa/mcqa/offline, or true/false for simple boolean) | | `logview` | `logview_host` | False | Provided by MC | The endpoint of MaxCompute Logview | | `tunnelEndpoint` | `tunnel_endpoint` | False | Provided by MC | The endpoint of the MaxCompute Tunnel service | | `enableOdpsLogger` | `enable_odps_logger` | False | false | Enable MaxCompute JDBC logger | #### Advanced | URL key | Property Key | Required | Default value | Description | |:-------------------------:|:----------------------------:|:--------:|:-----------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------| | `stsToken` | `sts_token` | False | | The Alibaba Cloud STS token | | `logConfFile` | `log_conf_file` | False | | The configuration path for SLF4J | | `charset` | `charset` | False | UTF-8 | The charset of the inputs and outputs | | `executeProject` | `execute_project_name` | False | | For MCQA, the name of the MaxCompute project in which actually execute the queries | | `alwaysFallback` | `always_fallback` | False | false | For MCQA, fall back to regular mode if any exception happened | | `instanceTunnelMaxRecord` | `instance_tunnel_max_record` | False | -1 (unlimited) | For MCQA, max number of records within a result set, enableLimit option should set to false | | `instanceTunnelMaxSize` | `instance_tunnel_max_size` | False | -1 (unlimited) | For MCQA, max size of a result set in byte | | `enableLimit` | `enable_limit` | False | true(limited) | Download permission won't be checked if enableLimit is set true, but your result record count will be limited to 10000 | | `autoLimitFallback` | `auto_limit_fallback` | False | False(no auto fallback) | For non-MCQA mode, result record count will be limited to 10000 when no download permission exception happened and autoLimitFallback is set to true | #### MCQA | URL key | Property Key | Required | Default value | Description | |:-------------------------------:|:---------------------------------:|:--------:|:----------------:|:-------------------------------------------------------------------------------------------| | `interactiveMode` | `interactive_mode` | False | false | Enable MCQA (interactive mode: maxqa/mcqa/offline, or true/false for simple boolean) | | `interactiveServiceName` | `interactive_service_name` | False | `public.default` | The interactive service name to use (for MCQA) | | `majorVersion` | `major_version` | False | `default` | The major version to use for execution | | `autoSelectLimit` | `auto_select_limit` | False | -1(limited) | Set limit for number of records returned in selects without explicit limit conditions | | `fallbackForUnknownError` | `fallback_for_unknownerror` | False | true | Whether to fall back for unknown errors | | `fallbackForResourceNotEnough` | `fallback_for_resourcenotenough` | False | true | Whether to fall back for resource not enough errors | | `fallbackForUpgrading` | `fallback_for_upgrading` | False | true | Whether to fall back for service upgrading errors | | `fallbackForRunningTimeout` | `fallback_for_runningtimeout` | False | true | Whether to fall back for running timeout errors | | `fallbackForUnsupportedFeature` | `fallback_for_unsupportedfeature` | False | true | Whether to fall back for unsupported feature errors | | `disableFallback` | `disable_fallback` | False | false | Disable all fallback behavior | | `fallbackQuota` | `fallback_quota` | False | | The fallback quota to use | | `attachTimeout` | `attach_timeout` | False | -1 | The timeout for attaching to running jobs, unit depends on specific timeout implementation | #### Tunnel and Network Settings | URL key | Property Key | Required | Default value | Description | |:-------------------------------:|:-----------------------------------:|:--------:|:-------------:|:--------------------------------------------------------------------------------------| | `tunnelRetryTime` | `tunnel_retry_time` | False | 6 | The number of times to retry tunnel operations upon failure | | `useInstanceTunnel` | `use_instance_tunnel` | False | true | Whether to use instance tunnel for reading result sets, not recommend set it to false | | `tunnelReadTimeout` | `tunnel_read_timeout` | False | -1 | Tunnel read timeout in milliseconds, -1 means no timeout | | `tunnelConnectTimeout` | `tunnel_connect_timeout` | False | -1 | Tunnel connection timeout in milliseconds, -1 means no timeout | | `tunnelDownloadUseSingleReader` | `tunnel_download_use_single_reader` | False | `true` | Whether to use a single reader for tunnel downloads | | `fetchResultSplitSize` | `fetch_result_split_size` | False | 10000 | The size of result splits for fetching | | `fetchResultPreloadSplitNum` | `fetch_result_preload_split_num` | False | `5` | Number of splits to preload when fetching results | | `fetchResultThreadNum` | `fetch_result_thread_num` | False | 5 | Number of threads to use for result fetching | #### Security and Connection Settings | URL key | Property Key | Required | Default value | Description | |:--------------------------:|:----------------------------:|:--------:|:-------------:|:------------------------------------------------------------| | `stsToken` | `sts_token` | False | | STS token for temporary security credentials | | `disableConnectionSetting` | `disable_connection_setting` | False | false | Disable connection setting updates | | `enableCommandApi` | `enable_command_api` | False | false | Enable command API functionality | | `httpsCheck` | `https_check` | False | false | Enable HTTPS check for connection security | | `skipSqlRewrite` | `skip_sql_rewrite` | False | false | Skip SQL rewriting and optimization during query processing | | `skipSqlInjectCheck` | `skip_sql_inject_check` | False | false | Skip SQL injection check during query processing | | `skipCheckIfSelect` | `skipCheckIfSelect` | False | true | Skip use antlr4 to check if query is select | | `quotaName` | `quota_name` | False | Not set | The name of the query quota class to use for execution | #### Timeouts and Networking | URL key | Property Key | Required | Default value | Description | |:----------------:|:-----------------:|:--------:|:-------------:|:--------------------------------------------------------------------| | `readTimeout` | `read_timeout` | False | -1 | Read timeout in milliseconds, -1 means no timeout | | `connectTimeout` | `connect_timeout` | False | -1 | Connect timeout in milliseconds, -1 means no timeout | | `retryTime` | `retry_time` | False | -1 | Number of retry attempts for failed operations, -1 means no retries | #### Schema and Catalog Settings | URL key | Property Key | Required | Default value | Description | |:---------------------:|:-----------------------:|:--------:|:-------------------------:|:-----------------------------------------------------------------------| | `odpsNamespaceSchema` | `odps_namespace_schema` | False | Judge by project settings | Whether to use ODPS namespace schema feature (true/false) | | `schema` | `schema` | False | default | Default schema to use for connection (used with namespace schema mode) | | `useProjectTimeZone` | `use_project_time_zone` | False | false | Use project time zone instead of client time zone | | `timeZone` | `timezone` | False | JVM default | Explicit time zone setting for the connection | #### Logging and Debugging | URL key | Property Key | Required | Default value | Description | |:------------------:|:--------------------:|:--------:|:-------------:|:-------------------------------------| | `enableOdpsLogger` | `enable_odps_logger` | False | false | Enable MaxCompute JDBC logger | | `logLevel` | `log_level` | False | INFO | Set the logging level for the driver | | `verbose` | `verbose` | False | false | Enable verbose output for debugging | #### SQL Settings and Feature Control | URL key | Property Key | Required | Default value | Description | |:-------------------------:|:----------------------------:|:--------:|:-------------:|:-----------------------------------------------------------------------------| | `enableLimit` | `enable_limit` | False | true | Enable result limits for MCQA mode | | `autoLimitFallback` | `auto_limit_fallback` | False | false | Automatically fallback when auto limit is reached | | `settings` | `settings` | False | Not set | A JSON string with additional settings to apply on connection | | `tableList` | `table_list` | False | Not set | CSV list of table names to preload: project.table[,project.table] | | `logviewVersion` | `logview_version` | False | 1 | Set the LogView version to use, 1 or 2 | | `async` | `async` | False | false | Enable async execution mode for queries | | `longJobWarningThreshold` | `long_job_warning_threshold` | False | -1 | Threshold in milliseconds for long-running job warnings, -1 disables warning | #### ODPS Configuration File | URL key | Description | |:-------------:|:---------------------------------------------------------| | `odps_config` | Path to an external ODPS configuration file to load from | ## Example ### JDBC Client Sample Code ```java import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; public class OdpsJdbcClient { private static String driverName = "com.aliyun.odps.jdbc.OdpsDriver"; /** * @param args * @throws SQLException */ public static void main(String[] args) throws SQLException { try { Class.forName(driverName); } catch (ClassNotFoundException e) { e.printStackTrace(); System.exit(1); } // fill in the information here String accessId = "your_access_id"; String accessKey = "your_access_key"; Connection conn = DriverManager.getConnection( "jdbc:odps:https://service.odps.aliyun.com/api?project=", accessId, accessKey); Statement stmt = conn.createStatement(); String tableName = "testOdpsDriverTable"; stmt.execute("drop table if exists " + tableName); stmt.execute("create table " + tableName + " (key int, value string)"); String sql; ResultSet rs; // insert a record sql = String.format( "insert into table %s select 24 key, 'hours' value from (select count(1) from %s) a", tableName, tableName); System.out.println("Running: " + sql); int count = stmt.executeUpdate(sql); System.out.println("updated records: " + count); // select * query sql = "select * from " + tableName; System.out.println("Running: " + sql); rs = stmt.executeQuery(sql); while (rs.next()) { System.out.println(String.valueOf(rs.getInt(1)) + "\t" + rs.getString(2)); } // regular query sql = "select count(1) from " + tableName; System.out.println("Running: " + sql); rs = stmt.executeQuery(sql); while (rs.next()) { System.out.println(rs.getString(1)); } // do not forget to close stmt.close(); conn.close(); } } ``` ### Setting SQL task properties ```java stmt.execute("set biz_id=xxxxxx;"); stmt.execute("set odps.sql.mapper.split.size=512;"); ``` ## Third-party Integration It is also recommended to use ODPS by using other third-party BI tools or DB visualizer that supports JDBC. For example: * [SQLWorkbench/J]() * [Squrriel SQL]() * [Pentaho]() ## Getting Involved The project is under construction (and not fully JDBC-compliant). If you dicover any good features which have not been implemented, please fire me an [Email](mailto:zhangdingxin.zdx@alibaba-inc.com) or just pull a request. ### Architecture [![](mindmap-thumb.png)](mindmap.pdf) ### Build and run unitest 1.Build from source locally: ``` git clone .... cd odps-jdbc mvn package -DskipTests ``` 2.Copy out a configuration file: ``` cp ./src/test/resources/conf.properties.example ./src/test/resources/conf.properties ``` 3.Fill in your connection strings: ``` access_id=... access_key=... end_point=... project_name=... logview_host=... charset=UTF-8 ``` 4.Run maven test command (or just test it in IntelliJ IDEA): ``` mvn test ``` ### Data Type Mapping Currently, 16 ODPS data types are supported. Please see the following table for supported ODPS data types and corresponding JDBC interfaces. | ODPS Type | JDBC Interface | JDBC Type | |:-------------:|:--------------------------------:|:---------:| | TINYINT | java.sql.ResultSet.getByte | TINYINT | | SMALLINT | java.sql.ResultSet.getShort | SMALLINT | | INT | java.sql.ResultSet.getInt | INTEGER | | BIGINT | java.sql.ResultSet.getLong | BIGINT | | FLOAT | java.sql.ResultSet.getFloat | FLOAT | | DOUBLE | java.sql.ResultSet.getDouble | DOUBLE | | DECIMAL | java.sql.ResultSet.getBigDecimal | DECIMAL | | VARCHAR | java.sql.ResultSet.getString | VARCHAR | | CHAR | java.sql.ResultSet.getString | CHAR | | STRING | java.sql.ResultSet.getString | VARCHAR | | BOOLEAN | java.sql.ResultSet.getBoolean | BOOLEAN | | DATE | java.sql.ResultSet.getDate | DATE | | DATETIME | java.sql.ResultSet.getTimestamp | TIMESTAMP | | TIMESTAMP | java.sql.ResultSet.getTimestamp | TIMESTAMP | | TIMESTAMP_NTZ | java.sql.ResultSet.getTimestamp | TIMESTAMP | | BINARY | java.sql.ResultSet.getBytes | BINARY | | ARRAY | java.sql.ResultSet.getArray | ARRAY | When the `getObject()` method is called, what is obtained is the Java type directly corresponding to each ODPS type without any conversion operation. Please see the following table for he correspondence between ODPS types and Java types. | ODPS Type | Java Type | |:-------------:|:-----------------------------:| | TINYINT | java.lang.Byte | | SMALLINT | java.lang.Short | | INT | java.lang.Integer | | BIGINT | java.lang.Long | | FLOAT | java.lang.Float | | DOUBLE | java.lang.Double | | DECIMAL | java.math.BigDecimal | | VARCHAR | com.aliyun.odps.data.Varchar | | CHAR | com.aliyun.odps.data.Char | | STRING | byte[] | | BOOLEAN | java.sql.ResultSet.getBoolean | | DATE | java.time.LocalDate | | DATETIME | java.time.ZonedDateTime | | TIMESTAMP | java.time.Instant | | TIMESTAMP_NTZ | java.time.LocalDateTime | | BINARY | com.aliyun.odps.data.Binary | | ARRAY | java.util.List | NOTE: Possible timezone issue DATETIME in MaxCompute is actually defined as EPOCH in milliseconds, which is UTC, and so is TIMESTAMP in JDBC. This driver fill the DATETIME value directly into JDBC TIMESTAMP and do no parse or format action. When application that using JDBC display a DATETIME as a human-readable string format, it is the application itself did the format using application defined or OS defined timezone. It is suggested to keep your application/OS timezone setting same to MaxCompute to avoid inconsistent datetime parse/format. ### Type Conversion Implicit type conversion happens when accessing a ODPS data type with JDBC interfaces other than the recommended one. Please see the following table for supported implicit conversions. | JAVA\ODPS | TINYINT | SMALLINT | INT | BIGINT | FLOAT | DOUBLE | DECIMAL | CHAR | VARCHAR | STRING | DATE | DATETIME | TIMESTAMP | TIMESTAMP_NTZ | BOOLEAN | BINARY | |:----------:|:-------:|:--------:|:---:|:------:|:-----:|:------:|:-------:|:----:|:-------:|:------:|:----:|:--------:|:---------:|:-------------:|:-------:|:------:| | byte | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | | | | Y | Y | | short | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | | | | Y | Y | | int | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | | | | Y | Y | | long | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | | | | Y | Y | | float | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | | | | Y | Y | | double | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | | | | Y | Y | | BigDecimal | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | | | Y | Y | | String | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | byte\[\] | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | Date | | | | | | | | | | Y | Y | Y | Y | Y | | Y | | Time | | | | | | | | | | Y | | Y | Y | Y | | Y | | Timestamp | | | | | | | | | | Y | | Y | Y | Y | | Y | | boolean | Y | Y | Y | Y | Y | Y | Y | | | Y | | | | | Y | | ## Known Issues ### Thread Leak in v3.9.0 ~ v3.10.4 Versions v3.9.0 through v3.10.4 have a thread leak bug. The internal thread pool used for multi-threaded result fetching (`InstanceDataIterator`) is not properly shut down in the following scenarios: 1. **Reusing a `Statement` to execute multiple queries** — the most common case in connection pool scenarios. Each new query execution leaks the previous thread pool. 2. **Accessing a `ResultSet` in scrollable mode** — the thread pool is abandoned when converting to `OdpsScrollResultSet`. Each leaked query accumulates `fetchResultThreadNum` threads (default: `min(fetchResultPreloadSplitNum, availableProcessors * 2)`). In long-running applications, this causes continuous thread count growth, eventually leading to resource exhaustion. **Recommendation:** Upgrade to v3.10.5 or later. If upgrading is not immediately possible, ensure that each `Statement` object is only used once (create a new `Statement` for each query) as a temporary workaround. ## MaxCompute Service Compatibility and Recommended JDBC version Since Sprint27, MaxCompute tunnel service supported a feature named instance tunnel that allowing client read query result set through tunnel endpoint, to release client from creating temporary table. And this JDBC driver began adopt using instance tunnel since version 2.0. However, for users using MaxCompute deploy that is earlier than Sprint27 (especially Private Cloud cases), please stick to the latest version before 2.0. | MaxCompute | JDBC | |:----------------------:|:------:| | Public Service | latest | | Non PRC Public Service | latest | | <= Sprint27 | 1.9.2 | ## License licensed under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0.html)