User Guide - Orange Business Services

Transcription

Data Warehouse ServiceUser GuideDate2021-10-22

Data Warehouse ServiceUser GuideContentsContents1 Service Overview. 11.1 What Is GaussDB(DWS)?. 11.2 Advantages. 41.3 Application Scenarios. 61.4 Functions. 81.5 Concepts. 121.5.1 GaussDB(DWS) Management Concepts . 121.5.2 GaussDB(DWS) Database Concepts. 121.6 Related Services. 131.7 GaussDB(DWS) Permissions Management.141.8 GaussDB(DWS) Access. 191.9 Restrictions. 211.10 Technical Specifications. 212 Getting Started. 242.1 Step 1: Starting Preparations. 242.2 Step 2: Creating a Cluster. 252.3 Step 3: Connecting to a Cluster. 282.4 Step 4: Importing Sample Data and Performing Queries. 322.4.1 Checkpoint Vehicle Analysis.322.4.2 Supply Chain Requirement Analysis (TPC-H Data Set).362.4.3 Operations Status Analysis of a Retail Department Store. 432.5 Step 5: Viewing Other Documents and Deleting Resources. 513 Process for Using GaussDB(DWS). 534 Preparations.575 Cluster Configuration. 585.1 Accessing the GaussDB(DWS) Management Console. 585.2 Creating a Cluster. 586 Cluster Connection.696.1 Methods of Connecting to a Cluster. 696.2 Obtaining the Cluster Connection Address. 696.3 Using the gsql CLI Client to Connect to a Cluster. 722021-10-22ii

Data Warehouse ServiceUser GuideContents6.3.1 Downloading the Client. 726.3.2 Preparing an ECS as the gsql Client Host. 746.3.3 Using the gsql Client to Connect to a Cluster. 756.3.4 Establishing Secure TCP/IP Connections in SSL Mode. 786.3.5 (Optional) Configuring SSL Connection. 826.3.6 (Optional) Downloading the SSL Certificate. 856.4 Using the Data Studio GUI Client to Connect to a Cluster. 866.5 Using the JDBC and ODBC Drivers to Connect to a Cluster. 906.5.1 Development Specifications. 906.5.2 Downloading the JDBC or ODBC Driver. 906.5.3 Using a JDBC Driver to Connect to a Database. 916.5.4 Using an ODBC Driver to Connect to a Database. 1026.6 Managing Database Connections. 1087 Clusters. 1117.1 Checking the Cluster Status. 1117.2 Viewing Basic Cluster Information. 1147.3 Managing Access Domain Names. 1177.4 Modifying Database Parameters.1197.5 MRS Data Sources. 1207.5.1 Importing Data from MRS to GaussDB(DWS).1207.5.2 Creating an MRS Data Source Connection. 1217.5.3 Updating the MRS Data Source Configuration. 1237.6 Managing Cluster Workloads. 1247.6.1 Workload Management Overview. 1247.6.2 Adding Workload Queues. 1267.6.3 Modifying Workload Queues. 1287.6.4 Workload Queue Query. 1317.6.5 Deleting Workload Queues. 1317.7 Managing Tags. 1327.7.1 Overview. 1327.7.2 Tag Management. 1337.8 Deleting Clusters.1357.9 Parameter Templates. 1357.10 Managing Clusters That Fail to Be Created. 1408 Cluster O&M.1428.1 Snapshots. 1428.1.1 Overview. 1428.1.2 Manually Creating Snapshots. 1438.1.3 Configuring the Automated Snapshot Policy. 1458.1.4 Viewing Snapshot Information. 1468.1.5 Copying Automated Snapshots. 1478.1.6 Restoring a Snapshot to a New Cluster. 1482021-10-22iii

Data Warehouse ServiceUser GuideContents8.1.7 Deleting Manual Snapshots. 1508.2 Cluster Scale-out. 1508.3 Cluster Upgrade. 1528.4 Read-only Status.1548.5 Password Reset. 1548.6 Cluster Restart. 1559 Monitoring and Alarms. 1579.1 Cluster Monitoring. 1579.2 Event Notifications. 1679.2.1 Event Notifications Overview.1679.2.2 Subscribing to Event Notifications. 1709.2.3 Viewing Events. 17210 Audit Logs. 17310.1 Audit Log Overview. 17310.2 Viewing Audit Logs of Key Operations on the Management Console. 17410.3 Configuring the Database Audit Logs.17710.4 Dumping the Database Audit Logs. 18011 Security.18411.1 Configuring Cluster Security Settings. 18411.1.1 Configuring Separation of Permissions. 18411.2 Encrypting Databases. 18711.2.1 Overview. 18711.2.2 Rotating Encryption Keys. 18912 FAQs. 19012.1 General Problems. 19012.1.1 What Is a Data Warehouse?. 19012.1.2 Is the Data Warehouse Kernel GaussDB an In-House Product?. 19112.1.3 Why Are Data Warehouses Necessary?. 19112.1.4 What Scenarios Use Data Warehouses?. 19112.1.5 What Are the Differences Between a Data Warehouse and the Hadoop Big Data Platform?. 19212.1.6 Why Should I Use Public Cloud GaussDB(DWS)?. 19312.1.7 What Are the Benefits of GaussDB(DWS)?. 19412.1.8 Should I Choose Public Cloud GaussDB(DWS) or RDS?.19412.1.9 When Should I Use GaussDB(DWS) and MRS?. 19512.1.10 Can GaussDB(DWS) SQL on OBS Replace MRS?. 19512.1.11 What Are the Differences Between GaussDB(DWS) and Hive in Functions?. 19612.1.12 What Is the User Quota?. 19612.1.13 How Can I View When the User Was Created?. 19612.1.14 Regions and AZs. 19712.1.15 Is My Data Secure in GaussDB(DWS)?. 19812.1.16 How Is GaussDB(DWS) Secured?. 1992021-10-22iv

Data Warehouse ServiceUser GuideContents12.1.17 Can I Modify the Security Group of a GaussDB(DWS) Cluster?. 19912.1.18 What Is a Database/Data Warehouse/Data Lake/Lakehouse?. 19912.2 Cluster Management. 20212.2.1 Can I Batch Create Data Warehouses?. 20212.2.2 What Do I Do If Creating a GaussDB(DWS) Cluster Failed?. 20212.2.3 How Can I Upgrade or Downgrade GaussDB(DWS)?.20312.2.4 How Can I Clear and Reclaim Storage?. 20312.2.5 Can I Scale a Cluster?. 20412.2.6 Can I Use the Data Warehouse Cluster When It Is Scaled Out?. 20412.2.7 Why Did the Used Storage Shrink After Scale-out?. 20412.2.8 How Do I View Node Metrics (CPU, Memory, and Disk Usage)?.20412.2.9 Does GaussDB(DWS) Support Disk Capacity Expansion?. 20512.2.10 Does GaussDB(DWS) Support a Single Node for a Learning Environment?. 20512.2.11 How Is the Disk Space or Capacity of GaussDB(DWS) Calculated?.20512.2.12 Can I Change the Node IP Address in a GaussDB(DWS) Cluster?. 20512.2.13 Can I Change the CPU Architecture of GaussDB(DWS)?. 20512.2.14 What Are the Differences Between the Kunpeng Architecture and x86 Architecture of aGaussDB(DWS) Cluster?.20512.3 Cluster Backup and Restoration. 20612.3.1 Why Is Creating an Automated Snapshot So Slow?.20612.4 Database Connections. 20612.4.1 Does GaussDB(DWS) Support Third-Party Clients and JDBC and ODBC Drivers?. 20612.4.2 Can I Use Navicat to Connect to GaussDB(DWS)?. 20612.4.3 Can I Connect to GaussDB(DWS) Cluster Nodes Using SSH?.20612.4.4 What Should I Do If I Cannot Connect to a Data Warehouse Cluster?.20712.4.5 Why Was I Not Notified of Failure Unbinding the EIP When GaussDB(DWS) Is Connected Over theInternet?. 20712.4.6 Are There Any Restrictions on the Platform and OS of the Downloaded Client?.20812.4.7 How Can a Data Warehouse and a Database in Different AZs of the Same Region Communicate?. 20812.4.8 Do I Need the Same VPC and Subnet to Connect to a GaussDB(DWS) Cluster?. 20812.5 Data Import and Export. 20812.5.1 What Are the Differences Between Data Formats Supported by OBS and GDS Foreign Tables?.20812.5.2 How Can I Import Data to GaussDB(DWS)?. 20912.5.3 How Much Service Data Can a Data Warehouse Store?. 20912.5.4 How Do I Use COPY to Import Data to a Cloud Database?.20912.5.5 Can I Export GaussDB(DWS) Data from/to OBS Across Regions?. 21012.5.6 How Do I Import GaussDB(DWS)/MySQL/SQL Server Data to GaussDB(DWS) (Whole DatabaseMigration)?. 21012.5.7 Can I Import Data over the Public/External Network Using GDS?. 21012.6 Account, Password, and Permissions.21012.6.1 How Do I Change My Expired Database Password?. 21012.6.2 How Do I View All Database Users and Their Permissions?. 2112021-10-22v

Data Warehouse ServiceUser GuideContents12.6.3 How Do I Revoke the CONNECT ON DATABASE Permission from a User?. 21112.7 Database Usage.21212.7.1 Do I Need to Set a Distribution Key After Setting a Primary Key?. 21212.7.2 Is GaussDB(DWS) Compatible with PostgreSQL Stored Procedures?. 21212.7.3 Can I Set the Distribution Key After a Table Is Created?. 21212.7.4 How Can I Export the Table Structure?.21212.7.5 How Can I Export All Tables and Views from a Database?. 21312.7.6 How Can I Change the Character Encoding Format of the Database?. 21312.7.7 How Can I Delete Table Data Efficiently?. 21412.7.8 Must I Run VACUUM FULL and ANALYZE on Common Tables Periodically?. 21512.7.9 How Do I View Foreign Table Information?. 21612.8 Database Performance. 21612.8.1 Why Is SQL Execution Slow After Long GaussDB(DWS) Usage?.21612.8.2 Why Does GaussDB(DWS) Perform Worse Than a Single-Server Database in Extreme Scenarios?. 21712.8.3 How Can I View SQL Execution Records of A Certain Period When Service Read and Write AreBlocked?. 21712.9 Logs. 21712.9.1 How Can I View Database Operation Logs?. 218A Change History. 2192021-10-22vi

Data Warehouse ServiceUser Guide1 Service Overview1Service Overview1.1 What Is GaussDB(DWS)?GaussDB(DWS) is an online data processing database that runs on the cloudinfrastructure to provide scalable, fully-managed, and out-of-the-box analyticdatabase service, freeing you from complex database management andmonitoring. It is a native cloud service based on the converged data warehouseGaussDB, and is fully compatible with the standard ANSI SQL 99 and SQL 2003, aswell as the PostgreSQL and Oracle ecosystems. GaussDB(DWS) providescompetitive solutions for PB-level big data analysis in various industries.ArchitectureGaussDB(DWS) employs the shared-nothing architecture and the massivelyparallel processing (MPP) engine, and consists of numerous independent logicalnodes that do not share the system resources such as CPUs, memory, and storage.In such a system architecture, service data is separately stored on numerousnodes. Data analysis tasks are executed in parallel on the nodes where data isstored. The massively parallel data processing significantly improves responsespeed.Figure 1-1 Architecture2021-10-221

Data Warehouse ServiceUser Guide 1 Service OverviewApplication layerData loading tools, extract, transform, and load (ETL) tools, businessintelligence (BI) tools, as well as data mining and analysis tools, can beintegrated with GaussDB(DWS) through standard APIs. GaussDB(DWS) iscompatible with the PostgreSQL ecosystem, and the SQL syntax is compatiblewith Oracle and Teradata. Applications can be smoothly migrated toGaussDB(DWS) with few changes. APIApplications can connect to GaussDB(DWS) through the standard JavaDatabase Connectivity (JDBC) 4.0 and Open Database Connectivity (ODBC)3.5. GaussDB(DWS) (MPP cluster)A GaussDB(DWS) cluster contains nodes of the same flavor in the samesubnet. These nodes jointly provide services. Datanodes (DNs) in a clusterstore data on disks. Coordinators (CNs) receive access requests fromapplication

1 Service Overview 1.1 What Is GaussDB(DWS)? GaussDB(DWS) is an online data processing database that runs on the cloud infrastructure to provide scalable, fully-managed, and out-of-the-box analytic