SAP HANA Network Requirements

Transcription

SAP HANA Network RequirementsAs an in-memory database, SAP HANA uses multiple network connections to transfer data, forexample, from clients to the database during standard operations, between the nodes in ascale-out solution, and between data centers and to persistent storage in a disaster tolerantsolution. This paper discusses the SAP HANA network requirements using an SAP BW on SAPHANA system as an example.SAP HANA Development Teammechthild.bore-wuesthof@sap.comV1.0, October 2014

ContentsLegal Disclaimer . 3Change history . 31Introduction . 42SAP HANA network zones . 5Client zone . 5SQL Client Communication . 5HTTP Client Communication. 8Internal zone . 9Internode Network . 9System Replication Network . 11Storage Replication Network. 12Storage zone . 13Fibre Channel Storage using SAN . 13Network attached storage using NFS . 13Shared-nothing architecture with Cluster File Systems . 14Backup storage . 143Scenario: SAP BW on SAP HANA with system replication . 15SAP BW Scenarios . 15SAP BW Network Traffic . 16Technical Requirements . 19Recommendations for the Client Zone Network . 22Client reconnect with the Secure User Store (hdbuserstore) . 22Client reconnect in SAP HANA system replication . 23IP redirection (or Virtual IP). 23Connectivity suspend of the application server . 25Client reconnect for SAP HANA XS applications (http/https) . 25Recommendations for the Internal Zone Networks . 26Internode network . 26Internal host name resolution . 27System replication network. 27System replication host name resolution. 29Recommendations for the Storage Zone Networks . 31 2014 SAP SEpage 2/32

4Summary. 315References . 32Legal DisclaimerTHIS DOCUMENT IS PROVIDED FOR INFORMATION PURPOSES ONLY AND DOES NOT MODIFY THE TERMS OF ANY AGREEMENT. THE CONENT OF THISDOCUMENT IS SUBJECT TO CHANGE AND NO THIRD PARTY MAY LAY LEGAL CLAIM TO THE CONTENT OF THIS DOCUMENT. IT IS CLASSIFIED AS “CUSTOMER”AND MAY ONLY BE SHARED WITH A THIRD PARTY IN VIEW OF AN ALREADY EXISTING OR FUTURE BUSINESS CONNECTION WITH SAP. IF THERE IS NO SUCHBUSINESS CONNECTION IN PLACE OR INTENDED AND YOU HAVE RECEIVED THIS DOCUMENT, WE STRONGLY REQUEST THAT YOU KEEP THE CONTENTSCONFIDENTIAL AND DELETE AND DESTROY ANY ELECTRONIC OR PAPER COPIES OF THIS DOCUMENT. THIS DOCUMENT SHALL NOT BE FORWARDED TO ANYOTHER PARTY THAN THE ORIGINALLY PROJECTED ADDRESSEE.This document outlines our general product direction and should not be relied on in making a purchase decision. This document is not subject to your licenseagreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release anyfunctionality mentioned in this document. This document and SAP's strategy and possible future developments are subject to change and may be changed bySAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limitedto, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in thisdocument and shall have no liability for damages of any kind that may result from the use of these materials, except if such damages were caused by SAPintentionally or grossly negligent. Copyright 2014 SAP SE. All rights reserved.Change historyVersion1.01.1DateOctober 2014October 2014 2014 SAP SEDescriptionInitial releaseMinor addition in backup chapter – storage snapshots addedpage 3/32

1IntroductionThe components of a SAP HANA system communicate via different network channels. It is recommended practice to have a well-defined network topology to control and limit network access to SAPHANA to only those communication channels required for your scenario, and to apply appropriateadditional security and performance measures as necessary.Clients require access to the SAP HANA database, and while SAP HANA is an in-memory databasewhich stores and processes its data in memory, the data is also saved in persistent storage locationsto provide protection against data loss. In distributed (scale-out) systems, communication must bepossible between hosts; in disaster recovery setups, data is replicated over the network to otherdata centers.The components belonging to SAP HANA communicate over several logical network zones: Client zone. Different clients, such as SQL clients on SAP application servers, browserapplications using HTTP/S to the SAP HANA XS server and other data sources (such as BI)need a network communication channel to the SAP HANA database. Internal zone. The internal zone covers the communication between hosts in a distributedSAP HANA system as well as the communication used by SAP HANA system replicationbetween two SAP HANA sites. Storage zone. Although SAP HANA holds the bulk of its data in memory, the data is also savedin persistent storage locations – which probably need to be accessed via a network – toprovide protection against power failures or other events that cause hosts to becomeunavailable.Each communication zone needs special attention with regard to configuration, setup, security andperformance requirements. Redundancy is recommended for the internal and the storage networks,but is also important for high availability requirements.The following illustration shows the three logical zones of a distributed SAP HANA system (in thisexample1 with 2 hosts plus 1 standby host) on the primary site, which replicates to a secondary site(with 2 hosts) using SAP HANA system replication.1This is just an example. For SAP BW systems, at least 3 worker nodes are recommended. See the SAP Community Network bloghttp://scn.sap.com/docs/DOC-39682 as well as SAP Note 1702409 ( http://service.sap.com/sap/support/notes/1702409). 2014 SAP SEpage 4/32

Client zone:Internal zone:Storage zone:Network for SAP application servers, HTTP and SQL clientsInternode and System Replication NetworkEnterprise Storage Network and Backup NetworkNetwork zones of SAP HANA2SAP HANA network zonesThe aforementioned logical communication zones of SAP HANA are explained in greater detail in thissection.Client zoneSQL Client CommunicationClient applications communicate with an SAP HANA system via a client library (SQLDBC, JDBC, ODBC,DBSL, ODBO .) for SQL or MDX access from different platforms and types of clients. In distributedsystems, the application has a logical connection to the SAP HANA system; the client library may usemultiple connections to different hosts or change the underlying connection over time. The followingfigure provides an overview of some of the clients used on SAP HANA systems and the interfacesprovided by the client library. 2014 SAP SEpage 5/32

Client library (interfaces)The SQL client libraries will connect to the first available host specified in the connect string, whichcan contain a single host or a list of hosts. All hosts that could become the active master, becausethey are one of the three configured master candidates, should be listed in the connect string toallow initial connection to any of the master candidates in the event of a host auto-failover. As aresult, the client libraries receive a list of all hosts. During operations, statements may be sent to anyof these hosts. The client connection code uses a "round-robin" approach to reconnect and ensuresthat these clients can reach the SAP HANA database. In the event of failover, a list of hosts is parseduntil the first responding host is found.Please refer to the SAP HANA Master Guide [2] for further details on connect strings with multiplehost names.To ensure transactional consistency in distributed setups, SAP HANA supports distributedtransactions, which are transactions that involve data that is stored on different index servers. In thiscontext, the client library supports load balancing by connection selection and minimizescommunication overhead using statement routing based on information about the location of thedata. Connection selection and/or statement routing are executed based on the setting of the iniparameter client distribution mode in the indexserver.ini file (the default settingis statement for statement routing). Statement routing helps to optimize performance by reducing the amount of data that must besent over the internal network zone. In principle, clients can send requests to any index server ina distributed system. However, this could lead to a performance decrease if, for example, a queryinvolving data located on server1 is sent to server2. Server2 would have to forward the query toserver1 and forward the response from server1 to the client. This overhead can be avoided if the 2014 SAP SEpage 6/32

client sends the query to server1 directly. This can be achieved through the use of preparedstatements.Statement routing using prepared statementA prepared statement is an SQL statement (typically parameterized) that is sent to the serverwhere it is precompiled (1). The prepared statement is returned to the client (2). To execute thestatement, the client sends the prepared statement (usually filled with the actual parameters) tothe server where (most of) the data is located (3) by opening a new connection.This two-step protocol (prepare and execute) is used for statement routing in a distributedsystem, where the “prepare” request may return information about the preferred locations forexecuting the statement. Based on this information, the client library sends requests to executethe prepared statement directly to one of the indicated index servers. If the situation haschanged since the “prepare” request was processed, the response for an “execute” request mayalso contain location information (4). Statement routing is done by the client library and istransparent to the application. With statement routing, different statements of the sametransaction may be executed on different servers. Connection selection is transparent to the client application. Whenever a new connection iscreated, the client library chooses one of the servers in a round-robin manner. During a longliving connection, the client library may connect to a different server to improve the workloaddistribution. If a connected server fails, the client library switches to a different server. Command splitting describes the ability of the client interface to split up the execution of bulkstatements that allow, for example, the insertion of a big array of rows with one command. Theclient library can be configured to split the execution of such a command into batches that aresent using multiple connections. The servers are chosen according to the client-side landscapeinformation. 2014 SAP SEpage 7/32

Command splittingHTTP Client CommunicationIn addition, the communication with SAP HANA servers can be initiated by a web browser or a mobileapplication. These use the HTTP/S protocol to access SAP HANA Extended Application Services (SAPHANA XS).As shown in the following picture, each SAP HANA instance includes an SAP HANA XS server and aninternal SAP web dispatcher – referred to as HDB web dispatcher.Browser application running against the SAP HANA XS server 2014 SAP SEpage 8/32

You can have multiple SAP HANA XS servers for purposes of scale-out in a distributed SAP HANAsystem. In this case, it is recommended that you install an external, itself fault-protected HTTP loadbalancer (HLB) to support HTTP (web) clients, such as the SAP Web Dispatcher (webdisp) or asimilar product from another vendor. In scale-out or system replication landscapes, the HLBs areconfigured to monitor the web servers on all hosts on both the primary and secondary sites. In theevent that an SAP HANA instance fails, the HLB which serves as a reverse web-proxy redirects theHTTP clients to a SAP HANA XS instance running on an active host. HTTP clients are configured to usethe IP address of the HLB itself, which is obtained via DNS, and remain unaware of any SAP HANAfailover activity.An internal HDB web dispatcher runs between the external HTTP load balancer and the SAP HANA XSserver running on each host. This internal web dispatcher provides client access to the SAP HANA XSserver via HTTP in a 1:1 mapping.For more information, please see the SAP HANA Master Guide [2].Internal zoneOnce the physical limits of a single host have been reached, it is possible to scale out over multiplehosts to create a distributed SAP HANA system. One technique is to distribute different schemas andtables to different hosts (complete data and user separation), but that is not always possible, forinstance, when a single table is larger than the host’s RAM size. SAP HANA supports distribution of itsserver components across multiple hosts – for example, for scalability.Internode NetworkA distributed system can overcome the hardware limitations of a single physical host, and it candistribute the load between multiple hosts. The most important strategy for data scaling is datapartitioning. Partitioning supports the creation of very large tables2 (billions of rows) by breakingthem into smaller chunks that can be placed on different machines, as illustrated below:Table partitioned over multiple hostsIn a distributed system, each index server is assigned to one host to attain maximum performance. Itis possible to assign different tables to different hosts (by partitioning the database), to split a singletable between hosts (by partitioning the table), or to replicate tables to multiple hosts.2Master data tables should not be partitioned, since they are to be joined to all other tables. Network traffic is reduced, if tables in thesame query should have the same partition criteria (first-level) to benefit from query optimizations, e.g. collocated joins of OLAP engine forBW. 2014 SAP SEpage 9/32

In such a system, the name server acts as a central component that “knows” the topology and howthe data is distributed. It “knows” which tables, table replicas or partitions of tables are located onwhich index server. When processing a query, the index servers ask the name server about thelocations of the involved data. To prevent a negative impact on performance, the topology anddistribution information is replicated and cached on each host. Each SAP HANA system has onemaster name server that owns the topology and distribution data. This data is replicated to all theother name servers, which are referred to as the slave name servers. The slave name servers writethe replicated data to a cache in shared memory from where the index servers of the same instancecan read it.Topology and data location information in a distributed SAP HANA databaseThe master name server has its own persistence where it stores name server data (topology,distribution, data). The slave name servers have no persistence as they hold only replicated data inmemory.In a distributed (scale-out) system, all network zones are present with additional traffic going overthe internal network due to table distribution and partitioning. For example, index server A processesa query that involves tables located on different servers. When processing the query, the optimizerasks the name server for the location information and chooses the optimal execution plan takingdistribution into account. A distributed execution plan is created that also contains the locationinformation. The executor of the distributed plan also needs to be aware of the distribution. It sendsparts of the execution plan to remote servers, triggers the execution of the remote operations, andsynchronizes distributed execution based on the availability of intermediate results as input for thenext steps. Finally, it combines the results provided by the individual servers to generate the finalresult.SAP HANA supports three ways of distributing data between multiple index servers in a system: 2014 SAP SEpage 10/32

Different tables can be assigned to different index servers (which normally run on differenthosts). Here the entire table is assigned to a specific index server that resides on the master,on the slave, or on all hosts.A table can be split so that different rows of the table are stored on different index servers.Splitting tables is currently supported for column store tables only.A table can be replicated to multiple index servers, for better query and join performance.For more information on table placement, please have a look in the SAP HANA Administ

HTTP clients to a SAP HANA XS instance running on an active host. HTTP clients are configured to use the IP address of the HLB itself, which is obtained via DNS, and remain unaware of any SAP HANA failover activity. An internal HDB web dispatcher runs between the external HTTP l