IBM PureData System For Analytics N3001 - BITanium

Transcription

Data SheetAnalyticsIBM PureData System forAnalytics N3001Powered by Netezza technologyHighlights Easy to deploy and manage; dramaticallysimplifies your data warehouse andanalytic infrastructure Arrives ready to go with IBM Fluid Query,plus data integration, business intelligenceand Hadoop starter kits Protection of all data from unauthorizedaccess Integrated platform supporting thousandsof users, unifying data warehouse,Hadoop, and business intelligence withadvanced analytics Delivered with data integration, businessintelligence and Hadoop starter kits Powered by Netezza technologyTo gain competitive advantage, organizations must rely on sophisticatedanalytics mining large volumes of data. Yet many companies need fastertime-to-value for new analytical capability, as well as maintaining servicelevel agreements on existing analytics applications. IBM PureData System for Analytics N3001, powered by Netezza technology, providesfaster performance, is big data and business intelligence (BI) ready, andprovides advanced security all in a wider range of appliance models. WithIBM PureData System for Analytics N3001, IBM is again changing thegame for data warehouse appliances.IBM PureData System for Analytics is a high-performance, scalable,massively parallel system that enables clients to gain insight fromtheir data and perform analytics on enormous data volumes. Realizingbusiness value from today’s volumes is made simpler and faster,because the data is more easily accessible. The N3001 model comesready to deliver extra value with included business intelligence andHadoop starter kits. These tools complement the IBM PureDataSystem for Analytics N3001’s ability to run complex analytics on verylarge data volumes, at much faster rates than competing solutions.IBM PureData System for Analytics is a purpose-built, standards-baseddata warehouse and analytic appliance that integrates database, server,storage and advanced analytic capabilities into a single, easy-to-managesystem. Designed for rapid and deep analysis of data volumes scalinginto the petabytes, it delivers insight never before thought possible,at a low cost of ownership. IBM PureData System for Analytics N3001family ranges widely from entry point to petabyte scale, covering a broadrange of data capacity needs.PureData System for Analytics delivers the proven performance, scalability,intelligence, and simplicity your business needs. It requires minimaladministration and tuning both for the initial deployment as well asongong maintenance, which translates into a lower total cost ofownership (TCO).

Data SheetAnalyticsThe analytics opportunityBest practice: Routing queries to the dataDeep, sophisticated analytics on large data volumes areintegral to enterprises in a competitive economy, giving theman edge over the competition. However, most organizationsare challenged by both the time-to-market on new analyticcapability, as well as maintaining service level agreements onexisting analytics. IBM PureData System for Analytics shiftsthe focus to simplicity, instead of unproductivelymanaging complexity.IBM Fluid Query is the capability that unifies data access acrossthe logical data warehouse and big data ecosystems. Users andanalytic applications need access to data in a variety of datarepositories and platforms without concern for the data’s locationor access method or the need to rewrite a query. IBM FluidQuery is the capability for a data store to route a query (or evenpart of a query) to the correct data store within the logical datawarehouse so that the query can flow to the data, not the dataflow to the query.Strategic analytics should not be complicatedto deliver and difficult to manage.No matter where a user connects within the logical datawarehouse, they can access all data through the same, standardAPI/SQL access. IBM Fluid Query is the foundation of thelogical data warehouse, giving users the ability to combine theirdata, even if spread across various sources in a fast, agile mannerto drive analytics and deeper insight, without understanding howto connect multiple data stores, use different syntaxes or APIs, orchange their application.IBM PureData System for Analytics is a scalable, hardwareaccelerated, massively parallel system that enables clients to gaininsight from enormous data volumes, 10 – 100 times faster thanthey can with traditional systems1 without the need to copy thedata into a separate analytics server.IBM Fluid Query 1.0, included with IBM PureData System forAnalytics, provides access to data in Hadoop from IBMPureData System for Analytics appliances. IBM Fluid Query 1.0enables the fast movement of data between Hadoop and IBMPureData System for Analytics appliances. Enabling query anddata movement, IBM Fluid Query 1.0 connects those appliancesto common Hadoop systems: IBM BigInsights for Apache Hadoop , Cloudera, and Hortonworks. IBM Fluid Query 1.0allows queries against PureData System for Analytics, Hadoopor both by merging results from PureData System for Analyticsdatabase tables and Hadoop data sources thus creating powerfulanalytic combinations.IBM Fluid Query 1.0 enables your existing PureData System forAnalytics applications to gain insight from even more data. Youcan now run your existing queries, reports, and analytics againstdata on Hadoop, in addition to the data in your appliance.Figure 1: PureData System for Analytics2

Data SheetAnalyticsFast. Scalable. Smart. Simple.Completely integrated.support for PMML 4.0 models, data modelers and quantitativeteams can operate on the data directly inside the applianceinstead of having to off load massive data volumes to a separateinfrastructure, and then have to deal with the associated datapreprocessing, transformation, and movement.IBM PureData System for Analytics is designed specifically forrunning complex analytics on very large data volumes with fasterexecution times than competing solutions. It delivers the provenperformance: scalability, intelligence, and simplicity thatorganizations need to leverage their data.Data scientists can build their models using all the enterprisedata, and then iterate through different models much fasterto arrive at the best solution. Once the model is developed,it can be seamlessly executed against the relevant data in theappliance. Prediction and scoring can be done where the dataresides. Users can get their predictive scores in near real-time,helping operationalize advanced analytics and making it availablethroughout the enterprise.FastIBM PureData System for Analytics N3001 delivers aperformance advantage over other analytic options. This comesfrom its unique asymmetric massively parallel processing(AMPP) architecture that combines open IBM blade serversand disk storage with IBM’s patented, hardware-accelerated datafiltering, using field programmable gate arrays (FPGAs). Thiscombination delivers fast query performance on analyticworkloads supporting thousands of business intelligence and datawarehouse users, providing sophisticated analytics for satisfyingbusiness requirements.Included with every PureData System for Analytics system isIBM Netezza Analytics software. IBM Netezza Analytics offers abuilt-in analytical infrastructure and extensive library of statisticaland mathematical functions, supporting a breadth of analytictools and programming languages, including Open Source R.It is delivered with a library of more than 200 prebuilt, scalable,in-database analytic functions that execute analytics in parallelwhile abstracting away the complexity of parallel programmingfrom the developers, users and DBAs.ScalableWith the IBM PureData System for Analytics solution,organizations can deploy the right-sized environments for theirdata volumes and workloads, and be confident that as datavolumes grow, larger systems can be deployed quickly and easily.The IBM PureData System for Analytics N3001 family of sevendifferent configurations, starts with a data capacity of 16 TB(new N3001-001) and can grow to well over a petabyte foran eight-rack system (new N3001-080), assuming a 4Xcompression rate.The Netezza Analytics functionality also includes in-databasegeospatial analytics that are compatible with the industrystandard ESRI GIS formats. This enables easy integrationwith existing geospatial analytic environments. In addition, ifmodels are developed using SPSS Modeler or SAS, IBMNetezza Analytics will accelerate the development and scoringof these models.IBM PureData System for Analytics provides near linearperformance scalability as the size of the appliance grows, whichmeans that organizations can pick the appropriate sizedappliance to meet both their data volume and performancerequirements. This is accomplished with predictable, scalableperformance with no need to add significant resources to manageand maintain the appliance as data volumes grow.The IBM PureData System for Analytics N3001 bringsadvanced security to your data in this insecure world. Buildingon the appliance simplicity model, all data is stored on selfencrypting disk (SED) drives, providing security while notimpacting performance. The protection provided by the SEDimplementation supports the leading industries in securitycompliance — health care, government, and the financialsectors. This system utilizes strong authentication thatprevents threats due to unauthorized access, based on theindustry-standard Kerberos protocol.SmartIBM PureData System for Analytics dramatically simplifiesanalytics by consolidating all analytic activity to one place, wherethe data resides. Moving analytics to the IBM PureData Systemis straightforward with IBM’s embedded analytic platform. With3

Data SheetAnalyticsSimple and completely integratedIBM PureData System for Analytics is architected for highavailability. All components are internally redundant, and thefailure of a processing node (S-Blade) causes no significantperformance degradation for a robust, production-readyenvironment from the moment the appliance is installed in yourdata center.The IBM PureData System for Analytics N3001 also offersa great value bundle as complementary software licensesto use in conjunction with the appliance. Data movement,reporting, analytic tools, and Hadoop licenses make for afull service offering.Included software entitlements: IBM eliminates complexity at every step so you can redirectvaluable resources to initiatives that will positively impactthe bottom line.IBM Cognos Business Intelligence — five Analytics Userlicenses, one Analytics Administrator license.IBM DataStage (280 PVUs) — 2 concurrent DesignerClient licenses and IBM InfoSphere Data Click (withPureData System for Analytics as a source or target).IBM BigInsights for Apache Hadoop, software licenses tomanage around 100 TB of Hadoop data.Two non-production user licenses for the IBM InfoSphereStreams Developer Edition.The best valueIBM PureData System for Analytics is a cost-effective analyticsoption. It requires minimal ongoing administration or tuning,minimizing internal resources as well as implementation costs,for an extremely low total cost of ownership. The performanceand scalability of IBM appliances is available immediately,without requiring tuning, indexing, or aggregated tables.All of these new features are delivered with the samesimplicity and ease-of-use that distinguish all IBMPureSystems family offerings and what sets the IBMPureData System for Analytics apart.IBM offers your company fast time-to-value for importantBI and analytic initiatives. Your organization is armed withmore accurate intelligence to react quickly and accurately toopportunities and risks as they may present themselves.As an appliance, the integration of hardware, software andstorage is done for you, leading to shorter deployment cyclesand industry leading time-to-value for business intelligence andanalytic initiatives. The appliance is delivered ready-to-go forimmediate data loading and query execution. The applianceintegrates with leading ETL, BI and analytic applicationsthrough standard ODBC, JDBC and OLE DB interfaces.At a time when companies need flexibility to react to changingmarket conditions and growing analytic demands, anuncomplicated, easy-to-maintain system that runs fast andanalyzes your growing data volumes makes sense.The original data warehouse appliancePatented hardware accelerationIBM PureData System for Analytics N3001 adheres to IBM’sbasic principle of moving processing to the data, not moving thedata to the processors. Each IBM PureData System for Analyticscontains multiple snippet blades or S-Blades, where SQL querycode segments (or “snippets”) and complex analytic processes areexecuted. The S-Blades are intelligent processing nodes thatmake up the massively parallel processing appliance engine. EachS-Blade is an independent server that contains multi-coreIntel CPUs, IBM’s unique FPGAs, and gigabytes of RAMAdditionally, dedicated storage devices work concurrently withthe blades to deliver peak performance.Included with every system, the PureData System for AnalyticsPerformance Portal provides a web-based GUI that helpsadministrators monitor and manage hardware, administerdatabase objects, configure workload management, view activesessions and monitor system resource utilization for capacityplanning. The portal provides a consolidated administrativeinterface supporting PureData Systems for Analytics from one,easy-to-use access point.4

Data SheetAnalyticsSoftwareIncluded software entitlementsDatabaseBusiness intelligenceIBM Netezza Platform Software (NPS) v7.2 or greaterIBM Cognos Business Intelligence4 (five analytics users licensesplus one analytics administrator license; the IBM PureDataSystem for Analytics N3001 must be the data source)Operating systemRed Hat Linux Advanced Server 6.5Data integrationIBM Fluid Query 1.0IBM InfoSphere DataStage4 (280 PVUs), Designer Client(two concurrent users) and InfoSphere Data Click (all must workwith the IBM PureData System for Analytics N3001 as the datasource or target.)Run PureData System for Analytics queries against Hadoop data;Move PureData System for Analytics data quickly to Hadoop filesystems, and quickly move Hadoop data to PureData Systemfor AnalyticsHadoop data servicesSupported APIsIBM BigInsights for Apache Hadoop4 (five virtual servers tomanage 100 TB of Hadoop data; IBM PureData Systemfor Analytics appliances must be a source or target2)SQL, OLE DB, ODBC 3.5, JDBC 3.0 Type 4SQL standardsSQL-92 compliant, with SQL-99 extensionsReal-time analyticsIBM InfoSphere Streams Developer Edition4 (developerlicense: two users, not for production use, must work withIBM PureData System for Analytics N3001 appliances)Programming languagesJava, Python, Open Source R, R3, Fortran, C/C , Perl, LuaNetezza Analytics foundationAdditional toolsIn-Database Analytics, Open Source R, R2, Matrix, MapReduce,Geospatial technology with ESRI supportWindows and web-based DB Admin GUI; CLI andhigh-speed loading/unloading for IBM AIX , HP-UX,Linux, Solaris and WindowsHigh-speed load/unloadInteroperable with ETL and EAI tools at rates of 10 TB/hourThe IBM PureData System for Analytics is supported by awide range of market-leading business partners including:complementary technology partners, resellers, systemsintegrators, and service providers. For a complete list or to findout if a particular company or solution is part of our program,please contact your IBM representative.Backup and restoreInteroperable with IBM Tivoli , EMC Legato andSymantec NetbackupDatabase portabilityFrom IBM DB2 , IBM Informix , Microsoft SQL Server,MySQL, Oracle Database, Red Brick, Sybase IQ, Teradata5

Data SheetAnalyticsFor clients interested in a smaller entry-point model, please review the IBM PureData System for Analytics N3001-001 data sheet.SpecificationsSingle Rack SystemsMultiple Rack systemsIBM PureData System forAnalyticsIBM PureDataSystem forAnalyticsN3001-002IBM PureDataSystem forAnalyticsN3001-005IBM PureDataSystem forAnalyticsN3001-010IBM PureDataSystem forAnalyticsN3001-020IBM PureDataSystem forAnalyticsN3001-040IBM PureDataSystem forAnalyticsN3001-080Racks111248Active S-Blades247142856CPU cores40801402805601,120FPGA cores3264112224448896User data in TB(assumes 4Xcompression)32961923847681,536Power (Watts maximum)/rack3,2004,2007,6007,6007,6007,600Cooling - BTU/hour11,00014,40027,00054,000108,000216,000Rack Weight Kg620771907907907907Height/rack cm202202202202202202Depth/rack cm110110110110110110Width/rack cm64.864.864.864.864.864.8Power200-240 V,50Hz/60 Hz(Single phase),24A200-240 V,50Hz/60 Hz(Single phase),24A200-240 V,50Hz/60 Hz(Single phase),2X 24A200-240 V,50Hz/60 Hz(Single phase),2X 24A, per rack200-240 V,50Hz/60 Hz(Single phase),2X 24A, per rack200-240 V,50Hz/60 Hz(Single phase),2X 24A, per CC Part 15, ICES-003, AUS/NZ CISPR 22, VCCI and EN55022 Class A; European immunity: EN550246

Data SheetAnalyticsAbout IBM PureData System for AnalyticsThe IBM PureData System for Analytics, powered by Netezzatechnology, integrates database, server and storage into a single,easy-to-manage appliance that requires minimal setup andongoing administration while producing faster and moreconsistent analytic performance. The IBM PureData Systemfor Analytics simplifies business analytics dramatically byconsolidating all analytic activity in the appliance, right wherethe data resides, for industry-leading performance. Visit:ibm.com/PureData to see how our family of expertintegrated systems eliminates complexity at every step andhelps you drive true business value for your organization.The IBM PureData System for Analytics, powered byNetezza technology, integrates database, server and storageinto a single, easy-to-manage appliance that requires minimalsetup and ongoing administration while producing faster andmore consistent analytic performance. The IBM PureDataSystem for Analytics simplifies business analytics dramaticallyby consolidating all analytic activity in the appliance, wherethe data resides. Visit: ibm.com/PureSystems to see how ourfamily of expert integrated systems eliminates complexityand helps you drive true business value for your organization.For more informationAbout IBM Data Warehousingand Analytics SolutionsHelp IT make the shift to the strategic center of your business.Leverage proven expertise to take the lead. To learn more aboutIBM PureSystems and the PureData System for Analytics,contact your IBM representative or IBM Business Partner, orvisit the following website: ibm.com/PureSystems/PureDataIBM provides the most comprehensive portfolio of datawarehousing, information management and business analyticsoftware, hardware and solutions to help clients maximize thevalue of their information assets and discover new insights tomake better and faster decisions and optimize theirbusiness outcomes.Additionally, IBM Global Financing can help you acquire thesoftware capabilities that your business needs in the mostcost-effective and strategic way possible. We’ll partner withcredit-qualified clients to customize a financing solution to suityour business and development goals, enable effective cashmanagement, and improve your total cost of ownership. Fundyour critical IT investment and propel your business forwardwith IBM Global Financing. For more information, visit:Why IBM?IBM PureSystems offerings combine the flexibility of a generalpurpose system, the elasticity of cloud and the simplicity ofan appliance. They are integrated by design and come withbuilt-in expertise gained from decades of experience to deliver asimplified IT experience. Members of the PureSystems familyinclude: IBM PureFlex System, IBM PureApplication System, IBM PureData System for Transactions, IBMPureData System for Operational Analytics and IBM PureDataSystem for Analytics.ibm.com/financing7

Copyright IBM Corporation 2015IBM CorporationNew Orchard RoadArmonk, NY 10504Produced in the United States of AmericaApril 2015IBM, the IBM logo, ibm.com, Tivoli, DB2, Informix, AIX,PureSystems, PureFlex, and PureApplication are trademarks ofInternational Business Machines Corp., registered in many jurisdictionsworldwide. Other product and service names might be trademarksof IBM or other companies. A current list of IBM trademarks isavailable on the Web at “Copyright and trademark information” atwww.ibm.com/legal/copytrade.shtml.Netezza is a registered trademark of IBM International Group B.V.,an IBM Company.Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, IntelCentrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, andPentium are trademarks or registered trademarks of Intel Corporationor its subsidiaries in the United States and other countries.Linux is a registered trademark of Linus Torvalds in the United States,other countries, or both.Java and all Java-based trademarks and logos are trademarks or registeredtrademarks of Oracle and/or its affiliates.Microsoft, Windows and Windows NT are trademarks of MicrosoftCorporation in the United States, other countries, or both.This document is current as of the initial date of publication and maybe changed by IBM at any time. Not all offerings are available in everycountry in which IBM operates.The performance data discussed herein is presented as derived underspecific operating conditions. Actual results may vary. It is the user’sresponsibility to evaluate and verify the operation of any other productsor programs with IBM products and programs.THE INFORMATION IN THIS DOCUMENT IS PROVIDED“AS IS” WITHOUT ANY WARRANTY, EXPRESS OR IMPLIED,INCLUDING WITHOUT ANY WARRANTIES OF MERCHANT ABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANYWARRANTY OR CONDITION OF NON-INFRINGEMENT. IBMproducts are warranted according to the terms and conditions of theagreements under which they are provided.Statements regarding IBM’s future direction and intent are subjectto change or withdrawal without notice, and represent goals andobjectives only.Actual available storage capacity may be reported for both uncompressedand compressed data and will vary and may be less than stated.1 Based on reported results from IBM customers, “traditional customsystems” refers to systems that are not professionally prebuilt, pretested,and optimized. Individual results may vary.2 Based on 4 data nodes 1 master node. 12 TB uncompressed per datanode with 4 TB drives. 12 TB x 4 nodes 48 TB uncompressed. Using2-2.5x compression yields 96-120 TB compressed data. Capacity willdepend on hardware configuration selected.3 IBM PureData System for Analytics supports both open source Rand Revolution R Enterprise. Open source R is available from IBMdeveloperworks: ibm.com/developerworksRevolution R Enterprise for IBM PureData System for Analytics isavailable for additional purchase from Revolution Analytics.4 Please refer to IBM for the specific software version of the entitledproducts included ibm.biz/N3001 licensePlease RecycleWAD12369-USEN-03

data, even if spread across various sources in a fast, agile manner to drive analytics and deeper insight, without understanding how to connect multiple data stores, use different syntaxes or APIs, or change their application. IBM Fluid Query 1.0, included with IBM PureData System for Analytics, provides access to data in Hadoop from IBM