Using Oracle Big Data Cloud

Transcription

Oracle CloudUsing Oracle Big Data CloudE70336-24September 2019

Oracle Cloud Using Oracle Big Data Cloud,E70336-24Copyright 2017, 2019, Oracle and/or its affiliates. All rights reserved.This software and related documentation are provided under a license agreement containing restrictions onuse and disclosure and are protected by intellectual property laws. Except as expressly permitted in yourlicense agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify,license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means.Reverse engineering, disassembly, or decompilation of this software, unless required by law forinteroperability, is prohibited.The information contained herein is subject to change without notice and is not warranted to be error-free. Ifyou find any errors, please report them to us in writing.If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it onbehalf of the U.S. Government, then the following notice is applicable:U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software,any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are"commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agencyspecific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of theprograms, including any operating system, integrated software, any programs installed on the hardware,and/or documentation, shall be subject to license terms and license restrictions applicable to the programs.No other rights are granted to the U.S. Government.This software or hardware is developed for general use in a variety of information management applications.It is not developed or intended for use in any inherently dangerous applications, including applications thatmay create a risk of personal injury. If you use this software or hardware in dangerous applications, then youshall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure itssafe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of thissoftware or hardware in dangerous applications.Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks oftheir respective owners.Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks areused under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron,the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced MicroDevices. UNIX is a registered trademark of The Open Group.This software or hardware and documentation may provide access to or information about content, products,and services from third parties. Oracle Corporation and its affiliates are not responsible for and expresslydisclaim all warranties of any kind with respect to third-party content, products, and services unless otherwiseset forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not beresponsible for any loss, costs, or damages incurred due to your access to or use of third-party content,products, or services, except as set forth in an applicable agreement between you and Oracle.

ContentsPreface12AudienceviiiRelated ResourcesviiiConventionsviiiGet Started with Big Data CloudAbout Big Data Cloud1-1Before You Begin with Big Data Cloud1-1How to Begin with Big Data Cloud Subscriptions1-2About Big Data Cloud Roles and Users1-2Typical Workflow for Big Data Cloud1-3About Big Data Cloud Clusters on Oracle Cloud Infrastructure1-3About Installing Additional Software1-5Access Big Data CloudAccess the Service Console for Big Data Cloud2-1Access the Big Data Cloud Console2-1Access Big Data Cloud Using the REST API2-3Access Big Data Cloud Using the CLI2-3Access Big Data Cloud Using Ambari2-3About Accessing Thrift2-4Connect to a Cluster Node Through Secure Shell (SSH)2-8Connect to a Node by Using SSH on UNIXConnect to a Node by Using PuTTY on Windows32-92-10Manage the Life Cycle of Big Data CloudAbout Cluster Topology3-1Cluster Components3-2Cluster Extensions3-3Create a Cluster3-3iii

45Create a Cluster with Oracle Cloud Stack3-10View All Clusters3-11View Details for a Cluster3-11View Activities for Clusters3-12View Cluster Component Status3-12Monitor the Health of a Cluster3-12Scale a Cluster Out3-13Scale a Cluster In3-13Stop, Start, and Restart a Cluster3-13Delete a Cluster3-14Stop, Start, and Restart a Node3-14Manage Tags3-14Create, Assign, and Unassign Tags3-15Find Tags and Instances Using Search Expressions3-15Use Identity Cloud Service for Cluster AuthenticationAbout Cluster Authentication4-1Connect to Identity Cloud Service from the Service Console4-1Add Identity Cloud Service Users for Clusters4-2Make REST API Calls to Clusters That Use Identity Cloud Service4-3Update the Identity Cloud Service Password for Big Data Cloud4-6Manage Network AccessAbout Network Access5-1Enable Access Rules5-2Create Access Rules5-2Generate a Secure Shell (SSH) Public/Private Key Pair5-4Generate an SSH Key Pair on UNIX and UNIX-Like Platforms Using the sshkeygen Utility5-4Generate an SSH Key Pair on Windows Using the PuTTYgen Program5-5System Properties of Big Data Cloud65-6Patch Big Data CloudAbout Operating System Patching6-1View Available Patches6-1Check Patch Prerequisites6-2Apply a Patch6-3Roll Back a Patch or Failed Patch6-3iv

789Manage CredentialsChange the Cluster Password7-1Replace the SSH Keys for a Cluster7-2Update Cloud Storage Credentials7-2Use the Cluster Credential Store7-3Manage Certificates Used for the Cluster Console7-3Update the Security Key for Big Data Cloud on Oracle Cloud Infrastructure7-4Manage DataLoad Data Into Cloud Storage8-1Upload Files Into HDFS8-2Browse Data8-3About the Big Data File System (BDFS)8-3Connect to Oracle DatabaseUse the Oracle Shell for Hadoop Loaders Interface (OHSH)About Oracle Shell for Hadoop Loaders9-1Configure Big Data Cloud for Oracle Shell for Hadoop Loaders9-2Get Started with Oracle Shell for Hadoop Loaders9-2Use Oracle Loader for Hadoop9-4About Oracle Loader for Hadoop9-4Get Started With Oracle Loader for Hadoop9-5Use Copy to Hadoop109-19-9About Copy to Hadoop9-9First Look: Loading an Oracle Table Into Hive and Storing the Data in Hadoop9-9Work with JobsCreate a Job10-1Run a Job10-3About MapReduce Jobs10-3Stop a Job10-5View Jobs and Job Details10-5View Job Logs10-5Monitor and Troubleshoot Jobs10-6Manage Work Queue Capacity10-6Create Work Queues10-7v

111213Work with NotebookCreate a Note in a Notebook11-1Run a Note11-2View and Edit a Note11-2Import a Note11-2Export a Note11-3Delete a Note11-3Organize Notes11-3Manage Notebook Settings11-4Interpreters Available for Big Data Cloud11-4Work with Oracle R Advanced Analytics for Hadoop (ORAAH)About ORAAH in Big Data Cloud12-1Use ORAAH in Big Data Cloud12-1Troubleshoot Big Data CloudProblems with Administering ClustersI get a warning that the object store credentials are out of sync13-1I need to view the status of running services13-1Services aren’t being restarted properly after life cycle operations13-2I need to modify the Ambari Web inactivity timeout13-2I need to control the Ambari-agent service13-2I need to control the Ambari-server service13-3Problems with Patching and RollbackA13-113-3I can’t apply a patch13-3Patching fails due to disk space13-3Oracle Cloud Pages for Big Data CloudService Console: Instances PageA-2Service Console Create Instance: Instance PageA-4Service Console Create Instance: Service Details PageA-6Service Console Create Instance: Confirmation PageA-12Service Console: Activity PageA-13Service Console: SSH Access PageA-15Service Console: Instance Overview PageA-16Service Console: Access Rules PageA-21Big Data Cloud Console: Overview PageA-23Big Data Cloud Console: Jobs PageA-25vi

BBig Data Cloud Console New Job: Details PageA-27Big Data Cloud Console New Job: Configuration PageA-27Big Data Cloud Console New Job: Driver File PageA-29Big Data Cloud Console New Job: Confirmation PageA-30Big Data Cloud Console: Notebook PageA-30Big Data Cloud Console: Data Stores PageA-31Big Data Cloud Console: Status PageA-34Big Data Cloud Console: Settings PageA-35Customize ClustersAbout the Cluster Bootstrap ScriptB-1Bootstrap Script Execution and LoggingB-2Sample Bootstrap ScriptB-3Big Data Cloud Convenience FunctionsB-4vii

PrefacePrefaceThis document describes how to administer and use Oracle Big Data Cloud andprovides references to related documentation.Topics: Audience Related Resources ConventionsAudienceThis document is intended for users who want to quickly spin up elastic Apache Sparkor Apache Hadoop clusters and use the clusters to analyze data.Related ResourcesFor related information, see these Oracle resources: Getting Started with Oracle Cloud Getting Started with Oracle Platform Services in the Oracle Cloud Infrastructuredocumentation Getting Started with Object Storage Classic in Using Oracle Cloud InfrastructureObject Storage Classic REST API for Oracle Big Data Cloud REST API to Manage Oracle Big Data Cloud Using the Command Line Interface in PaaS Service Manager Command LineInterface Reference Big Data Cloud on the Oracle Cloud entionsThe following text conventions are used in this document:ConventionMeaningboldfaceBoldface type indicates graphical user interface elements associatedwith an action, or terms defined in text or the glossary.viii

PrefaceConventionMeaningitalicItalic type indicates book titles, emphasis, or placeholder variables forwhich you supply particular values.monospaceMonospace type indicates commands within a paragraph, URLs, codein examples, text that appears on the screen, or text that you enter.ix

1Get Started with Big Data CloudThis section describes how to get started with Oracle Big Data Cloud.Topics About Big Data Cloud Before You Begin with Big Data Cloud How to Begin with Big Data Cloud Subscriptions About Big Data Cloud Roles and Users Typical Workflow for Big Data Cloud About Big Data Cloud Clusters on Oracle Cloud Infrastructure About Installing Additional SoftwareAbout Big Data CloudBig Data Cloud leverages Oracle’s Infrastructure Cloud Services to deliver a secure,elastic, integrated platform for all Big Data workloads. You can: Spin up multiple Hadoop or Spark clusters in minutes Use built-in tools such as Apache Zeppelin to understand your data, or use thejobs API to run non-interactive jobs Use open interfaces to integrate third-party tools to analyze your data Launch multiple clusters against a centralized data lake to achieve data sharingwithout compromising on job isolation Create small clusters or extremely large ones based on workload and use-cases Elastically scale the compute and storage tiers independently of one another,either manually or in an automated fashion Pause a cluster when not in use Use REST APIs to monitor, manage, and utilize the serviceFor information about the open source components used in Big Data Cloud, seeCluster Components.Before You Begin with Big Data CloudBefore you start using Oracle Big Data Cloud, you should be familiar with the followingtechnologies: The Apache Hadoop ecosystem Apache Spark OpenStack Swift Object Storage1-1

Chapter 1How to Begin with Big Data Cloud SubscriptionsBefore you create a cluster: Subscribe to Oracle Cloud Infrastructure Object Storage Classic, the persistentdata lake for Big Data Cloud Subscribe to Oracle Big Data Cloud (Optional) Create an Oracle Cloud Infrastructure Object Storage Classic containerfor your data (Optional) Create a Secure Shell (SSH) public/private key pair to provide whenyou create a clusterHow to Begin with Big Data Cloud SubscriptionsTo get started with Oracle Big Data Cloud subscriptions:1.Sign up for a free credit promotion or purchase a subscription.See Request and Manage Free Oracle Cloud Promotions and Buy an OracleCloud Subscription in Getting Started with Oracle Cloud.2.Access Oracle Big Data Cloud.See Access Big Data Cloud.Note:Be sure to review Before You Begin with Big Data Cloud before you createyour first cluster.If you want to grant others access to Big Data Cloud, start by reviewing About BigData Cloud Roles and Users. Then, create accounts for users and assign themappropriate privileges and roles. For instructions, see Add Users and Assign Roles inGetting Started with Oracle Cloud.About Big Data Cloud Roles and UsersOracle Big Data Cloud uses roles to control access to tasks and resources. A roleassigned to a user gives certain privileges to that user.In addition to the roles and privileges described in Learn About Cloud Account Roles inGetting Started with Oracle Cloud, the following role is created for Big Data Cloud:BDCSCE Administrator.When the Big Data Cloud account is first set up, the service administrator is given theBDCSCE Administrator role. User accounts with this role must be added beforeanyone else can access and use the service.A user with the BDCSCE Administrator role has complete administrative control overthe service. This user can create and terminate clusters, add and delete nodes,monitor cluster health, stop and start clusters, and manage other life cycle events. In atypical workflow, the administrator spins up a cluster that users can use to do theirwork. When the cluster is no longer needed, the administrator terminates it.The identity domain administrator can create more Big Data Cloud administrators bycreating user accounts and assigning the role to the user. Only the identity domain1-2

Chapter 1Typical Workflow for Big Data Cloudadministrator is allowed to create user accounts and assign roles. See Add Users andAssign Roles in Getting Started with Oracle Cloud.Typical Workflow for Big Data CloudTo start using Oracle Big Data Cloud, refer to the following tasks as a guide. Some ofthese tasks are performed only by administrators.TaskDescriptionMore InformationSign up for a freecredit promotion orpurchase asubscriptionProvide your information, and How to Begin with Big Data Cloudsign up for a free creditSubscriptionspromotion or purchase asubscription to Oracle BigData Cloud.Add and manageusers and rolesCreate accounts for yourAdd Users and Assign Roles in Gettingusers and assign themStarted with Oracle Cloud, and Aboutappropriate privileges. Assign Big Data Cloud Roles and Usersthe necessary Oracle BigData Cloud roles.Create an SSH keypairCreate SSH public/privateGenerate a Secure Shell (SSH) Public/key pairs to facilitate securePrivate Key Pairaccess to all virtual machinesin your service.Create a clusterUse a wizard to create acluster.Enable networkaccessPermit access to networkAbout Network Accessservices associated with yourclusters.Load dataLoad the data you’ll be usingfor your analysis.Manage DataCreate and managejobsUse jobs to analyze data.Work with JobsCreate and managenotesUse notes to analyze data.Work with NotebookMonitor clustersCheck on the health andperformance of individualclusters.Monitor the Health of a ClusterMonitor the serviceCheck on the day-to-dayPerforming Service-Specific Tasks inoperation of your service,Managing and Monitoring Oracle Cloudmonitor performance, andreview important notifications.Create a ClusterAbout Big Data Cloud Clusters on Oracle CloudInfrastructureYou can create Oracle Big Data Cloud clusters on Oracle Cloud Infrastructure and onOracle Cloud Infrastructure Classic.The infrastructure a cluster gets created on depends on the region you select whenyou create the cluster. If you see the Availability Domain and Subnet fields whenyou select a region for the cluster you're creating, that means the cluster will be1-3

Chapter 1About Big Data Cloud Clusters on Oracle Cloud Infrastructurecreated on Oracle Cloud Infrastructure. Otherwise, the cluster is created on OracleCloud Infrastructure Classic.To determine which infrastructure your cluster is running on after the cluster has beencreated, click the Instance Details icon for the cluster, and then locate the Regioninformation. If the value is us-phoenix-1, us-ashburn-1, eu-frankfurt-1, or uklondon-1, then the instance is running on Oracle Cloud Infrastructure.Prerequisites on Oracle Cloud InfrastructureOracle Big Data Cloud clusters on Oracle Cloud Infrastructure require certainnetworking and storage resources that you must create on Oracle Cloud Infrastructurebefore you create your first cluster.To learn about these resources, see Prerequisites for Oracle Platform Services in theOracle Cloud Infrastructure documentation.Creating theFor step-by-step instructions to create these resources, seeInfrastructure Resources Required for Oracle Platform Services.Note:Oracle Big Data Cloud uses the native Oracle Cloud Infrastructure objectstorage API rather than the Swift API. As such, an API signing key isrequired for authentication to Oracle Cloud Infrastructure Object Storage, nota Swift user name and password as described in the Prerequisitesdocumentation above.Differences Between Clusters on Oracle Cloud Infrastructure and Oracle CloudInfrastructure ClassicThe cluster environment on either type of infrastructure is substantially the same. Afew differences exist in the underlying infrastructure components and in the supportedcapabilities. Awareness of these differences will help you choose an appropriateinfrastructure when creating a cluster.The following table lists differences between Big Data Cloud clusters on Oracle CloudInfrastructure and on Oracle Cloud Infrastructure Classic.FeatureOracle Cloud InfrastructureClassicOracle Cloud InfrastructureAvailability domainsNot applicableEach region has multiple isolatedavailability domains, with separatepower and cooling. The availabilitydomains within a region areinterconnected using a low-latencynetwork. When creating a cluster, youcan select the availability domain thatthe cluster should be placed in.Subnets and IPnetworksYou can attach clusters to IPnetworks defined on OracleCloud Infrastructure ComputeClassic.You must attach each cluster to asubnet, which is a part of a virtualcloud network that you create onOracle Cloud Infrastructure.1-4

Chapter 1About Installing Additional SoftwareFeatureOracle Cloud InfrastructureClassicOracle Cloud InfrastructureCompute shapesStandard and high memoryshapesVM.Standard and BM.StandardshapesThe list of available shapes mayvary by region. For informationabout shapes, see AboutShapes in Using Oracle CloudInfrastructure Compute Classic.The list of available shapes may varyby region. For information aboutshapes, see Overview of the ComputeService in the Oracle CloudInfrastructure documentation.IP reservationsNot supportedNot supportedNetwork access toclustersUse the Oracle Big Data Cloudinterfaces to configure accessrules.Use Oracle Cloud Infrastructureinterfaces to configure security rules.Note that these access rulesprohibit access by default (withthe exception of SSH access onport 22), and you must enablethem to provide access to otherports.Scaling clustersSupportedNot supportedYou cannot scale the shape of acluster’s compute nodes; you canscale only the storage. The minimumsize of a new storage volume onOracle Cloud Infrastructure is 50 GB.Using OracleIdentity CloudService to controlaccess toapplicationsdeployed on theclusterIn accounts that use OracleNot supportedIdentity Cloud Service, whilecreating a cluster, you canenable Oracle Identity CloudService as the identity providerfor applications deployed on thecluster.Load balanceroptionsWhile creating a cluster, if youUses a custom load balancer.enable Oracle Identity CloudService as the identity provider,an Oracle-managed loadbalancer is created andconfigured automatically for thecluster.If you don’t enable OracleIdentity Cloud Service, then youcan use Oracle Traffic Director.Object storageYou can create the objectstorage container either beforeor during cluster creation.You must create the object storagebucket on Oracle Cloud Infrastructurebefore creating the cluster.About Installing Additional SoftwareYou can install additional software on Oracle Big Data Cloud, but do so at your ownrisk. Certain software installations can affect the proper functioning of the service.Note the following: Using Ambari to install and manage additional services will cause patching to fail.1-5

Chapter 1About Installing Additional Software Changing the default Python version can have adverse effects on lifecycleoperations such as start, stop, restart, scale-in, scale-out, and patching. If you choose to install third-party products, they should be installed on edgenodes and not directly in a Big Data Cloud cluster. You are responsible for the maintenance, operation, and support of any additionalsoftware you install on Big Data Cloud.1-6

2Access Big Data CloudThis section describes how to access the consoles and interfaces available for OracleBig Data Cloud.Topics Access the Service Console for Big Data Cloud Access the Big Data Cloud Console Access Big Data Cloud Using the REST API Access Big Data Cloud Using the CLI Access Big Data Cloud Using Ambari About Accessing Thrift Connect to a Cluster Node Through Secure Shell (SSH)Access the Service Console for Big Data CloudOracle Big Data Cloud can be accessed through a web console. Access to thisconsole is limited to administrators.To access the service console for Oracle Big Data Cloud:1.Sign in to Oracle Cloud.If you received a welcome email, use it to identify the URL, your user name, andyour temporary password. After signing in, you'll be prompted to change yourpassword.2.navigation menu in the top leftFrom the Infrastructure Console, click thecorner, expand Classic Data Management Services, and then click Big Data Compute Edition.The service console opens on the Instances page. For information about thedetails on the page, see Service Console: Instances Page. If this is the first timeOracle Big Data Cloud has been accessed for the account, a Welcome page isdisplayed.Access the Big Data Cloud ConsoleClusters in Oracle Big Data Cloud can be accessed through a web-based console.The Big Data Cloud Console (also referred to as the cluster console in this document),is used to create, terminate, monitor, and manage Apache Spark jobs; create andmanage notes and notebooks; browse Hadoop Distributed File System (HDFS) andCloud storage; and manage work queue configurations.After administrators create a cluster, they give users the information they need toconnect to the cluster console. Administrators also provide information about the2-1

Chapter 2Access the Big Data Cloud ConsoleOracle Cloud Infrastructure Object Storage Classic container associated with thecluster when the cluster was created. Oracle Cloud Infrastructure Object StorageClassic is the persistent data lake for Big Data Cloud and is typically where the dataused for analysis is stored. Job logs are also stored there.The cluster console can be accessed in several different ways, depending on whetheryou have administrator privileges, and whether the cluster uses Basic authentication oruses Oracle Identity Cloud Service (IDCS) for authentication.Access the Big Data Cloud Console — AdministratorsTo access the cluster as an administrator:1.Open the service console. See Access the Service Console for Big Data Cloud .2.From themenu for the cluster you want to access, select Big Data CloudConsole and log in with the appropriate credentials: For clusters that use HTTP Basic authentication, log in with the administrativeuser name and password specified for the cluster when the cluster wascreated. For clusters that use IDCS for authentication, log in with your existing IDCSuser name and password. When IDCS is enabled as the authenticationmechanism for a cluster, anyone who can authenticate to IDCS can log in andaccess all cluster services.After the cluster console opens, make note of the URL. This is the URL you’ll provideto users who need to access the cluster. The URL and connection information differdepending on whether the cluster uses Basic authentication or IDCS forauthentication.For clusters that use Basic authentication, provide users with the cluster URL and withthe credentials specified for the cluster when the cluster was created. The URL is inthe form of https://address:1080/, where address is the public IP address of theMASTER-1 node on the cluster. Note that the console can be accessed on port 1080on all master nodes in a cluster. If you can't access the console on the MASTER-1node, try accessing it on another master node.For clusters that use IDCS for authentication, provide users with just the cluster URL.The URL is in the form of https://cluster name-load balancing server URI,where cluster name is the name of the cluster, and load balancing server URI isthe URI assigned to the cluster by the load balancing service. Because authenticationis managed by IDCS, you won’t send a user name and password. Users will log in tothe cluster using their own IDCS credentials, which should have already beenprovisioned before you send the cluster URL. For information about adding users, seeAdd Identity Cloud Service Users for Clusters.For both cluster types, also provide users with the URL and credentials for the OracleCloud Infrastructure Object Storage Classic container that was associated with thecluster when the cluster was created.Access the Big Data Cloud Console — Cluster UsersTo access the cluster if you are not an administrator:1.Obtain the information you need from your administrator: For clusters that use Basic authentication, the administrator will give you thecluster URL and the user name and password for the cluster.2-2

Chapter 2Access Big Data Cloud Using the REST API 2.For clusters that use IDCS for authentication, the administrator will give youjust the cluster URL. An administrator should have already added you as auser in IDCS, and you should have received an email with your IDCS logininformation. You’ll log in with your IDCS user name and password.Access the cluster URL in your browser and log in when prompted: For clusters that use Basic authentication, you’re presented with a basic logindialog. Log in with the user name and password provided by youradministrator. For clusters that use IDCS for authentication, you’re presented with theIdentity Cloud Service login screen. Log in with your IDCS user name andpassword. All that’s required to access an IDCS-enabled cluster is a validIDCS account.The Big Data Cloud Console opens.Administrators should also give you the URL and credentials for the Oracle CloudInfrastructure Object Storage Classic container associated with the cluster when thecluster was created.Access Big Data Cloud Using the REST APIYou can use the REST API to create and manage Oracle Big Data Cloud clusters andperform many other tasks you can perform using the web-based consoles. See: REST API for Oracle Big Data Cloud REST API to Manage Oracle Big Data CloudYou can also access the API Catalog for Big Data Cloud from the user name menu inthe Big Data Cloud Console. See Access the Big Data Cloud Console.Access Big Data Cloud Using the CLIYou can use a command line interface (CLI) to create and manage Oracle Big DataCloud clusters and perform many other tasks you can perform using the web-basedconsoles.The Oracle PaaS Service Manager (PSM) CLI enables you to manage the lifecycle ofvarious services in Oracle Public Cloud, including Big Data Cloud. See Using theCommand Line Interface in PaaS Service Manager Command Line InterfaceReference.Access Big Data Cloud Using AmbariThis topic does not apply to Oracle Cloud Infrastructure. On Oracle CloudInfrastructure, the Ambari port is already accessible and nothing else needs to bedone.You can use Apache Ambari to access and manage Oracle Big Data Cloud clusters.While Ambari isn't needed for normal operations with the cluster, it's useful to openAmbari access to help with troubleshooting and certain administrative actions.2-3

Chapter 2About Accessing ThriftTo access a cluster using Ambari, you enable an access rule to open the port forAmbari, and then use the Ambari URL:1.Open the service console. See Access the Service Console for Big Data Cloud .2.From themenu for the cluster you want to access using Ambari, select AccessRules. Access rules control which ports can be accessed on the VMs that are partof a cluster.3.In the list of access rules, find the Ambari REST rule, which is associated withport 8080, the port that needs to be open.4.From themenu for the Ambari REST rule, select Enable.The Enable Access Rule window is displayed.5.Select Enable.The Enable Access Rule window closes and the rule is displayed as enabled inthe list of rules. The given port on the cluster is opened to the public internet.6.After the rule is enabled, click the link for the cluster at the top of the page to returnto the cluster overview page.7.On the cluster overview page, under Resources, find the MASTER-1 host, copythe Public IP address, and paste it into your browser address bar, adding port8080 if necessary. For example, https://Public IP address:8080/. You mustuse https or you won’t be able to connect.8.If you’re prompted for credentials, enter the user name and password specified forthe cluster when the cluster was created.You should now be connected to the Ambari management console on the cluster.For information about using Ambari to upload files into HDFS, see Upload Files IntoHDFS. For general information about using Ambari, see the Ambari 2.4documentation.About Accessing ThriftOracle Big Data Cloud deploys two Thrift servers to provide JDBC connectivity to Hiveand Spark: Spark Thrift Server and Hive Thrift Server.JDBC clients can connect to Hive or Spark serve

Work with Oracle R Advanced Analytics for Hadoop (ORAAH) About ORAAH in Big Data Cloud 12-1 . Big Data Cloud leverages Oracle's Infrastructure Cloud Services to deliver a secure, elastic, integrated platform for all Big Data workloads. You can: Spin up multiple Hadoop or Spark clusters in minutes Use built-in tools such as Apache .