NANODEGREE PROGRAM SYLLABUS Data Analyst

Transcription

NANODEGREE PROGR AM SYLL ABUSData Analyst

OverviewThis program prepares you for a career as a data analyst by helping you learn to organize data, uncoverpatterns and insights, draw meaningful conclusions, and clearly communicate critical findings. You’lldevelop proficiency in Python and its data analysis libraries (Numpy, pandas, Matplotlib) and SQL as youbuild a portfolio of projects to showcase in your job search.Depending on how quickly you work through the material, the amount of time required is variable. We haveincluded an hourly estimation for each section of the program. The program covers one term of threemonth (approx. 13 weeks). If you spend about 10 hours per week working through the program, you shouldfinish the term within 13 weeks. Students will have an additional four weeks beyond the end of the term tocomplete all projects.In order to succeed in this program, we recommend having experience working with data inPython (Numpy and Pandas) and SQL.I N CO L L A B O R AT I O N W I T HEstimated Time:4 Months at10hrs/weekPrerequisites:Python & SQLFlexible Learning:Self-paced, soyou can learn onthe schedule thatworks best for youTechnical MentorSupport:Our knowledgeablementors guide yourlearning and arefocused on answeringyour questions,motivating you andkeeping you on trackData Analyst 2

Course 1: Introduction to Data AnalysisLearn the data analysis process of wrangling, exploring, analyzing, and communicating data. Work with data inPython, using libraries like NumPy and Pandas.Course ProjectExplore Weather TrendsThis project will introduce you to the SQL and how to download datafrom a database. You’ll analyze local and global temperature dataand compare the temperature trends where you live to overall globaltemperature trends.Course ProjectInvestigate a DatasetIn this project, you’ll choose one of Udacity’s curated datasets andinvestigate it using NumPy and pandas. You’ll complete the entiredata analysis process, starting by posing a question and finishing bysharing your findings.LEARNING OUTCOMESAnaconda Learn to use Anaconda to manage packages andenvironments for use with PythonLESSON TWOJupyter Notebooks Learn to use this open-source web application to combineexplanatory text, math equations, code, and visualizationsin one sharable documentLESSON THREEData AnalysisProcess Learn about the keys steps of the data analysis process. Investigate multiple datasets using Python and Pandas.LESSON ONEData Analyst 3

LESSON FOURPandas and ANDNumPy:Case Study 1 Perform the entire data analysis process on a dataset Learn to use NumPy and Pandas to wrangle, explore,analyze, and visualize dataLESSON FIVEPandas and ANDNumPy:Case Study 2 Perform the entire data analysis process on a dataset Learn more about NumPy and Pandas to wrangle, explore,analyze, and visualize dataLESSON SIXProgrammingWorkflow for DataAnalysis Learn about how to carry out analysis outside Jupyternotebook using IPython or the command line interfaceData Analyst 4

Course 2: Practical StatisticsLearn how to apply inferential statistics and probability to real-world scenarios, such as analyzing A/B testsand building supervised learning models.Course ProjectAnalyze Experiment ResultsIn this project, you will be provided a dataset reflecting datacollected from an experiment. You’ll use statistical techniques toanswer questions about the data and report your conclusions andrecommendations in a report.LEARNING OUTCOMESLESSON ONESimpson’s Paradox Examine a case study to learn about Simpson’s ParadoxLESSON TWOProbability Learn the fundamental rules of probability.LESSON THREEBinomialDistribution Learn about binomial distribution where each observationrepresents one of two outcomes Derive the probability of a binomial distributionLESSON FOURConditionalProbability Learn about conditional probability, i.e., when events are notindependent.Bayes Rule Build on conditional probability principles to understand theBayes rule Derive the Bayes theoremStandardizing Convert distributions into the standard normal distributionusing the Z-score. Compute proportions using standardized distributions.LESSON FIVELESSON SIXData Analyst 5

LESSON SEVENSamplingDistributionsand Central LimitTheorem Use normal distributions to compute probabilities Use the Z-table to look up the proportions of observationsabove, below, or in between valuesLESSON EIGHTConfidenceIntervals Estimate population parameters from sample statistics usingconfidence intervalsLESSON NINEHypothesis Testing Use critical values to make decisions on whether or not atreatment has changed the value of a population parameter.LESSON TENT-Tests and A/BTests Test the effect of a treatment or compare the difference inmeans for two groups when we have small sample sizesLESSON ELEVENRegression Build a linear regression model to understand the relationshipbetween independent and dependent variables. Use linear regression results to make a prediction.LESSON TWELVEMultiple LinearRegression Use multiple linear regression results to interpret coefficientsfor several predictorsLESSON THIRTEENLogistic Regression Use logistic regression results to make a prediction about therelationship between categorical dependent variables andpredictors.Data Analyst 6

Course 3: Data WranglingLearn the data wrangling process of gathering, assessing, and cleaning data. Learn to use Python towrangle data programmatically and prepare it for analysis.Course ProjectWrangle and Analyze DataReal-world data rarely comes clean. Using Python, you’ll gather datafrom a variety of sources, assess its quality and tidiness, then cleanit. You’ll document your wrangling efforts in a Jupyter Notebook,plus showcase them through analyses and visualizations usingPython and SQL.LEARNING OUTCOMESLESSON ONELESSON TWOLESSON THREELESSON FOURIntro to DataWrangling Identify each step of the data wrangling process (gathering,assessing, and cleaning). Wrangle a CSV file downloaded from Kaggle usingfundamental gathering, assessing, and cleaning code.Gathering Data Gather data from multiple sources, including gathering files,programmatically downloading files, web-scraping data, andaccessing data from APIs. Import data of various file formats into pandas, including flatfiles (e.g. TSV), HTML files, TXT files, and JSON files. Store gathered data in a PostgreSQL database.Assessing Data Assess data visually and programmatically using pandas Distinguish between dirty data (content or “quality” issues)and messy data (structural or “tidiness” issues) Identify data quality issues and categorize them using metrics:validity, accuracy, completeness, consistency, and uniformityCleaning Data Identify each step of the data cleaning process (defining,coding, and testing) Clean data using Python and pandas Test cleaning code visually and programmatically using PythonData Analyst 7

Course 4: Data Visualization with PythonLearn to apply visualization principles to the data analysis process. Explore data visually at multiple levels tofind insights and create a compelling story.Course ProjectCommunicate DataFindingsReal-world data rarely comes clean. Using Python, you’ll gather datafrom a variety of sources, assess its quality and tidiness, then cleanit. You’ll document your wrangling efforts in a Jupyter Notebook,plus showcase them through analyses and visualizations usingPython and SQL.LEARNING OUTCOMESLESSON ONELESSON TWOLESSON THREELESSON FOURData Visualization inData Analysis Understand why visualization is important in the practiceof data analysis. Know what distinguishes exploratory analysis fromExplanatory analysis, and the role of data visualization ineach.Design ofVisualizations Interpret features in terms of level of measurement. Know different encodings that can be used to depict datain visualizations. Understand various pitfalls that can affect theeffectiveness and truthfulness of visualizations.Univariate Explorationof Data Use bar charts to depict distributions of categoricalvariables. Use histograms to depict distributions of numeric variables Use axis limits and different scales to change how yourdata is interpretedBivariate Explorationof Data Use scatterplots to depict relationships between numericvariables. Use clustered bar charts to depict relationships betweencategorical variables Use violin and bar charts to depict relationships betweencategorical and numeric variables Use faceting to create plots across different subsets of thedataData Analyst 8

MultivariateExploration of Data Use encodings like size, shape, and color to encode valuesof a third variable in a visualization. Use plot matrices to explore relationships betweenmultiple variables at the same time. Use feature engineering to capture relationships betweenvariables.LESSON SIXExplanatoryVisulizations Understand what it means to tell a compelling story withdata. Choose the best plot type, encodings, and annotations topolish your plots. Create a slide deck using a Jupyter Notebook to conveyyour findings.LESSON SEVENVisulizationCase Study Apply your knowledge of data visualization to a datasetinvolving the characteristics of diamonds and their prices.LESSON FIVEData Analyst 9

Our Classroom ExperienceREAL-WORLD PROJECTSBuild your skills through industry-relevant projects. Getpersonalized feedback from our network of 900 projectreviewers. Our simple interface makes it easy to submityour projects as often as you need and receive unlimitedfeedback on your work.KNOWLEDGEFind answers to your questions with Knowledge, ourproprietary wiki. Search questions asked by other students,connect with technical mentors, and discover in real-timehow to solve the challenges that you encounter.WORKSPACESSee your code in action. Check the output and quality ofyour code by running them on workspaces that are a partof our classroom.QUIZZESCheck your understanding of concepts learned in theprogram by answering simple and auto-graded quizzes.Easily go back to the lessons to brush up on conceptsanytime you get an answer wrong.CUSTOM STUDY PLANSCreate a custom study plan to suit your personal needsand use this plan to keep track of your progress towardyour goal.PROGRESS TRACKERStay on track to complete your Nanodegree program withuseful milestone reminders.Data Analyst 10

Learn with the BestJosh BernhardSebastian ThrunDATA S C I E N T I S TAT N E R D WA L L E TPRESIDENT OFU DA C I T YJosh has been sharing his passion fordata for nearly a decade at all levels ofuniversity, and as Lead Data ScienceInstructor at Galvanize. He’s used datascience for work ranging from cancerresearch to process automation.As the founder and president of Udacity,Sebastian’s mission is to democratizeeducation. He is also the founder ofGoogle X, where he led projects includingthe Self-Driving Car, Google Glass, andmore.Derek SteerJuno LeeCEOAT M O D EC U R R I C U LU M L E A DAT U DAC I T YDerek is the CEO of Mode Analytics. Hedeveloped an analytical foundation atFacebook and Yammer and is passionateabout sharing it with future analysts. Heauthored SQL School and is a mentor atInsight Data Science.Juno is the curriculum lead for the Schoolof Data Science. She has been sharing herpassion for data and teaching, buildingseveral courses at Udacity. As a datascientist, she built recommendationengines, computer vision and NLP models,and tools to analyze user behavior.Data Analyst 11

Learn with the BestMike YiDavid VenturiDATA A N A LY S TI N S T R U C TO RDATA A N A LY S TI N S T R U C TO RMike is a Content Developer with amultidisciplinary academic background,including math, statistics, physics, andpsychology. Previously, he worked onUdacity’s Data Analyst Nanodegreeprogram as a support lead.Formerly a chemical engineer and dataanalyst, David created a personalized datascience master’s program using onlineresources. He has studied hundreds ofonline courses and is excited to bring thebest to Udacity students.Sam NelsonPRODUC T LE ADSam is the Product Lead for Udacity’sData Analyst, Business Analyst, and DataFoundations programs. He’s worked as ananalytics consultant on projects in severalindustries, and is passionate about helpingothers improve their data skills.Data Analyst 12

All Our Nanodegree Programs Include:EXPERIENCED PROJECT REVIEWERSREVIEWER SERVICES Personalized feedback & line by line code reviews 1600 Reviewers with a 4.85/5 average rating 3 hour average project review turnaround time Unlimited submissions and feedback loops Practical tips and industry best practices Additional suggested resources to improveTECHNICAL MENTOR SUPPORTMENTORSHIP SERVICES Questions answered quickly by our team oftechnical mentors 1000 Mentors with a 4.7/5 average rating Support for all your technical questionsPERSONAL CAREER SERVICESC AREER SUPPORT Github portfolio review LinkedIn profile optimizationData Analyst 13

Frequently Asked QuestionsPROGR AM OVERVIE WWHY SHOULD I ENROLL?The Data Analyst Nanodegree program offers you the opportunity tomaster data skills that are in demand by top employers, such as Python andStatistics. By the end of the program you will have created a portfolio of workdemonstrating your ability to solve complex data problems. After graduating,you will have the skills needed to join a large corporation or a small firm, oreven go independent as a freelance data analyst.You’ll have personalized support as you master in-demand skills that qualifyyou for high-value jobs in the data field. You’ll also receive career support viaprofile and portfolios reviews to help make sure you’re ready to establish asuccessful career in data, and land a job you love.WHAT JOBS WILL THIS PROGRAM PREPARE ME FOR?Graduates will be well prepared to fill a wide array of data related roles.These include: Data Analyst, Analytics Consultant, Product Manager, andManagement Consultant.HOW DO I KNOW IF THIS PROGRAM IS RIGHT FOR ME?If you’re someone who wants to make data driven decisions or work withvarious types of data to conduct analyses, or is interested in becomingan data analyst, this program is ideal for you, because you’ll learn appliedstatistics, data wrangling with Python, and data visualization with Matplotlib,which will enable you to work with any data set and find and showcasemeaningful insights. This will qualify you for roles such as a Data Analyst andAnalytics Consultant. You’ll need to have some experience with python andpandas to succeed in this program, and if that’s you, and you’re ready to applythose skills to real world projects, then we encourage you to enroll today.WHAT IS THE SCHOOL OF DATA SCIENCE, AND HOW DO I KNOW WHICHPROGRAM TO CHOOSE?Udacity’s School of Data consists of several different Nanodegree programs,each of which offers the opportunity to build data skills, and advance yourcareer. These programs are organized around three main career roles:Business Analyst, Data Analyst, and Data Scientist.The School of Data currently offers two clearly-defined career paths. Thesepaths are differentiated by whether they focus on developing programmingskills or not. Whether you are just getting started in data, are looking toaugment your existing skill set with in-demand data skills, or intend to pursueadvanced studies and career roles, Udacity’s School of Data has the right pathfor you! Visit How to Choose the Data Science Program That’s Right for Youto learn more.Data Analyst 14

FAQs ContinuedENROLLMENT AND ADMISSIONDO I NEED TO APPLY? WHAT ARE THE ADMISSION CRITERIA?No. This Nanodegree program accepts all applicants regardless of experienceand specific background.WHAT ARE THE PREREQUISITES FOR ENROLLMENT?In order to succeed in this program, we recommend having the followingexperience: Python programming, including common data analysis libraries (e.g.,Numpy and Pandas) SQL ProgrammingYou should also be able to read and write in English.TUITION AND TERM OF PROGR AMHOW IS THIS NANODEGREE PROGRAM STRUCTURED?The Data Analyst Nanodegree program is comprised of content and curriculumto support five (5) projects. We estimate that students can complete theprogram in four (4) months working 10 hours per week.Each project will be reviewed by the Udacity reviewer network and platform.Feedback will be provided and if you do not pass the project, you will be askedto resubmit the project until it passes.HOW LONG IS THIS NANODEGREE PROGRAM?Access to this Nanodegree program runs for the length of time specified inthe payment card above. If you do not graduate within that time period, youwill continue learning with month to month payments. See the Terms of Useand FAQs for other policies regarding the terms of access to our Nanodegreeprograms.I HAVE GRADUATED FROM THE DATA ANALYST PROGRAM AND I WANT TOKEEP LEARNING. WHERE SHOULD I GO FROM HERE?Check out our Data Scientist Nanodegree program to take the concepts youhave learned in Data Analyst and build upon them using machine learningand neural networks. Learning these advanced concepts will not only enhanceyour knowledge it will make you a more attractive candidate to be hired as ananalyst or data scientist.WHAT IS THE SCHOOL OF DATA SCIENCE, AND HOW DO I KNOW WHICHPROGRAM TO CHOOSE?Udacity’s School of Data consists of several different Nanodegree programs,each of which offers the opportunity to build data skills, and advance yourcareer. These programs are organized around career roles like BusinessAnalyst, Data Analyst, Data Scientist, and Data Engineer.Data Analyst 15

FAQs ContinuedThe School of Data currently offers three clearly-defined career paths inBusiness Analytics, Data Science, and Data Engineering. Whether you arejust getting started in data, are looking to augment your existing skill set within-demand data skills, or intend to pursue advanced studies and career roles,Udacity’s School of Data has the right path for you! Visit How to Choose theData Science Program That’s Right for You to learn more.SOF T WARE AND HARDWAREWHAT SOFTWARE AND VERSIONS WILL I NEED IN THIS PROGRAM?For this Nanodegree program you will need access to the Internet, and a 64 bitcomputer.Additional software such as Python and its common data analysis libraries(e.g., Numpy and Pandas) will be required, but the program will guide studentson how to download once the course has begun.Data Analyst 16

mentors guide your learning and are focused on answering your questions, motivating you and keeping you on track . Data Analyst 3 Course 1: Introduction to Data Analysis Learn the data analysis process of wrangling, exploring, analyzing, and communicating data. Work with data in