Data science for dummies / Lillian Pierson
- Pierson, Lillian
- Hoboken, NJ : For Dummies, 2021.
- Third edition.
- Physical Description:
- 1 online resource
- <P><b>Introduction</b><b> 1</b></p> <p>About This Book 3</p> <p>Foolish Assumptions 3</p> <p>Icons Used in This Book 4</p> <p>Beyond the Book 4</p> <p>Where to Go from Here 4</p> <p><b>Part 1: Getting Started with Data Science</b><b> 5</b></p> <p><b>Chapter 1: Wrapping Your Head Around Data Science</b><b> 7</b></p> <p>Seeing Who Can Make Use of Data Science 8</p> <p>Inspecting the Pieces of the Data Science Puzzle 10</p> <p>Collecting, querying, and consuming data 11</p> <p>Applying mathematical modeling to data science tasks 12</p> <p>Deriving insights from statistical methods 12</p> <p>Coding, coding, coding -- it's just part of the game 13</p> <p>Applying data science to a subject area 13</p> <p>Communicating data insights 14</p> <p>Exploring Career Alternatives That Involve Data Science 15</p> <p>The data implementer 16</p> <p>The data leader 16</p> <p>The data entrepreneur 17</p> <p><b>Chapter 2: Tapping into Critical Aspects of Data Engineering</b><b> 19</b></p> <p>Defining Big Data and the Three Vs 19</p> <p>Grappling with data volume 21</p> <p>Handling data velocity 21</p> <p>Dealing with data variety 22</p> <p>Identifying Important Data Sources 23</p> <p>Grasping the Differences among Data Approaches 24</p> <p>Defining data science 25</p> <p>Defining machine learning engineering 26</p> <p>Defining data engineering 26</p> <p>Comparing machine learning engineers, data scientists, and data engineers 27</p> <p>Storing and Processing Data for Data Science 28</p> <p>Storing data and doing data science directly in the cloud 28</p> <p>Storing big data on-premise 32</p> <p>Processing big data in real-time 35</p> <p><b>Part 2: Using Data Science to Extract Meaning from Your Data </b><b>37</b></p> <p><b>Chapter 3: Machine Learning Means Using a Machine to Learn from Data</b><b> 39</b></p> <p>Defining Machine Learning and Its Processes 40</p> <p>Walking through the steps of the machine learning process 40</p> <p>Becoming familiar with machine learning terms 41</p> <p>Considering Learning Styles 42</p> <p>Learning with supervised algorithms 42</p> <p>Learning with unsupervised algorithms 43</p> <p>Learning with reinforcement 43</p> <p>Seeing What You Can Do 43</p> <p>Selecting algorithms based on function 44</p> <p>Using Spark to generate real-time big data analytics 48</p> <p><b>Chapter 4: Math, Probability, and Statistical Modeling</b><b> 51</b></p> <p>Exploring Probability and Inferential Statistics 52</p> <p>Probability distributions 53</p> <p>Conditional probability with Naïve Bayes 55</p> <p>Quantifying Correlation 56</p> <p>Calculating correlation with Pearson's r 56</p> <p>Ranking variable-pairs using Spearman's rank correlation 58</p> <p>Reducing Data Dimensionality with Linear Algebra 59</p> <p>Decomposing data to reduce dimensionality 59</p> <p>Reducing dimensionality with factor analysis 63</p> <p>Decreasing dimensionality and removing outliers with PCA 64</p> <p>Modeling Decisions with Multiple Criteria Decision-Making 65</p> <p>Turning to traditional MCDM 65</p> <p>Focusing on fuzzy MCDM 67</p> <p>Introducing Regression Methods 67</p> <p>Linear regression 67</p> <p>Logistic regression 69</p> <p>Ordinary least squares (OLS) regression methods 70</p> <p>Detecting Outliers 70</p> <p>Analyzing extreme values 70</p> <p>Detecting outliers with univariate analysis 71</p> <p>Detecting outliers with multivariate analysis 73</p> <p>Introducing Time Series Analysis 73</p> <p>Identifying patterns in time series 74</p> <p>Modeling univariate time series data 75</p> <p><b>Chapter 5: Grouping Your Way into Accurate Predictions</b><b> 77</b></p> <p>Starting with Clustering Basics 78</p> <p>Getting to know clustering algorithms 79</p> <p>Examining clustering similarity metrics 81</p> <p>Identifying Clusters in Your Data 82</p> <p>Clustering with the k-means algorithm 82</p> <p>Estimating clusters with kernel density estimation (KDE) 84</p> <p>Clustering with hierarchical algorithms 84</p> <p>Dabbling in the DBScan neighborhood 87</p> <p>Categorizing Data with Decision Tree and Random Forest Algorithms 88</p> <p>Drawing a Line between Clustering and Classification 89</p> <p>Introducing instance-based learning classifiers 90</p> <p>Getting to know classification algorithms 90</p> <p>Making Sense of Data with Nearest Neighbor Analysis 93</p> <p>Classifying Data with Average Nearest Neighbor Algorithms 94</p> <p>Classifying with K-Nearest Neighbor Algorithms 97</p> <p>Understanding how the k-nearest neighbor algorithm works 98</p> <p>Knowing when to use the k-nearest neighbor algorithm 99</p> <p>Exploring common applications of k-nearest neighbour algorithms 100</p> <p>Solving Real-World Problems with Nearest Neighbor Algorithms 100</p> <p>Seeing k-nearest neighbor algorithms in action 101</p> <p>Seeing average nearest neighbor algorithms in action 101</p> <p><b>Chapter 6: Coding Up Data Insights and Decision Engines</b><b> 103</b></p> <p>Seeing Where Python and R Fit into Your Data Science Strategy 104</p> <p>Using Python for Data Science 104</p> <p>Sorting out the various Python data types 106</p> <p>Putting loops to good use in Python 109</p> <p>Having fun with functions 110</p> <p>Keeping cool with classes 112</p> <p>Checking out some useful Python libraries 114</p> <p>Using Open Source R for Data Science 120</p> <p>Comprehending R's basic vocabulary 121</p> <p>Delving into functions and operators 124</p> <p>Iterating in R 127</p> <p>Observing how objects work 129</p> <p>Sorting out R's popular statistical analysis packages 131</p> <p>Examining packages for visualizing, mapping, and graphing in R 133</p> <p><b>Chapter 7: Generating Insights with Software Applications</b><b> 137</b></p> <p>Choosing the Best Tools for Your Data Science Strategy 138</p> <p>Getting a Handle on SQL and Relational Databases 139</p> <p>Investing Some Effort into Database Design 144</p> <p>Defining data types 144</p> <p>Designing constraints properly 145</p> <p>Normalizing your database 145</p> <p>Narrowing the Focus with SQL Functions 147</p> <p>Making Life Easier with Excel 151</p> <p>Using Excel to quickly get to know your data 152</p> <p>Reformatting and summarizing with PivotTables 157</p> <p>Automating Excel tasks with macros 158</p> <p><b>Chapter 8: Telling Powerful Stories with Data</b><b> 161</b></p> <p>Data Visualizations: The Big Three 162</p> <p>Data storytelling for decision makers 162</p> <p>Data showcasing for analysts 163</p> <p>Designing data art for activists 164</p> <p>Designing to Meet the Needs of Your Target Audience 164</p> <p>Step 1: Brainstorm (All about Eve) 165</p> <p>Step 2: Define the purpose 166</p> <p>Step 3: Choose the most functional visualization type for your purpose 166</p> <p>Picking the Most Appropriate Design Style 167</p> <p>Inducing a calculating, exacting response 167</p> <p>Eliciting a strong emotional response 168</p> <p>Selecting the Appropriate Data Graphic Type 170</p> <p>Standard chart graphics 171</p> <p>Comparative graphics 173</p> <p>Statistical plots 176</p> <p>Topology structures 179</p> <p>Spatial plots and maps 180</p> <p>Testing Data Graphics 183</p> <p>Adding Context 184</p> <p>Creating context with data 184</p> <p>Creating context with annotations 185</p> <p>Creating context with graphical elements 186</p> <p><b>Part 3: Taking Stock of Your Data Science Capabilities </b><b>187</b></p> <p><b>Chapter 9: Developing Your Business Acumen</b><b> 189</b></p> <p>Bridging the Business Gap 189</p> <p>Contrasting business acumen with subject matter expertise 190</p> <p>Defining business acumen 191</p> <p>Traversing the Business Landscape 192</p> <p>Seeing how data roles support the business in making money 192</p> <p>Leveling up your business acumen 195</p> <p>Fortifying your leadership skills 196</p> <p>Surveying Use Cases and Case Studies 197</p> <p>Documen
- Monetize your company's data and data science expertise without spending a fortune on hiring independent strategy consultants to help What if there was one simple, clear process for ensuring that all your company's data science projects achieve a high a return on investment? What if you could validate your ideas for future data science projects, and select the one idea that's most prime for achieving profitability while also moving your company closer to its business vision? There is. Industry-acclaimed data science consultant, Lillian Pierson, shares her proprietary STAR Framework - A simple, proven process for leading profit-forming data science projects. Not sure what data science is yet? Don't worry! Parts 1 and 2 of Data Science For Dummies will get all the bases covered for you. And if you're already a data science expert? Then you really won't want to miss the data science strategy and data monetization gems that are shared in Part 3 onward throughout this book. Data Science For Dummies demonstrates: The only process you'll ever need to lead profitable data science projects Secret, reverse-engineered data monetization tactics that no one's talking about The shocking truth about how simple natural language processing can be How to beat the crowd of data professionals by cultivating your own unique blend of data science expertise Whether you're new to the data science field or already a decade in, you're sure to learn something new and incredibly valuable from Data Science For Dummies. Discover how to generate massive business wins from your company's data by picking up your copy today.
- 9781119811619 (ePub ebook)
View MARC record | catkey: 37459916