46 Best 「big data」 Books of 2024| Books Explorer
- Naked Statistics: Stripping the Dread from the Data
- Build a Career in Data Science
- Storytelling with Data: A Data Visualization Guide for Business Professionals
- Big Data: A Revolution That Will Transform How We Live, Work, and Think
- Data Science for Beginners: 4 Books in 1: Python Programming, Data Analysis, Machine Learning. A Complete Overview to Master The Art of Data Science From Scratch Using Python for Business (Data Science Mastery)
- Data Science for Dummies
- Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
- Introduction to Probability
- Algorithms of Oppression: How Search Engines Reinforce Racism
- R for Data Science: Import, Tidy, Transform, Visualize, and Model Data
A New York Times bestseller"Brilliant, funny…the best math teacher you never had." ―San Francisco ChronicleOnce considered tedious, the field of statistics is rapidly evolving into a discipline Hal Varian, chief economist at Google, has actually called "sexy." From batting averages and political polls to game shows and medical research, the real-world application of statistics continues to grow by leaps and bounds. How can we catch schools that cheat on standardized tests? How does Netflix know which movies you’ll like? What is causing the rising incidence of autism? As best-selling author Charles Wheelan shows us in Naked Statistics, the right data and a few well-chosen statistical tools can help us answer these questions and more.For those who slept through Stats 101, this book is a lifesaver. Wheelan strips away the arcane and technical details and focuses on the underlying intuition that drives statistical analysis. He clarifies key concepts such as inference, correlation, and regression analysis, reveals how biased or careless parties can manipulate or misrepresent data, and shows us how brilliant and creative researchers are exploiting the valuable data from natural experiments to tackle thorny questions.And in Wheelan’s trademark style, there’s not a dull page in sight. You’ll encounter clever Schlitz Beer marketers leveraging basic probability, an International Sausage Festival illuminating the tenets of the central limit theorem, and a head-scratching choice from the famous game show Let’s Make a Deal―and you’ll come away with insights each time. With the wit, accessibility, and sheer fun that turned Naked Economics into a bestseller, Wheelan defies the odds yet again by bringing another essential, formerly unglamorous discipline to life.
You are going to need more than technical knowledge to succeed as a data scientist. Build a Career in Data Science teaches you what school leaves out, from how to land your first job to the lifecycle of a data science project, and even how to become a manager.Build a Career in Data Science is your guide to landing your first data science job and developing into a valued senior employee. By following clear and simple instructions, you'll learn to craft an amazing resume and ace your interviews. In this demanding, rapidly changing field, it can be challenging to keep projects on track, adapt to company needs, and manage tricky stakeholders. You'll love the insights on how to handle expectations, deal with failures, and plan your career path in the stories from seasoned data scientists included in the book.Table of Contents:PART 1 - GETTING STARTED WITH DATA SCIENCE1. What is data science?2. Data science companies3. Getting the skills4. Building a portfolioPART 2 - FINDING YOUR DATA SCIENCE JOB5. The search: Identifying the right job for you6. The application: Résumés and cover letters7. The interview: What to expect and how to handle it8. The offer: Knowing what to acceptPART 3 - SETTLING INTO DATA SCIENCE9. The first months on the job10. Making an effective analysis11. Deploying a model into production12. Working with stakeholdersPART 4 - GROWING IN YOUR DATA SCIENCE ROLE13. When your data science project fails14. Joining the data science community15. Leaving your job gracefully16. Moving up the ladder
Don't simply show your data—tell a story with it! Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You'll discover the power of storytelling and the way to make data a pivotal point in your story. The lessons in this illuminative text are grounded in theory, but made accessible through numerous real-world examples—ready for immediate application to your next graph or presentation.Storytelling is not an inherent skill, especially when it comes to data visualization, and the tools at our disposal don't make it any easier. This book demonstrates how to go beyond conventional tools to reach the root of your data, and how to use your data to create an engaging, informative, compelling story. Specifically, you'll learn how to:Understand the importance of context and audience Determine the appropriate type of graph for your situation Recognize and eliminate the clutter clouding your information Direct your audience's attention to the most important parts of your data Think like a designer and utilize concepts of design in data visualization Leverage the power of storytelling to help your message resonate with your audience Together, the lessons in this book will help you turn your data into high impact visual stories that stick with your audience. Rid your world of ineffective graphs, one exploding 3D pie chart at a time. There is a story in your data—Storytelling with Data will give you the skills and power to tell it.
A revelatory exploration of the hottest trend in technology and the dramatic impact it will have on the economy, science, and society at large.Which paint color is most likely to tell you that a used car is in good shape? How can officials identify the most dangerous New York City manholes before they explode? And how did Google searches predict the spread of the H1N1 flu outbreak? The key to answering these questions, and many more, is big data. “Big data” refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it. This emerging science can translate myriad phenomena—from the price of airline tickets to the text of millions of books—into searchable form, and uses our increasing computing power to unearth epiphanies that we never could have seen before. A revolution on par with the Internet or perhaps even the printing press, big data will change the way we think about business, health, politics, education, and innovation in the years to come. It also poses fresh threats, from the inevitable end of privacy as we know it to the prospect of being penalized for things we haven’t even done yet, based on big data’s ability to predict our future behavior.In this brilliantly clear, often surprising work, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing.www.big-data-book.com
Product Description Did you know that according to Harvard Business Review the Data Scientist is the sexiest job of the 21st century? And for a reason!If "sexy" means having rare qualities that are much in demand, data scientists are already there. They are expensive to hire and, given the very competitive market for their services, difficult to retain. There simply aren't a lot of people with their combination of scientific background and computational and analytical skills.Data Science is all about transforming data into business value using math and algorithms. And needless to say, Python is the must-know programming language of the 21st century. If you are interested in coding and Data Science, then you must know Python to succeed in these industries! Data Science for Beginners is the perfect place to start learning everything you need to succeed. Contained within these four essential books are the methods, concepts, and important practical examples to help build your foundation for excelling at the discipline that is shaping the modern word. This bundle is perfect for programmers, software engineers, project managers and those who just want to keep up with technology. With these books in your hands, you will: ● Learn Python from scratch including the basic operations, how to install it, data structures and functions, and conditional loops ● Build upon the fundamentals with advanced techniques like Object-Oriented Programming (OOP), Inheritance, and Polymorphism● Discover the importance of Data Science and how to use it in real-world situations ● Learn the 5 steps of Data Analysis so you can comprehend and analyze data sitting right in front of you ● Increase your income by learning a new, valuable skill that only a select handful of people take the time to learn ● Discover how companies can improve their business through practical examples and explanations ● And Much More! This bundle is essential for anyone who wants to study Data Science and learn how the world is moving to an open-source platform. Whether you are a software engineer or a project manager, jump to the next level by developing a data-driven approach and learning how to define a data-driven vision of your business!Order Your Copy of the Bundle and Start Your New Career Path Today! Review This 4 book set focuses on Python programming. The 1st book details how easy Python is to learn and use as opposed to something C++. I liked that terms are explained in great detail and there are tons of examples. Every step has a screen grab to make sure you are following along in the learning process. Book 2 focuses on data analysis with Python. So anyone in the business world with definitely benefit from this book. The ways Python can streamline and speed up data analysis is examined. Again there are tons of examples and it is easy to follow. Book 3 focuses on Machine Learning. Python seems to be the future of AI, and one who hopes to build a future in that arena needs a strong foundation in Python. The author dispels myths and fears that are common when AI is discussed especially that it will take people's jobs. Instead the author believes AI will create just as many jobs, just in different areas. Book 4's subject is data science. And the benefits Python can bring to that area of business and math. Again, this book is loaded with examples and every formula and chart is explained so well. I was not lost at all. The author comes across as a highly knowledgeable expert imparting credible info. I recommend this to anyone wanting to explore Python more. - J. Mielke
Discover how data science can help you gain in-depth insight into your business - the easy way! Jobs in data science abound, but few people have the data science skills needed to fill these increasingly important roles. Data Science For Dummies is the perfect starting point for IT professionals and students who want a quick primer on all areas of the expansive data science space. With a focus on business cases, the book explores topics in big data, data science, and data engineering, and how these three areas are combined to produce tremendous value. If you want to pick-up the skills you need to begin a new career or initiate a new project, reading this book will help you understand what technologies, programming languages, and mathematical methods on which to focus. While this book serves as a wildly fantastic guide through the broad, sometimes intimidating field of big data and data science, it is not an instruction manual for hands-on implementation. Here’s what to expect: Provides a background in big data and data engineering before moving on to data science and how it's applied to generate value Includes coverage of big data frameworks like Hadoop, MapReduce, Spark, MPP platforms, and NoSQL Explains machine learning and many of its algorithms as well as artificial intelligence and the evolution of the Internet of Things Details data visualization techniques that can be used to showcase, summarize, and communicate the data insights you generate It's a big, big data world out there―let Data Science For Dummies help you harness its power and gain a competitive edge for your organization.
Longlisted for the National Book Award | New York Times BestsellerA former Wall Street quant sounds an alarm on the mathematical models that pervade modern life and threaten to rip apart our social fabric.We live in the age of the algorithm. Increasingly, the decisions that affect our lives—where we go to school, whether we get a car loan, how much we pay for health insurance—are being made not by humans, but by mathematical models. In theory, this should lead to greater fairness: Everyone is judged according to the same rules, and bias is eliminated.But as Cathy O’Neil reveals in this urgent and necessary book, the opposite is true. The models being used today are opaque, unregulated, and uncontestable, even when they’re wrong. Most troubling, they reinforce discrimination: If a poor student can’t get a loan because a lending model deems him too risky (by virtue of his zip code), he’s then cut off from the kind of education that could pull him out of poverty, and a vicious spiral ensues. Models are propping up the lucky and punishing the downtrodden, creating a “toxic cocktail for democracy.” Welcome to the dark side of Big Data.Tracing the arc of a person’s life, O’Neil exposes the black box models that shape our future, both as individuals and as a society. These “weapons of math destruction” score teachers and students, sort résumés, grant (or deny) loans, evaluate workers, target voters, set parole, and monitor our health.O’Neil calls on modelers to take more responsibility for their algorithms and on policy makers to regulate their use. But in the end, it’s up to us to become more savvy about the models that govern our lives. This important book empowers us to ask the tough questions, uncover the truth, and demand change.
text is designed for an introductory probability course at the university level for sophomores, juniors, and seniors in mathematics, physical and social sciences, engineering, and computer science. It presents a thorough treatment of ideas and techniques necessary for a firm understanding of the subject. The text is also recommended for use in discrete probability courses. The material is organized so that the discrete and continuous probability discussions are presented in a separate, but parallel, manner. This organization does not emphasize an overly rigorous or formal view of probabililty and therefore offers some strong pedagogical value. Hence, the discrete discussions can sometimes serve to motivate the more abstract continuous probability discussions. Features: Key ideas are developed in a somewhat leisurely style, providing a variety of interesting applications to probability and showing some nonintuitive ideas. Over 600 exercises provide the opportunity for practicing skills and developing a sound understanding of ideas. Numerous historical comments deal with the development of discrete probability. The text includes many computer programs that illustrate the algorithms or the methods of computation for important problems.
A revealing look at how negative biases against women of color are embedded in search engine results and algorithmsRun a Google search for “Black girls”―what will you find? “Big Booty” and other sexually explicit terms are likely to come up as top search terms. But, if you type in “white girls,” the results are radically different. The suggested porn sites and un-moderated discussions about “why Black women are so sassy” or “why Black women are so angry” presents a disturbing portrait of Black womanhood in modern society.In Algorithms of Oppression, Safiya Umoja Noble challenges the idea that search engines like Google offer an equal playing field for all forms of ideas, identities, and activities. Data discrimination is a real social problem; Noble argues that the combination of private interests in promoting certain sites, along with the monopoly status of a relatively small number of Internet search engines, leads to a biased set of search algorithms that privilege whiteness and discriminate against people of color, specifically women of color.Through an analysis of textual and media searches as well as extensive research on paid online advertising, Noble exposes a culture of racism and sexism in the way discoverability is created online. As search engines and their related companies grow in importance―operating as a source for email, a major vehicle for primary and secondary school learning, and beyond―understanding and reversing these disquieting trends and discriminatory practices is of utmost importance.An original, surprising and, at times, disturbing account of bias on the internet, Algorithms of Oppression contributes to our understanding of how racism is created, maintained, and disseminated in the 21st century.
Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch.If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability―and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases
To really learn data science, you should not only master the tools―data science libraries, frameworks, modules, and toolkits―but also understand the ideas and principles underlying them. Updated for Python 3.6, this second edition of Data Science from Scratch shows you how these tools and algorithms work by implementing them from scratch.If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with the hacking skills you need to get started as a data scientist. Packed with new material on deep learning, statistics, and natural language processing, this updated book shows you how to find the gems in today’s messy glut of data. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability―and how and when they’re used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest neighbors, Naïve Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases
This book is written for high school and college students learning about probability for the first time. It will appeal to the reader who has a healthy level of enthusiasm for understanding how and why the various results of probability come about. All of the standard introductory topics in probability are covered: combinatorics, the rules of probability, Bayes’ theorem, expectation value, variance, probability density, common distributions, the law of large numbers, the central limit theorem, correlation, and regression. Calculus is not a prerequisite, although a few of the problems do involve calculus. These are marked clearly.The book features 150 worked-out problems in the form of examples in the text and solved problems at the end of each chapter. These problems, along with the discussions in the text, will be a valuable resource in any introductory probability course, either as the main text or as a helpful supplement.
Use R to turn data into insight, knowledge, and understanding. With this practical book, aspiring data scientists will learn how to do data science with R and RStudio, along with the tidyverseâ??a collection of R packages designed to work together to make data science fast, fluent, and fun. Even if you have no programming experience, this updated edition will have you doing data science quickly.You'll learn how to import, transform, and visualize your data and communicate the results. And you'll get a complete, big-picture understanding of the data science cycle and the basic tools you need to manage the details. Updated for the latest tidyverse features and best practices, new chapters show you how to get data from spreadsheets, databases, and websites. Exercises help you practice what you've learned along the way.You'll understand how to: Visualize: Create plots for data exploration and communication of results Transform: Discover variable types and the tools to work with them Import: Get data into R and in a form convenient for analysis Program: Learn R tools for solving data problems with greater clarity and ease Communicate: Integrate prose, code, and results with Quarto
Peter Norvig, Research Director at Google, co-author of AIMA, the most popular AI textbook in the world: "Burkov has undertaken a very useful but impossibly hard task in reducing all of machine learning to 100 pages. He succeeds well in choosing the topics — both theory and practice — that will be useful to practitioners, and for the reader who understands that this is the first 100 (or actually 150) pages you will read, not the last, provides a solid introduction to the field."Aurélien Géron, Senior AI Engineer, author of the bestseller Hands-On Machine Learning with Scikit-Learn and TensorFlow: "The breadth of topics the book covers is amazing for just 100 pages (plus few bonus pages!). Burkov doesn't hesitate to go into the math equations: that's one thing that short books usually drop. I really liked how the author explains the core concepts in just a few words. The book can be very useful for newcomers in the field, as well as for old-timers who can gain from such a broad view of the field."Karolis Urbonas, Head of Data Science at Amazon: "A great introduction to machine learning from a world-class practitioner."Chao Han, VP, Head of R&D at Lucidworks: "I wish such a book existed when I was a statistics graduate student trying to learn about machine learning."Sujeet Varakhedi, Head of Engineering at eBay: "Andriy's book does a fantastic job of cutting the noise and hitting the tracks and full speed from the first page.''Deepak Agarwal, VP of Artificial Intelligence at LinkedIn: "A wonderful book for engineers who want to incorporate ML in their day-to-day work without necessarily spending an enormous amount of time.''Vincent Pollet, Head of Research at Nuance: "The Hundred-Page Machine Learning Book is an excellent read to get started with Machine Learning.''Gareth James, Professor of Data Sciences and Operations, co-author of the bestseller An Introduction to Statistical Learning, with Applications in R: "This is a compact “how to do data science” manual and I predict it will become a go-to resource for academics and practitioners alike. At 100 pages (or a little more), the book is short enough to read in a single sitting. Yet, despite its length, it covers all the major machine learning approaches, ranging from classical linear and logistic regression, through to modern support vector machines, deep learning, boosting, and random forests. There is also no shortage of details on the various approaches and the interested reader can gain further information on any particular method via the innovative companion book wiki. The book does not assume any high level mathematical or statistical training or even programming experience, so should be accessible to almost anyone willing to invest the time to learn about these methods. It should certainly be required reading for anyone starting a PhD program in this area and will serve as a useful reference as they progress further. Finally, the book illustrates some of the algorithms using Python code, one of the most popular coding languages for machine learning. I would highly recommend “The Hundred-Page Machine Learning Book” for both the beginner looking to learn more about machine learning and the experienced practitioner seeking to extend their knowledge base."Everything you really need to know in Machine Learning in a hundred pages.
Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. Now, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.By using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and Tensor Flow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn a range of techniques, starting with simple linear regression and progressing to deep neural networks. With exercises in each chapter to help you apply what you’ve learned, all you need is programming experience to get started. Explore the machine learning landscape, particularly neural nets Use Scikit-Learn to track an example machine-learning project end-to-end Explore several training models, including support vector machines, decision trees, random forests, and ensemble methods Use the Tensor Flow library to build and train neural nets Dive into neural net architectures, including convolutional nets, recurrent nets, and deep reinforcement learning Learn techniques for training and scaling deep neural nets.
SummaryGrokking Deep Learning teaches you to build deep learning neural networks from scratch! In his engaging style, seasoned deep learning expert Andrew Trask shows you the science under the hood, so you grok for yourself every detail of training neural networks.Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.About the TechnologyDeep learning, a branch of artificial intelligence, teaches computers to learn by using neural networks, technology inspired by the human brain. Online text translation, self-driving cars, personalized product recommendations, and virtual voice assistants are just a few of the exciting modern advancements possible thanks to deep learning.About the BookGrokking Deep Learning teaches you to build deep learning neural networks from scratch! In his engaging style, seasoned deep learning expert Andrew Trask shows you the science under the hood, so you grok for yourself every detail of training neural networks. Using only Python and its math-supporting library, NumPy, you'll train your own neural networks to see and understand images, translate text into different languages, and even write like Shakespeare! When you're done, you'll be fully prepared to move on to mastering deep learning frameworks.What's insideThe science behind deep learning Building and training your own neural networks Privacy concepts, including federated learning Tips for continuing your pursuit of deep learningAbout the ReaderFor readers with high school-level math and intermediate programming skills.About the AuthorAndrew Trask is a PhD student at Oxford University and a research scientist at DeepMind. Previously, Andrew was a researcher and analytics product manager at Digital Reasoning, where he trained the world's largest artificial neural network and helped guide the analytics roadmap for the Synthesys cognitive computing platform.Table of ContentsIntroducing deep learning: why you should learn it Fundamental concepts: how do machines learn? Introduction to neural prediction: forward propagation Introduction to neural learning: gradient descent Learning multiple weights at a time: generalizing gradient descent Building your first deep neural network: introduction to backpropagation How to picture neural networks: in your head and on paper Learning signal and ignoring noise:introduction to regularization and batching Modeling probabilities and nonlinearities: activation functions Neural learning about edges and corners: intro to convolutional neural networks Neural networks that understand language: king - man + woman == ? Neural networks that write like Shakespeare: recurrent layers for variable-length data Introducing automatic optimization: let's build a deep learning framework Learning to write like Shakespeare: long short-term memory Deep learning on unseen data: introducing federated learning Where to go from here: a brief guide
An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, deep learning, survival analysis, multiple testing, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform.Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.This Second Edition features new chapters on deep learning, survival analysis, and multiple testing, as well as expanded treatments of naïve Bayes, generalized linear models, Bayesian additive regression trees, and matrix completion. R code has been updated throughout to ensure compatibility.
If you know how to program, you have the skills to turn data into knowledge, using tools of probability and statistics. This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python.By working with a single case study throughout this thoroughly revised book, you’ll learn the entire process of exploratory data analysis―from collecting data and generating statistics to identifying patterns and testing hypotheses. You’ll explore distributions, rules of probability, visualization, and many other tools and concepts.New chapters on regression, time series analysis, survival analysis, and analytic methods will enrich your discoveries. Develop an understanding of probability and statistics by writing and testing code Run experiments to test statistical behavior, such as generating samples from several distributions Use simulations to understand concepts that are hard to grasp mathematically Import data from most sources with Python, rather than rely on data that’s cleaned and formatted for statistics tools Use statistical inference to answer questions about real-world data
Organizations can make data science a repeatable, predictable tool, which business professionals use to get more value from their dataEnterprise data and AI projects are often scattershot, underbaked, siloed, and not adaptable to predictable business changes. As a result, the vast majority fail. These expensive quagmires can be avoided, and this book explains precisely how.Data science is emerging as a hands-on tool for not just data scientists, but business professionals as well. Managers, directors, IT leaders, and analysts must expand their use of data science capabilities for the organization to stay competitive. Smarter Data Science helps them achieve their enterprise-grade data projects and AI goals. It serves as a guide to building a robust and comprehensive information architecture program that enables sustainable and scalable AI deployments.When an organization manages its data effectively, its data science program becomes a fully scalable function that’s both prescriptive and repeatable. With an understanding of data science principles, practitioners are also empowered to lead their organizations in establishing and deploying viable AI. They employ the tools of machine learning, deep learning, and AI to extract greater value from data for the benefit of the enterprise.By following a ladder framework that promotes prescriptive capabilities, organizations can make data science accessible to a range of team members, democratizing data science throughout the organization. Companies that collect, organize, and analyze data can move forward to additional data science achievements: Improving time-to-value with infused AI models for common use cases Optimizing knowledge work and business processes Utilizing AI-based business intelligence and data visualization Establishing a data topology to support general or highly specialized needs Successfully completing AI projects in a predictable manner Coordinating the use of AI from any compute node. From inner edges to outer edges: cloud, fog, and mist computingWhen they climb the ladder presented in this book, businesspeople and data scientists alike will be able to improve and foster repeatable capabilities. They will have the knowledge to maximize their AI and data assets for the benefit of their organizations.
Master the math needed to excel in data science and machine learning. If you're a data scientist who lacks a math or scientific background or a developer who wants to add data domains to your skillset, this is your book. Author Hadrien Jean provides you with a foundation in math for data science, machine learning, and deep learning. Through the course of this book, you'll learn how to use mathematical notation to understand new developments in the field, communicate with your peers, and solve problems in mathematical form. You'll also understand what's under the hood of the algorithms you're using. Learn how to: Use Python and Jupyter notebooks to plot data, represent equations, and visualize space transformations Read and write math notation to communicate ideas in data science and machine learning Perform descriptive statistics and preliminary observation on a dataset Manipulate vectors, matrices, and tensors to use machine learning and deep learning libraries such as TensorFlow or Keras Explore reasons behind a broken model and be prepared to tune and fix it Choose the right tool or algorithm for the right data problem
This best-selling textbook for a second course in linear algebra is aimed at undergrad math majors and graduate students. The novel approach taken here banishes determinants to the end of the book. The text focuses on the central goal of linear algebra: understanding the structure of linear operators on finite-dimensional vector spaces. The author has taken unusual care to motivate concepts and to simplify proofs. A variety of interesting exercises in each chapter helps students understand and manipulate the objects of linear algebra.The third edition contains major improvements and revisions throughout the book. More than 300 new exercises have been added since the previous edition. Many new examples have been added to illustrate the key ideas of linear algebra. New topics covered in the book include product spaces, quotient spaces, and dual spaces. Beautiful new formatting creates pages with an unusually pleasant appearance in both print and electronic versions.No prerequisites are assumed other than the usual demand for suitable mathematical maturity. Thus the text starts by discussing vector spaces, linear independence, span, basis, and dimension. The book then deals with linear maps, eigenvalues, and eigenvectors. Inner-product spaces are introduced, leading to the finite-dimensional spectral theorem and its consequences. Generalized eigenvectors are then used to provide insight into the structure of a linear operator.
This is the first textbook on pattern recognition to present the Bayesian viewpoint. The book presents approximate inference algorithms that permit fast approximate answers in situations where exact answers are not feasible. It uses graphical models to describe probability distributions when no other books apply graphical models to machine learning. No previous knowledge of pattern recognition or machine learning concepts is assumed. Familiarity with multivariate calculus and basic linear algebra is required, and some experience in the use of probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.
Peter Norvig, Research Director at Google, co-author of AIMA, the most popular AI textbook in the world: "Burkov has undertaken a very useful but impossibly hard task in reducing all of machine learning to 100 pages. He succeeds well in choosing the topics — both theory and practice — that will be useful to practitioners, and for the reader who understands that this is the first 100 (or actually 150) pages you will read, not the last, provides a solid introduction to the field."Aurélien Géron, Senior AI Engineer, author of the bestseller Hands-On Machine Learning with Scikit-Learn and TensorFlow: "The breadth of topics the book covers is amazing for just 100 pages (plus few bonus pages!). Burkov doesn't hesitate to go into the math equations: that's one thing that short books usually drop. I really liked how the author explains the core concepts in just a few words. The book can be very useful for newcomers in the field, as well as for old-timers who can gain from such a broad view of the field."Karolis Urbonas, Head of Data Science at Amazon: "A great introduction to machine learning from a world-class practitioner."Chao Han, VP, Head of R&D at Lucidworks: "I wish such a book existed when I was a statistics graduate student trying to learn about machine learning."Sujeet Varakhedi, Head of Engineering at eBay: "Andriy's book does a fantastic job of cutting the noise and hitting the tracks and full speed from the first page.''Deepak Agarwal, VP of Artificial Intelligence at LinkedIn: "A wonderful book for engineers who want to incorporate ML in their day-to-day work without necessarily spending an enormous amount of time.''Vincent Pollet, Head of Research at Nuance: "The Hundred-Page Machine Learning Book is an excellent read to get started with Machine Learning.''Gareth James, Professor of Data Sciences and Operations, co-author of the bestseller An Introduction to Statistical Learning, with Applications in R: "This is a compact “how to do data science” manual and I predict it will become a go-to resource for academics and practitioners alike. At 100 pages (or a little more), the book is short enough to read in a single sitting. Yet, despite its length, it covers all the major machine learning approaches, ranging from classical linear and logistic regression, through to modern support vector machines, deep learning, boosting, and random forests. There is also no shortage of details on the various approaches and the interested reader can gain further information on any particular method via the innovative companion book wiki. The book does not assume any high level mathematical or statistical training or even programming experience, so should be accessible to almost anyone willing to invest the time to learn about these methods. It should certainly be required reading for anyone starting a PhD program in this area and will serve as a useful reference as they progress further. Finally, the book illustrates some of the algorithms using Python code, one of the most popular coding languages for machine learning. I would highly recommend “The Hundred-Page Machine Learning Book” for both the beginner looking to learn more about machine learning and the experienced practitioner seeking to extend their knowledge base."Everything you really need to know in Machine Learning in a hundred pages.
Printed in full color! Unlock the groundbreaking advances of deep learning with this extensively revised new edition of the bestselling original. Learn directly from the creator of Keras and master practical Python deep learning techniques that are easy to apply in the real world.In Deep Learning with Python, Second Edition you will learn:Deep learning from first principlesImage classification and image segmentationTimeseries forecastingText classification and machine translationText generation, neural style transfer, and image generationFull color printing throughoutDeep Learning with Python has taught thousands of readers how to put the full capabilities of deep learning into action. This extensively revised full color second edition introduces deep learning using Python and Keras, and is loaded with insights for both novice and experienced ML practitioners. You’ll learn practical techniques that are easy to apply in the real world, and important theory for perfecting neural networks.Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.About the technologyRecent innovations in deep learning unlock exciting new software capabilities like automated language translation, image recognition, and more. Deep learning is quickly becoming essential knowledge for every software developer, and modern tools like Keras and TensorFlow put it within your reach—even if you have no background in mathematics or data science. This book shows you how to get started.About the bookDeep Learning with Python, Second Edition introduces the field of deep learning using Python and the powerful Keras library. In this revised and expanded new edition, Keras creator François Chollet offers insights for both novice and experienced machine learning practitioners. As you move through this book, you’ll build your understanding through intuitive explanations, crisp color illustrations, and clear examples. You’ll quickly pick up the skills you need to start developing deep-learning applications.What's insideDeep learning from first principlesImage classification and image segmentationTime series forecastingText classification and machine translationText generation, neural style transfer, and image generationFull color printing throughoutAbout the readerFor readers with intermediate Python skills. No previous experience with Keras, TensorFlow, or machine learning is required.About the authorFrançois Chollet is a software engineer at Google and creator of the Keras deep-learning library.Table of Contents1 What is deep learning?2 The mathematical building blocks of neural networks3 Introduction to Keras and TensorFlow4 Getting started with neural networks: Classification and regression5 Fundamentals of machine learning6 The universal workflow of machine learning7 Working with Keras: A deep dive8 Introduction to deep learning for computer vision9 Advanced deep learning for computer vision10 Deep learning for timeseries11 Deep learning for text12 Generative deep learning13 Best practices for the real world14 Conclusions
SummaryDask is a native parallel analytics tool designed to integrate seamlessly with the libraries you're already using, including Pandas, NumPy, and Scikit-Learn. With Dask you can crunch and work with huge datasets, using the tools you already have. And Data Science with Python and Dask is your guide to using Dask for your data projects without changing the way you work!Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. You'll find registration instructions inside the print book.About the TechnologyAn efficient data pipeline means everything for the success of a data science project. Dask is a flexible library for parallel computing in Python that makes it easy to build intuitive workflows for ingesting and analyzing large, distributed datasets. Dask provides dynamic task scheduling and parallel collections that extend the functionality of NumPy, Pandas, and Scikit-learn, enabling users to scale their code from a single laptop to a cluster of hundreds of machines with ease.About the BookData Science with Python and Dask teaches you to build scalable projects that can handle massive datasets. After meeting the Dask framework, you'll analyze data in the NYC Parking Ticket database and use DataFrames to streamline your process. Then, you'll create machine learning models using Dask-ML, build interactive visualizations, and build clusters using AWS and Docker.What's insideWorking with large, structured and unstructured datasets Visualization with Seaborn and Datashader Implementing your own algorithms Building distributed apps with Dask Distributed Packaging and deploying Dask appsAbout the ReaderFor data scientists and developers with experience using Python and the PyData stack.About the AuthorJesse Daniel is an experienced Python developer. He taught Python for Data Science at the University of Denver and leads a team of data scientists at a Denver-based media technology company.Table of ContentsPART 1 - The Building Blocks of scalable computing Why scalable computing matters Introducing Dask PART 2 - Working with Structured Data using Dask DataFrames Introducing Dask DataFrames Loading data into DataFrames Cleaning and transforming DataFrames Summarizing and analyzing DataFrames Visualizing DataFrames with Seaborn Visualizing location data with Datashader PART 3 - Extending and deploying Dask Working with Bags and Arrays Machine learning with Dask-ML Scaling and deploying Dask
This textbook provides a single source introduction to the primary approaches to machine learning. It is intended for advanced undergraduate and graduate students, as well as for developers and researchers in the field. No prior background in artificial intelligence or statistics is assumed. Several key algorithms, example data sets, and project-oriented home work assignments discussed in the book are accessible through the World Wide Web.Several new chapters are available from the author's website.
Data is at the center of many challenges in system design today. Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications. Peer under the hood of the systems you already use, and learn how to use and operate them more effectively Make informed decisions by identifying the strengths and weaknesses of different tools Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity Understand the distributed systems research upon which modern databases are built Peek behind the scenes of major online services, and learn from their architectures
An introduction to a broad range of topics in deep learning, covering mathematical and conceptual background, deep learning techniques used in industry, and research perspectives.“Written by three experts in the field, Deep Learning is the only comprehensive book on the subject.”—Elon Musk, cochair of OpenAI; cofounder and CEO of Tesla and SpaceXDeep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning.The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.
This introductory textbook provides an inexpensive, brief overview of statistics to help readers gain a better understanding of how statistics work and how to interpret them correctly. Each chapter describes a different statistical technique, ranging from basic concepts like central tendency and describing distributions to more advanced concepts such as t tests, regression, repeated measures ANOVA, and factor analysis. Each chapter begins with a short description of the statistic and when it should be used. This is followed by a more in-depth explanation of how the statistic works. Finally, each chapter ends with an example of the statistic in use, and a sample of how the results of analyses using the statistic might be written up for publication. A glossary of statistical terms and symbols is also included. Using the author’s own data and examples from published research and the popular media, the book is a straightforward and accessible guide to statistics.New features in the fourth edition include: sets of work problems in each chapter with detailed solutions and additional problems online to help students test their understanding of the material new "Worked Examples" to walk students through how to calculate and interpret the statistics featured in each chapter new examples from the author’s own data and from published research and the popular media to help students see how statistics are applied and written about in professional publications many more examples, tables, and charts to help students visualize key concepts, clarify concepts, and demonstrate how the statistics are used in the real world a more logical flow, with correlation directly preceding regression, and a combined glossary appearing at the end of the book a Quick Guide to Statistics, Formulas, and Degrees of Freedom at the start of the book, plainly outlining each statistic and when students should use them greater emphasis on (and description of) effect size and confidence interval reporting, reflecting their growing importance in research across the social science disciplines an expanded website at www.routledge.com/cw/urdan with PowerPoint presentations, chapter summaries, a new test bank, interactive problems and detailed solutions to the text’s work problems, SPSS datasets for practice, links to useful tools and resources, and videos showing how to calculate statistics, how to calculate and interpret the appendices, and how to understand some of the more confusing tables of output produced by SPSSStatistics in Plain English, Fourth Edition is an ideal guide for statistics, research methods, and/or for courses that use statistics taught at the undergraduate or graduate level, or as a reference tool for anyone interested in refreshing their memory about key statistical concepts. The research examples are from psychology, education, and other social and behavioral sciences.
The Data Science Handbook contains candid interviews with 25 of the world’s best data scientists.We sat down with them, had in-depth conversations about their careers, personal stories, perspectives on data science and life advice.In The Data Science Handbook, you will find war stories from DJ Patil, US Chief Data Officer and one of the founders of the field. You’ll learn industry veterans such as Kevin Novak and Riley Newman, who head the data science teams at Uber and Airbnb respectively. You’ll also read about rising data scientists such as Clare Corthell, who crafted her own open source data science masters program.This book is perfect for aspiring or current data scientists to learn from the best. It’s a reference book packed full of strategies, suggestions and recipes to launch and grow your own data science career. Table of ContentsThis book contains insight and interviews with data scientists from established companies such as Facebook, LinkedIn, Pandora, Intuit, and The New York Times.We also spoke with data scientists at fast-growing startups such as Uber, Airbnb, Mattermark, Quora, Square and Khan Academy.
Data Science and Big Data Analytics is about harnessing the power of data for new insights. The book covers the breadth of activities and methods and tools that Data Scientists use. The content focuses on concepts, principles and practical applications that are applicable to any industry and technology environment, and the learning is supported and explained with examples that you can replicate using open-source software.This book will help you: Become a contributor on a data science team Deploy a structured lifecycle approach to data analytics problems Apply appropriate analytic techniques and tools to analyzing big data Learn how to tell a compelling story with data to drive business action Prepare for EMC Proven Professional Data Science CertificationGet started discovering, analyzing, visualizing, and presenting data in a meaningful way today!
Wouldn't It Be Great If There Were A Statistics Book That Made Histograms, Probability Distributions, And Chi Square Analysis More Enjoyable Than Going To The Dentist? Head First Statistics Brings This Typically Dry Subject To Life, Teaching You Everything You Want And Need To Know About Statistics Through Engaging, Interactive, And Thought-provoking Material, Full Of Puzzles, Stories, Quizzes, Visual Aids, And Real-world Examples. Whether You're A Student, A Professional, Or Just Curious About Statistical Analysis, Head First's Brain-friendly Formula Helps You Get A Firm Grasp Of Statistics So You Can Understand Key Points And Actually Use Them. Learn To Present Data Visually With Charts And Plots; Discover The Difference Between Taking The Average With Mean, Median, And Mode, And Why It's Important; Learn How To Calculate Probability And Expectation; And Much More
Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today.Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making. Understand how data science fits in your organization―and how you can use it for competitive advantage Treat data as a business asset that requires careful investment if you’re to gain real value Approach business problems data-analytically, using the data-mining process to gather good data in the most appropriate way Learn general concepts for actually extracting knowledge from data Apply data science principles when interviewing data science job candidates
Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. Now, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.By using concrete examples, minimal theory, and two production-ready Python frameworks-scikit-learn and TensorFlow-author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You'll learn a range of techniques, starting with simple linear regression and progressing to deep neural networks. With exercises in each chapter to help you apply what you've learned, all you need is programming experience to get started. Explore the machine learning landscape, particularly neural nets Use scikit-learn to track an example machine-learning project end-to-end Explore several training models, including support vector machines, decision trees, random forests, and ensemble methods Use the TensorFlow library to build and train neural nets Dive into neural net architectures, including convolutional nets, recurrent nets, and deep reinforcement learning Learn techniques for training and scaling deep neural nets Apply practical code examples without acquiring excessive machine learning theory or algorithm details
BRAND NEW, Exactly same ISBN as listed, Please double check ISBN carefully before ordering.
Thelong-anticipated revision of ArtificialIntelligence: A Modern Approach explores the full breadth and depth of the field of artificialintelligence (AI). The 4th Edition brings readers up to date on the latest technologies,presents concepts in a more unified manner, and offers new or expanded coverageof machine learning, deep learning, transfer learning, multi agent systems,robotics, natural language processing, causality, probabilistic programming,privacy, fairness, and safe AI.
A great building requires a strong foundation. This book teaches basic Artificial Intelligence algorithms such as dimensionality, distance metrics, clustering, error calculation, hill climbing, Nelder Mead, and linear regression. These are not just foundational algorithms for the rest of the series, but are very useful in their own right. The book explains all algorithms using actual numeric calculations that you can perform yourself. Artificial Intelligence for Humans is a book series meant to teach AI to those without an extensive mathematical background. The reader needs only a knowledge of basic college algebra or computer programming—anything more complicated than that is thoroughly explained. Every chapter also includes a programming example. Examples are currently provided in Java, C#, R, Python and C. Other languages planned.
Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.
Written With The Aim Of Becoming The Primary Resource For Students Of Business Analytics, This Book Provides A Holistic Perspective Of Analytics With Theoretical Foundations And Applications Of The Theory Using Examples Across Several Industries.
Major changes in this edition include the substitution of probabilistic arguments for combinatorial artifices, and the addition of new sections on branching processes, Markov chains, and the De Moivre-Laplace theorem.