Learn how to identify the unusual, interesting, extreme, or inaccurate parts of your data. Data scientists have two main tasks: finding patterns in data and finding the exceptions. These outliers are often the most informative parts of data, revealing hidden insights, novel patterns, and potential problems. Outlier Detection in Python is a practical guide to spotting the parts of a dataset that deviate from the norm, even when they're hidden or intertwined among the expected data points. In Outlier Detection in Python you'll learn how to:
- Use standard Python libraries to identify outliers
- Select the most appropriate detection methods
- Combine multiple outlier detection methods for improved results
- Interpret your results effectively
- Work with numeric, categorical, time series, and text data
Outlier detection is a vital tool for modern business, whether it's discovering new products, expanding markets, or flagging fraud and other suspicious activities. This guide presents the core tools for outlier detection, as well as techniques utilizing the Python data stack familiar to data scientists. To get started, you'll only need a basic understanding of statistics and the Python data ecosystem.
About the technology
Outliers—values that appear inconsistent with the rest of your data—can be the key to identifying fraud, performing a security audit, spotting bot activity, or just assessing the quality of a dataset. This unique guide introduces the outlier detection tools, techniques, and algorithms you’ll need to find, understand, and respond to the anomalies in your data.
About the book
Outlier Detection in Python illustrates the principles and practices of outlier detection with diverse real-world examples including social media, finance, network logs, and other important domains. You’ll explore a comprehensive set of statistical methods and machine learning approaches to identify and interpret the unexpected values in tabular, text, time series, and image data. Along the way, you’ll explore scikit-learn and PyOD, apply key OD algorithms, and add some high value techniques for real world OD scenarios to your toolkit.
what's inside
- Python libraries to identify outliers
- Combine outlier detection methods
- Interpret your results
About the reader
For Python programmers familiar with tools like pandas and NumPy, and the basics of statistics.
About the author
Brett Kennedy is a data scientist with over thirty years’ experience in software development and data science.
Ebook License
End-User Warranty And License Agreement
1. Grant Of License
Manning Has Authorized The Download By You Of An Unrestricted Number Of Copies Of The Electronic Book (Ebook) In Any Of The Available Formats. Manning Grants You A Nonexclusive, Nontransferable License To Use The Ebook According To The Terms And Conditions Herein. This License Agreement Permits You To Install The Ebook On Any And All Your Devices For Your Personal Use Only.
2. Restrictions
You Shall Not: (1) Share, Resell, Rent, Assign, Timeshare, Distribute, Or Transfer All Or Part Of The Ebook Or Any Rights Granted Hereunder To Any Other Person; (2) Duplicate The Ebook, Except For A Single Backup Or Archival Copy; (3) Remove Any Proprietary Notices, Labels, Or Marks From The Ebook; (4) Transfer Or Sublicense Title To The Ebook To Any Other Party.
3. Intellectual Property Protection
The Ebook Is Owned By Manning And Is Protected By United States And International Copyright And Other Intellectual Property Laws. Manning Reserves All Rights In The Ebook Not Expressly Granted Herein. This License And Your Right To Use The Ebook Terminate Automatically If You Violate Any Part Of This Agreement. In The Event Of Termination, You Must Remove The Original And Any Copies Of The Ebook From All Your Devices.
4. Source Code Supplementary Material
Any Source Code Files Provided As A Supplement To The Book Are Freely Available To The Public For Download. Reuse Of The Code Is Permitted, In Whole Or In Part, Including The Creation Of Derivative Works, Provided That You Acknowledge That You Are Using It And Identify The Source: Title, Publisher And Year.
5. Limited Warranty
Manning Warrants That The Ebook Files, A Copy Of Which You Are Authorized To Download, Are Free From Defects In The Operational Sense That They Can Be Read By A Pdf Reader Or Epub Reader, Or Other. Except For This Express Limited Warranty, Manning Makes And You Receive No Warranties, Express, Implied, Statutory Or In Any Communication With You, And Manning Specifically Disclaims Any Other Warranty Including The Implied Warranty Of Merchantability Or Fitness Or A Particular Purpose. Manning Does Not Warrant That The Operation Of The Ebook Will Be Uninterrupted Or Error Free. If The Ebook Was Purchased In The United States, The Above Exclusions May Not Apply To You As Some States Do Not Allow The Exclusion Of Implied Warranties. In Addition To The Above Warranty Rights, You May Also Have Other Rights That Vary From State To State.
6. Limitation Of Liability
In No Event Will Manning Be Liable For Any Damages, Whether Arising For Tort Or Contract, Including Loss Of Data, Lost Profits, Or Other Special, Incidental, Consequential, Or Indirect Damages Arising Out Of The Use Or Inability To Use The Ebook.
7. General
This Agreement Constitutes The Entire Agreement Between You And Manning And Supersedes Any Prior Agreement Concerning The Ebook. This Agreement Is Governed By The Laws Of The State Of New York