Python Feature Engineering Cookbook

Python Feature Engineering Cookbook
Title Python Feature Engineering Cookbook PDF eBook
Author Soledad Galli
Publisher Packt Publishing Ltd
Pages 364
Release 2020-01-22
Genre Computers
ISBN 1789807824

Download Python Feature Engineering Cookbook Book in PDF, Epub and Kindle

Extract accurate information from data to train and improve machine learning models using NumPy, SciPy, pandas, and scikit-learn libraries Key FeaturesDiscover solutions for feature generation, feature extraction, and feature selectionUncover the end-to-end feature engineering process across continuous, discrete, and unstructured datasetsImplement modern feature extraction techniques using Python's pandas, scikit-learn, SciPy and NumPy librariesBook Description Feature engineering is invaluable for developing and enriching your machine learning models. In this cookbook, you will work with the best tools to streamline your feature engineering pipelines and techniques and simplify and improve the quality of your code. Using Python libraries such as pandas, scikit-learn, Featuretools, and Feature-engine, you’ll learn how to work with both continuous and discrete datasets and be able to transform features from unstructured datasets. You will develop the skills necessary to select the best features as well as the most suitable extraction techniques. This book will cover Python recipes that will help you automate feature engineering to simplify complex processes. You’ll also get to grips with different feature engineering strategies, such as the box-cox transform, power transform, and log transform across machine learning, reinforcement learning, and natural language processing (NLP) domains. By the end of this book, you’ll have discovered tips and practical solutions to all of your feature engineering problems. What you will learnSimplify your feature engineering pipelines with powerful Python packagesGet to grips with imputing missing valuesEncode categorical variables with a wide set of techniquesExtract insights from text quickly and effortlesslyDevelop features from transactional data and time series dataDerive new features by combining existing variablesUnderstand how to transform, discretize, and scale your variablesCreate informative variables from date and timeWho this book is for This book is for machine learning professionals, AI engineers, data scientists, and NLP and reinforcement learning engineers who want to optimize and enrich their machine learning models with the best features. Knowledge of machine learning and Python coding will assist you with understanding the concepts covered in this book.

Python Feature Engineering Cookbook

Python Feature Engineering Cookbook
Title Python Feature Engineering Cookbook PDF eBook
Author Soledad Galli
Publisher Packt Publishing Ltd
Pages 386
Release 2022-10-31
Genre Computers
ISBN 1804615390

Download Python Feature Engineering Cookbook Book in PDF, Epub and Kindle

Create end-to-end, reproducible feature engineering pipelines that can be deployed into production using open-source Python libraries Key Features Learn and implement feature engineering best practices Reinforce your learning with the help of multiple hands-on recipes Build end-to-end feature engineering pipelines that are performant and reproducible Book DescriptionFeature engineering, the process of transforming variables and creating features, albeit time-consuming, ensures that your machine learning models perform seamlessly. This second edition of Python Feature Engineering Cookbook will take the struggle out of feature engineering by showing you how to use open source Python libraries to accelerate the process via a plethora of practical, hands-on recipes. This updated edition begins by addressing fundamental data challenges such as missing data and categorical values, before moving on to strategies for dealing with skewed distributions and outliers. The concluding chapters show you how to develop new features from various types of data, including text, time series, and relational databases. With the help of numerous open source Python libraries, you'll learn how to implement each feature engineering method in a performant, reproducible, and elegant manner. By the end of this Python book, you will have the tools and expertise needed to confidently build end-to-end and reproducible feature engineering pipelines that can be deployed into production.What you will learn Impute missing data using various univariate and multivariate methods Encode categorical variables with one-hot, ordinal, and count encoding Handle highly cardinal categorical variables Transform, discretize, and scale your variables Create variables from date and time with pandas and Feature-engine Combine variables into new features Extract features from text as well as from transactional data with Featuretools Create features from time series data with tsfresh Who this book is for This book is for machine learning and data science students and professionals, as well as software engineers working on machine learning model deployment, who want to learn more about how to transform their data and create new features to train machine learning models in a better way.

Feature Engineering for Machine Learning

Feature Engineering for Machine Learning
Title Feature Engineering for Machine Learning PDF eBook
Author Alice Zheng
Publisher "O'Reilly Media, Inc."
Pages 218
Release 2018-03-23
Genre Computers
ISBN 1491953195

Download Feature Engineering for Machine Learning Book in PDF, Epub and Kindle

Feature engineering is a crucial step in the machine-learning pipeline, yet this topic is rarely examined on its own. With this practical book, you’ll learn techniques for extracting and transforming features—the numeric representations of raw data—into formats for machine-learning models. Each chapter guides you through a single data problem, such as how to represent text or image data. Together, these examples illustrate the main principles of feature engineering. Rather than simply teach these principles, authors Alice Zheng and Amanda Casari focus on practical application with exercises throughout the book. The closing chapter brings everything together by tackling a real-world, structured dataset with several feature-engineering techniques. Python packages including numpy, Pandas, Scikit-learn, and Matplotlib are used in code examples. You’ll examine: Feature engineering for numeric data: filtering, binning, scaling, log transforms, and power transforms Natural text techniques: bag-of-words, n-grams, and phrase detection Frequency-based filtering and feature scaling for eliminating uninformative features Encoding techniques of categorical variables, including feature hashing and bin-counting Model-based feature engineering with principal component analysis The concept of model stacking, using k-means as a featurization technique Image feature extraction with manual and deep-learning techniques

Practical Data Science Cookbook

Practical Data Science Cookbook
Title Practical Data Science Cookbook PDF eBook
Author Prabhanjan Tattar
Publisher Packt Publishing Ltd
Pages 428
Release 2017-06-29
Genre Computers
ISBN 178712326X

Download Practical Data Science Cookbook Book in PDF, Epub and Kindle

Over 85 recipes to help you complete real-world data science projects in R and Python About This Book Tackle every step in the data science pipeline and use it to acquire, clean, analyze, and visualize your data Get beyond the theory and implement real-world projects in data science using R and Python Easy-to-follow recipes will help you understand and implement the numerical computing concepts Who This Book Is For If you are an aspiring data scientist who wants to learn data science and numerical programming concepts through hands-on, real-world project examples, this is the book for you. Whether you are brand new to data science or you are a seasoned expert, you will benefit from learning about the structure of real-world data science projects and the programming examples in R and Python. What You Will Learn Learn and understand the installation procedure and environment required for R and Python on various platforms Prepare data for analysis by implement various data science concepts such as acquisition, cleaning and munging through R and Python Build a predictive model and an exploratory model Analyze the results of your model and create reports on the acquired data Build various tree-based methods and Build random forest In Detail As increasing amounts of data are generated each year, the need to analyze and create value out of it is more important than ever. Companies that know what to do with their data and how to do it well will have a competitive advantage over companies that don't. Because of this, there will be an increasing demand for people that possess both the analytical and technical abilities to extract valuable insights from data and create valuable solutions that put those insights to use. Starting with the basics, this book covers how to set up your numerical programming environment, introduces you to the data science pipeline, and guides you through several data projects in a step-by-step format. By sequentially working through the steps in each chapter, you will quickly familiarize yourself with the process and learn how to apply it to a variety of situations with examples using the two most popular programming languages for data analysis—R and Python. Style and approach This step-by-step guide to data science is full of hands-on examples of real-world data science tasks. Each recipe focuses on a particular task involved in the data science pipeline, ranging from readying the dataset to analytics and visualization

Feature Engineering and Selection

Feature Engineering and Selection
Title Feature Engineering and Selection PDF eBook
Author Max Kuhn
Publisher CRC Press
Pages 266
Release 2019-07-25
Genre Business & Economics
ISBN 1351609467

Download Feature Engineering and Selection Book in PDF, Epub and Kindle

The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.

Artificial Intelligence with Python

Artificial Intelligence with Python
Title Artificial Intelligence with Python PDF eBook
Author Alberto Artasanchez
Publisher Packt Publishing Ltd
Pages 619
Release 2020-01-31
Genre Computers
ISBN 1839216077

Download Artificial Intelligence with Python Book in PDF, Epub and Kindle

New edition of the bestselling guide to artificial intelligence with Python, updated to Python 3.x, with seven new chapters that cover RNNs, AI and Big Data, fundamental use cases, chatbots, and more. Key FeaturesCompletely updated and revised to Python 3.xNew chapters for AI on the cloud, recurrent neural networks, deep learning models, and feature selection and engineeringLearn more about deep learning algorithms, machine learning data pipelines, and chatbotsBook Description Artificial Intelligence with Python, Second Edition is an updated and expanded version of the bestselling guide to artificial intelligence using the latest version of Python 3.x. Not only does it provide you an introduction to artificial intelligence, this new edition goes further by giving you the tools you need to explore the amazing world of intelligent apps and create your own applications. This edition also includes seven new chapters on more advanced concepts of Artificial Intelligence, including fundamental use cases of AI; machine learning data pipelines; feature selection and feature engineering; AI on the cloud; the basics of chatbots; RNNs and DL models; and AI and Big Data. Finally, this new edition explores various real-world scenarios and teaches you how to apply relevant AI algorithms to a wide swath of problems, starting with the most basic AI concepts and progressively building from there to solve more difficult challenges so that by the end, you will have gained a solid understanding of, and when best to use, these many artificial intelligence techniques. What you will learnUnderstand what artificial intelligence, machine learning, and data science areExplore the most common artificial intelligence use casesLearn how to build a machine learning pipelineAssimilate the basics of feature selection and feature engineeringIdentify the differences between supervised and unsupervised learningDiscover the most recent advances and tools offered for AI development in the cloudDevelop automatic speech recognition systems and chatbotsApply AI algorithms to time series dataWho this book is for The intended audience for this book is Python developers who want to build real-world Artificial Intelligence applications. Basic Python programming experience and awareness of machine learning concepts and techniques is mandatory.

Python for Finance Cookbook

Python for Finance Cookbook
Title Python for Finance Cookbook PDF eBook
Author Eryk Lewinson
Publisher Packt Publishing Ltd
Pages 426
Release 2020-01-31
Genre Computers
ISBN 1789617324

Download Python for Finance Cookbook Book in PDF, Epub and Kindle

Solve common and not-so-common financial problems using Python libraries such as NumPy, SciPy, and pandas Key FeaturesUse powerful Python libraries such as pandas, NumPy, and SciPy to analyze your financial dataExplore unique recipes for financial data analysis and processing with PythonEstimate popular financial models such as CAPM and GARCH using a problem-solution approachBook Description Python is one of the most popular programming languages used in the financial industry, with a huge set of accompanying libraries. In this book, you'll cover different ways of downloading financial data and preparing it for modeling. You'll calculate popular indicators used in technical analysis, such as Bollinger Bands, MACD, RSI, and backtest automatic trading strategies. Next, you'll cover time series analysis and models, such as exponential smoothing, ARIMA, and GARCH (including multivariate specifications), before exploring the popular CAPM and the Fama-French three-factor model. You'll then discover how to optimize asset allocation and use Monte Carlo simulations for tasks such as calculating the price of American options and estimating the Value at Risk (VaR). In later chapters, you'll work through an entire data science project in the financial domain. You'll also learn how to solve the credit card fraud and default problems using advanced classifiers such as random forest, XGBoost, LightGBM, and stacked models. You'll then be able to tune the hyperparameters of the models and handle class imbalance. Finally, you'll focus on learning how to use deep learning (PyTorch) for approaching financial tasks. By the end of this book, you’ll have learned how to effectively analyze financial data using a recipe-based approach. What you will learnDownload and preprocess financial data from different sourcesBacktest the performance of automatic trading strategies in a real-world settingEstimate financial econometrics models in Python and interpret their resultsUse Monte Carlo simulations for a variety of tasks such as derivatives valuation and risk assessmentImprove the performance of financial models with the latest Python librariesApply machine learning and deep learning techniques to solve different financial problemsUnderstand the different approaches used to model financial time series dataWho this book is for This book is for financial analysts, data analysts, and Python developers who want to learn how to implement a broad range of tasks in the finance domain. Data scientists looking to devise intelligent financial strategies to perform efficient financial analysis will also find this book useful. Working knowledge of the Python programming language is mandatory to grasp the concepts covered in the book effectively.