Natural Language Processing and Machine Learning for Developers

Explore the behavioral and quantitative aspects of project management to manage virtually any program, or task force.

(NLP-DEV.AJ1) / ISBN : 979-8-90059-036-3
Lessons
Lab
AI Tutor (Add-on)
Get A Free Trial

About This Course

Natural Language Processing and Machine Learning for Developers tackles the messy reality of text data. You'll run into issues quickly if you don't understand how to preprocess, encode, and model language. It's not just about running a library function; it's knowing why it sometimes blows up.This material helps you get past the initial hurdles. We cover the foundational pieces, using 17 Hands-on Labs and 18 Comprehensive Chapters to build up from basic data handling.

There are 86 Practice Quizzes and 100 Flashcards for reinforcing concepts.It won't make you a research scientist overnight, nor does it replace deep theoretical study. But it gives you a practical footing. Expect to grapple with things like dirty input and model selection. We've included 13 Practice Exercises and 100 Key Terms to aid that.

Skills You’ll Get

  • Data Preprocessing for NLP: Without this, models ingest garbage, producing results that are useless or actively misleading.
  • Text Vectorization: Ignoring this means your algorithms can't 'understand' text, leading to poor model performance and wasted compute cycles.
  • Model Selection for Text Tasks: Picking the wrong algorithm for a text problem often means spending days debugging a fundamentally flawed approach.
  • Handling Real-world Data Mess: If you can't clean and structure raw text, your entire ML pipeline stalls before it even begins.

1

Preface

  • Does This Course Contain Production-Level Code Samples?
  • What Are the Non-Technical Prerequisites for This Course?
  • How Do I Set Up a Command Shell?
2

Introduction to NumPy

  • What is NumPy?
  • What are NumPy Arrays?
  • Working with Loops
  • Appending Elements to Arrays (1)
  • Appending Elements to Arrays (2)
  • Multiply Lists and Arrays
  • Doubling the Elements in a List
  • Lists and Exponents
  • Arrays and Exponents
  • Math Operations and Arrays
  • Working with “-1” Subranges with Arrays
  • Other Useful NumPy Methods
  • Arrays and Vector Operations
  • NumPy and Dot Products (1)
  • NumPy and Dot Products (2)
  • NumPy and the “Norm” of Vectors
  • NumPy and Other Operations
  • NumPy and the reshape() Method
  • Calculating the Mean and Standard Deviation
  • Working with Lines in the Plane (Optional)
  • Plotting a Line with NumPy and Matplotlib
  • Plotting a Quadratic with NumPy and Matplotlib
  • What is Linear Regression?
  • The MSE Formula
  • Calculating the MSE Manually
  • Find the Best-Fitting Line with NumPy
  • Calculating MSE by Successive Approximation (1)
  • Calculating MSE by Successive Approximation (2)
  • What is Jax?
  • Google Colaboratory
  • Summary
3

Introduction to Pandas

  • What is Pandas?
  • A Pandas Data Frame with NumPy Example
  • Describing a Pandas Data Frame
  • Pandas Boolean Data Frames
  • Pandas Data Frames and Random Numbers
  • Reading CSV Files in Pandas
  • The loc() and iloc() Methods in Pandas
  • Converting Categorical Data to Numeric Data
  • Matching and Splitting Strings in Pandas
  • Converting Strings to Dates in Pandas
  • Merging and Splitting Columns in Pandas
  • Combining Pandas Data frames
  • Data Manipulation with Pandas Data Frames (1)
  • Data Manipulation with Pandas Data Frames (2)
  • Data Manipulation with Pandas Data Frames (3)
  • Pandas Data Frames and CSV Files
  • Managing Columns in Data Frames
  • Managing Rows in Pandas
  • Handling Missing Data in Pandas
  • Sorting Data Frames in Pandas
  • Working with groupby() in Pandas
  • Working with apply() and mapapply() in Pandas
  • Handling Outliers in Pandas
  • Pandas Data Frames and Scatterplots
  • Pandas Data Frames and Simple Statistics
  • Aggregate Operations in Pandas Data Frames
  • Aggregate Operations with the titanic.csv Dataset
  • Save Data Frames as CSV Files and Zip Files
  • Pandas Data Frames and Excel Spreadsheets
  • Working with JSON-based Data
  • Pandas and Regular Expressions (Optional)
  • Useful One-Line Commands in Pandas
  • What is Method Chaining?
  • Pandas Profiling
  • What is Texthero?
  • Summary
4

NLP Concepts (I)

  • The Origin of Languages
  • The Complexity of Natural Languages
  • Japanese Grammar
  • Phonetic Languages
  • Multiple Ways to Pronounce Consonants
  • English Pronouns and Prepositions
  • What is NLP?
  • A Wide-Angle View of NLP
  • Information Extraction and Retrieval
  • Word Sense Disambiguation
  • NLP Techniques in ML
  • Text Normalization and Tokenization
  • Handling Stop Words
  • What is Stemming?
  • What is Lemmatization?
  • Working with Text: POS
  • Working with Text: NER
  • What is Topic Modeling?
  • Keyword Extraction, Sentiment Analysis, and Text Summarization
  • Summary
5

NLP Concepts (II)

  • What is Word Relevance?
  • What is Text Similarity?
  • Sentence Similarity
  • Working with Documents
  • Techniques for Text Similarity
  • What is Text Encoding?
  • Text Encoding Techniques
  • The BoW Algorithm
  • What are n-grams?
  • Calculating tf, idf, and tf-idf
  • The Context of Words in a Document
  • What is Cosine Similarity?
  • Text Vectorization (aka Word Embeddings)
  • Overview of Word Embeddings and Algorithms
  • What is Word2vec?
  • The CBoW Architecture
  • What are Skip-grams?
  • What is GloVe?
  • Working with GloVe
  • What is FastText?
  • Comparison of Word Embeddings
  • What is Topic Modeling?
  • Language Models and NLP
  • Vector Space Models
  • NLP and Text Mining
  • Relation Extraction and Information Extraction
  • What is a BLEU Score?
  • Summary
6

Algorithms and Toolkits (I)

  • Cleaning Data with Regular Expressions
  • Handling Contracted Words
  • Python Code Samples of BoW
  • One-Hot Encoding Examples
  • Sklearn and Word Embedding Examples
  • What is BeautifulSoup?
  • Web Scraping with Pure Regular Expressions
  • What is Scrapy?
  • What is SpaCy?
  • SpaCy and Stop Words
  • SpaCy and Tokenization
  • SpaCy and Lemmatization
  • SpaCy and NER
  • SpaCy Pipelines
  • SpaCy and Word Vectors
  • The scispaCy Library (Optional)
  • Summary
7

Algorithms and Toolkits (II)

  • What is NLTK?
  • NLTK and BoW
  • NLTK and Stemmers
  • NLTK and Lemmatization
  • NLTK and Stop Words
  • What is Wordnet?
  • NLTK, lxml, and XPath
  • NLTK and n-grams
  • NLTK and POS (1)
  • NLTK and POS (2)
  • NLTK and Tokenizers
  • NLTK and Context-Free Grammars (Optional)
  • What is Gensim?
  • An Example of Topic Modeling
  • A Brief Comparison of Popular Python-Based NLP Libraries
  • Miscellaneous Libraries
  • Summary
8

Introduction to Machine Learning

  • What is Machine Learning?
  • Types of Machine Learning Algorithms
  • Preparing a Dataset and Training a Model
  • Feature Engineering, Selection, and Extraction
  • Working with Datasets
  • Overfitting versus Underfitting
  • Data Normalization Techniques
  • Metrics in Machine Learning
  • What is Linear Regression?
  • Other Types of Regression
  • Working with Lines in the Plane (Optional)
  • Scatter Plots with NumPy and Matplotlib (1)
  • Scatter Plots with NumPy and Matplotlib (2)
  • A Quadratic Scatterplot with NumPy and Matplotlib
  • The Mean Squared Error (MSE) Formula
  • Calculating the MSE Manually
  • Approximating Linear Data with np.linspace()
  • What are Ensemble Methods?
  • Four Types of Ensemble Methods
  • Common Boosting Algorithms
  • Hyperparameter Optimization
  • AutoML, AutoML-Zero, and AutoNLP
  • Miscellaneous Topics
  • Summary
9

Classifiers in Machine Learning

  • What is Classification?
  • What are Linear Classifiers?
  • What is kNN?
  • What are Decision Trees?
  • Decision Tree Code Samples
  • Decision Trees, Gini Impurity, and Entropy
  • What are Random Forests?
  • What are Support Vector Machines?
  • What is a Bayesian Classifier?
  • Training Classifiers
  • Evaluating Classifiers
  • Trade-offs for ML Algorithms
  • What are Activation Functions?
  • Common Activation Functions
  • The ReLU and ELU Activation Functions
  • Sigmoid, Softmax, and Hardmax Similarities
  • Sigmoid, Softmax, and HardMax Differences
  • Hyperparameters for Neural Networks
  • What is Logistic Regression?
  • Keras, Logistic Regression, and Iris Dataset
  • Sklearn and Linear Regression
  • SciPy and Linear Regression
  • Keras and Linear Regression
  • Summary
10

NLP Applications

  • What is Text Summarization?
  • Text Summarization with gensim and SpaCy
  • What are Recommender Systems?
  • Content-Based Recommendation Systems
  • Collaborative Filtering Algorithm
  • Recommender Systems and Reinforcement Learning (Optional)
  • What is Sentiment Analysis?
  • Sentiment Analysis with Naïve Bayes
  • Sentiment Analysis in NLTK and VADER
  • Sentiment Analysis with Textblob
  • Sentiment Analysis with Flair
  • Detecting Spam
  • Logistic Regression and Sentiment Analysis
  • Working with COVID-19
  • What are Chatbots?
  • Summary
11

NLP and TF2/Keras

  • Term-Document Matrix
  • Text Classification Algorithms in Machine Learning
  • A Keras-Based Tokenizer
  • TF2 and Tokenization
  • TF2 and Encoding
  • A Keras-Based Word Embedding
  • An Example of BoW with TF2
  • The 20newsgroup Dataset
  • Text Classification with the kNN Algorithm
  • Text Classification with a Decision Tree Algorithm
  • Text Classification with a Random Forest Algorithm
  • Text Classification with the SVC Algorithm
  • Text Classification with the Naïve Bayes Algorithm
  • Text Classification with the kMeans Algorithm
  • TF2/Keras and Word Tokenization
  • TF2/Keras and Word Encodings
  • Text Summarization with TF2/Keras and Reuters Dataset
  • Summary
12

Transformer, BERT, and GPT

  • What is Attention?
  • An Overview of the Transformer Architecture
  • What is T5?
  • What is BERT?
  • The Inner Workings of BERT
  • Subword Tokenization
  • Sentence Similarity in BERT
  • Generating BERT Tokens (1)
  • Generating BERT Tokens (2)
  • The BERT Family
  • Introduction to GPT
  • Working with GPT-2
  • What is GPT-3?
  • The Switch Transformer: One Trillion Parameters
  • Looking Ahead
  • Summary
A

Appendix A: Data and Statistics

  • What are Datasets?
  • Preparing Datasets
  • Missing Data, Anomalies, and Outliers
  • What is Imbalanced Classification?
  • What is SMOTE?
  • Analyzing Classifiers
  • What is a Probability?
  • Random Variables
  • Fundamental Concepts in Statistics
  • The Moments of a Function (Optional)
  • Data and Statistics
  • The Bias-Variance Trade-off
  • Gini Impurity, Entropy, and Perplexity
  • Cross-Entropy and KL Divergence
  • Covariance and Correlation Matrices
  • Principal Component Analysis (PCA)
  • Dimensionality Reduction
  • Dimensionality Reduction Techniques
  • Linear Versus Nonlinear Reduction Techniques
  • Types of Distance Metrics
  • Other Well-Known Distance Metrics
  • What is Sklearn?
  • What is Bayesian Inference?
  • What are Vector Spaces?
  • Summary
B

Appendix B: Introduction to Python

  • Tools for Python
  • Python Installation
  • Setting the PATH Environment Variable (Windows Only)
  • Launching Python on Your Machine
  • Python Identifiers
  • Lines, Indentation, and Multilines
  • Quotation and Comments in Python
  • Saving Your Code in a Module
  • Some Standard Modules in Python
  • The help() and dir() Functions
  • Compile Time and Runtime Code Checking
  • Simple Data Types in Python
  • Working with Numbers
  • Working with Fractions
  • Unicode and UTF-8
  • Working with Unicode
  • Working with Strings
  • Uninitialized Variables and the Value None in Python
  • Slicing and Splicing Strings
  • Search and Replace a String in Other Strings
  • Remove Leading and Trailing Characters
  • Printing Text without NewLine Characters
  • Text Alignment
  • Working with Dates
  • Exception Handling in Python
  • Handling User Input
  • Python and Emojis (Optional)
  • Command-Line Arguments
  • Summary
C

Appendix C: Introduction to Regular Expressions

  • What are Regular Expressions?
  • Metacharacters in Python
  • Character Sets in Python
  • Character Classes in Python
  • Matching Character Classes with the re Module
  • Using the re.match() Method
  • Options for the re.match() Method
  • Matching Character Classes with the re.search() Method
  • Matching Character Classes with the findAll() Method
  • Additional Matching Function for Regular Expressions
  • Grouping with Character Classes in Regular Expressions
  • Using Character Classes in Regular Expressions
  • Modifying Text Strings with the re Module
  • Splitting Text Strings with the re.split() Method
  • Splitting Text Strings Using Digits and Delimiters
  • Substituting Text Strings with the re.sub() Method
  • Matching the Beginning and the End of Text Strings
  • Compilation Flags
  • Compound Regular Expressions
  • Counting Character Types in a String
  • Regular Expressions and Grouping
  • Simple String Matches
  • Additional Topics for Regular Expressions
  • Summary
  • Exercises
D

Appendix D: Introduction to Keras

  • What is Keras?
  • Creating a Keras-Based Model
  • Keras and Linear Regression
  • Keras, MLPs, and MNIST
  • Keras, CNNs, and cifar10
  • Resizing Images in Keras
  • Keras and Early Stopping (1)
  • Keras and Early Stopping (2)
  • Keras and Metrics
  • Saving and Restoring Keras Models
  • Summary
E

Appendix E: Introduction to TensorFlow 2

  • What is TF 2?
  • Other TF 2-Based Toolkits
  • TF 2 Eager Execution
  • TF 2 Tensors, Data Types, and Primitive Types
  • Constants in TF 2
  • Variables in TF 2
  • The tf.rank() API
  • The tf.shape() API
  • Variables in TF 2 (Revisited)
  • What is @tf.function in TF 2?
  • Working with @tf.function in TF 2
  • Arithmetic Operations in TF 2
  • Caveats for Arithmetic Operations in TF 2
  • TF 2 and Built-In Functions
  • Calculating Trigonometric Values in TF 2
  • Calculating Exponential Values in TF 2
  • Working with Strings in TF 2
  • Working with Tensors and Operations in TF 2
  • Second-Order Tensors in TF 2 (1)
  • Second-Order Tensors in TF 2 (2)
  • Multiplying Two Second-Order Tensors in TF
  • Convert Python Arrays to TF Tensors
  • Differentiation and tf.GradientTape in TF 2
  • Examples of tf.GradientTape
  • What is Trax?
  • Google Colaboratory
  • Other Cloud Platforms
  • TF2 and tf.data.Dataset
  • The TF 2 tf.data.Dataset
  • What are Lambda Expressions?
  • Working with Generators in TF 2
  • Summary
F

Appendix F: Data Visualization

  • What is Data Visualization?
  • What is Matplotlib?
  • Horizontal Lines in Matplotlib
  • Slanted Lines in Matplotlib
  • Parallel Slanted Lines in Matplotlib
  • A Grid of Points in Matplotlib
  • A Dotted Grid in Matplotlib
  • Lines in a Grid in Matplotlib
  • A Colored Grid in Matplotlib
  • A Colored Square in an Unlabeled Grid in Matplotlib
  • Randomized Data Points in Matplotlib
  • A Histogram in Matplotlib
  • A Set of Line Segments in Matplotlib
  • Plotting Multiple Lines in Matplotlib
  • Trigonometric Functions in Matplotlib
  • Display IQ Scores in Matplotlib
  • Plot a Best-Fitting Line in Matplotlib
  • Introduction to Sklearn (scikit-learn)
  • The Digits Dataset in Sklearn
  • The Iris Dataset in Sklearn
  • The Iris Dataset in Sklearn (Optional)
  • The faces Dataset in Sklearn (Optional)
  • Working with Seaborn
  • Seaborn Built-in Datasets
  • The Iris Dataset in Seaborn
  • The Titanic Dataset in Seaborn
  • Extracting Data from the Titanic Dataset in Seaborn (1)
  • Extracting Data from the Titanic Dataset in Seaborn (2)
  • Visualizing a Pandas Dataset in Seaborn
  • Data Visualization in Pandas
  • Summary

1

Introduction to NumPy

  • Using Lists and Arrays
  • Performing Mathematical Operations with Arrays
  • Performing Statistical Operations
  • Creating Line Charts
  • Performing Linear Regression
2

Introduction to Pandas

  • Creating Data Frames
  • Working with Data Frames - I
  • Working with Data Frames - II
  • Analyzing JSON Data Structures
3

Algorithms and Toolkits (I)

  • Cleaning Data and Handling Contracted Words
  • Analyzing Encoding and Embedding Techniques
  • Performing Web Scraping
  • Performing Text Processing and Entity Recognition
4

Algorithms and Toolkits (II)

  • Using Gensim for Text Processing
  • Using NLTK for Text Processing
5

Introduction to Machine Learning

  • Evaluating Regression Models Using Error Metrics
6

Classifiers in Machine Learning

  • Selecting the Best Classification Model
7

NLP Applications

  • Summarizing Text with Gensim and SpaCy
  • Building a Recommender System
  • Creating a Spam Classifier
8

NLP and TF2/Keras

9

Transformer, BERT, and GPT

Any questions?
Check out the FAQs

Still have unanswered questions and need to get in touch?

Contact Us Now

It's heavily skewed towards practical application. You'll work through code examples, but understanding the underlying mechanisms is still necessary.

Many labs can run in cloud environments like Google Colaboratory. Some larger datasets or complex models might hit resource limits on typical laptops.

This course provides a strong foundation for development. Production systems involve additional engineering concerns like scaling, which aren't the primary focus here.

Yes. We skip the intro to programming and dive straight into the technical mess. If you don't know your way around a Python script, you'll want to brush up before jumping in.

We can Build Real-World NLP Skills That Actually Work

Develop the skills needed to preprocess text, train effective models, and troubleshoot NLP pipelines with confidence using hands-on, job-ready learning.

$167.99

Pre-Order Now

Related Courses

All Courses
scroll to top