ML with Python
Questions - Answer
Unit-1
Q - 2 Explain Types of Machine Learning Algorithms
Machine Learning algorithms are categorized based on the type of task they perform:
1. Regression Algorithms
Used for predicting continuous values.
Examples: Linear Regression, Ridge Regression, Lasso Regression.
Use Case: Predicting house prices, stock price trends.
2. Classification Algorithms
Used for predicting categorical outcomes.
Examples: Logistic Regression, Random Forest, Support Vector Machines (SVM), K-Nearest Neighbors (KNN).
Use Case: Spam detection, fraud detection.
3. Clustering Algorithms
Used to group data points into clusters based on similarity.
Examples: K-Means, Hierarchical Clustering, DBSCAN.
Use Case: Customer segmentation, recommendation systems.
4. Dimensionality Reduction Algorithms
Used to reduce the number of input features while retaining important information.
Examples: Principal Component Analysis (PCA), t-SNE, Autoencoders.
Use Case: Image compression, feature selection.
5. Ensemble Algorithms
Combine multiple models to improve performance and accuracy.
Examples: Random Forest, Gradient Boosting, XGBoost.
Use Case: Predicting customer churn, product recommendations.
Unit-4
Q - 3 What is stemmer? Explain types of stemming in NLP
Simplifying words to their most basic form is called stemming, and it is made easier by stemmers or stemming algorithms.
A stemmer is a tool in Natural Language Processing (NLP) that reduces words to their root or base form, known as the stem.
Stemming helps in text normalization, improving the efficiency of text processing tasks like search engines, information retrieval, and machine learning models.
Types of Stemming in NLP:
1. Porter Stemmer
One of the most commonly used stemming algorithms.
Developed by Martin Porter in 1980.
Uses a set of rules to iteratively remove suffixes (e.g., "running" → "run", "flies" → "fli").
Produces stems that are not necessarily valid words.
Fast but sometimes over-stems (removes too much).
2. Lovins Stemmer
One of the earliest stemmers (developed in 1968).
Uses a large set of rules for suffix removal.
More complex but less commonly used now due to its older design.
(e.g., "sitting" → "sitt" → "sit").
3. Dawson Stemmer
It is an extension of Lovins stemmer in which suffixes are stored in the reversed order indexed by their length and last letter.
4. Krovetz Stemmer
It was proposed in 1993 by Robert Krovetz. Following are the steps:
1) Convert the plural form of a word to its singular form.
2) Convert the past tense of a word to its present tense and remove the suffix ‘ing’.
Example: ‘children’ -> ‘child’
5. Xerox Stemmer
Capable of processing extensive datasets and generating valid words, it has a tendency to over-stem, primarily due to its reliance on lexicons, making it language-dependent.
This constraint implies that its effectiveness is limited to specific languages.
Example:
‘children’ -> ‘child’
‘understood’ -> ‘understand’

Comments
Post a Comment