Underfitting - SparTech Software

Underfitting in artificial intelligence (AI) and machine learning occurs when a model is too simple to capture the underlying patterns in the training data, resulting in poor performance on both the training set and new, unseen data. This means the model fails to learn the important relationships within the data and cannot make accurate predictions. It contrasts with the problem of overfitting.

Underfitting typically occurs due to:

• Model Simplicity: The model architecture is too basic to represent the complexity of the data (e.g., using a linear model for data that has a non-linear relationship).
• Insufficient Training: The model has not been trained for enough iterations, so it hasn’t had the opportunity to learn from the data.
• Poor Feature Selection: The chosen input features do not provide enough information for the model to learn the target variable.
• Insufficient Data: There is not enough data to capture the full range of patterns in the problem.
• Too Much Regularization: Excessive constraints on the model can prevent it from learning the data’s true structure.

How to Detect Underfitting

• High Error on Training and Test Data: If the model performs poorly on both training and validation/test data, underfitting is likely.
• Oversimplified Predictions: The model’s predictions are too simplistic and do not reflect the complexity of the real data.

Example

If you use a straight line (linear regression) to fit data that actually follows a curve, the model will miss important nuances and perform poorly, both on the training set and on new data.

How to Address Underfitting

• Use a more complex model or algorithm that can capture more intricate patterns.
• Train the model for more epochs or iterations.
• Add more relevant features or improve feature selection.
• Increase the size and diversity of the training dataset.
• Reduce the amount of regularization if it is set too high.

Glossary: Overfitting
Overfitting is a common problem in artificial intelligence (AI) (as is underfitting) and machine learning where a model learns the training data too well—including its noise, errors, and outliers—rather than just the underlying patterns. As a result, the model performs exceptionally on the training data but fails to generalize to new, unseen data, leading to poor predictive performance in real-world scenarios.
Machine learning glossary
Machine learning (ML) is a branch of artificial intelligence (AI) that focuses on developing computer systems capable of learning from data, identifying patterns, and making decisions or predictions with minimal human intervention. Instead of being explicitly programmed with step-by-step instructions for every task, a machine learning system is designed to improve its performance automatically as it is exposed to more data and experience.

United States	~59% of ransomware attacks globally Thousands per year
Poland	1,000+ per week
Russia	Highest cybercrime threat level
China	Thousands per year
India	115% surge in attacks Q2 2024
Ukraine	Significant surge since 2022
Brazil	Among top countries for blocked attacks
Mexico	65% of businesses hit in 2024
Germany	High targeted rate (EU)
France	High targeted rate (EU)

AS Name	ASN
Bharat Sanchar Nigam Ltd	9829
No.31,Jin-rong Street	4134
CHINA UNICOM China169 Backbone	4837
DigitalOcean, LLC	14061
HUAWEI INTERNATIONAL PTE. LTD.	136907
Amazon.com, Inc.	14618
Alibaba (US) Technology Co., Ltd.	45102
Google LLC	396982
Amazon.com, Inc.	16509
3xK Tech GmbH	200373

IP Address	Notable Exploits/Context
104.238.159.149	SharePoint zero-day, broad exploitation
107.191.58.76	SharePoint zero-day, government targets
96.9.125.147	SharePoint, previously Ivanti exploits
139.162.47.194	Exploits on CitrixBleed 2
38.180.148.215	CitrixBleed 2 campaigns
185.224.128.17	High activity, Netherlands
89.248.163.200	High activity, Netherlands
15.235.218.150	Associated with APT, active C2
45.9.148.114	Associated with C2, malicious netflow
91.107.150.184	C2 infrastructure, recent IoC

Related