Before we begin, let’s address some common questions that often arise when studying new concepts.
Why are we studying this?
- Well, firstly, expressing concepts numerically adds clarity.
- Secondly, it’s vital for quantifying relationships among variables.
- Thirdly, it enables predictions based on predictors.
Now, let’s delve into the concept of Simple Linear Regression.
- Regression analysis is a statistical method used to understand and predict the relationship between two or more variables.
- Imagine you have a bunch of data points (like numbers) representing different things. For example, let’s say you want to figure out how the amount of time you spend studying affects your exam scores.
- Regression analysis helps you find a mathematical equation that best fits these data points, forming a line (or curve) that shows the general trend of the relationship between the two variables. In our example, the two variables are the time spent studying (let’s call it X) and the exam scores (let’s call it Y).
- Once you have this line, you can use it to make predictions. For instance, if you know how much time you’ll spend studying (X), you can use the equation to estimate what your exam score might be (Y).
- Keep in mind that regression analysis doesn’t necessarily prove that one variable causes changes in the other. It just helps us understand if there’s a relationship between them and how strong that relationship is.
Lets delve in coding —
import numpy as np
from sklearn.linear_model import LinearRegression
# Sample data - hours spent studying and corresponding exam scores
hours_studied = np.array([2, 4, 6, 8, 10, 12, 14, 16, 18, 20])
exam_scores = np.array([65, 70, 75, 80, 85, 88, 90, 92, 95, 98])
# Reshape the data to be 2D (required for scikit-learn)
X = hours_studied.reshape(-1, 1)
Y = exam_scores
# Create a linear regression model
model = LinearRegression()
# Fit the model to the data
model.fit(X, Y)
# Get the coefficients (intercept and slope)
intercept = model.intercept_
slope = model.coef_[0]
# Print the coefficients
print("Intercept (β0):", intercept)
print("Slope (β1):", slope)
# Make predictions
hours_to_predict = np.array([15, 25, 30]).reshape(-1, 1)
predicted_scores = model.predict(hours_to_predict)
# Print the predictions
print("Predicted scores:", predicted_scores)
Stay tuned for future blog posts! Follow me to stay updated with all the upcoming content.