If you want to know how two or more pieces of data are related to each other (for example, how changes in unemployment and inflation affect GDP), or have you ever asked your boss to ask you to create a forecast or analysis based on the relationship between variables , Then learning regression analysis will be worth your time.

In this article, you will learn the basics of simple linear regression, sometimes called “ordinary least squares” or OLS regression-a tool commonly used for forecasting and financial analysis. We will start by learning the core principles of regression, first learn about covariance and correlation, and then continue to build and interpret the regression output. Popular commercial software such as Microsoft Excel can complete all regression calculations and output for you, but it is still important to learn the underlying mechanism.

Key points

- Simple linear regression is often used for forecasting and financial analysis—for example, companies can judge how changes in GDP affect sales.
- Microsoft Excel and other software can complete all calculations, but it is good to understand the mechanism of simple linear regression.

## variable

The core of the regression model is the relationship between two different variables, called the dependent variable and the independent variable. For example, suppose you want to predict your company’s sales, and you have concluded that your company’s sales will rise and fall as GDP changes.

Your predicted sales will be the dependent variable because their value “depends” on the value of GDP, and GDP will be the independent variable. Then, you need to determine the strength of the relationship between these two variables to predict sales. If GDP increases/decreases by 1%, how much will your sales increase or decrease?

## Covariance

C ov (x, y) = ∑ (xn − xu) (yn − yu) N begin(aligned) &Cov(x,y) = sum frac {(x_n-x_u )( y_n-y_u) }{ N } \ end{align}

C○v(X,Yes)=∑N(Xn–Xyou)(Yesn–Yesyou)

The formula for calculating the relationship between two variables is called covariance. This calculation shows you the direction of the relationship. If one variable increases and the other variable also tends to increase, the covariance will be positive. If one variable rises and the other tends to fall, then the covariance will be negative.

The actual number calculated by the calculation may be difficult to interpret because it is not standardized. For example, a covariance of 5 can be interpreted as a positive correlation, but it can only be said that the strength of this relationship is stronger than when the number is 4 or weaker than when the number is 6.

## Correlation coefficient

C orelation = ρ xy = C ovxysxsy begin{aligned} &Correlation = rho_{xy} = frac {Cov_{xy} }{ s_x s_y} \ end{aligned}

C○rrelectronicRiseOneTonA generation○n=ρXYes=secondXsecondYesC○vXYes

We need to standardize the covariance so that we can better interpret and use it in predictions. The result is the correlation calculation. The correlation calculation simply takes the covariance and divides it by the product of the standard deviations of the two variables. This binds the correlation between the -1 and +1 values.

The correlation of +1 can be interpreted as showing that two variables are completely positively correlated with each other, and -1 means that they are completely negatively correlated. In our previous example, if the correlation is +1 and GDP increases by 1%, then sales will increase by 1%. If the correlation is -1, a 1% increase in GDP will result in a 1% decrease in sales—just the opposite.

## Regression equation

Now that we know how the relative relationship between two variables is calculated, we can develop a regression equation to predict or predict the variables we want. The following is the formula for simple linear regression. “Y” is the value we are trying to predict, “b” is the slope of the regression line, “x” is the value of our independent value, and “a” represents the y intercept. The regression equation simply describes the relationship between the dependent variable (y) and the independent variable (x).

y = bx + a begin{aligned} &y = bx + a \ end{aligned}

Yes=SecondX+One

If the value of x (independent variable) is zero, the intercept or “a” is the value of y (dependent variable), so it is sometimes simply referred to as “constant”. Therefore, if there is no change in GDP, your company will still make some sales. When the GDP change is zero, this value is the intercept. Check the figure below for a graphical description of the regression equation. In this graph, the five points on the graph only represent five data points. Linear regression attempts to estimate the line that best fits the data (the line of best fit), the equation of which leads to the regression equation.

## Regression in Excel

Now that you understand some background of regression analysis, let us use Excel’s regression tool to make a simple example. Based on the previous example, we will try to predict next year’s sales based on changes in GDP. The following table lists some artificial data points, but these numbers are easy to obtain in real life.

year | Sales volume | GDP |

2015 | 100 | 1.00% |

2016 | 250 | 1.90% |

2017 | 275 | 2.40% |

2018 | 200 | 2.60% |

2019 year | 300 | 2.90% |

Just stare at the table and you can see that there is a positive correlation between sales and GDP. Both tend to rise together.With Excel, you just click *tool* Drop-down menu, select *data analysis *Then choose from there *return*The pop-up box is easy to fill in from there; your input Y range is your “sales” column, and your input X range is the change in the GDP column; select the output range where you want the data to be displayed on the spreadsheet, Then press OK. You should see something similar to the one given in the table below:

Regression statistical coefficient

Multiple R | 0.8292243 | intercept | 34.58409 |

R party |
0.687613 | GDP | 88.15552 |

adjusted
R party |
0.583484 |
—— |
—— |

Standard error | 51.021807 | —— |
—— |

observe | 5 |
—— |
—— |

## explain

For simple linear regression, the main outputs you need to focus on are R-squared, intercept (constant), and the beta (b) coefficient of GDP. The R-squared number in this example is 68.7%. This shows how good our model predicts or predicts future sales, and shows that the explanatory variable in the model predicts 68.7% of the change in the dependent variable. Next, the intercept is 34.58, which tells us that if the predicted GDP change is zero, our sales will be about 35 units. Finally, the GDP beta or correlation coefficient of 88.15 tells us that if GDP increases by 1%, sales may increase by about 88 units.

## Bottom line

So how will you use this simple model in your business? Well, if your research convinces you that the next GDP change will be a certain percentage, you can plug that percentage into the model and generate sales forecasts. This can help you develop a more objective plan and budget for the coming year.

Of course, this is just a simple regression, you can use multiple independent variables called multiple linear regression to build a model. But multiple linear regression is more complicated, and there are several issues that need to be discussed in another article.

.