Construction of \(S(\boldsymbol{\beta})\)
The sum of squares for a linear model with \(p\) explanatory variables is
Data: \((y_i,x_{i, 1},x_{i, 2},\ldots,x_{i, \,p}), \quad i=1,\dots,n\)
Model: \(E(y_{i})=\alpha+\beta_{1}x_{i, 1}+\beta_{2}x_{i, 2}+\ldots+\beta_{p}x_{i, \, p}, \quad i=1,\ldots n.\)
We have defined vectors \(\mathbf{Y}\) and \(\boldsymbol{\beta}\) and design matrix \(\mathbf{X}\) such that \[\mathrm{E}(\mathbf{Y}) = \mathbf{X}\boldsymbol{\beta},\]
Let \(\mathbf{X_i}\) be the \(i^{th}\) row of the \(\mathbf{X}\) matrix. Therefore, \[\begin{aligned} S(\boldsymbol{\beta}) &=\sum_{i=1}^n(y_i-\mathrm{E}(y_i))^2\\ \mathrm{or}\quad S(\boldsymbol{\beta}) &=\sum_{i=1}^n(y_i-\mathbf{X_i}\boldsymbol{\beta})^{2}. \end{aligned}\]
The sum of squares function for the parameters \(\boldsymbol{\beta}\) can be written in vector-matrix form as:
\[S(\boldsymbol{\beta}) = \sum_{i=1}^n(y_i-\mathrm{E}(y_i))^2 =(\mathbf{Y}-E(\mathbf{Y}))^T(\mathbf{Y}-E(\mathbf{Y}) ) = (\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})^T(\mathbf{Y}-\mathbf{X}\boldsymbol{\beta} ).\]
To verify the dimensions of matrices in the vector-matrix definition of the sum of squares we need the dimensions of \(S(\boldsymbol{\beta}) = (\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})^T(\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})\) to match on both sides of the equation and all the matrix operations to be valid.
\(\mathbf{X}\) has \(n \times [p+1]\) rows and \(\boldsymbol{\beta}\) has \([p+1] \times 1\).
\(\mathbf{X}\boldsymbol{\beta}\) is a valid operation and the dimension of \(\mathbf{X}\boldsymbol{\beta}\) is \(n \times 1\).
\(\mathbf{Y}\) is another \(n \times 1\) vector.
\(\mathbf{Y}-\mathbf{X}\boldsymbol{\beta}\) is a valid operation.
\((\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})\) is \(n \times 1\) and \((\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})^T\) is \(1\times n\).
\((\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})^T(\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})\) of dimension \(1 \times 1\), which is just a scalar.
The dimension of \(S(\boldsymbol{\beta})\), which is also a scalar.
Please note here, based on how the model was defined, we have \(p\) explanatory variables and \(p+1\) regression coefficients.