# nwxtregress

**Repository Path**: fangyebing/nwxtregress

## Basic Information

- **Project Name**: nwxtregress
- **Description**: Network Regressions in Stata
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 2
- **Created**: 2025-03-31
- **Last Updated**: 2025-03-31

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# nwxtregress
Network Regressions in Stata with unbalanced panel data and time varying network structures or spatial weight matrices.

__Table of Contents__
1. [Syntax](#1-syntax)
2. [Description](#2-description)
3. [Options](#3-options)
4. [Postestimation (Predict, Direct, indirect and total effects)](#4-postestimation)
5. [Saved Values](#5-saved-values)
6. [Examples](#6-examples)
7. [References](#7-references)
8. [How to install](#8-how-to-install)
9. [Questions?](#9-questions?)
10. [About](#10-authors)


# 1. Syntax

### SAR

```
nwxtregress depvar  indepvars [if], 
        ivarlag(W1[, sparse timesparse mata id(string)])
        [mcmcoptions nosparse]
```

### SDM

```
nwxtregress depvar  indepvars [if], 
        ivarlag(W1[, sparse timesparse mata id(string)])
        dvarlag(Ws:varlist[, sparse timesparse mata id(string)]
        [mcmcoptions nosparse]
```

Data has to be ```xtset``` before use. W1 and Ws define the spatial weight matrix, default is ***Sp*** object.
```dvarlag()``` and ```ivarlag()``` define the spatial lag of the dependent and independent variables.
 ```dvarlag()``` is repeatable and multiple spatial weight matrices are supported.

#### Options for ```ivarlag()``` and ```dvarlag()```

option | Description
--- | ---
**mata** | declares weight matrix is ```mata``` matrix.
**sparse** | if weight matrix is sparse.
**timesparse** | weight matrix is sparse and varying over time.
**id(string)** | vector of IDs if W is a non sparse mata matrix
**normalize(string)** | which normalization to use.

#### General Options

options | Description
--- | ---
**nosparse** | not convert weight matrix internally to a sparse matrix
**asarray(name)** | change name of array estimation results and info

#### MCMC Options

mcmcoptions | Description
--- | ---
**draws()** | number of griddy gibs draws, default 2000
**gridlength()** | grid length, default 1000
**nomit()** | number of omitted draws, default 500
**barrypace(numlist)** | settings for BarryPace Trick. Order is iterations, maxorder. Default is 50 and 100
**usebp** | use BarryPace trick instead of LUD for inverse of (I−ρW).
**seed(#)** | sets the seed

#### Maintenance:

```
nwxtregress , [update version]
```

***nwxtregress, version*** displays the current version.
***nwxtregress, update*** updates ***nwxtregress*** from GitHub.


# 2. Description

```nwxtregress``` estimates Spatial Autoregressive (SAR) or Spatial Durbin (SDM) models. The spatial weight matrices are allowed to be time varying and the dataset can be unbalanced.

The SAR is:

```
y = rho W1 y + beta X + eps

```

The SDM is:

```
Y = rho W1 Y + beta X + gamma W2 X + eps

```

where **W1** and **W2** are spatial weight matrices, Y the dependent and X the independent variables.

```nwxtregress``` can handle spatial weights in three formats: 1. square matrix, 2. sparse and 3. time sparse.
Sparse matrices have the advantage that they save space and thus computational time
and allow for time varying weights.
The [Sp environment](http://www.stata.com/manuals/sp.pdf) only supports the square matrix format. 
```nwxtregress``` can read **square**, **sparse** and **time sparse** formats if the 
data for the weights is in ``mata`` or saved in a ``frame``.{p_end}

#### 1. Square matrix format

The spatial weights are a matrix with dimension N_g x N_g. It is time constant. An Example with a 5 x 5 matrix is:

        0    0.1  0.2  0
        0    0    0.1  0.2
        0.3  0.1  0    0
        0.2  0    0.2  0

#### 2. Sparse matrix format

The sparse matrix format is a **v x 3** matrix, where **v** is the number of non-zero elements in the spatial weight matrix. The weight matrix is time constant. The first column indicates the destination, the second the origin of the flow.
A sparse matrix of the matrix from above is:

           Destination  Origin    Flow
           1            2         0.1
           1            3         0.2
           2            3         0.1
           2            4         0.2
           3            1         0.3
           3            2         0.1
           4            1         0.2
           4            3         0.2

#### 3. Time-Sparse format

The time sparse format can handle time varying spatial weights. The first column indicates the time period, the remaining are the same as for the sparse matrix. 
For example, if there are two time periods and we have the matrix from 
above for the first and the square for the second period:


           Time    Destination    Origin    Flow
           1       1              2         0.1
           1       1              3         0.2
           1       2              3         0.1
           1       2              4         0.2
           1       3              1         0.3
           1       3              2         0.1
           1       4              1         0.2
           1       4              3         0.2
              (next time period)
           2       1              2         0.1
           2       1              3         0.4
           2       2              3         0.1
           2       2              4         0.4
           2       3              1         0.9
           2       3              2         0.1
           2       4              1         0.4
           2       4              3         0.4

Internally, nextregress will always use the time sparse format. This ensures that unbalanced panels do not pose a problem.  nextregress comes with functions for creating sparse matrices, coplying a sparse matrix into a squared format, and functions for mathematical operations (transpose and multiplication).


# 3. Options

#### Options

Option | Description
 --- | --- 
**frame(name)** | declares weight matrix is saved in a ```frame```. Default is to use a spatial weight matrix from the **Sp** environment. If a frame is used, data can be in sparse, timesparse or square matrix format.
**mata** | declares weight matrix is ```mata``` matrix. Default is to use a spatial weight matrix from the **Sp** environment. If a mata matrix is used, data can be in sparse, time sparse or square matrix format.
**sparse** | if weight matrix is in sparse format. Sparse format implies that the first two column define the origin and the destination of the flow, the third column the value of the flow.
**timesparse** | weight matrix is sparse and varying over time. As **sparse** but first column includes the time period.
**id(string)** | vector of IDs if W is a non sparse mata matrix. If a frame is used, then **id()** contains the varible names of the time indicator (if applicable), the origin and destination of the flows.
**normalize(string)** |  which normalization to use for spatial weight matrix.  Default is row normalisation.  Can be none, row (default), column, spectral or minmax, see normalisation option of [spmat creat](http://www.stata.com/manuals/spspmatrixcreate.pdf). The normalisation is done for each time period individually.
**nosparse** | not convert weight matrix internally to a sparse matrix.
**asarray(name)** | nwxtregress saves intermediate results such as the spatial weight matrix in an internal time sparse format, residuals and results from the MCMC in an array, see stored values.  It is not recommended to change contents of the array and the option to change the name should only be rarely used. The default name is NWXTREG_OBJECT#, where # is a counter if the array already existed.
**draws()** | number of griddy gibs draws, default 2000.
**gridlength()** | grid length, default 1000.
**nomit()** | number of omitted draws, default 500.
**barrypace(numlist)** | settings for BarryPace Trick. Order is iterations, maxorder. Default is 50 and 100.
**usebp** | use BarryPace trick instead of LUD for inverse of (I−ρW).
**seed(#)** | sets the seed.
**version** | display version.
**update** | update from Github.

# 4. Postestimation

## 4.1 Direct, indirect and total effects.

Direct, indirect and total effects. can be calculated using ```estat impact```. The syntax is

```
 estat impact [varlist] [, options]
```

Option | Description
 --- | --- 
seed(#) | set seed for Barry Pace matrix inversion.
array(name) | name of array with saved contents from nwxtregress, see stored results.

``varlist`` defines the variables for which the direct, indirect and total effects are displayed.  If not specified, then estat impact will calculate the effects for all explanatory variables (indepvars).

``estat impact`` saves the following in r():


Matrix | Description
 --- | --- 
**r(b_direct)** | Coefficient Matrix of direct effects
**r(V_direct)** | Variance covariance matrix of direct effects
**r(b_indirect)** | Coefficient Matrix of indirect effects
**r(V_indirect)** |  Variance covariance matrix of indirect effects 
**r(b_total)** | Coefficient Matrix of total effects
**r(V_total)** | Variance covariance matrix of total effects

## 4.2 Predict

``predict`` can be used after nwxtregress. The syntax for predict is:

```
predict [type] varname [, options]
```

Option | Description
 --- | --- 
xb | calculate linear prediction.
res | calculate residuals.
replace | replace if varname exists.
array(name) | name of array with saved contents from nwxtregress, see stored results.


# 5. Saved Values

***nwxtregress*** saves the following in ***e()***

#### Matrices

Matrices | Description
---|---
***b*** | Coefficient Matrix
***V*** | Variance-Covariance Matrix

#### Scalars

Scalars | Description
---|---
N | Number of observations
N_g | Number of groups
T | Number of time periods
Tmin | Minimum number of time periods
Tavg | Average number of time periods
Tmax | Maximum number of time periods
K | Number of regressors excluding spatial lags
Kfull | Number of regressors including spatial lags
r2 | R-squared
r2_a | adjusted R-squared
MCdraws | Number of MCMC draws

#### Macros

Macro | Description
---|---
sample | sample


#### mata arrays

 In addition to e() and r() nwxtregress saves informations about the estimation in a mata array.  The contents are the weight matrix in time sparse format, residuals and results from the MCMC.  Storing those saves time for ``estat impact`` and ``predict``.  The name default name of the array is _NWXTREG_OBJECT#, but can be set with the option **asarray()**.  In general it is not recommended to change this setting.


# 6. Examples

An example dataset with USE/MAKE table data from the BEA’s website and links between industries is available [GitHub](https://github.com/JanDitzen/nwxtregress/tree/main/examples). The dataset IO.dta contains the linkages (spatial weights) and the dataset VA.dta the firm data.  We want to estimate capital consumption by using compensation and net surplus as explanatory variables.


First we load the data from the W dataset and convert into a **SP** object for the year 1998.

```
use https://janditzen.github.io/nwxtregress/examples/IO.dta
keep if Year == 1998
replace sam = 0 if sam < 0
replace sam = 0 if ID1==ID2
keep ID1 ID2 sam
reshape wide sam, i(ID1) j(ID2)
spset ID1
spmatrix fromdata WSpmat = sam* , replace
```

Next, we load the dataset with the firm data and estimate a SAR with a time constant spatial weight matrix.  We also obtain the total, direct and indirect effects using estat impact.  For reproducibility we set a seed.

```
use https://janditzen.github.io/nwxtregress/examples/VA.dta
nwxtregress cap_cons compensation net_surplus , dvarlag(WSpmat) seed(1234)
estat impact

```

The disadvantage is that the spatial weight are constant across time and we had to get rid of all negative numbers.  To allow for time varying spatial weights, we load the W dataset again and but load it into the frame IO:

```
frame create IO
frame IO: use https://janditzen.github.io/nwxtregress/examples/IO.dta
```

Using the VA dataset again, we can estimate the SAR model with time varying spatial weights.  To do so we use the options frame(name), where name indicates the frame and the weight matrix name corresponds to the variable names.  The data is in timesparse format so we need to use the option timesparse.  Finally it is nessary to define the year identifier and the origin and destination of the flows using the id() option:


```
nwxtregress cap_cons compensation net_surplus ,
dvarlag(sam, frame(IO) id(Year ID1 ID2)  timesparse) 
seed(1234) 
```

Alternatively we can load the spatial weight matrix into mata:

```
frame IO: putmata Wt = (Year ID1 ID2 sam), replace
nwxtregress cap_cons compensation net_surplus , 
dvarlag(Wt, mata timesparse) seed(1234)
```

If we want to estimate an SDM by adding the option ivarlag():

```
nwxtregress cap_cons compensation net_surplus , 
dvarlag(Wt,mata timesparse) ivarlag(Wt: compensation,mata timesparse )  
seed(1234)
```

We can also define two different spatial weight matrices:

```
mata: Wt2 = Wt[selectindex(Wt[.,4]:>2601.996),.]
nwxtregress cap_cons compensation net_surplus , 
dvarlag(Wt, mata timesparse) 
ivarlag(Wt: net_surplus, mata timesparse) 
ivarlag(Wt2: compensation, mata timesparse) seed(1234)
```

Total, direct and indirect effects can be calculated using estat impact:

```
estat impact
```

To predict fitted values and residuals predict can be used:

```
predict xb
predict residuals, residual
```

# 7. References

# 8. How to install


The latest version of the ***nwxtregress*** package can be obtained by typing in Stata:

```
net from https://janditzen.github.io/nwxtregress/
``` 

or 

```
net install nwxtregress , from(https://janditzen.github.io/nwxtregress/)

```

# 9. Questions?

Questions? Feel free to write us an email, open an [issue](https://github.com/JanDitzen/nwxtregress/issues) or [start a discussion](https://github.com/JanDitzen/nwxtregress/discussions).


# 10. Authors

#### Jan Ditzen (Free University of Bozen-Bolzano)

Email: jan.ditzen@unibz.it

Web: www.jan.ditzen.net

#### William Grieser (Texas Christian University)

Email: w.grieser@tcu.edu

Web: https://www.williamgrieser.com/

#### Morad Zekhnini (Michigan State University)

Email: zekhnini@msu.edu

Web: https://sites.google.com/view/moradzekhnini/home