Compute the similarity test.

This function compute the nonparametric test for spatial independence using symbolic analysis for categorical/qualitative spatial process.

Usage

similarity.test(formula = NULL, data = NULL, fx = NULL, listw = listw,
alternative = "two.sided", distr = "asymptotic", nsim = NULL, control = list())

Arguments

formula: a symbolic description of the factor (optional).
data: an (optional) data frame or a sf object containing the variable to testing for.
fx: a factor (optional).
listw: a listw object
alternative: a character string specifying the type of cluster, must be one of "High" (default), "Both" or "Low".
distr: A string. Distribution of the test "asymptotic" (default) or "bootstrap"
nsim: Number of permutations.
control: List of additional control arguments.

Value

A object of the htest

`data.name`	a character string giving the names of the data.
`statistic`	Value of the similarity test
`N`	total number of observations.
`Zmlc`	Elements in the Most Likelihood Cluster.
`alternative`	a character string describing the alternative hypothesis.
`p.value`	p-value of the similarity test
`similiarity.mc`	values of the similarity test in each permutation.

Details

This testing approach for spatial independence that extends some of the properties of the join count statistic. The premise of the tests is similar to the join count statistic in that they use the concept of similarity between neighbouring spatial entities (Dacey 1968; Cliff and Ord 1973, 1981). However, it differs by taking into consideration the number of joins belonging locally to each spatial unit, rather than the total number of joins in the entire spatial system. The approach proposed here is applicable to spatial and network contiguity structures, and the number of neighbors belonging to observations need not be constant.

We define an equivalence relation $\sim$ in the set of locations S. An equivalence relation satisfies the following properties:

Reflexive: $s_i \sim s_i$ for all $s_i \in S$,
Symmetric: If $s_i \sim s_j$, then $s_j \sim s_i$ for all $s_i,\ s_j \in S$ and
Transitive: If $s_i \sim s_j$ and $s_j \sim s_k$, then $s_i \sim s_k$ for all $s_i, \ s_j, \ s_k \in S$

We call the relation $\sim$ a similarity relation. Then, the null hypothesis that we are interested in is $$H_0: \{X_s\}_{s \in S} \ \ is\ \ iid$$ Assume that, under the null hypothesis, $P(s_i \sim s_{ji}) = p_i$ for all $s_{ji} \in N_{s_i}$.
Define
$$I_{ij}=1 \ \ if \ \ s_i \sim s_{ji} \ \ ; 0 \ \ otherwise$$
Then, for a fixed degree d and for all location si with degree d, the variable d
$$\Lambda_{(d,i)}=\sum_{j=1}^d I_{ij}$$ gives the number of nearest neighbours to si that are similar to si. Therefore, under the null hypothesis, $H_0$, $\Lambda(d,i)$ follows a binomial distribution $B(d, p_i)$.

Control arguments

seedinit Numerical value for the seed (only for boot version). Default value seedinit=123

References

Farber, S., Marin, M. R., & Paez, A. (2015). Testing for spatial independence using similarity relations. Geographical Analysis. 47(2), 97-120.

Author

Fernando López	fernando.lopez@upct.es
Román Mínguez	roman.minguez@uclm.es
Antonio Páez	paezha@gmail.com
Manuel Ruiz	manuel.ruiz@upct.es

Examples


# Case 1:
N <- 100
cx <- runif(N)
cy <- runif(N)
listw <- spdep::knearneigh(cbind(cx,cy), k = 3)
p <- c(1/4,1/4,1/4,1/4)
rho <- 0.5
fx <- dgp.spq(p = p, listw = listw, rho = rho)
W <- (spdep::nb2mat(spdep::knn2nb(listw)) >0)*1
similarity <- similarity.test(fx = fx, data = FastFood.sf, listw = listw)
print(similarity)
#> 
#> 	Similarity test of spatial dependence for qualitative data.
#> 	Distribution: asymptotic
#> 
#> data:  fx
#> Similarity-test = 1.9113, p-value = 0.05596
#> alternative hypothesis: two.sided
#> 

# Case 2: test with formula, a sf object (points) and knn
data("FastFood.sf")
coor <- sf::st_coordinates(sf::st_centroid(FastFood.sf))
listw <- spdep::knearneigh(coor, k = 4)
formula <- ~ Type
similarity <- similarity.test(formula = formula, data = FastFood.sf, listw = listw)
#> Warning: neighbour object has 11 sub-graphs
print(similarity)
#> 
#> 	Similarity test of spatial dependence for qualitative data.
#> 	Distribution: asymptotic
#> 
#> data:  Type
#> Similarity-test = -5.4476, p-value = 5.105e-08
#> alternative hypothesis: two.sided
#> 

# Case 3:
data(provinces_spain)
listw <- spdep::poly2nb(as(provinces_spain,"Spatial"), queen = FALSE)
#> although coordinates are longitude/latitude, st_intersects assumes that they
#> are planar
#> Warning: some observations have no neighbours;
#> if this seems unexpected, try increasing the snap argument.
#> Warning: neighbour object has 4 sub-graphs;
#> if this sub-graph count seems unexpected, try increasing the snap argument.
provinces_spain$Mal2Fml <- factor(provinces_spain$Mal2Fml > 100)
levels(provinces_spain$Mal2Fml) = c("men","woman")
formula <- ~ Mal2Fml
similarity <- similarity.test(formula = formula, data = provinces_spain, listw = listw)
#> Warning: Out-of-range p-value: reconsider test arguments
print(similarity)
#> 
#> 	Similarity test of spatial dependence for qualitative data.
#> 	Distribution: asymptotic
#> 
#> data:  Mal2Fml
#> Similarity-test = NaN, p-value = NA
#> alternative hypothesis: two.sided
#>