Compute the similarity test.
similarity.test.Rd
This function compute the nonparametric test for spatial independence using symbolic analysis for categorical/qualitative spatial process.
Usage
similarity.test(formula = NULL, data = NULL, fx = NULL, listw = listw,
alternative = "two.sided", distr = "asymptotic", nsim = NULL, control = list())
Arguments
- formula
a symbolic description of the factor (optional).
- data
an (optional) data frame or a sf object containing the variable to testing for.
- fx
a factor (optional).
- listw
a listw object
- alternative
a character string specifying the type of cluster, must be one of "High" (default), "Both" or "Low".
- distr
A string. Distribution of the test "asymptotic" (default) or "bootstrap"
- nsim
Number of permutations.
- control
List of additional control arguments.
Value
A object of the htest
data.name | a character string giving the names of the data. |
statistic | Value of the similarity test |
N | total number of observations. |
Zmlc | Elements in the Most Likelihood Cluster. |
alternative | a character string describing the alternative hypothesis. |
p.value | p-value of the similarity test |
similiarity.mc | values of the similarity test in each permutation. |
Details
This testing approach for spatial independence that extends some of the properties
of the join count statistic. The premise of the tests is similar to the join count
statistic in that they use the concept of similarity between neighbouring spatial
entities (Dacey 1968; Cliff and Ord 1973, 1981). However, it differs by taking
into consideration the number of joins belonging locally to each spatial unit,
rather than the total number of joins in the entire spatial system. The approach
proposed here is applicable to spatial and network contiguity structures, and
the number of neighbors belonging to observations need not be constant.
We define an equivalence relation \(\sim\) in the set of locations S. An equivalence
relation satisfies the following properties:
Reflexive: \(s_i \sim s_i\) for all \(s_i \in S\),
Symmetric: If \(s_i \sim s_j\), then \(s_j \sim s_i\) for all \(s_i,\ s_j \in S\) and
Transitive: If \(s_i \sim s_j\) and \(s_j \sim s_k\), then \(s_i \sim s_k\)
for all \(s_i, \ s_j, \ s_k \in S\)
We call the relation \(\sim\) a similarity relation. Then, the null hypothesis that
we are interested in is
$$H_0: \{X_s\}_{s \in S} \ \ is\ \ iid$$
Assume that, under the null hypothesis, \(P(s_i \sim s_{ji}) = p_i\) for all
\(s_{ji} \in N_{s_i}\).
Define
$$I_{ij}=1 \ \ if \ \ s_i \sim s_{ji} \ \ ; 0 \ \ otherwise$$
Then, for a fixed degree d and for all location si with degree d, the variable d
$$\Lambda_{(d,i)}=\sum_{j=1}^d I_{ij}$$
gives the number of nearest neighbours to si that are similar to si.
Therefore, under the null hypothesis, \(H_0\), \(\Lambda(d,i)\) follows a binomial
distribution \(B(d, p_i)\).
Control arguments
seedinit | Numerical value for the seed (only for boot version). Default value seedinit=123 |
References
Farber, S., Marin, M. R., & Paez, A. (2015). Testing for spatial independence using similarity relations. Geographical Analysis. 47(2), 97-120.
Author
Fernando López | fernando.lopez@upct.es |
Román Mínguez | roman.minguez@uclm.es |
Antonio Páez | paezha@gmail.com |
Manuel Ruiz | manuel.ruiz@upct.es |
Examples
# Case 1:
N <- 100
cx <- runif(N)
cy <- runif(N)
listw <- spdep::knearneigh(cbind(cx,cy), k = 3)
p <- c(1/4,1/4,1/4,1/4)
rho <- 0.5
fx <- dgp.spq(p = p, listw = listw, rho = rho)
W <- (spdep::nb2mat(spdep::knn2nb(listw)) >0)*1
similarity <- similarity.test(fx = fx, data = FastFood.sf, listw = listw)
print(similarity)
#>
#> Similarity test of spatial dependence for qualitative data.
#> Distribution: asymptotic
#>
#> data: fx
#> Similarity-test = 2.7599, p-value = 0.005783
#> alternative hypothesis: two.sided
#>
# Case 2: test with formula, a sf object (points) and knn
data("FastFood.sf")
coor <- sf::st_coordinates(sf::st_centroid(FastFood.sf))
#> Warning: st_centroid assumes attributes are constant over geometries of x
#> Warning: bounding box has potentially an invalid value range for longlat data
#> Warning: st_centroid does not give correct centroids for longitude/latitude data
listw <- spdep::knearneigh(coor, k = 4)
formula <- ~ Type
similarity <- similarity.test(formula = formula, data = FastFood.sf, listw = listw)
print(similarity)
#>
#> Similarity test of spatial dependence for qualitative data.
#> Distribution: asymptotic
#>
#> data: Type
#> Similarity-test = -5.4476, p-value = 5.105e-08
#> alternative hypothesis: two.sided
#>
# Case 3:
data(provinces_spain)
listw <- spdep::poly2nb(as(provinces_spain,"Spatial"), queen = FALSE)
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
provinces_spain$Male2Female <- factor(provinces_spain$Male2Female > 100)
levels(provinces_spain$Male2Female) = c("men","woman")
formula <- ~ Male2Female
similarity <- similarity.test(formula = formula, data = provinces_spain, listw = listw)
#> Warning: Out-of-range p-value: reconsider test arguments
print(similarity)
#>
#> Similarity test of spatial dependence for qualitative data.
#> Distribution: asymptotic
#>
#> data: Male2Female
#> Similarity-test = NaN, p-value = NA
#> alternative hypothesis: two.sided
#>