[PySAL] 空间自相关

作者: 红薯铺子 | 来源:发表于2018-04-08 15:47 被阅读0次

[PySAL] 空间自相关
[PySAL] 空间权重
Coding and Paper Letter（五）
基本操作
【自招相关】
空间向量相关问题
oracle 表空间相关
非因解读 | 玩转顶流DSP空间组技术，最大化利用病理样本快速发
向量空间相关概念总结-向量空间
UWP 文件系统接口

原文

1. 介绍

空间自相关描述了空间上一系列点的属性分布模式。正相关反映了属性值在空间上的相似性，负相关则反映了差异性。这种自相关性显然不是随机产生的。

空间自相关有两种分析方式——全局自相关和局部自相关。顾名思义，全局自相关研究整个地图上属性分布模式的聚类情况。局部自相关则将关注点放在了整体的内部，去识别聚类或是热点，它可能和整体聚类模式相符，也可能反映了和整体模式相反的异质性。

2. 全局自相关

PySAL有5种全局自相关检验：Gamma值、Join Count、Moran’s I、Geary’s C、和Getis and Ord’s G。

2.1 Gamma值

Gamma

2.2 Join Count

2.3 Moran's I

对n个空间单位中，属性y的的I值计算公式如下：

Moran's I

其中w_ij是空间权重，z_i = y_i-\bar{y}，S_0=ΣΣw_ij。

下面以圣路易斯周围78个郡的蓄意谋杀率的案例研究为例：
先读取数据：

>>> import pysal
>>> import numpy as np
>>> f = pysal.open(pysal.examples.get_path("stl_hom.txt"))
>>> y = np.array(f.by_col['HR8893'])

下面创建权重矩阵：

>>> w = pysal.open(pysal.examples.get_path("stl.gal")).read()

获取Moran's I的实例：

>>> mi = pysal.Moran(y, w, two_tailed=False)
>>> "%.3f"%mi.I
'0.244'
>>> mi.EI
-0.012987012987012988
>>> "%.5f"%mi.p_norm
'0.00014'

从结果可以看出，在假定谋杀率服从正态分布的情况下，观察值远高于期望值。

下面我们来看看mi对象的参数和属性：

class Moran(builtins.object)
 |  Moran's I Global Autocorrelation Statistic
 |  
 |  Parameters
 |  ----------
 |  
 |  y               : array
 |                    variable measured across n spatial units
 |  w               : W
 |                    spatial weights instance
 |  transformation  : string
 |                    weights transformation,  default is row-standardized "r".
 |                    Other options include "B": binary,  "D":
 |                    doubly-standardized,  "U": untransformed
 |                    (general weights), "V": variance-stabilizing.
 |  permutations    : int
 |                    number of random permutations for calculation of
 |                    pseudo-p_values
 |  two_tailed      : boolean
 |                    If True (default) analytical p-values for Moran are two
 |                    tailed, otherwise if False, they are one-tailed.
 |  
 |  Attributes
 |  ----------
 |  y            : array
 |                 original variable
 |  w            : W
 |                 original w object
 |  permutations : int
 |                 number of permutations
 |  I            : float
 |                 value of Moran's I
 |  EI           : float
 |                 expected value under normality assumption
 |  VI_norm      : float
 |                 variance of I under normality assumption
 |  seI_norm     : float
 |                 standard deviation of I under normality assumption
 |  z_norm       : float
 |                 z-value of I under normality assumption
 |  p_norm       : float
 |                 p-value of I under normality assumption
 |  VI_rand      : float
 |                 variance of I under randomization assumption
 |  seI_rand     : float
 |                 standard deviation of I under randomization assumption
 |  z_rand       : float
 |                 z-value of I under randomization assumption
 |  p_rand       : float
 |                 p-value of I under randomization assumption
 |  two_tailed   : boolean
 |                 If True p_norm and p_rand are two-tailed, otherwise they
 |                 are one-tailed.
 |  sim          : array
 |                 (if permutations>0)
 |                 vector of I values for permuted samples
 |  p_sim        : array
 |                 (if permutations>0)
 |                 p-value based on permutations (one-tailed)
 |                 null: spatial randomness
 |                 alternative: the observed I is extreme if
 |                 it is either extremely greater or extremely lower
 |                 than the values obtained based on permutations
 |  EI_sim       : float
 |                 (if permutations>0)
 |                 average value of I from permutations
 |  VI_sim       : float
 |                 (if permutations>0)
 |                 variance of I from permutations
 |  seI_sim      : float
 |                 (if permutations>0)
 |                 standard deviation of I under permutations.
 |  z_sim        : float
 |                 (if permutations>0)
 |                 standardized I based on permutations
 |  p_z_sim      : float
 |                 (if permutations>0)
 |                 p-value based on standard normal approximation from
 |                 permutations

哦，原来我们的推断不一定要基于正态假设，也可以基于随机置换的一组值。

>>> np.random.seed(10)
>>> mir = pysal.Moran(y, w, permutations = 9999)
>>> // 得到伪p值：
>>> print(mir.p_sim)
0.0022

关于置换检验
这里观察到21个置换值和实际的I值一样极端，因此p值就是(21+1)/(9999+1)=0.0022。或者，我们也可使用Z变换，将置换检验中得到的I值和原来的I值比较：

>>> print(mir.EI_sim)
-0.0118217511619
>>> print(mir.z_sim)
4.55451777821
>>> print(mir.p_z_sim)
2.62529422013e-06

如果我们的研究变量y是基于不同数量总体的比率，那么y的I值就需要调整，以解释总体间的差异。
为了进行这项调整，我们需要创建一个Moran_Rate的实例，而不是Moran的实例。比方说，我们打算估计死于婴儿瘁死综合症的新生儿比率。下面先读取数据，再创建W实例和Moran_Rate实例。

>>> f = pysal.open(pysal.examples.get_path("sids2.dbf"))
>>> b = np.array(f.by_col('BIR79'))
>>> e = np.array(f.by_col('SID79'))

>>> w = pysal.open(pysal.examples.get_path("sids2.gal")).read()

>>> mi = pysal.esda.moran.Moran_Rate(e, b, w, two_tailed=False)
>>> "%6.4f" % mi.I
'0.1662'
>>> "%6.4f" % mi.EI
'-0.0101'
>>> "%6.4f" % mi.p_norm
'0.0042'

显然，在调整后，I值远高于期望值。

2.4 Geary's C

2.5 Getis and Ord's G

3. 局部自相关

要定量地度量局部自相关性，PySAL提供了Moran's I和Getis and Ord's G的空间关联局部指标(LISA)。

3.1 Local Moran's I

先给出局部自相关I值的公式：

Local Moran's I

可以计算出n个局部空间自相关I值，每个空间单元1个。依然拿圣路易斯为例，LISA统计量可以这样获得：

>>> f = pysal.open(pysal.examples.get_path("stl_hom.txt"))
>>> y = np.array(f.by_col['HR8893'])
>>> w = pysal.open(pysal.examples.get_path("stl.gal")).read()
>>> np.random.seed(12345)
>>> lm = pysal.Moran_Local(y,w)
>>> lm.n
78
>>> len(lm.Is)
78

78个LISA存储在向量lm.ls中。对这些值的推断由条件随机获得（除了i以外的n-1个空间单元被用来生成i的LISA统计量），从而得到每个LISA的伪p值。

>>> lm.p_sim
array([ 0.176,  0.073,  0.405,  0.267,  0.332,  0.057,  0.296,  0.242,
        0.055,  0.062,  0.273,  0.488,  0.44 ,  0.354,  0.415,  0.478,
        0.473,  0.374,  0.415,  0.21 ,  0.161,  0.025,  0.338,  0.375,
        0.285,  0.374,  0.208,  0.3  ,  0.373,  0.411,  0.478,  0.414,
        0.009,  0.429,  0.269,  0.015,  0.005,  0.002,  0.077,  0.001,
        0.088,  0.459,  0.435,  0.365,  0.231,  0.017,  0.033,  0.04 ,
        0.068,  0.101,  0.284,  0.309,  0.113,  0.457,  0.045,  0.269,
        0.118,  0.346,  0.328,  0.379,  0.342,  0.39 ,  0.376,  0.467,
        0.357,  0.241,  0.26 ,  0.401,  0.185,  0.172,  0.248,  0.4  ,
        0.482,  0.159,  0.373,  0.455,  0.083,  0.128])

numpy的索引可以帮助我们得到显著的p值：

>>> sig = lm.p_sim<0.05
>>> lm.p_sim[sig]
array([ 0.025,  0.009,  0.015,  0.005,  0.002,  0.001,  0.017,  0.033,
        0.04 ,  0.045])

也可以获得这些显著值在莫兰指数散点图中的象限：
（1：HH高聚类，2：LH高包围低，3：LL低值聚类，4：HL低包围高）

>>> lm.q[sig]
array([4, 1, 3, 1, 3, 1, 1, 3, 3, 3])

和全局莫兰指数一样，如果是比率，也要进行调整，同样以新生儿数据为例：

>>> f = pysal.open(pysal.examples.get_path("sids2.dbf"))
>>> b = np.array(f.by_col('BIR79'))
>>> e = np.array(f.by_col('SID79'))
>>> w = pysal.open(pysal.examples.get_path("sids2.gal")).read()

>>> np.random.seed(12345)
>>> lm = pysal.esda.moran.Moran_Local_Rate(e, b, w)
>>> lm.Is[:10]
array([-0.13452366, -1.21133985,  0.05019761,  0.06127125, -0.12627466,
        0.23497679,  0.26345855, -0.00951288, -0.01517879, -0.34513514])

获取显著的p值：

>>> sig = lm.p_sim<0.05
>>> lm.p_sim[sig]
array([ 0.021,  0.04 ,  0.047,  0.015,  0.001,  0.017,  0.032,  0.031,
        0.019,  0.014,  0.004,  0.048,  0.003])

[PySAL] 空间自相关
原文 1. 介绍空间自相关描述了空间上一系列点的属性分布模式。正相关反映了属性值在空间上的相似性，负相关则反映了...
[PySAL] 空间权重
原文here 1. 介绍空间权重是许多空间分析的基础，可以用多种方式来表示。PySAL能够创建、处理和分析三类空...
Coding and Paper Letter（五）
资源整理 1 Coding： 1.Python模块libpysal，pysal的核心组件。与pysal的区别目前没...
基本操作
用户相关查看所有用户修改用户名/密码删除用户数据泵相关表空间新建表空间空间表相关查看所有表
【自招相关】
新概念作文大赛（清华北大复旦要求全国一等奖，但并不像楼上所说的就等同于“预录取”，这只能提供进入自主招生材料初审环...
空间向量相关问题
一、相关知识点二、各类题型准备工作：求平面的法向量类型一证明问题类型二角度问题类型三距离问题类型四探索问题
oracle 表空间相关
非因解读 | 玩转顶流DSP空间组技术，最大化利用病理样本快速发
DSP空间组技术助力ANCA-GN肾小球包曼氏囊破裂机制研究抗中性粒细胞胞浆自抗体相关肾小...
向量空间相关概念总结-向量空间
什么是向量空间特点：① 包含向量比如向量组，而且向量组内部的向量维数相同② 包含向量的运动向量的加法->生成新的...
UWP 文件系统接口
UWP 文件系统接口本文章翻译自 MSDN UWP 文件系统的相关接口都放在下面三个名称空间中：Windows....