Videos

SciPy 2014

Video recording and production done by Enthought.

Spatial weights matrices, $W$, play an essential role in many spatial analysis tasks including measures of spatial association, regionalization, and spatial regression. A spatial weight $w_{i,j}$ represents potential interaction between each $i,j$ pair in a set of $n$ spatial units. The weights are generally defined as either binary $w_{i,j}=\{1,0\}$, depending on whether or not $i$ and $j$ are considered neighbors, or a continuous value reflecting some general distance relationship between $i$ and $j$. This work focuses on the case of binary weights using a contiguity criteria where $i$ and $j$ are rook neighbors when sharing an edge and queen neighbors when sharing a vertex. Population of the $W$ is computationally expensive, requiring, in the naive case, $O(n^2)$ point or edge comparisons. To improve efficiency data decomposition techniques, in the form of regular grids and quad-trees, as well as spatial indexing techniques using r-trees have be utilized to reduce the total number of local point or edge comparisons. Unfortunately, these algorithms still scale quadratically. Recent research has also shown that even with the application of parallel processing techniques, the gridded decomposition method does not scale as $n$ increases. This work presents the development and testing of a high performance implementation, written in pure Python, using time constant and $O(n)$ operations, by leveraging high performance containers and a vertex comparison method. The figures below depict results of initial testing using synthetically generated lattices of triangles, squares, and hexagons with rook contiguity in black and queen contiguity in gray. These geometries were selected to control for average neighbour cardinality and average vertex count per geometry. From these initial tests, we report a significant speedup over r-tree implementations and a more modest speedup over gridded decomposition methods. In addition to scaling linearly, while existing methods scale quadratically, this method is also more memory efficient. Ongoing work is focusing on testing using randomly distributed data, and U.S. Census data, the application of parallelization techniques to test further performance improvements, and the use of fuzzy operator to account for spatial error .

Rated: Everyone
Viewed 17 times
Tags: There are no tags for this video.