pysal

Previous topic

pysal.region — Spatially constrained clustering

This Page

region.maxp – Spatially constrained regionalization

New in version 1.0.

Max p regionalization

Heuristically form the maximum number (p) of regions given a set of n areas and a floor constraint.

class pysal.region.maxp.Maxp(w, z, floor, floor_variable, verbose=False, initial=100, seeds=[])

Try to find the maximum number of regions for a set of areas such that each region combines continguous areas that satisfy a given threshold constraint.

Parameters:

w : W

spatial weights object

z : array

n*m array of observations on m attributes across n areas. This is used to calculate intra-regional homogeneity

floor : int

a minimum bound for a variable that has to be obtained in each region

floor_variable : array

n*1 vector of observations on variable for the floor

initial : int number of initial solutions to generate

verbose : binary

if true debugging information is printed

seeds : list

ids of observations to form initial seeds. If len(ids) is less than the number of observations, the complementary ids are added to the end of seeds. Thus the specified seeds get priority in the solution

Examples

>>> import random
>>> import numpy as np
>>> random.seed(100)
>>> np.random.seed(100)
>>> w=pysal.lat2W(10,10)
>>> z=np.random.random_sample((w.n,2))
>>> p=np.random.random(w.n)*100
>>> p=np.ones((w.n,1),float)
>>> floor=3
>>> solution=Maxp(w,z,floor,floor_variable=p,initial=100)
>>> solution.p
30
>>> solution.regions[0]
[49, 39, 29]
>>> 

Attributes

area2region dict mapping of areas to region. key is area id, value is region id
regions list list of lists of regions (each list has the ids of areas in that region)
swap_iterations int number of swap iterations
total_moves int number of moves into internal regions

Methods

check_floor
cinference
inference
initial_solution
objective_function
swap
cinference(nperm=99, maxiter=1000)

Compare the within sum of squares for the solution against conditional simulated solutions where areas are randomly assigned to regions that maintain the cardinality of the original solution and respect contiguity relationships.

Parameters:

nperm : int

number of random permutations for calculation of pseudo-p_values

maxiter : int

maximum number of attempts to find each permutation

Notes

it is possible for the number of feasible solutions (feas_sols) to be less than the number of permutations requested (nperm); an exception is raised if this occurs.

Examples

>>> import random
>>> import numpy as np
>>> random.seed(100)
>>> np.random.seed(100)
>>> w=pysal.weights.lat2W(5,5)
>>> z=np.random.random_sample((w.n,2))
>>> p=np.random.random(w.n)*100
>>> p=np.ones((w.n,1),float)
>>> floor=3
>>> solution=Maxp(w,z,floor,floor_variable=p,initial=100)
>>> solution.cinference(nperm=9, maxiter=100)
>>> solution.cpvalue
0.10000000000000001

Attributes

pvalue float pseudo p_value
feas_sols int number of feasible solutions found
inference(nperm=99)

Compare the within sum of squares for the solution against simulated solutions where areas are randomly assigned to regions that maintain the cardinality of the original solution.

Parameters:

nperm : int

number of random permutations for calculation of pseudo-p_values

Examples

>>> import random
>>> import numpy as np
>>> random.seed(100)
>>> np.random.seed(100)
>>> w=pysal.weights.lat2W(5,5)
>>> z=np.random.random_sample((w.n,2))
>>> p=np.random.random(w.n)*100
>>> p=np.ones((w.n,1),float)
>>> floor=3
>>> solution=Maxp(w,z,floor,floor_variable=p,initial=100)
>>> solution.inference(nperm=9)
>>> solution.pvalue
0.29999999999999999

Attributes

pvalue float pseudo p_value
class pysal.region.maxp.Maxp_LISA(w, z, y, floor, floor_variable, initial=100)

Max-p regionalization using LISA seeds

Parameters:

w : W

spatial weights object

z : array

nxk array of n observations on k variables used to measure similarity between areas within the regions.

y : array

nx1 array used to calculate the LISA statistics and to set the intial seed order

floor : float

value that each region must obtain on floor_variable

floor_variable : array

nx1 array of values for regional floor threshold

initial : int

number of initial feasible solutions to generate prior to swapping

Notes

We sort the observations based on the value of the LISAs. This ordering then gives the priority for seeds forming the p regions. The initial priority seeds are not guaranteed to be separated in the final solution.

Examples

>>> import random
>>> import numpy as np
>>> random.seed(100)
>>> np.random.seed(100)
>>> w=pysal.lat2W(10,10)
>>> z=np.random.random_sample((w.n,2))
>>> y=np.arange(w.n)
>>> p=np.ones((w.n,1),float)
>>> mpl=Maxp_LISA(w,z,y,floor=3,floor_variable=p)

Note random components result is slightly different values across architectures so the results have been removed from doctests and will be moved into unittests that are conditional on architectures

Attributes

area2region dict mapping of areas to region. key is area id, value is region id
regions list list of lists of regions (each list has the ids of areas in that region)
swap_iterations int number of swap iterations
total_moves int number of moves into internal regions