| Complexity penalized support estimation | ||
|
IntroductionWe will present a method for the estimation of a support of a probability density function for two dimensional cases. The method works also for the estimation of the support of an intensity function of a Poisson process. The estimator is spatially flexible, allowing us to estimate supports which consist of disconnected components. The method is described in the article "Complexity penalized support estimation" . We will provide a R package called "densup", containing programs for estimating the support and plotting the estimate. R is a language and environment for statistical computing and graphics. It can be downloaded from R archive network . The estimation of density support may be applied to the detection of abnormal behaviour of the system, plant, or machine. We may apply our estimator to define a nonparametric multivariate method for statistical quality control, which is an extension of the Shewart methodology based on tolerance regions. Support estimation may be also applied to measure performance of an enterprise in terms of technical efficiency. The latter is distance from the observed productivity to the boundary. We may apply our estimator to the estimation of the support of a Poisson intensity. This may be applied for example to estimate the boundary of a forest, when the location of individual trees is distributed according to a planar Poisson process with unknown intensity function. The package "densup" is designed by Jussi Klemelä . I am grateful from bug reports. InstallationThe programs are provided as R-package.
DocumentationHere is a listing of procedures, which the package provides.
TutorialBelow we show a session which uses the main features of the package. #First load the library library(densup)0:th example dendat<-matrix(rnorm(20),10) #Grow the tree N<-c(8,8) h<-0.1 tree<-grow(dendat,N,h) #Prune the tree alpha<-0.00065 lambda<-0.1 ps<-plotsupport(dendat,tree,alpha,h,lambda,data=T)1st example: Gaussian density
#Generate a sample of size 100 from a standard Gaussian distribution
set.seed(1)
dendat<-matrix(rnorm(200),100)
#Grow the tree
N<-c(64,64)
h<-0.1
tree64<-grow(dendat,N,h)
#Prune the tree
alpha<-0.00065
lambda<-0.1
ps065<-plotsupport(dendat,tree64,alpha,h,lambda,data=T)
#Try other colors
colonum<-2
#colo<-rainbow(colonum,s=1, v=1, start=0, end=max(1,colonum - 1)/n, gamma=1)
#colo<-heat.colors(colonum)
#colo<-terrain.colors(colonum)
#colo<-topo.colors(colonum)
colo<-cm.colors(colonum)
colo[1]<-"white"
image(ps065$x,ps065$y,ps065$z,col=colo,xlab="",ylab="")
points(dendat,pch=20)
2nd example: mixture of two Gaussians
#Generate a sample of size 125 from a mixture of two standard
#two-dimensional Gaussians. We use a program "simmix" to generate data.
source("~/denpro/R/simmix.R")
d<-2
mixnum<-2
M<-matrix(0,mixnum,d)
D<-8
M[1,]<-c(0,0)
M[2,]<-c(D,0)
sig<-matrix(1,mixnum,d)
p0<-1/mixnum
p<-p0*rep(1,mixnum)
n<-125
dendat<-simmix(n,d,M,sig,p,seed=2)
N<-c(64,64)
h<-0.1
tree64<-grow(dendat,N,h)
alpha<-0.0005
lambda<-0.1
ps070<-plotsupport(dendat,tree64,alpha,h,lambda,data=T)
3rd example: mixture of three Gaussians
#Generate a sample of size 150 from a mixture of three standard
#two-dimensional Gaussians. We use a program "simmix" to generate data.
#Means are on vertices of a triangle.
#Distance between vertices is D.
source("~/denpro/R/simmix.R")
d<-2
mixnum<-3
M<-matrix(0,mixnum,d)
D<-8
M[1,]<-c(0,0)
M[2,]<-c(D,0)
M[3,]<-c(D/2,sqrt(3)/2*D)
sig<-matrix(1,mixnum,d)
p0<-1/mixnum
p<-p0*rep(1,mixnum)
n<-150
dendat<-simmix(n,d,M,sig,p,seed=2)
N<-c(64,64)
h<-0.1
tree64<-grow(dendat,N,h)
alpha<-0.0006
lambda<-0.1
ps7<-plotsupport(dendat,tree64,alpha,h,lambda,data=T)
| |
| ||