Spatial assessment of soil pollutants: a compositional approach

Quintas Campillejo, David

dc.contributor	Corral López, Jesús
dc.contributor	Ortego Martínez, María Isabel
dc.contributor.author	Quintas Campillejo, David
dc.contributor.other	Universitat Politècnica de Catalunya. Departament de Matemàtica Aplicada III
dc.date.accessioned	2018-06-10T20:10:09Z
dc.date.issued	2017-06-15
dc.identifier.uri	http://hdl.handle.net/2117/117993
dc.description.abstract	The present project is aimed to study the spatial variability of the As concentration within Northern Ireland. The main objective has been to assess the risk of high concentration of As in the soil within the domain of reference, which may be understood as a hazard for human beings and environmental issues. Notwithstanding most of the samples recorded do not pose any risk to human health, it is important to assess and monitor those areas where the concentration of As is rather high. In this way, obtaining interpolated maps with the probability attached to a status category seems a good way to detect those areas with higher exposure. Other than that, this project aims to apply the statistical methods available for treatment of compositional data and geospatial prediction and thereby the techniques used here can be used for other sets of data, regardless of the phyisical problem underpinning the data, as long as it is of a compositional nature. To assess the potential danger, we have defined three categories for As concentration according to the guide values available for the country: (i) low, (ii) medium and (iii) high. A logistic regression has been used in order to classify the sampling points into one of the previouslydefined categories. In this way, the information of the chemical elements available with higher corrrelation with As have been used as a explanatory variables in a linear regression (i.e. to define the discriminant function). After that, the parameters of the model have been tested to assess their significance on the dependent variable (i.e. the category of As) through hypothesis testing; and these not significant enough to discard the null hypothesis (i.e. the hypothesis assuming the parameter is null) have been discarded. Once the categorisation has been made, the vector of probabilities associated to each category has been the focus of the spatial prediction analysis. The spatial prediction starts with the structural analysis, which is aimed to identify and characterise the spatial variability structure of the data. However, because of the compositional nature of the data (i.e. the probability vector), first we have needed to express them in terms of orthogonal coordinates. This has been achieved by means of the isometric log ratio coordinates (ilr), simply obtained from a standard binary partition (SBP), following the approach suggested by Egozcue J.J. These new coordinates, called balances, allows us to apply the standard statistical tools, particularly the geostatistical methods used for spatial interpolation purposes. Next, the structural analysis has been carried out on the balances through the next steps: (i) the sample variogram has been calculated taking several directions, in order to assess the anisotropy/isotropy of the data; (ii) a parametric variogram family has been selected from the output of the previous step; (iii) a variogram model has been fitted by minimizing the weighted sum of square errors. The next step has been to interpolate spatially the results, taking the information contained in the sampling points. This has been done by means of kriging, which is the best unbiased estimator. The krige estimator has been fed with the model variogram, which contains the spatial dependence used to estimate at the unsampled location, and interpolated maps of the defined balances have been obtained. Following that, the results have been back-transformed to obtain interpolated maps of the probabilities and the errors of the interpolation have been assessed through cross validation. At last, several simulations have been executed in order to see how much different realisations of the random field that generates the data, given the spatial structure available, may vary. With reference to the results, the logistic regression which categorizes the As concentration has been proved to be meaningful and rather accurate, according to is simplicity. The kriged maps, however, has shown significant errors due to the weak spatial dependence of the variable under study.
dc.description.abstract	Aquesta tesina es centra en l’anàlisi de la variabilitat espacial de la concentració d’Arsènic al sòl d’Irlanda del Nord. L’objectiu principal és detectar les zones amb risc d’alta concentració d’As, ja que aquest pot suposar una amenaça per éssers humans i pel medi ambient. La majoria de les mostres que utilitzem no suposen cap perill, donat que estan per sota del llindar de referència. No obstant això, és important monitoritzar aquelles zones on la concentració és elevada. D’altra banda, la tesina tracta d’adaptar i utilitzar els mètodes estadístics disponibles per al tractament de dades composicionals i per predicció espacial i proposar una metodologia que es pugui aplicar a altres dades. Amb aquest objectiu, s’han definit 3 categories per a la concentració d’As basades en els valors disponibles a les guies de referencia: (i) baix, (ii) mitjà i (iii) alt. La regressió logística s’ha utilitzat per classificar les mostres a les categories prèviament definides. Així, la informació dels elements químics disponibles que mostraven més correlació amb l’As s’han utilitzat com a variables explanatòries del model. A continuació, els paràmetres del model han estat sotmesos a un contrast d’hipòtesis per descartar aquelles variables no significatives. Un cop les mostres han estat classificades, l’interès està en predir el vector de probabilitats associat a les categories a l’espai. La predicció espacial comença amb l’anàlisi estructural, el qual té per objectiu identificar i caracteritzar l’estructura de variabilitat espacial de les dades. Donada la naturalesa composicional de les dades, però, primer hem d’expressar les variables en coordenades ortogonals. En aquest cas utilitzem les isomètric log ratio coordinates (ilr), definides mitjançant l’Standard Binary Partition (SBP), seguint la metodologia suggerida per Egozcue J.J. Aquestes noves coordenades, anomenades balances, ens permeten aplicar les eines estadístiques clàssiques, en particular els mètodes geostadístics disponibles per a la interpolació espacial. A continuació, l’anàlisi estructural s’ha realitzat sobre els balanços seguint els següents passos: (i) el variograma experimental s’ha calculat prenent diferents direccions, amb l’objectiu d’analitzar l’anisotropia/isotropia de les dades; (ii) es selecciona una família de variogrames paramètrics d’acord amb l’output del primer pas; (iii) s’ajusta un model que minimitzi la suma ponderada dels errors quadràtics. El següent pas consisteix en la interpolació espacial en els punts sense mesura. Això s’ha dut a terme mitjançant kriging (i.e.best unbiased estimator. El model krige s’alimenta amb el model de variograma, el qual conté la informació de dependència espacial utilitzada per estimar els valors als punts sense mostra. El resultat són mapes interpolats dels balanços. A continuació, els resultats han estat back-transformed per obtenir els mapes de probabilitats i els errors de la interpolació han estat analitzat mitjançant cross validation. Referent als resultats, el model de regressió logística utilitzat en la categorització d’As ha resultat significatiu i força acurat, donada la simplicitat del mateix. Els mapes obtinguts mitjançant kriging mostren resultats acceptables per a valors centrals, tot i que per als valors extrems el mètode hauria de ser revisat.
dc.language.iso	eng
dc.publisher	Universitat Politècnica de Catalunya
dc.subject	Àrees temàtiques de la UPC::Enginyeria civil
dc.subject.lcsh	Arsenic
dc.subject.lcsh	Soil pollution
dc.title	Spatial assessment of soil pollutants: a compositional approach
dc.type	Master thesis
dc.subject.lemac	Arsènic
dc.subject.lemac	Sòls -- Contaminació
dc.identifier.slug	PRISMA-120877
dc.rights.access	Restricted access - author's decision
dc.date.lift	10000-01-01
dc.date.updated	2017-07-21T12:31:30Z
dc.audience.educationlevel	Màster
dc.audience.mediator	Escola Tècnica Superior d'Enginyers de Camins, Canals i Ports de Barcelona

Fitxers d'aquest items

Nom:: 20170614 thesis vF.pdf
Mida:: 4,972Mb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Màster universitari en Enginyeria de Camins, Canals i Ports [639]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Spatial assessment of soil pollutants: a compositional approach

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora