The hiring problem and its algorithmic applications
ColaboratorMartínez Parra, Conrado; Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics
Document typeDoctoral thesis
PublisherUniversitat Politècnica de Catalunya
Rights accessOpen Access
The hiring problem is a simple model for on-line decision-making under uncertainty, recently introduced in the literature. Despite some related work dates back to 2000, the name and the first extensive studies were written in 2007 and 2008. The problem has been introduced explicitly first by Broder et al. in 2008 as a natural extension to the well-known secretary problem. Soon afterwards, Archibald and Martínez in 2009 introduced a discrete (combinatorial) model of the hiring problem, where the candidates seen so far could be ranked from best to worst without the need to know their absolute quality scores. This thesis introduces an extensive study for the hiring problem under the formulation given by Archibald and Martínez, explores the connections with other on-line selection processes in the literature, and develops one interesting application of our results to the field of data streaming algorithms. In the hiring problem we are interested in the design and analysis of hiring strategies. We study in detail two hiring strategies, namely hiring above the median and hiring above the m-th best. Hiring above the median hires the first interviewed candidate then any coming candidate is hired if and only if his relative rank is better than the median rank of the already hired staff, and others are discarded. Hiring above the m-th best hires the first m candidates in the sequence, then any coming candidate is hired if and only if his relative rank is larger than the m-th best among all hired candidates, and others are discarded. For both strategies, we were able to obtain exact and asymptotic distributional results for various quantities of interest (which we call hiring parameters). Our fundamental parameter is the number of hired candidates, together with other parameters like waiting time, index of last hired candidate and distance between the last two hirings give us a clear picture of the hiring rate or the dynamics of the hiring process for the particular strategy under study. There is another group of parameters like score of last hired candidate, score of best discarded candidate and number of replacements that give us an indicator of the quality of the hired staff. For the strategy hiring above the median, we study more quantities like number of hired candidates conditioned on the first one and probability that the candidate with score q is getting hired. We study the selection rule 1/2-percentile rule introduced by Krieger et al., in 2007, and the seating plan (1/2,1) of the Chinese restaurant process (CRP) introduced by Pitman, which are very similar to hiring above the median. The connections between hiring above the m-th best and the notion of m-records, and also the seating plan (0,m) of the CRP are investigated here. We report preliminary results for the number of hired candidates for a generalization of hiring above the median; called hiring above the alpha-quantile (of the hired staff). The explicit results for the number of hired candidates enable us to design an estimator, called RECORDINALITY, for the number of distinct elements in a large sequence of data which may contain repetitions; this problem is known in the literature as cardinality estimation problem. We show that another hiring parameter, the score of best discarded candidate, can also be used to design a new cardinality estimator, which we call DISCARDINALITY. Most of the results presented here have been published or submitted for publication. The thesis leaves some open questions, as well as many promising ideas for future work. One interesting question is how to compare two different strategies; that requires a suitable definition of the notion of optimality, which is still missing in the context of the hiring problem. We are also interested in investigating other variants of the problem like probabilistic hiring strategies, that is when the hiring criteria is not deterministic, unlike all the studied strategies.
- Tesis - TDX-UPC