Shopper intent prediction from clickstream e-commerce data with minimal browsing information

View/Open
Cita com:
hdl:2117/341936
Document typeArticle
Defense date2020-10-12
PublisherNature
Rights accessOpen Access
Except where otherwise noted, content on this work
is licensed under a Creative Commons license
:
Attribution-NonCommercial-NoDerivs 3.0 Spain
ProjectTHE DISCOVERIES CTR - Implementation of The Discoveries Centre for Regenerative and Precision Medicine, a new Centre of Excellence in Portugal (EC-H2020-739572)
LASERLAB-EUROPE - The Integrated Initiative of European Laser Research Infrastructures (EC-H2020-654148)
OPTOlogic - Optical Topologic Logic (EC-H2020-899794)
LASERLAB-EUROPE - The Integrated Initiative of European Laser Research Infrastructures (EC-H2020-654148)
OPTOlogic - Optical Topologic Logic (EC-H2020-899794)
Abstract
We address the problem of user intent prediction from clickstream data of an e-commerce website via two conceptually different approaches: a hand-crafted feature-based classification and a deep learning-based classification. In both approaches, we deliberately coarse-grain a new clickstream proprietary dataset to produce symbolic trajectories with minimal information. Then, we tackle the problem of trajectory classification of arbitrary length and ultimately, early prediction of limited-length trajectories, both for balanced and unbalanced datasets. Our analysis shows that k-gram statistics with visibility graph motifs produce fast and accurate classifications, highlighting that purchase prediction is reliable even for extremely short observation windows. In the deep learning case, we benchmarked previous state-of-the-art (SOTA) models on the new dataset, and improved classification accuracy over SOTA performances with our proposed LSTM architecture. We conclude with an in-depth error analysis and a careful evaluation of the pros and cons of the two approaches when applied to realistic industry use cases.
CitationRequena, B. [et al.]. Shopper intent prediction from clickstream e-commerce data with minimal browsing information. "Scientific reports", 12 Octubre 2020, vol. 10, núm. 16983, p. 1-23.
ISSN2045-2322
Publisher versionhttps://www.nature.com/articles/s41598-020-73622-y
Collections
Files | Description | Size | Format | View |
---|---|---|---|---|
SciRep_proofs.pdf | Pre-print proofs | 5,987Mb | View/Open |