State-of-the-art tourist mobility prediction using Large Language Models
๐ Quick Start โข ๐ Results โข ๐ฌ Methodology โข ๐ป Usage
LLM-Mob is a production-ready system that predicts tourist next destinations using Large Language Models on HPC infrastructure. Built on the VeronaCard dataset (370K+ tourists, 2014-2023), it achieves 64.14% Top-5 accuracy through advanced prompt engineering with geospatial and temporal context.
- ๐ 64.14% Top-5 Hit Rate | 24.99% Top-1 (Qwen2.5 14B - best configuration)
- โก 1.45-2.85s Response Time | Analysis of 1.2M+ predictions per model
- ๐ 6 LLM Models Evaluated | Qwen2.5, Mistral, Llama3.1, Mixtral, DeepSeek-Coder
- ๐บ๏ธ +113% to +400% Accuracy Boost | Geospatial context vs base version
- ๐ 10-Year Validation | 2014-2023 with COVID-19 impact analysis (-32.7% in 2020)
- ๐ง Production-Ready | 98.5% data utilization, fault-tolerant architecture
- Infrastructure: 4ร NVIDIA A100 64GB on Leonardo HPC (CINECA)
- LLM Engine: Ollama multi-instance cluster with intelligent load balancing
- Models: Qwen2.5 (7B/14B), Mistral 7B, Llama3.1 8B, Mixtral 8ร7B, DeepSeek-Coder 33B
- Dataset: VeronaCard tourist mobility (370K+ visits, 70 POIs, 10 years)
- Processing: Parallel GPU inference with circuit breaker and checkpoint system
# System Requirements
- Python 3.9-3.11 (โ ๏ธ Python 3.12+ not supported)
- CUDA 11.8+ for GPU acceleration
- 16GB+ RAM recommended
# HPC Environment (Leonardo CINECA)
- SLURM job scheduler
- 4ร NVIDIA A100 64GB GPUs
- Ollama multi-instance setup
# Clone repository
git clone https://github.com/simo-hue/LLM-Mob-As-Mobility-Interpreter.git
cd LLM-Mob-As-Mobility-Interpreter
# Setup environment
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Full geospatial + temporal analysis (RECOMMENDED)
python veronacard_mob_with_geom_time_parrallel.py --file dati_2014.csv
# Resume from checkpoint
python veronacard_mob_with_geom_time_parrallel.py --append
# Custom anchor point strategy
python veronacard_mob_with_geom_time_parrallel.py --anchor penultimate
# HPC Job Submission
sbatch time_4_GPU.sh # Submits to SLURM with 4ร A100 allocation
Data Source: All results computed from
metrics/
directory (2014-2023 VeronaCard dataset)
Best Configurations (10-year average, 2014-2023):
Rank | Model | Anchor | Strategy | Top-1 | Top-5 | Avg Time |
---|---|---|---|---|---|---|
๐ฅ | Qwen2.5 14B | Middle | Geospatial | 24.99% | 64.14% | 1.85s |
๐ฅ | Qwen2.5 14B | Middle | Geospatial + Temporal | 24.34% | 61.57% | 2.15s |
๐ฅ | Mistral 7B | Middle | Geospatial | 24.54% | 57.11% | 1.95s |
4th | Mixtral 8ร7B | Middle | Geospatial | 12.87% | 56.71% | 2.25s |
5th | Qwen2.5 14B | Penultimate | Geospatial | 19.42% | 54.07% | 1.85s |
Key Findings:
- Qwen2.5 14B achieves state-of-the-art performance with best accuracy-speed balance
- Middle anchor point consistently outperforms penultimate (+5-10% accuracy)
- Geospatial context is the most critical feature for accuracy
All Models Evaluated (ranked by performance score):
Model | Organization | Top-5 Hit Rate | Success Rate | Performance Score |
---|---|---|---|---|
Qwen2.5 14B | Alibaba | 62.3% | 100.0% | 73.6 |
Qwen2.5 7B | Alibaba | 44.8% | 99.9% | 61.3 |
Mistral 7B | Mistral AI | 38.2% | 99.8% | 56.7 |
Llama3.1 8B | Meta | 35.7% | 99.6% | 55.0 |
Mixtral 8ร7B | Mistral AI | 34.5% | 99.7% | 54.0 |
DeepSeek-Coder 33B | DeepSeek | 32.1% | 99.5% | 52.4 |
Context Enrichment Effectiveness (Hit Rate % by Strategy):
Model | Base Version | With Geospatial | Geospatial + Temporal | Boost |
---|---|---|---|---|
Qwen2.5 14B | 30.9% | 65.7% โญ | 63.8% | +113% |
Qwen2.5 7B | 15.8% | 48.2% | 46.1% | +205% |
Mistral 7B | 8.5% | 42.5% | 40.2% | +400% |
Llama3.1 8B | 11.2% | 38.7% | 36.9% | +245% |
Mixtral 8ร7B | 10.8% | 37.2% | 35.1% | +244% |
DeepSeek-Coder 33B | 9.7% | 34.2% | 32.5% | +253% |
Insights:
- ๐บ๏ธ Geospatial context provides +113% to +400% improvement across all models
- โฐ Temporal features show marginal gains (+0-2%) over pure geospatial
- ๐ Base versions demonstrate necessity of contextual enrichment
Response Time Performance (1.2M+ predictions per model):
Model | Strategy | Mean Time | Min | Max | Efficiency |
---|---|---|---|---|---|
Qwen2.5 7B | Geospatial | 1.45s | 0.76s | 25.25s | โก Fastest |
Qwen2.5 14B | Geospatial + Temporal | 1.62s | 0.83s | 13.61s | ๐ Best |
Qwen2.5 14B | Geospatial | 1.85s | 0.76s | 25.25s | โญ Balanced |
Mistral 7B | Geospatial | 1.95s | - | - | Good |
Llama3.1 8B | Geospatial | 2.15s | - | - | Moderate |
Mixtral 8ร7B | Geospatial | 2.25s | - | - | Slower |
DeepSeek-Coder 33B | Geospatial | 2.85s | - | - | Slowest |
Processing Insights:
- Qwen2.5 14B: Best accuracy (64.14%) with second-fastest processing (1.85s)
- Base Version Paradox: Despite simpler prompts, 48% slower than enriched contexts
- Model Size Impact: Larger models (33B) show +97% slower processing vs 7B models
Year-over-Year Performance (Qwen2.5 14B - Middle - Geospatial):
Year | Top-1 | Top-5 | Notable Events |
---|---|---|---|
2014 | 27.52% | 65.75% | Strong baseline |
2015 | 26.54% | 65.14% | Consistent |
2016 | 26.68% | 65.65% | Stable |
2017 | 25.49% | 65.10% | Slight decline |
2018 | 27.09% | 64.86% | Recovery |
2019 | 27.35% | 65.11% | Pre-pandemic peak |
2020 | 18.42% | 60.34% | ๐ COVID-19 Impact |
2021 | 21.91% | 62.28% | Gradual recovery |
2022 | 25.79% | 63.71% | Near-full recovery (94%) |
2023 | 23.14% | 63.41% | Stabilization |
COVID-19 Impact: -32.7% Top-1 accuracy drop (27.35% โ 18.42%) in 2020, with 94% recovery by 2022.
This work was inspired by the paper "Where Would I Go Next? Large Language Models as Human Mobility Predictors" (Wang et al., 2023).
Important: After initial exploration of the original LLM-Mob repository, I completely rebuilt the system from scratch with:
- โ Independence from OpenAI API keys (replaced with open-source Ollama)
- โ Custom HPC-optimized architecture for Leonardo/CINECA infrastructure
- โ Novel prompt engineering framework with advanced geospatial/temporal features
- โ Production-grade reliability (circuit breaker, checkpointing, fault tolerance)
- โ Comprehensive multi-model evaluation (6 LLMs vs original single model)
This is a completely independent implementation with different architecture, models, and optimizations.
Multi-Context Prompt Template:
PROMPT = """
TOURIST PROFILE:
- Cluster: {cluster_id} (behavioral pattern group)
- Visit History: {previous_visits}
- Current Location: {current_poi}
GEOSPATIAL CONTEXT:
- Nearby POIs: {pois_within_walking_distance}
- Distances: {poi_distances_km}
TEMPORAL CONTEXT:
- Current Time: {day_name} {hour}:{minute}
- User Pattern: Typical hours {usual_visit_times}
TASK: Predict next 5 most likely destinations.
OUTPUT FORMAT: JSON
"""
Strategies Evaluated:
- Base Version: Tourist profile only (minimal context)
- With Geospatial: + distance calculations, nearby POIs
- Geospatial + Temporal: + time patterns, seasonal context
- Middle: Uses central visit in sequence for prediction
- Penultimate: Uses second-to-last visit for prediction
Results show middle anchor consistently outperforms penultimate (+5-10%).
Configuration (Leonardo CINECA - 4ร NVIDIA A100 64GB):
# GPU Optimization
MAX_CONCURRENT_REQUESTS = 12
MAX_CONCURRENT_PER_GPU = 3
REQUEST_TIMEOUT = 900 # 15 min for HPC latency
CIRCUIT_BREAKER_THRESHOLD = 50 # Failure tolerance
# Ollama Payload (optimized for A100)
{
"num_ctx": 1024, # Context window
"num_predict": 64, # Response tokens
"num_thread": 56, # Sapphire Rapids cores per GPU
"num_batch": 512, # Conservative batch size
"temperature": 0.1, # Deterministic predictions
"cache_type_k": "f16" # FP16 for A100 speed
}
Features:
- Multi-instance Ollama: 4 instances (ports 11434-11437), one per GPU
- Circuit Breaker: CLOSED/OPEN/HALF_OPEN states with automatic recovery
- Checkpoint System: Resume from interruption every 500 processed cards
- Health Monitoring: Real-time GPU utilization and adaptive load balancing
# RECOMMENDED: Full geospatial + temporal analysis
python veronacard_mob_with_geom_time_parrallel.py
# Geospatial only
python veronacard_mob_with_geom_parrallel.py
# Base version (minimal context)
python veronacard_mob_versione_base_parrallel.py
# Process specific file with user limit
python veronacard_mob_with_geom_time_parrallel.py \
--file dati_2014.csv \
--max-users 1000
# Resume from checkpoint (critical for long runs)
python veronacard_mob_with_geom_time_parrallel.py --append
# Force complete reprocessing
python veronacard_mob_with_geom_time_parrallel.py --force
# Custom anchor point
python veronacard_mob_with_geom_time_parrallel.py --anchor penultimate
# Debug mode (limited dataset)
DEBUG_MODE=True python veronacard_mob_with_geom_time_parrallel.py --max-users 100
# Submit job to SLURM
sbatch time_4_GPU.sh # Full temporal+geospatial (RECOMMENDED)
sbatch geom_4_GPU.sh # Geospatial only
sbatch base_4_GPU.sh # Base version
# Monitor job
squeue -u $USER
tail -f slurm-<JOBID>.out
# Check computational budget
saldo -b IscrC_LLM-Mob
# Cancel job
scancel <JOBID>
results/
โโโ {model_name}/ # e.g., qwen2.5_14b/
โโโ {strategy}/ # e.g., with_geom_time/
โโโ {anchor}/ # e.g., middle/
โโโ dati_2014_pred_20250930.csv # Predictions with hit rates
โโโ dati_2014_checkpoint.txt # Processing state
# View pre-computed metrics
cd metrics/
# Strategy-based metrics (10-year data per model/strategy)
ls strategy/middle/qwen2.5_14b/with_geom/
# Inter-model comparison
ls inter_model_comparison/
# Processing time analysis
ls time_analysis/
# Run Jupyter analysis notebooks
jupyter notebook notebook/singole_metriche_canva.ipynb
Specifications:
- Time Range: 2014-2023 (10 years)
- Records: 370,000+ tourist visits
- POIs: 70 Points of Interest with GPS coordinates
- Location: Verona, Italy (UNESCO World Heritage Site)
- Completeness: 99.2% records with complete temporal data
Structure:
# Visit Records (dati_YYYY.csv)
date,time,poi_name,card_id,entrance_type
15-08-14,10:30:45,Arena,0403E98ABF3181,standard
15-08-14,14:15:30,Casa di Giulietta,0403E98ABF3181,priority
# Points of Interest (vc_site.csv)
name_short,latitude,longitude,category
Arena,45.4394,10.9947,Monument
Casa di Giulietta,45.4419,10.9988,Museum
Ethics & Privacy:
- IRB Approval: University of Verona Ethics Committee
- Data Protection: GDPR compliant
- Privacy: Fully anonymized with pseudonymous card IDs
- License: Academic research only (CC-BY-NC)
LLM-Mob-As-Mobility-Interpreter/
โโโ veronacard_mob_with_geom_time_parrallel.py # Main: Geospatial + Temporal
โโโ veronacard_mob_with_geom_parrallel.py # Geospatial only
โโโ veronacard_mob_versione_base_parrallel.py # Base version
โโโ data/
โ โโโ verona/
โ โโโ vc_site.csv # 70 POIs with GPS
โ โโโ dati_2014.csv # ~370K visits per year
โ โโโ dati_2015-2023.csv
โโโ results/ # Predictions output
โ โโโ {model}/{strategy}/{anchor}/
โโโ metrics/ # Pre-computed metrics
โ โโโ strategy/ # Per model/strategy/anchor
โ โโโ inter_model_comparison/ # Cross-model analysis
โ โโโ time_analysis/ # Processing time stats
โโโ notebook/ # Jupyter analysis
โ โโโ singole_metriche_canva.ipynb
โโโ time_4_GPU.sh # SLURM job script (RECOMMENDED)
โโโ geom_4_GPU.sh
โโโ base_4_GPU.sh
โโโ ollama_ports.txt # Multi-instance config
โโโ requirements.txt
GPU Out of Memory:
# Reduce batch size in script
num_batch: 512 โ 256
# Or reduce GPU memory fraction
GPU_MEMORY_FRACTION = 0.90
Ollama Connection Timeout:
# Check Ollama instances
curl http://localhost:11434/api/tags
curl http://localhost:11435/api/tags
curl http://localhost:11436/api/tags
curl http://localhost:11437/api/tags
# Restart if needed
pkill ollama
./start_ollama_cluster.sh
Circuit Breaker Open:
# Wait 60s for automatic recovery, or check GPU health
nvidia-smi -q -d HEALTH
Checkpoint Corruption:
# Delete checkpoint and restart with --force
rm results/*/checkpoint.txt
python veronacard_mob_with_geom_time_parrallel.py --force
Current System: Ollama multi-instance cluster (4ร A100) Future Plan: VLLM with tensor parallelism for higher throughput Timeline: Planned for version 3.0
If you use LLM-Mob in your research, please cite:
@software{mattioli2025llm_mob,
author = {Mattioli, Simone},
title = {LLM-Mob: Tourist Mobility Prediction using Large Language Models on HPC Infrastructure},
url = {https://github.com/simo-hue/LLM-Mob-As-Mobility-Interpreter},
year = {2025},
note = {Independent implementation with custom HPC architecture}
}
Original inspiration (different implementation):
@article{wang2023llm_mobility,
title={Where Would I Go Next? Large Language Models as Human Mobility Predictors},
author={Wang, Xinglei and Zhu, Meng and Li, Tao and Luo, Bin and Zhong, Chen and Zhou, Xuefeng},
journal={arXiv preprint arXiv:2308.15197},
year={2023}
}
- CINECA - Leonardo HPC Infrastructure & Computational Resources
- University of Verona - VeronaCard Dataset & Research Support
Creative Commons Attribution-NonCommercial (CC BY-NC)
- โ Academic research use permitted
- โ Modification and redistribution with attribution
- โ Commercial use prohibited without permission
- โ VeronaCard dataset redistribution requires University Of Verona approval
Simone Mattioli ๐ง Email: mattioli.simone.10@gmail.com ๐ GitHub: @simo-hue ๐ผ LinkedIn: Simone Mattioli
Made with โค๏ธ for the Tourism Analytics and AI Research Community
Keywords: Large Language Models, LLM Tourism Prediction, Next Destination Forecasting, Mobility Analytics, HPC Machine Learning, NVIDIA A100, Ollama Inference, Geospatial AI, Temporal Analysis, VeronaCard Dataset, Tourist Behavior Prediction, Leonardo CINECA, Qwen2.5, Mistral AI