Update model card

f2966fb verified 22 days ago

6.56 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- scene-graph-generation
	- object-detection
	- visual-relationship-detection
	- pytorch
	- yolo
	pipeline_tag: object-detection
	library_name: sgg-benchmark
	model-index:
	- name: REACT++ yolo12m
	results:
	- task:
	type: object-detection
	name: Scene Graph Detection
	dataset:
	name: VG150
	type: vg150
	metrics:
	- type: mR@20
	value: 10.81
	name: mR@20
	- type: R@20
	value: 18.76
	name: R@20
	- type: mR@50
	value: 14.42
	name: mR@50
	- type: R@50
	value: 24.63
	name: R@50
	- type: mR@100
	value: 16.78
	name: mR@100
	- type: R@100
	value: 28.47
	name: R@100
	- type: F1@20
	value: 13.72
	name: F1@20
	- type: F1@50
	value: 18.19
	name: F1@50
	- type: F1@100
	value: 21.11
	name: F1@100
	- type: e2e_latency_ms
	value: 20.5
	name: e2e_latency_ms
	- name: REACT++ yolo26m
	results:
	- task:
	type: object-detection
	name: Scene Graph Detection
	dataset:
	name: VG150
	type: vg150
	metrics:
	- type: mR@20
	value: 10.81
	name: mR@20
	- type: R@20
	value: 21.12
	name: R@20
	- type: mR@50
	value: 14.6
	name: mR@50
	- type: R@50
	value: 28.34
	name: R@50
	- type: mR@100
	value: 18.36
	name: mR@100
	- type: R@100
	value: 33.7
	name: R@100
	- type: F1@20
	value: 14.3
	name: F1@20
	- type: F1@50
	value: 19.27
	name: F1@50
	- type: F1@100
	value: 23.77
	name: F1@100
	- type: e2e_latency_ms
	value: 19.8
	name: e2e_latency_ms
	- name: REACT++ yolov8m
	results:
	- task:
	type: object-detection
	name: Scene Graph Detection
	dataset:
	name: VG150
	type: vg150
	metrics:
	- type: mR@20
	value: 12.22
	name: mR@20
	- type: R@20
	value: 22.89
	name: R@20
	- type: mR@50
	value: 16.31
	name: mR@50
	- type: R@50
	value: 29.96
	name: R@50
	- type: mR@100
	value: 18.45
	name: mR@100
	- type: R@100
	value: 34.09
	name: R@100
	- type: F1@20
	value: 15.93
	name: F1@20
	- type: F1@50
	value: 21.12
	name: F1@50
	- type: F1@100
	value: 23.94
	name: F1@100
	- type: e2e_latency_ms
	value: 18.7
	name: e2e_latency_ms
	---

	# REACT++ Scene Graph Generation — VG150 (yolo12m, yolo26m, yolov8m)

	This repository contains REACT++ model checkpoints for scene graph generation (SGG)
	on the VG150 benchmark, across 3 backbone sizes.

	REACT++ is a parameter-efficient, attention-augmented relation predictor built on top of
	a YOLO backbone. It uses:

	- DAMP (Detection-Anchored Multi-Scale Pooling), a new simple pooling algorithm for one-stage object detectors such as YOLO
	- SwiGLU gated MLP for all feed-forward blocks (½ the params of ReLU-MLP at equal capacity)
	- Visual x Semantic cross-attention — visual tokens attend to GloVe prototype embeddings
	- Geometry RoPE — box-position encoded as a rotary frequency bias on the Q matrix
	- Prototype Momentum Buffer — per-class EMA prototype bank
	- P5 Scene Context — AIFI-enhanced P5 tokens provide global context via cross-attention

	The models were trained with the
	[SGG-Benchmark](https://github.com/Maelic/SGG-Benchmark) framework and described in the
	[REACT++ paper (Neau et al., 2026)](https://arxiv.org/abs/2603.06386).

	---

	## Results — SGDet on VG150 test split (CUDA, max_det=100, batch_size=1)

	> Metrics from end-to-end evaluation (`tools/evaluate.py`). Latency = model forward only.

	\| Backbone \| R@20 \| R@50 \| R@100 \| mR@20 \| mR@50 \| mR@100 \| F1@20 \| F1@50 \| F1@100 \| Lat. (ms) \|
	\|----------\|-----:\|-----:\|------:\|------:\|------:\|-------:\|------:\|------:\|-------:\|--------------:\|
	\| yolo12m \| 18.76 \| 24.63 \| 28.47 \| 10.81 \| 14.42 \| 16.78 \| 13.72 \| 18.19 \| 21.11 \| 20.5 \|
	\| yolo26m \| 21.12 \| 28.34 \| 33.7 \| 10.81 \| 14.6 \| 18.36 \| 14.3 \| 19.27 \| 23.77 \| 19.8 \|
	\| yolov8m \| 22.89 \| 29.96 \| 34.09 \| 12.22 \| 16.31 \| 18.45 \| 15.93 \| 21.12 \| 23.94 \| 18.7 \|

	---

	## Checkpoints

	\| Variant \| Sub-folder \| Checkpoint files \|
	\|---------\|------------\|-----------------\|
	\| yolo12m \| `yolo12m/` \| `yolo12m/model.onnx` (ONNX) · `yolo12m/best_model_epoch_19.pth` (PyTorch) \|
	\| yolo26m \| `yolo26m/` \| `yolo26m/model.onnx` (ONNX) · `yolo26m/best_model_epoch_18.pth` (PyTorch) \|
	\| yolov8m \| `yolov8m/` \| `yolov8m/model.onnx` (ONNX) · `yolov8m/best_model_epoch_6.pth` (PyTorch) \|

	---

	## Usage

	### ONNX (recommended — no Python dependencies beyond onnxruntime)

	```python
	from huggingface_hub import hf_hub_download

	onnx_path = hf_hub_download(
	repo_id="maelic/REACTPlusPlus_VG150",
	filename="yolo12m/react_pp_yolo12m.onnx",
	repo_type="model",
	)
	# Run with tools/eval_onnx_psg.py or load directly via onnxruntime
	```

	### PyTorch

	```python
	# 1. Clone the repository
	# git clone https://github.com/Maelic/SGG-Benchmark

	# 2. Install dependencies
	# pip install -e .

	# 3. Download checkpoint + config
	from huggingface_hub import hf_hub_download

	ckpt_path = hf_hub_download(
	repo_id="maelic/REACTPlusPlus_VG150",
	filename="yolo12m/best_model.pth",
	repo_type="model",
	)
	cfg_path = hf_hub_download(
	repo_id="maelic/REACTPlusPlus_VG150",
	filename="yolo12m/config.yml",
	repo_type="model",
	)

	# 4. Run evaluation
	import subprocess
	subprocess.run([
	"python", "tools/relation_eval_hydra.py",
	"--config-path", str(cfg_path),
	"--task", "sgdet",
	"--eval-only",
	"--checkpoint", str(ckpt_path),
	])
	```

	---

	## Citation

	```bibtex
	@article{neau2026reactpp,
	title = {REACT++: Efficient Cross-Attention for Real-Time Scene Graph Generation
	},
	author = {Neau, Maëlic and Falomir, Zoe},
	year = {2026},
	url = {https://arxiv.org/abs/2603.06386},
	}
	```