Task Graph
Determine how each task should be performed (using keysteps) based on a given video segment.
Task Definition
Given a video segment and its segment history , models have to:
- Determine the list of previous keysteps to be performed before ;
- Infer if is an optional keystep, i.e. the procedure can be completed even skipping this keystep;
- Infer if is a procedural mistake, i.e. a mistake due to incorrect keystep ordering;
- Predict a list of missing keysteps. These are keysteps which should have been performed before but have not been performed;
- Forecast next keysteps. These are keysteps for which dependencies are satisfied after the execution of and hence can be executed next.
The task is weakly supervised, with two versions based on two different levels of supervision: 1) instance-level: segments and their keystep labels are available during training and inference; 2) procedure-level: unlabeled segments and procedure-specific keystep names are given for training and inference.
Note that, when the procedure-level supervision is considered, the input to the model excludes keystep labels both at training and test time. At both the procedure and instance levels of supervision, models are required to process the video in a causal fashion, meaning that predictions made at time only depend on observations made at time .
Metrics
Baselines are evaluated using calibrated Average Precision (cAP), as defined in this research paper. Note that, according to this measure, a random baseline would on average achieve a performance of .
Baseline
Coming Soon!