This benchmark operates either in the full mode for registered users (U) or in a restricted anonymous mode (no custom mosaics creation, no database entry for test results or algorithm data). For registration go to the registration page. Single hot spots are explained using the contextual help in the status bar.


Welcome to the Prague Texture Segmentation Benchmark, whose purpose is:
- to mutually compare and rank different (dynamic/static) texture segmenters (supervised or unsupervised),
- to support new segmentation and classification methods development and to check the progress of an algorithm's development,
- to track and measure the progress toward human-level segmentation performance over time.
This server allows users:
- to obtain customized experimental texture mosaics and their corresponding ground-truths (U),
- to obtain the benchmark texture mosaic set with their corresponding ground-truths,
- to evaluate your working segmenters and compare them with the state of art algorithms (U),
- to include your algorithm's details (reference, abstract, benchmark results) into the benchmark database (U),
- to check single mosaics' evaluation details (the criteria values and the resulting thematic maps),
- to rank segmentation algorithms according to the most common benchmark criteria,
- to assess noise robustness with respect to noise,
- to obtain LaTeX coded resulting criteria tables or export data in MATLAB format (U),
- to select a user-defined subset of the criteria (U).
Dataset
- Computer generated texture mosaics and benchmarks are composed from the following image types:
- monospectral textures,
- multispectral textures,
- BTF (bidirectional texture function) textures [UTIA BTF, BTF Bonn],
- ALI hyperspectral satellite images [Earth Observing 1],
- very-high-resolution RGB images [GeoEye],
- dynamic textures [DynTex],
- rotation invariant texture set,
- scale invariant texture set,
- illumination invariant texture set.
- All generated texture mosaics can be corrupted with additive Gaussian noise, Poisson or salt&pepper noise.
- The corresponding trainee sets (hold out) are supplied in the classification (supervised) mode.
Benchmark evaluation
- Submitted results are stored in the server database and used for the algorithm ranking based on a selected criterion from the following criteria set:
- region-based (including the sensitivity graphs):
- CS - correct detection,
- OS - over-segmentation,
- US - under-segmentation,
- ME - missed error,
- NE - noise error,
- pixel-wise:
- O - omission error,
- C - commission error,
- CA - class accuracy,
- CO - recall = correct assignment,
- CC - precision = object accuracy,
- I - type I error,
- II - type II error,
- EA - mean class accuracy estimate,
- MS - mapping score,
- RM - root mean square proportion estimation error,
- CI - comparison index,
- F-measure (weighted harmonic mean of precision and recall) graph,
- consistency measures:
- GCE - global consistency error,
- LCE - local consistency error,
- BCE - bidirectional consistency error,
- GBCE - global bidirectional consistency error,
- clustering comparison:
- BGM - bipartite graph matching,
- VD - Van Dongen,
- L - Larsen,
- SC, SSC - segmentation covering,
- information:
- AVI - variation of information (normalized w.r.t. number of pixels),
- NVI - variation of information (normalized w.r.t. number of classes/regions),
- NMI - normalized mutual information,
- set:
- JC - Jaccard coefficient,
- DC - Dice coefficient,
- FMI - Fowlkes and Mallows index,
- ARI - adjusted Rand index,
- WI, WII - Wallace,
- M - Mirkin metric,
- boundary:
- NBDE - normalized boundary displacement error,
- meta-criteria:
- RANK - average of ranks (over displayed criteria),
- AVG - average of values (over displayed criteria),
- NORM - average of z-scores (over displayed criteria).
- region-based (including the sensitivity graphs):
- All criteria are displayed after multiplication by 100.
- Result values for dynamic benchmark are averages of criteria values over all video frames.