.. _benchmark:

Benchmarking
============

This module contains code for benchmarking attribution methods, including
reproducing several published results. In addition to implementations of
benchmarking protocols (:mod:`.pointing_game`), the module also provides
implementations of *reference datasets* and *reference models* used in prior
research work, properly converted to PyTorch. Overall, this implementations
closely reproduces prior results, notably the ones in the [EBP]_ paper.

A standard benchmarking suite is included in this library as
:mod:`examples.standard_suite`. For slow methods, a computer cluster may be
required for evaluation (we do not include explicit support for clusters, but
it is easy to add on top of this example code).

It is also recommended to turn on logging (see
:mod:`torchray.benchmark.logging`), which allows the driver to
uses MongoDB to store partial benchmarking results as it goes.
Computations can then be cached and reused to resume the calculations
after a crash or other issue. In order to start the logging server, use

.. code:: shell

      $ python -m torchray.benchmark.server

The server parameters (address, port, etc) can be configured by writing
a ``.torchrayrc`` file in your current or home directory. The package
contains an example configuration file. The server creates a regular
MongoDB database (by default in ``./data/db``) which can be manually
explored by means of the MongoDB shell.

By default, the driver writes data in the ``./data/`` subfolder.
You can change that via the configuration file, or, possibly more easily,
add a symbolic link to where you want to store the data.

The data include the *datasets* (PASCAL VOC, COCO, ImageNet; see
:mod:`torchray.benchmark.datasets`).  These must be downloaded manually and
stored in ``./data/datasets/{voc,coco,imagenet}`` unless this is changed via
the configuration file. Note that these datasets can be very large (many GBs).

The data also include *reference models* (see
:mod:`torchray.benchmark.models`).

.. automodule:: torchray.benchmark
      :members:
      :show-inheritance:

Pointing Game
-------------

The *Pointing Game* [EBP]_ assesses the quality of an attribution method by
testing how well it can extract from a predictor a response correlated with the
presence of known object categories in the image.

Given an input image :math:`x` containing an object of category :math:`c`, the
attribution method is applied to the predictor in order to find the part of the
images responsible for predicting :math:`c`. The attribution method usually
returns a saliency heatmap. The latter must then be converted in a single point
:math:`(u,v)` that is "most likely" to be contained by an object of that class.
The specific way the point is obtained is method-dependent.

The attribution method then scores a hit if the point is within a *tolerance*
:math:`\tau` (set to 15 pixels by default) to the image region :math:`\Omega`
containing that object:

    .. math::
        \operatorname{hit}(u,v|\Omega)
        = [ \exists (u',v') \in \Omega : \|(u,v) - (u',v')\| \leq \tau].

The point coordinates :math:`(u,v)` are also indices :math:`x_{ncvu}` in the
input image tensor :math:`x`.

RISE [RISE]_ and Extremal Perturbation [EP]_ results are averaged over 3 runs.

.. csv-table:: Pointing game results
   :widths: auto
   :header-rows: 2
   :stub-columns: 1
   :file: pointing.csv


.. automodule:: torchray.benchmark.pointing_game
      :members:
      :show-inheritance:

Datasets
--------

.. automodule:: torchray.benchmark.datasets
      :members:
      :show-inheritance:

      .. autodata:: IMAGENET_CLASSES
         :annotation:

      .. autodata:: VOC_CLASSES
         :annotation:

      .. autodata:: COCO_CLASSES
         :annotation:

Reference models
----------------

.. automodule:: torchray.benchmark.models
      :members:
      :show-inheritance:

Logging with MongoDB
--------------------

.. automodule:: torchray.benchmark.logging
        :members:
        :show-inheritance: