Scenes

The original version of the following list of visual stimulus sets was compiled by Johanna Margret Sigurdardottir and will be updated as needed. We neither host nor do we provide copies of the stimuli. Researchers who may wish to use a particular stimulus set should seek further information, including on possible licences, e.g. by following the provided web links, reading the referenced papers, and/or emailing the listed contact person/persons for a particular stimulus set. If you notice an error, know of a stimulus set that should be included, or have any other questions or comments, please contact Heida Maria Sigurdardottir (heidasi(Replace this parenthesis with the @ sign)hi.is). The list is provided as is without any warranty whatsoever.

These data sets contain images scenes that are categorized into “Natural scenes”, “Man-made scenes” and “Various scenes”. Natural scenes include images of nature (mostly) without buildings or other man-made objects. Man-made scenes include images that show either inside or outside man-made structures. Various scenes include pictures that show inside buildings and natural scenes with and without man-made structures or objects.  

Table of Contents

Natural scenes

Natural Scenes Dataset (NSD)

Description: The main component of the Natural Scenes Dataset (NSD) is high-resolution whole-brain 7T fMRI responses to 70,000+ natural scene images across 8 human observers. These images are taken from Microsoft’s Common Objects in Context (COCO) image database.

License: People who would like to access the NSD dataset should fill out a short NSD Data Access Agreement.

Link: Images, further documentation on the dataset, and accompanying paper.

Reference: Allen, E. J., St-Yves, G., Wu, Y., Breedlove, J. L., Prince, J. S., Dowdle, L. T., … & Kay, K. (2021). A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence. Nature Neuroscience, 1-11.

Nature Scene Collection

Description: This set contains 1204 pictures of real natural scenes taken in Austin, Texas, USA.  The pictures do not show any people or man-made objects.

License: https://creativecommons.org/licenses/by/3.0/

Link: http://natural-scenes.cps.utexas.edu/db.shtml#nature_scene_collection    

References:

Geisler, W. S. & Perry, J. S. (2011). Statistics for optimal point prediction in natural images. Journal of Vision. October 19, 2011 vol. 11 no. 12 article 14.

http://jov.arvojournals.org/article.aspx?articleid=2121019

UPenn Natural Image Database

Description: This set contains 4000 photos of natural scenes taken at a baboon habitat in Botswana. These photos contains some baboons and other animals.

License: Creative Commons Attribution-NonCommercial

Link: http://tofu.psych.upenn.edu/~upennidb/

Reference: Tkačik, G., Garrigan, P., Ratliff, C., Milčinski, G., Klein, J. M., Seyfarth, L. H., Sterling, P., et al. (2011). Natural images from the birthplace of the human eye. PLoS ONE, 6(6), e20409. doi:10.1371/journal.pone.0020409

Pacific Labelled Corals

Description: This dataset includes 5090 pictures of coral reef mapping from 4 Pacific monitoring projects in Moorea in French Polynesia, Heron Reff in Australia, Nanwan Bay in Taiwan and the northen Line Islands. The pictures have all been annotated by random point annotation tool used by coral reef experts.

License: see website.

Link: http://mcr.lternet.edu/cgi-bin/showDataset.cgi?docid=knb-lter-mcr.5013

Reference: Edmunds, P of Moorea Coral Reef LTER. 2015. MCR LTER: Coral Reef: Computer Vision: Multi-annotator Comparison of Coral Photo Quadrat Analysis. knb-lter-mcr.5013.3

Man-made scenes

Indoor scenes

Change Blindness (CB) Database

Description: This set was created to study change blindness, i.e. when people don’t notice a change in a scene. According to Sareen, P., Ehinger, K.A., & Wolfe, J.M. (2015) the images are 1024×768 and in jpeg format. The database is divided into 5 subsets:

  1. “The main dataset” includes 130 scenes of real life, mostly indoor environments. There are about 4 images of each scene where the same object is either there or not and then again with the image “horizontally-reversed”. 
  2. “The Window change dataset”: consists of 12 distinct indoor scenes. There are 4 pictures of each scene. In one image an object is positioned outside the room and seen through a window. In another the same object that was outside is seen inside the room.
  3. “Mirror change dataset:” includes 24 scenes with so-called “Mirror condition” and “Disjoint condition.” In the Disjoint condition the altered object is either not seen in the room but as a reflection in a mirror or it is seen in the room itself but not in the mirror. The object is never seen in the room and as a reflection in the mirror at the same time. In the Mirror condition the main object is seen as a reflection in a mirror and in the room itself. “For the room change:” the object is no longer inside the room itself but can be seen as a reflection in the mirror. “For the mirror change” the object is still inside the room but its reflection in the mirror disappears (Sareen, Ehinger, Wolfe. 2015).
  4. “Additional CB images:” contains two images of 62 different scenes. The images are exactly the same except that one random object has a distinct color in the 2 pictures. This subset also includes 2 images of 50 various scenes with a random object visible in one but not the other.
  5. “Shadow Change Sets:” consists pictures of 96 places where an object or an object’s shadow changes.

License: see website.

Link: http://search.bwh.harvard.edu/new/CBDatabase.html

Reference:

  1. Sareen, P., Ehinger, K.A., & Wolfe, J.M. (2015). CB database: A change blindness database for objects in natural indoor scenes. Behavior Research Methods. [PDF].
  2. Ehinger, K. A., Allen, K., & Wolfe, J. M. (2016). Change blindness for cast shadows in natural scenes: Even informative shadow changes are missed. [journal article]. Attention, Perception, & Psychophysics, 78(4), 978-987. doi: 10.3758/s13414-015-1054-7
  3. Sareen, P., Ehinger, K., & Wolfe, J. M. (2015). Through the looking-glass: Objects in the mirror are less real. Psychonomic Bulletin & Review, 22(4), 980-986. doi: 10.3758/s13423-014-0761-8

More articles can be found here:

http://search.bwh.harvard.edu/new/publications.html

Scene Size x Clutter Database

Description: This set has 36 categories of scenes from the inside of buildings with 12 pictures in each category. The scenes are categorized by size which refers to how many people would fit in the scene/room and they are also sorted by how full they are, i.e. from empty to completely packed (Oliva, Konkle, Park. 2015)

License: not known

Link: http://konklab.fas.harvard.edu/#

Reference:

Oliva, A., Konkle, T., Park, S. (2015). Parametric Coding of the Size and Clutter of Natural Scenes in the Human Brain. Cerebral Cortex, 24(7), 1792-1805.

http://cercor.oxfordjournals.org/content/25/7/1792.full.pdf+html

Scene Categories by Size

Description: This set has 288 pictures taken inside buildings, e.g. apartments or stadiums. The pictures are arranged by size into 18 groups with 16 different scenes in every group. The size is defined by how many people would fit inside the room/scene. Scenes which should be able to contain the same amount of people are grouped together (Oliva, Konkle, Park. 2015).

License: not known

Link: http://konklab.fas.harvard.edu/#

Reference:

Oliva, A., Konkle, T., Park, S. (2015). Parametric Coding of the Size and Clutter of Natural Scenes in the Human Brain. Cerebral Cortex, 24(7), 1792-1805.

http://cercor.oxfordjournals.org/content/25/7/1792.full.pdf+html

“Shuffle stimulus set”

Description: This database was used to study our ability to notice change. Most of the pictures in this database are of indoor and some of outdoor scenes. All of them are 600 x 450 pixels. Each scene was photographed three times at different angles. The new angle view-point was either gotten by the photographer taking a large step or by only a tilt on the front or the side of the camera which was on a tripod. For one of these angles changes one object was removed from the scene.

License: see website.

Link: http://search.bwh.harvard.edu/new/Shuffle_Images.html

References:

Josephs, E., Drew, T., & Wolfe, J. (2015). Shuffling your way out of change blindness. [journal article]. Psychonomic Bulletin & Review, 23(1), 193-200. doi: 10.3758/s13423-015-0886-4

More articles can be found here http://search.bwh.harvard.edu/new/publications.html

Outdoor scenes

Campus Scene Collection

Description: 90 pictures taken on a university campus in Texas. The pictures show buildings, vehicles, people and plants.

License: https://creativecommons.org/licenses/by/3.0/

Link: http://natural-scenes.cps.utexas.edu/db.shtml#nature_scene_collection

References: see website.

Burge J, Geisler WS (2011). Optimal defocus estimation in individual natural images. Proceedings of the National Academy of Sciences, 108 (40): 16849-16854 http://www.pnas.org/content/108/40/16849.short

Correlated photographs: Camera edition

Description: In visual memory experiments, we often ask people to memorize large numbers of very different photographs. On the other hand, the everyday visual input is correlated in time and space. By “correlated”, we mean that the view we are seeing at the moment is similar to our other recent/neighboring views. It is likely people take advantage of the correlation and pay attention accordingly.

The stimulus set Correlated Photographs – Camera Edition was created to provide researchers with materials depicting similar, but different scenes from a city walk.

The set contains two basic forms of correlated materials:

panorama – set of photographs, taken from the same point
track – set of photographs, taken along a straight line

The set contains 128 panoramas and 128 tracks with approx. 12 photographs in each group. The photographs are resized to a fixed height (2000 px).

License: https://creativecommons.org/licenses/by/4.0/

Reference: https://osf.io/vy3sz/

UCSD Campus Images Dataset

Description: This database contains 1400 photos from various locations on the UCSD Campus which are categorized in 3 classes: “train“, “test“, “googleglass”. “Train” has 1204 pictures, with GPS tags. “Test” has 272 pictures with GPS tags and “googleglass“ has 66 pictures obtained from Google Glass and do not have GPS tags.

License: Not known.

Link: http://vision.ucsd.edu/content/ucsd-campus-images-dataset

Reference: Not known.

Various scenes

Indoor and Outdoor scenes

Description: This set contains 800 photos of various scenes, including indoor scenes and both man-made and natural outdoor scenes. The images can be seen isolated but the creators of this database showed each scene next to an object which was relevant to the scene like a picture of a pan either right or left to a photo of a kitchen.

License: see website.

Link: https://labs.psych.ucsb.edu/eckstein/miguel/research_pages/saliencydata.html

Reference: Koehler, K., Guo, F., Zhang, S., & Eckstein, M. P. (2014). What do saliency models predict? Journal of Vision, 14(3), 14, 1-27. doi:10.1167/14.3.14

Places Database

Description: The Place Database is “…a repository of 10 million scene photographs, labeled with scene semantic categories and attributes, comprising a quasi-exhaustive list of the types of environments encountered in the world” (from ref. 1).

License: see website.

Link: http://places.csail.mit.edu/

Reference: 

  1. Zhou, B., Khosla, A., Lapedriza, A., Torralba, A., & Oliva, A. (2016). Places: An image database for deep scene understanding. arXiv preprint arXiv:1610.02055.
  2. Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., & Oliva, A. (2014). Learning deep features for scene recognition using places database. In Advances in neural information processing systems (pp. 487-495). 

Street View Text Dataset

Description: This set includes 349 photos taken from Google Street View. Text in the scenes is often of low resolution but the creators noticed that the text comes from signs most of the time and the business names are easily detected.

License: Not known.

Link: http://vision.ucsd.edu/~kai/svt/

Reference:

  1. Wang, K., Babenko, B., & Belongie, S. (2011, November). End-to-end scene text recognition. In 2011 International Conference on Computer Vision (pp. 1457-1464). IEEE.
  2. Wang, K., & Belongie, S. (2010, September). Word spotting in the wild. In European Conference on Computer Vision (pp. 591-604). Springer Berlin Heidelberg.

4672 Scenes grouped into categories

Description: “4672 pictures of scenes sorted in 160 different categories.”

License: Not known.

Link: http://bradylab.ucsd.edu/stimuli.html

Reference:

Brady, T. F., Konkle, T., Alvarez, G.A., and Oliva, A. (2013). Real-world objects are not represented as bound units: Independent forgetting of different object details from visual memory. Journal of Experimental Psychology: General, 142(3), 791-808.