Authors: Michael Arrigo, Stephanie Strassel, Christopher Caruso.
Data source: web collection.
Data type: still image, text.
Applications: keyword spotting, language identification, OCR decoding, script identificaton, text localizaton.
LDC number: LDC2022T07.
In English, Arabic, Chinese, Persian, Hindi, Japanese, Kannada, Korean, Russian, Tamil, Thai, Urdu, Vietnamese.
Title from resource home page (LDC website, viewed Febraury 27, 2023).