Datasets which can help train or evaluate various approaches to automatic metadata generation and extraction.