Dataset of Virginia Flue-cured Tobacco Leaf images based on stalk leaf position for classification tasks: A case of Tanzania

Fecha de publicación: 10/09/2024
Fuente: PubMed "Tobacco Plant"
Data Brief. 2024 Aug 10;56:110817. doi: 10.1016/j.dib.2024.110817. eCollection 2024 Oct.ABSTRACTNicotiana tabacum is a kind of plant cultivated for its leaves used for manufacturing medicine and cigarettes. With the common name, the Tobacco plant is grown in many countries including China, Indonesia, Malawi and Tanzania just to mention a few. Literatures suggest a technical gap in the proper identification of grade labels for various parts of the plant. In addition, manual grading has resulted in various gaps and biases. To mitigate this, a data-driven grading solution is necessary. However, relevant datasets to train grade classifiers from various countries become of the essence. This article presents images concentrated on tobacco leaf plant position namely Leaf position which normally carries 23 grade labels. Due to high rainfall which swiped away the applied fertilizer on the tobacco plants in the farms, we failed to get images of one grade. Therefore, this research could capture and label 22 grade labels. Images of tobacco leaves based on the tobacco plant position were collected in Tanzania through participatory community research. Canon 5D mark III cameras with 100 mm micro lens were used to take pictures of tobacco leaves based on the tobacco plant position. Domain experts were used for image labelling and cleaning according to tobacco grade labels identified in Tanzania. The dataset carries 49,779 images, which can be used to develop machine learning models for tobacco leaf grade label identification. The collected dataset can be used to train models and enhance the performance of pre-trained models in any country of interest.PMID:39252771 | PMC:PMC11381425 | DOI:10.1016/j.dib.2024.110817