24 / 2023-05-14 16:20:51
Coal Gangue Image Classification Method Based on Vision Transformer
Coal gangue classification,Image classification,Vision Transformer,Transfer learning,CLAHE
全文录用
Yanwei Wang / Heilongjiang University of Science & Technology
Junbo Ye / Heilongjiang University of Science & Technology
Kaiyun Chen / Heilongjiang University of Science & Technology
Coal gangue classification is an effective way to fully utilize coal resources and reduce environmental pollution. Traditional coal gangue classification methods such as manual sorting and mechanical wet selection are inefficient, costly, and environmentally polluting. In the context of the carbon peaking and carbon neutrality goals, these methods can no longer meet the requirements of modern smart mines for coal gangue classification. To address these issues, a coal gangue image classification method based on Vision Transformer is proposed in this study. Firstly, the CLAHE algorithm is used to enhance the contrast of coal gangue images, thereby improving their quality. Secondly, geometric transformation methods such as mirroring, rotation, and cropping are used to augment the coal gangue data, which increases the diversity and quantity of the data and thus enhances the model's generalization ability. Then, the Transformer is used as the feature extractor to obtain the global feature representation of coal gangue images, which significantly improves the model's classification performance. Finally, the multi-layer perceptron is used to complete the coal gangue image classification task.

The Transformer was first applied in the field of natural language processing and was proposed by Vaswani et al.[1], and has become the cornerstone of large language models such as GPT. The modeling ability of Transformer for long-distance correlations and its attention to global features of input information has led to tremendous success in the field of natural language processing. Since 2020, researchers have started exploring how to apply Transformer to computer vision. Google first proposed the ViT [2] structure, which applies Transformer in computer vision as much as possible to maintain its original structure, and achieved experimental results comparable to the current state-of-the-art CNN models.Since then, many excellent visual Transformers have emerged, such as Facebook's DeiT [3] and Microsoft Asia Research Institute's Swim Transformer [4]. These models have been widely used in various specific tasks in the field of computer vision.

In this study, we explored the application of ViT Transformer in the classification of coal gangue images and compare the experimental results with current classical CNN classification algorithms and the latest image classification algorithms. And transfer learning is combined with pre-training weights to achieve end-to-end training of the overall model,and the Vision Transformer model's classification performance on coal gangue images under different training strategies is analyzed through a large number of experiments. The CLAHE algorithm is also introduced to enhance the coal gangue image features and make it easier for the feature extractor to extract good features. Additionally, to evaluate the effectiveness of the model, the coal gangue dataset used in this study is compared with advanced convolutional neural network classification algorithms, and the experimental results are analyzed and compared.The results show that the Vision Transformer model after contrast enhancement has an accuracy improvement of 2.94% compared to the original model. The Vision Transformer model after contrast enhancement and transfer learning training has a classification accuracy of 99.73% for coal gangue images, which is 2.51% higher than that of the VGG16 algorithm. The proposed method has higher classification accuracy and can be effectively used for coal gangue image classification tasks.
重要日期
  • 会议日期

    08月18日

    2023

    08月20日

    2023

  • 07月07日 2023

    初稿截稿日期

  • 08月20日 2023

    注册截止日期

主办单位
International Committee of Mine Safety Science and Engineering
承办单位
Heilongjiang University of Science and Technology
联系方式
移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询