Appendix B: Datasets

  1. Google Speech Commands Dataset
    • Description: A set of one-second .wav audio files, each containing a single spoken English word.
    • Link to the Dataset
  2. VisualWakeWords Dataset
    • Description: A dataset tailored for TinyML vision applications, consisting of binary labeled images indicating whether a person is in the image or not.
    • Link to the Dataset
  3. EMNIST Dataset
    • Description: A dataset containing 28x28 pixel images of handwritten characters and digits, which is an extension of the MNIST dataset but includes letters.
    • Link to the Dataset
  4. UCI Machine Learning Repository: Human Activity Recognition Using Smartphones
    • Description: A dataset with the recordings of 30 study participants performing activities of daily living (ADL) while carrying a waist-mounted smartphone with embedded inertial sensors.
    • Link to the Dataset
  5. PlantVillage Dataset
    • Description: A dataset comprising of images of healthy and diseased crop leaves categorized based on the crop type and disease type, which could be used in a TinyML agricultural project.
    • Link to the Dataset
  6. Gesture Recognition using 3D Motion Sensing (3D Gesture Database)
    • Description: This dataset contains 3D gesture data recorded using a Leap Motion Controller, which might be useful for gesture recognition projects.
    • Link to the Dataset
  7. Multilingual Spoken Words Corpus
    • Description: A dataset containing recordings of common spoken words in various languages, useful for speech recognition projects targeting multiple languages.
    • Link to the Dataset

Remember to verify the dataset’s license or terms of use to ensure it can be used for your intended purpose.