Custom Sound

Create your own custom sound tags by uploading your own files and training it on our platform. Users can upload their sound files via either our Python Script or our dashboard UI. Check out our full list of tags here.

0. Prepare your Data


1. Sort your files

Sort your data according to the following structure.

  • target_data upper most folder containing all your files.

  • target_labels.tsv tsv file containing metadata of your audio files. Download the template format here.

    • File name, the name of the audio file you will use as data.
    • Start time (sec), starting time of the audio snippet you want to use as data, must not be less than 0 or greater than the end time.
    • End time (sec), ending time of the audio snippet you want to use as data, must not be less than the start time or exceed the file length.
    • File length (sec), the total duration of the WAV file. This is used to verify whether the labels are correctly written by comparing them with the actual file length.
    • Class name (optional), the name of the sound class you are training, only a single class name is supported.
  • All your audio files (wav, mp3, or flac) should be placed directly inside the target_data folder, alongside target_labels.tsv.

modeNote

  • Label Requirements: Your target_labels.tsv file must meet at least one of the following conditions:
    • Contain 100 or more rows (labels).
    • Have a total labeled duration of at least 2 minutes.
  • Label Merging: Labels that overlap or are separated by a very short interval (less than 0.1 seconds) will be automatically merged into a single label. The label requirements are checked against these merged labels.
  • Training Time: Training is expected to take approximately 30 minutes if there is no queue.

starGood data preparation practices

Here are a list of considerations from our researchers that can greatly improve the quality of your dataset:

  1. Recording environment consistency Was the data recorded with the actual service equipment? (e.g., the actual microphone or recorder)
  2. Data Diversity Your target_labels.tsv file must meet at least one of the following conditions:
    • Are all collected variations within the intended scope of the target sound? (e.g. Only the specific alarm you want to detect).
    • Does the data include variations in distance, direction, and intensity?
    • Does the data include various real-world background noises? (e.g. office noise, traffic, etc)
  3. Negative samples Are there a sufficient amount of 'unlabeled' portions? These portions, referred to as 'negative samples', should only contain various background noises without the target sound.
  4. Labelling Accuracy
    • Do the label start/end points match the sound precisely? Try to minimize silence before/after the sound.
    • Are sounds with gaps of more than 0.5s separated into distinct labels to clearly separate sound events?

2. Zip your file

Create a zip file of your target_data target_data folder.

1. Create your sound tag


1.1 Upload on Our Web Dashboard

(1) Access the “Custom Sound” tab on our web dashboard projects page

Log in to your Cochl.Sense Dashboard and click on the “Custom Sound” tab on our main projects page.

(2) Create a new custom sound tag

Click on the Add sound tag button. Enter the new sound tag name and drag your target_data.zip file under the Training data section.

(3) Custom sound tag creation will start

If the files are all valid, the process will start on the web dashboard.

(4) Notification upon completion or failure

An email notification will be sent to you when the process is complete or if it fails.

1.2 Use our Python Script

(1) Install the Cochl.Sense Cloud API

Install our Cochl.Sense Cloud API api-client. You can follow the tutorial here.

(2) Create a new file

In the same project, create a .py file anywhere outside of your target_data folder. Copy the following script:

api_config = sense.APIConfig(    
    custom_sound=True,
    )
    
    if api_config.custom_sound:
        custom_sound = sense.CustomSound(
            'YOUR_ORGANIZATION_KEY',
            api_config=api_config,    
    )    
    
    cs_error: Optional[dict] = custom_sound.upload(
        custom_sound_tag='YOUR_CUSTOM_SOUND_TAG',
        zip_file_path='YOUR_ZIP_FILE_PATH'    
    )
    
    if cs_error is None:
        print('successfully uploaded')
    else:
        print(cs_error)

4. Use your Custom Sound


1. Create a new Edge SDK project

Click the New project button. Enter your project name and select Edge SDK as the project type. You can select your created custom sound tags under the Custom Sound tab on the right or on the first few sound tags of on the All tab. Click Create Project to add your project.

2. Use your custom sound tags

You can use the custom sound tags freely on your project like our other support tags.

See also


This is the end of the Custom Sound tutorial. See the links below for other ways to incorporate our technology into your projects.

Transcribe and detect voices Speaker Recognition.

Got a question? Contact us