Cochl. Sense

We are surrounded by millions of different sounds that contain important clues about our surroundings. For example, if you hear someone screaming, you know that there is an emergency. If you hear a siren, you know a fire truck is approaching. Humans are adept at understanding and contextualizing sound while it is still hard for computers. How great would it be if AI could understand “sounds” as well?

Cochl. Sense allows computers to understand what’s going on around it by listening to its surroundings. By simply sending an audio file or audio stream from a microphone, our system will let you know what is happening.

Cochl. Sense is split between different categories :

Cochl. Sense provides following services:

  • Emergency detection
  • Human interaction
  • Human status
  • Home context

Requirements

Sampling rate >= 22050Hz

To have the best result, we recommend sending us an audio input with a sampling rate greater than 22050Hz.

If your audio file is sampled below this value, don’t resample it by yourself: our system supports it as well.

Minimum duration: 1 second

Audio that we analyze needs to be at least 1 second long.

Emergency detection

The following events can be detected when creating an emergency project in the dashboard

  'Fire_smoke_alarm'
  'Glassbreak'
  'Scream'
  'Siren'
  'Others'
  'Skipped' #only with smart filtering

Human interaction

The following events can be detected when creating a human-interaction project in the dashboard

  'Clap'
  'Finger_snap'
  'Knock'
  'Whisper'
  'Whistling'
  'Others'
  'Skipped' #only with smart filtering

Human status

The following events can be detected when creating a human-status project in the dashboard

  'Burping'
  'Cough'
  'Fart'
  'Hiccup'
  'Laughter'
  'Sigh'
  'Sneeze'
  'Snoring'
  'Yawn'
  'Others'
  'Skipped' #only with smart filtering

Home context

The following events can be detected when creating an home-context project in the dashboard

  'Baby_cry'
  'Dining_clink'
  'Dog_bark'
  'Electric_shaver_or_toothbrush'
  'Keyboard_or_mouse'
  'Knock'
  'Toilet_flush'
  'Water_tap_or_liquid_fill'
  'Others'
  'Skipped' #only with smart filtering

Result example

Here is an example of the output of the Cochl. Sense when a 3 seconds length audio file was analyzed by a “human-interaction” API.

  {
    "service": "human-interaction",
    "events": [
        {
          "tag": "Finger_snap",
          "probability": 0.7919,
          "start_time": 0,
          "end_time": 1
        },
        {
          "tag": "Finger_snap",
          "probability": 0.7945,
          "start_time": 0.5,
          "end_time": 1.5
        },
        {
          "tag": "Whistling",
          "probability": 0.8945,
          "start_time": 1,
          "end_time": 2
        },
        {
          "tag": "Whistling",
          "probability": 0.9919,
          "start_time": 1.5,
          "end_time": 2.5
        },
        {
          "tag": "Others",
          "probability": 0.9937,
          "start_time": 2,
          "end_time": 3
        }
      ]
    }