Reference - REST

Cochl.Sense API allows to detect what is contained inside sound data. Send audio data over the internet to discover what it contains.

npx @openapitools/openapi-generator-cli generate -i openapi.json -g python -o python-client

v1.4.0 Download specification (tested with openapi-generator v5.3.0)

Authentication

API_Key

We use a simple API key to authenticate requests on the backend. API keys are scoped to a given project. To get an API key, go to the dashboard projects page. One key is available for each project. If no projects are present, create one by clicking “add new project” allows creating a new project.

Once the key is retrieved, it will be required to be passed in the HTTP request headers x-api-key: YOUR_API_KEY

Security Scheme TypeHeader parameter name
apiKeyx-api-key

Audio Session

Sense API Audio Session is the heart of the API. An audio session represents one audio source that will be inferenced*. An audio source can be either a file or a stream.

*inferenced: analyzed by our deep learning neural network

Lifetime

An audio session goes through 3 states during its lifetime :

writable -> readonly -> deleted

  • The chunk can be uploaded if and only if session is writable
  • Status can be readonly if and only if session is not deleted

A state can become readonly

  • By manually updating the session with a PATCH request
  • When an error occurs
  • When session total_size has been reached

A state can become deleted.

  • By manually deleting session with a DELETE request
  • When the session becomes inactive*

*inactive: a session is considered inactive if no chunks have been uploaded for more than one hour

Upload

Audio needs to be sent to the server by chunk. The maximum allowed size of a chunk is 1MiB (1MiB == 1024KiB)

  • For a stream, we recommend sending half second long audio chunk. This allows having the shortest latency between audio recording and inferenced results.
  • For a file, we recommend sending a file by chunk of 1MiB audio chunks.

One big difference between stream and file is how they are decoded. For a stream, we recommend every chunk is required to be decodable*, whereas, for a file, the concatenation of all the chunks needs to be decodable.

*decodable: an audio source is decodable if by having only the content-type and the raw data, it is possible to decode the audio

Inference

Once chunks are received, our server will begin to inference automatically.

  • For a stream, chunks are inferenced immediately
  • For a file, chunks are inferenced when session state becomes readonly

Status

Inference results can be retrieved on the status endpoint.

To get all results of a given session, it is recommended to use next_token. It gives the warranty to read all results in correct order exactly once.

Note: In the future, results from the deleted audio session will be accessible on another endpoint

Error

In case an error occurs during a session, that session will become readonly and pending inferences of chunks are canceled. An error message will be available in the session status.

A typical error is a content-type decoding error. For instance, audio/wav data may have been misclassified as audio/mp3 data.

Create Session

POST /audio_sessions/

Create a new session. An API key is required. Session parameters are immutable and can be set at creation only.

Authorization

Request Body

FormatRequiredSchema
application/jsontrueCreateSession

Responses

CodeDescriptionSchema
200The session was created successfullySessionRefs
400The parameter is missing or not formatted properlyGenericError
401Authentication failed. For instance, API key is missing or invalidGenericError
500Unexpected server error. If the error persists, you can contact support@cochl.ai to fix the problem.GenericError

Delete Session

DELETE /audio_sessions/{session_id}

Change the state of the session to deleted. All future call on the session will return 404

Params

ParameterInTypeRequiredDescription
session_idpathstringtrueSession id represents a unique identifier for an audio session

Responses

CodeDescriptionSchema
204The session successfully deleted
404Resources don’t exist or have been deletedGenericError
500Unexpected server error. If the error persists, you can contact support@cochl.ai to fix the problem.GenericError

Update Session

PATCH /audio_sessions/{session_id}

Update a session

Params

ParameterInTypeRequiredDescription
session_idpathstringtrueSession id represents a unique identifier for an audio session

Request Body

FormatRequiredSchema
application/jsontrueUpdateSession

Responses

CodeDescriptionSchema
204The session has been updated successfully
400The parameter is missing or not formatted properlyGenericError
404Resources don’t exist or have been deletedGenericError
500Unexpected server error. If the error persists, you can contact support@cochl.ai to fix the problem.GenericError

Upload Chunk

PUT /audio_sessions/{session_id}/chunks/{chunk_sequence}

Upload new audio chunk

Params

ParameterInTypeRequiredDescription
session_idpathstringtrueSession id represents a unique identifier for an audio session
chunk_sequencepathintegertrueChunk represents the chunk number. This is needed to be a counter starting from 0 and growing by one on each request

Request Body

FormatRequiredSchema
application/jsontrueAudioChunk

Responses

CodeDescriptionSchema
200The chunk successfully uploadedSessionRefs
400The parameter is missing or not formatted properlyGenericError
403The session is not writable anymoreGenericError
404Resources don’t exist or have been deletedGenericError
409The chunk sequence is invalidGenericError
413Audio chunk size must be smaller than 1MiBGenericError
500Unexpected server error. If the error persists, you can contact support@cochl.ai to fix the problem.GenericError

Read Status

GET /audio_sessions/{session_id}/status

Get session status

Note that if all chunks didn’t finish to be inferenced, the server will wait for at least one result to be available in the required page range before returning result. Such waiting can lead to HTTP requests timeout. Therefore we recommend implementing a client retry logic.

Params

ParameterInTypeRequiredDescription
session_idpathstringtrueSession id represents a unique identifier for an audio session
offsetqueryintegerfalseHow many existing elements to skip before returning the first result control how many results to receive
countqueryintegerfalseLimit the length of the returned results array to limit the size of the returned payload
next_tokenquerystringfalseThe next token can be used from a previous page result. It allows to iterating through all the next elements of a collection. If next_token is set, offset and limit will be ignored

Responses

CodeDescriptionSchema
200Successful operationSessionStatus
400Parameter is missing or not formatted properlyGenericError
404Resources don’t exist or have been deletedGenericError
500Unexpected server error. If the error persists, you can contact support@cochl.ai to fix the problem.GenericError

Schemas

AudioChunk

KeyTypeRequiredDescription
datastringtrueRaw audio encoded as base64

AudioContentType

TypeDescription
stringType of audio to send.

AudioMetadatas

KeyTypeRequiredDescription

AudioType

TypeDescription
stringWhether the audio that will be sent is a stream or a file

possible values

  • stream

  • file

CreateSession

KeyTypeRequiredDescription
content_typeAudioContentTypetrueType of audio to send.
default_sensitivityDefaultSensitivityfalseIf set, it allows to provide a default adjusted sensitivity for all tags The sensitivity adjustment ranges in [-2,2] 0 is used if not set Positive value reduces tag appearance, negative value increases it
metadatasAudioMetadatasfalseType of audio session’s metadatas. Check metadatas documentation for an exaustive list of available metadatas
tags_sensitivityTagsSensitivityfalseIf set, it allows to adjust the sensitivity of a given tag in this list The sensitivity adjustment ranges in [-2,2] A value of 0 preserves the default sensitivity Positive value reduces tag appearance, negative value increases it
total_sizeintegerfalseIf set, it allows to automatically change the state of the session to readonly when at least total_size bytes of audio chunk has been uploaded. For stream, this can be used to limit the maximum duration of the session. For a file, this allows to automatically start inferencing when the whole file has been sent. We recommend to set the size for files as it allows to do one less API call to start the inferencing
typeAudioTypetrueWhether the audio that will be sent is a stream or a file
window_hopWindowHopfalseIf set, it allows to adjust the sensitivity of a given tag [in this list] The window hop adjustment can be set eith to 0.5s or 1s By default, this value is set to 0.5s

DefaultSensitivity

integer

{{map[]} [] [] [] integer If set, it allows to provide a default adjusted sensitivity for all tags

The sensitivity adjustment ranges in [-2,2]

0 is used if not set

Positive value reduces tag appearance, negative value increases it [] 0 false false false false false false false false 0xc000d3c648 0xc000d3c640 0 0 [] map[] 0 }

GenericError

KeyTypeRequiredDescription
errorstringtrueHuman-readable description of the error. Note that the value should not be used programmatically as the description might be changed at any moment
status_codeintegertrueHTTP status code returned

Page

KeyTypeRequiredDescription
countintegertrueThe number of elements that have been returned
next_tokenstringfalseThe next token can be used in the next page request to get the following results. If not present, it means that the page has reached the end of the collection
offsetintegertrueIndex of the first return element
totalintegertrueThe total number of available elements in the collection at the moment

Sense

KeyTypeRequiredDescription
in_progressbooleanfalseIs true when there are still some pending chunks that were uploaded but are not inferenced yet
pagePagefalseContains the range of elements that have been returned for a given collection
resultsarrayfalseContains paginated results of what has been inferenced so far

SenseEvent

KeyTypeRequiredDescription
end_timenumbertrueRepresent the end of the window, in the second where inference was done. Note that end_time is window_length after start_time
start_timenumbertrueRepresent the start of the window, in the second, where inference was done. Note that start_time will increase by window_hop on every step
tagsarraytrueContains results of what has been inferenced at the same time

SenseEventTag

KeyTypeRequiredDescription
namestringtrueName of the sound recognized during the inference.
probabilitynumbertrueProbability that the event occurred. 0. means not possible at all and 1. means that it is certain

SessionRefs

KeyTypeRequiredDescription
chunk_sequenceintegertrueChunk is uploaded in sequence. This represents the sequence of the next chunk to upload
session_idstringtrueSession id of the session that can be used to interact with API

SessionStatus

KeyTypeRequiredDescription
errorstringfalseAn error occurred during the session
inferenceSensetrueInference related status
refsSessionRefstrueList of session links
statestringtrueState in which the session is

TagsSensitivity

KeyTypeRequiredDescription

UpdateSession

KeyTypeRequiredDescription
make_readonlybooleanfalseIf set to true, will set session state to readonly Note that setting make_readonly to false once the session is readonly will not make the session writable again

WindowHop

TypeDescription
stringIf set, it allows to adjust the sensitivity of a given tag [in this list] The window hop adjustment can be set eith to 0.5s or 1s By default, this value is set to 0.5s

possible values

  • 0.5s

  • 1s