Reference - REST
Cochl.Sense API allows to detect what is contained inside sound data. Send audio data over the internet to discover what it contains.
npx @openapitools/openapi-generator-cli generate -i openapi.json -g python -o python-client
v1.1.0 Download specification
Authentication
API_Key
We use a simple API key to authenticate requests on the backend. API keys are scoped to a given project. To get an API key, go to the dashboard projects page. One key is available for each project. If no projects are present, create one by clicking “add new project” allows creating a new project.
Once the key is retrieved, it will be required to be passed in the HTTP
request headers x-api-key: YOUR_API_KEY
Security Scheme Type | Header parameter name |
---|---|
apiKey | x-api-key |
Audio Session
Sense API Audio Session is the heart of the API. An audio session represents one audio source that will be inferenced*. An audio source can be either a file or a stream.
*inferenced: analyzed by our deep learning neural network
Lifetime
An audio session goes through 3 states during its lifetime :
writable -> readonly -> deleted
- The chunk can be uploaded if and only if session is writable
- Status can be readonly if and only if session is not deleted
A state can become readonly
- By manually updating the session with a PATCH request
- When an error occurs
- When session total_size has been reached
A state can become deleted.
- By manually deleting session with a DELETE request
- When the session becomes inactive*
*inactive: a session is considered inactive if no chunks have been uploaded for more than one hour
Upload
Audio needs to be sent to the server by chunk. The maximum allowed size of a chunk is 1MiB (1MiB == 1024KiB)
- For a stream, we recommend sending half second long audio chunk. This allows having the shortest latency between audio recording and inferenced results.
- For a file, we recommend sending a file by chunk of 1MiB audio chunks.
One big difference between stream and file is how they are decoded. For a stream, we recommend every chunk is required to be decodable*, whereas, for a file, the concatenation of all the chunks needs to be decodable.
*decodable: an audio source is decodable if by having only the content-type and the raw data, it is possible to decode the audio
Inference
Once chunks are received, our server will begin to inference automatically.
- For a stream, chunks are inferenced immediately
- For a file, chunks are inferenced when session state becomes readonly
Status
Inference results can be retrieved on the status endpoint.
To get all results of a given session, it is recommended to use next_token
.
It gives the warranty to read all results incorrect order exactly once.
Note: In the future, results from the deleted audio session will be accessible on another endpoint
Error
In case an error occurs during a session, that session will become readonly and pending inferences of chunks are canceled. An error message will be available in the session status.
A typical error is a content-type decoding error. For instance, audio/wav data may have been misclassified as audio/mp3 data.
Create Session
POST /audio_sessions/
Create a new session. An API key is required. Session parameters are immutable and can be set at creation only.
Authorization
Request Body
Format | Required | Schema |
---|---|---|
application/json | true | CreateSession |
Responses
Code | Description | Schema |
---|---|---|
200 | The session was created successfully | SessionRefs |
400 | The parameter is missing or not formatted properly | GenericError |
401 | Authentication failed. For instance, API key is missing or invalid | GenericError |
500 | Unexpected server error. If the error persists, you can contact support@cochl.ai to fix the problem. | GenericError |
Delete Session
DELETE /audio_sessions/{session_id}
Change the state of the session to deleted. All future call on the session will return 404
Params
Parameter | In | Type | Required | Description |
---|---|---|---|---|
session_id | path | string | true | Session id represents a unique identifier for an audio session |
Responses
Code | Description | Schema |
---|---|---|
204 | The session successfully deleted | |
404 | Resources don’t exist or have been deleted | GenericError |
500 | Unexpected server error. If the error persists, you can contact support@cochl.ai to fix the problem. | GenericError |
Update Session
PATCH /audio_sessions/{session_id}
Update a session
Params
Parameter | In | Type | Required | Description |
---|---|---|---|---|
session_id | path | string | true | Session id represents a unique identifier for an audio session |
Request Body
Format | Required | Schema |
---|---|---|
application/json | true | UpdateSession |
Responses
Code | Description | Schema |
---|---|---|
204 | The session has been updated successfully | |
400 | The parameter is missing or not formatted properly | GenericError |
404 | Resources don’t exist or have been deleted | GenericError |
500 | Unexpected server error. If the error persists, you can contact support@cochl.ai to fix the problem. | GenericError |
Upload Chunk
PUT /audio_sessions/{session_id}/chunks/{chunk_sequence}
Upload new audio chunk
Params
Parameter | In | Type | Required | Description |
---|---|---|---|---|
session_id | path | string | true | Session id represents a unique identifier for an audio session |
chunk_sequence | path | integer | true | Chunk represents the chunk number. This is needed to be a counter starting from 0 and growing by one on each request |
Request Body
Format | Required | Schema |
---|---|---|
application/json | true | AudioChunk |
Responses
Code | Description | Schema |
---|---|---|
200 | The chunk successfully uploaded | SessionRefs |
400 | The parameter is missing or not formatted properly | GenericError |
403 | The session is not writable anymore | GenericError |
404 | Resources don’t exist or have been deleted | GenericError |
409 | The chunk sequence is invalid | GenericError |
413 | Audio chunk size must be smaller than 1MiB | GenericError |
500 | Unexpected server error. If the error persists, you can contact support@cochl.ai to fix the problem. | GenericError |
Read Status
GET /audio_sessions/{session_id}/status
Get session status
Note that if all chunks didn’t finish to be inferenced, the server will wait for at least one result to be available in the required page range before returning result. Such waiting can lead to HTTP requests timeout. Therefore we recommend implementing a client retry logic.
Params
Parameter | In | Type | Required | Description |
---|---|---|---|---|
session_id | path | string | true | Session id represents a unique identifier for an audio session |
offset | query | integer | false | How many existing elements to skip before returning the first result control how many results to receive |
count | query | integer | false | Limit the length of the returned results array to limit the size of the returned payload |
next_token | query | string | false | The next token can be used from a previous page result. It allows to iterating through all the next elements of a collection. If next_token is set, offset and limit will be ignored |
Responses
Code | Description | Schema |
---|---|---|
200 | Successful operation | SessionStatus |
400 | Parameter is missing or not formatted properly | GenericError |
404 | Resources don’t exist or have been deleted | GenericError |
500 | Unexpected server error. If the error persists, you can contact support@cochl.ai to fix the problem. | GenericError |
Schemas
AudioChunk
Key | Type | Required | Description |
---|---|---|---|
data | string | true | Raw audio encoded as base64 |
AudioContentType
Type | Description |
---|---|
string | Type of audio to send. |
AudioMetadata
Key | Type | Required | Description |
---|
AudioType
Type | Description |
---|---|
string | Whether the audio that will be sent is a stream or a file |
possible values
stream
file
CreateSession
Key | Type | Required | Description |
---|---|---|---|
content_type | AudioContentType | true | Type of audio to send. |
metadata | AudioMetadata | false | Type of audio session’s metadata. |
total_size | integer | false | If set, it allows to automatically change the state of the session to readonly when at least total_size bytes of audio chunk has been uploaded. For stream, this can be used to limit the maximum duration of the session. For a file, this allows to automatically start inferencing when the whole file has been sent. We recommend to set the size for files as it allows to do one less API call to start the inferencing |
type | AudioType | true | Whether the audio that will be sent is a stream or a file |
GenericError
Key | Type | Required | Description |
---|---|---|---|
error | string | true | Human-readable description of the error. Note that the value should not be used programmatically as the description might be changed at any moment |
status_code | integer | true | HTTP status code returned |
Page
Key | Type | Required | Description |
---|---|---|---|
count | integer | true | The number of elements that have been returned |
next_token | string | false | The next token can be used in the next page request to get the following results. If not present, it means that the page has reached the end of the collection |
offset | integer | true | Index of the first return element |
total | integer | true | The total number of available elements in the collection at the moment |
Sense
Key | Type | Required | Description |
---|---|---|---|
in_progress | boolean | false | Is true when there are still some pending chunks that were uploaded but are not inferenced yet |
page | Page | false | Contains the range of elements that have been returned for a given collection |
results | array | false | Contains array of SenseEvent as results of what has been inferenced so far |
SenseEvent
Key | Type | Required | Description |
---|---|---|---|
end_time | number | true | Represent the end of the window, in the second where inference was done. Note that end_time is window_length after start_time |
start_time | number | true | Represent the start of the window, in the second, where inference was done. Note that start_time will increase by window_hop on every step |
tags | array | true | Contains results of what has been inferenced at the same time |
SenseEventTag
Key | Type | Required | Description |
---|---|---|---|
name | string | true | Name of the sound recognized during the inference. |
probability | number | true | Probability that the event occurred. 0. means not possible at all and 1. means that it is certain |
SessionRefs
Key | Type | Required | Description |
---|---|---|---|
chunk_sequence | integer | true | Chunk is uploaded in sequence. This represents the sequence of the next chunk to upload |
session_id | string | true | Session id of the session that can be used to interact with API |
SessionStatus
Key | Type | Required | Description |
---|---|---|---|
error | string | false | An error occurred during the session |
inference | Sense | true | Inference related status |
refs | SessionRefs | true | List of session links |
state | string | true | State in which the session is |
UpdateSession
Key | Type | Required | Description |
---|---|---|---|
make_readonly | boolean | false | If set to true, will set session state to readonly Note that setting make_readonly to false once the session is readonly will not make the session writable again |