WebSocket - Connection
The WebSocket protocol is used for three purposes:
- Streaming media
- Receiving responses
- Sending flow-control messages to the service
- Currently, only one flow-control message is supported; see "End of Stream" section below.
If you are using a different protocol to stream media (e.g. RTMP), the WebSocket protocol will be limited only to receiving responses and sending flow-control messages.
WebSocket URL
Base URL
The base URL for the service is: wss://speech.verbit.co/ws
Please take into consideration that the service is hosted in the US, this may be a factor for latency considerations.
Service Region - Please contact us if your use case requires to use our service in another region for any reason.
Authentication
There are two possible methods of authentication in order to establish a WebSocket connection to our API.
Connecting to a pre-booked session
In this scenario, we assume you have already placed an Order using the Ordering API.
Within the response received from the Create Order endpoint, you will find the following section which contains the URL you need to use for your WebSocket connection. Here is an example:
...
"delivery": [
{
"type": "websocket",
"connection_params": {
"url": "ws://speech.verbit.co/ws?token=eyJhbGciO...."
}
}
]
...
Notice that the URL already includes a token
parameter. This token identifies and allows access to the Order you have created.
In alignment with any of our services' APIs, you must include your short-lived Customer-level token in the Authorization
header. For more details about the general authentication scheme, see Authentication.
Here is an example of a complete WebSocket upgrade request:
GET /ws?token=<ORDER_LEVEL_TOKEN> HTTP/1.1
Host: speech.verbit.co
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Authorization: <SHORT_LIVED_CUSTOMER_LEVEL_TOKEN>
Origin: http://speech.verbit.co
And a successfully authenticated upgrade response:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Connecting to an ad-hoc session
This scenario is purposed for quick-start streaming where default settings are sufficient, i.e. there is no need to provide any special configuration that can be specified in the Order phase.
This removes both the need to pre-book an Order, and to obtain a short-lived token through Authentication.
However, note that due to the absence of an Order-level token, there is no way to reconnect to the same session if the connection drops. In the case of a connection drop, you will need to start a new ad-hoc session. Thus, creating a series of distinct sessions which are not formally related to each other. This will hinder your ability to merge this sequence and run any further processing that surmises the completeness of the corresponding real-life event.
This is the main caveat of the ad-hoc mode.
To start a new ad-hoc session, you can directly connect to the global WebSocket URL, using your permanent customer token (no need to use the Authentication API to get a short-lived token).
Here is an example of an ad-hoc session WebSocket upgrade request:
GET /ws HTTP/1.1
Host: speech.verbit.co
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Authorization: <PERMANENT_CUSTOMER_LEVEL_TOKEN>
Origin: http://speech.verbit.co
Media Parameters
This set of parameters describes the format of the media which the client is going to be streaming.
Name | Request Parameter | Required | Default |
---|---|---|---|
Format | format | False | S16LE |
Audio sample rate | sample_rate | False | 16000 |
Audio sample width | sample_width | False | 2 |
Number of audio channels | num_channels | False | 1 |
Request "Transcript" type responses | get_transcript | False | |
Request "Caption" type responses | get_captions | False |
Here is an example of a URL with media format parameters:
http://speech.verbit.co/ws?format=S16LE&sample_rate=16000&sample_width=2&num_channels=1&get_transcript=True&token=eyJhbGciO....
It is important to specify the format of your input media correctly in order to avoid quality degradation or errors due to inaccurate decoding.
End of stream
In order to signal that the media stream has come to an end, the client needs to send an explicit message.
This allows the API to differentiate between unintended disconnections to an intended end of session.
This method of sending an explicit End-of-Stream message is used in both scenarios; whether the media is streamed over the WebSocket or pulled from an external source (such as RTMP).
To signal the end of a stream, send the following message over the WebSocket connection:
{
"event": "EOS"
}
Updated 4 months ago