The WebSocket protocol is used for three purposes:

Streaming media
Receiving responses
Sending flow-control messages to the service
- Currently, only one flow-control message is supported; see "End of Stream" section below.

If you are using a different protocol to stream media (e.g. RTMP), the WebSocket protocol will be limited only to receiving responses and sending flow-control messages.

WebSocket URL

Base URL

The base URL for the service is: wss://speech.verbit.co/ws
Please take into consideration that the service is hosted in the US, this may be a factor for latency considerations.

👍
Service Region - Please contact us if your use case requires to use our service in another region for any reason.

Authentication

There are two possible methods of authentication in order to establish a WebSocket connection to our API.

Connecting to a pre-booked session

In this scenario, we assume you have already placed an Order using the Ordering API.
Within the response received from the Create Order endpoint, you will find the following section which contains the URL you need to use for your WebSocket connection. Here is an example:

...
"delivery": [
  {
    "type": "websocket",
    "connection_params": {
      "url": "ws://speech.verbit.co/ws?token=eyJhbGciO...."
    }
  }
]
...

Notice that the URL already includes a token parameter. This token identifies and allows access to the Order you have created.

In alignment with any of our services' APIs, you must include your short-lived Customer-level token in the Authorization header. For more details about the general authentication scheme, see Authentication.

Here is an example of a complete WebSocket upgrade request:

GET /ws?token=<ORDER_LEVEL_TOKEN> HTTP/1.1
Host: speech.verbit.co
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Authorization: <SHORT_LIVED_CUSTOMER_LEVEL_TOKEN>
Origin: http://speech.verbit.co

And a successfully authenticated upgrade response:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

Connecting to an ad-hoc session

This scenario is purposed for quick-start streaming where default settings are sufficient, i.e. there is no need to provide any special configuration that can be specified in the Order phase.
This removes both the need to pre-book an Order, and to obtain a short-lived token through Authentication.

However, note that due to the absence of an Order-level token, there is no way to reconnect to the same session if the connection drops. In the case of a connection drop, you will need to start a new ad-hoc session. Thus, creating a series of distinct sessions which are not formally related to each other. This will hinder your ability to merge this sequence and run any further processing that surmises the completeness of the corresponding real-life event.
This is the main caveat of the ad-hoc mode.

To start a new ad-hoc session, you can directly connect to the global WebSocket URL, using your permanent customer token (no need to use the Authentication API to get a short-lived token).

Here is an example of an ad-hoc session WebSocket upgrade request:

GET /ws HTTP/1.1
Host: speech.verbit.co
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Authorization: <PERMANENT_CUSTOMER_LEVEL_TOKEN>
Origin: http://speech.verbit.co

Media Parameters

This set of parameters describes the format of the media which the client is going to be streaming.

Name	Request Parameter	Required	Default
Format	format	False	`S16LE`
Audio sample rate	sample_rate	False	16000
Audio sample width	sample_width	False	2
Number of audio channels	num_channels	False	1
Request "Transcript" type responses	get_transcript	False
Request "Caption" type responses	get_captions	False

Here is an example of a URL with media format parameters:

http://speech.verbit.co/ws?format=S16LE&sample_rate=16000&sample_width=2&num_channels=1&get_transcript=True&token=eyJhbGciO....

📘
It is important to specify the format of your input media correctly in order to avoid quality degradation or errors due to inaccurate decoding.

End of stream

In order to signal that the media stream has come to an end, the client needs to send an explicit message.
This allows the API to differentiate between unintended disconnections to an intended end of session.

This method of sending an explicit End-of-Stream message is used in both scenarios; whether the media is streamed over the WebSocket or pulled from an external source (such as RTMP).

To signal the end of a stream, send the following message over the WebSocket connection:

{  
   "event": "EOS"  
}

WebSocket - Connection

WebSocket URL

Base URL

👍
Service Region - Please contact us if your use case requires to use our service in another region for any reason.

Authentication

Connecting to a pre-booked session

Connecting to an ad-hoc session

Media Parameters

📘
It is important to specify the format of your input media correctly in order to avoid quality degradation or errors due to inaccurate decoding.

End of stream

WebSocket URL

Base URL

👍Service Region - Please contact us if your use case requires to use our service in another region for any reason.

Authentication

Connecting to a pre-booked session

Connecting to an ad-hoc session

Media Parameters

📘It is important to specify the format of your input media correctly in order to avoid quality degradation or errors due to inaccurate decoding.

End of stream

👍
Service Region - Please contact us if your use case requires to use our service in another region for any reason.

📘
It is important to specify the format of your input media correctly in order to avoid quality degradation or errors due to inaccurate decoding.