Video API is an interface dedicated to providing audio and video transmission services, mainly divided into two types: static video API and live video API.
Static Video API is an API that provides video file play services. Service providers offer video file cloud storage and CDN distribution services and provide video services through protocol interfaces such as HTTP and RTMP.
For example, YouTube and Instagram use Static Video API.
For example, Live.me, Yalla, and Uplive use the Live Video API.
The static video API is easy to understand and pulls video files from the cloud through the streaming protocol.
The live video API is more complicated. How does it ensure the video can be transmitted to the other end quickly, smoothly, and clearly?
In this article, we will introduce the logic behind the live video API in detail.
The live video API application is becoming more extensive with the continuous improvement of network bandwidth and device performance. It makes many scenarios possible, such as:
For the live video API, it is necessary to ensure that the video data can complete the end-to-end transmission within 500ms and, simultaneously, guarantee the clarity of the video picture and the flow of the call.
Therefore, the live video API needs to ensure that the audio and video data can realize end-to-end data transmission in a fast, large and stable manner. A complex system is required to ensure the availability of the live video API.
As shown in the figure, the live video API mainly layered the functions of 6 modules:
Audio and video capture is the source of audio and video data collected via cameras, microphones, screens, video files, recording files, and other channels.
It involves using color spaces, such as RGB and YUV and extracting audio features, such as sampling rate, number of channels, bit rate, audio frame, etc.
The audio and video preprocessing are mainly for business scenarios, and the collected data is processed once and again to optimize the customer experience.
For example:
Audio and video coding ensures that audio and video data can be transmitted quickly and safely on the network.
Commonly used encoding formats are Video encoding format: H264, H265 Audio coding format: OPUS, AAC.
Audio and video transmission is the most complex module in the video API. To ensure that the audio and video data can be transmitted to the opposite end quickly, stably, and with high quality in a complex network environment, streaming protocols such as RTP, HTTP-FLV, HLS, and RTMP can be used.
Various anomalies such as data redundancy, loss, out-of-order, flow control, adaptive frame rate, resolution, jitter buffering, etc., must be resolved using multiple algorithms.
So whether video API is worth choosing, we need to focus on whether the manufacturer has outstanding audio and video transmission advantages.
Audio and video decoding means that after the audio and video data is transmitted to the receiving end, the receiving end needs to decode the data according to the received data encoding format.
There are two video decoding methods, hardware decoding, and software decoding. Software decoding generally uses the open source library FFMpeg. Audio decoding only supports software decoding; use FDK_AAC or OPUS decoding according to the audio encoding format.
Audio and video rendering is the last step in the video API processing process. This step seems to be very simple. You only need to call the system interface to render the data to the screen.
However, one must process much logic to align the video picture with the audio. Still, now there is a standard processing method to ensure the synchronization of the audio and video.
Building a complete set of real-time audio and video systems is complex work. Nevertheless, many Video APIs help us solve the underlying complex operations. We only need to focus on the upper-level business logic.
The following will introduce how to use the ZEGOCLOUD Video API to implement the video calling function.
The following diagram shows the basic process of User A playing a stream published by User B:
The following sections explain each step of this process in more detail.
Before creating a ZegoExpressEngine
instance, we recommend you add the following UI elements to implement basic real-time audio and video features:
To create a singleton instance of the ZegoExpressEngine
class, call the createEngine
method with the AppID of your project.
/** Define a ZegoExpressEngine object */
ZegoExpressEngine engine;
ZegoEngineProfile profile = new ZegoEngineProfile();
/** AppID format: 123456789L */
profile.appID = appID;
/** General scenario */
profile.scenario = ZegoScenario.GENERAL;
/** Set application object of App */
profile.application = getApplication();
/** Create a ZegoExpressEngine instance */
engine = ZegoExpressEngine.createEngine(profile, null);
To log in to a room, call the loginRoom
method.
/** create a user */
ZegoUser user = new ZegoUser("user1");
ZegoRoomConfig roomConfig = new ZegoRoomConfig();
/** Token is generated by the user's own server. For an easier and convenient debugging, you can get a temporary token from the ZEGOCLOUD Admin Console */
roomConfig.token = "xxxx";
/** onRoomUserUpdate callback can be received only by passing in a ZegoRoomConfig whose "isUserStatusNotify" parameter value is "true".*/
roomConfig.isUserStatusNotify = true;
/** log in to a room */
engine.loginRoom("room1", user, roomConfig, (int error, JSONObject extendedData)->{
// (Optional callback) The result of logging in to the room. If you only pay attention to the login result, you can use this callback.
});
Then, to listen for and handle various events that may happen after logging in to a room, you can implement the corresponding event callback methods of the event handler as needed.
engine.setEventHandler(new IZegoEventHandler() {
/** Common event callbacks related to room users and streams. */
/** Callback for updates on the current user's room connection status. */
@Override
public void onRoomStateUpdate(String roomID, ZegoRoomState state, int errorCode, JSONObject extendedData) {
/** Implement the callback handling logic as needed. */
}
/** Callback for updates on the status of other users in the room. */
@Override
public void onRoomUserUpdate(String roomID, ZegoUpdateType updateType, ArrayList<ZegoUser> userList) {
/** Implement the callback handling logic as needed. */
}
/** Callback for updates on the status of the streams in the room. */
@Override
public void onRoomStreamUpdate(String roomID, ZegoUpdateType updateType, ArrayList<ZegoStream> streamList, JSONObject extendedData){
/** Implement the callback handling logic as needed. */
}
});
To start the local video preview, call the startPreview
method with the view for rendering the local video passed to the canvas
parameter.
You can use a SurfaceView
, TextureView
, or SurfaceTexture
to render the video.
/**
* Set up a view for the local video preview and start the preview with SDK's default view mode (AspectFill).
* The following play_view is a SurfaceView, TextureView, or SurfaceTexture object on the UI.
*/
engine.startPreview(new ZegoCanvas(preview_view));
To start publishing a local audio or video stream to remote users, call the startPublishingStream
method with the corresponding Stream ID passed to the streamID
parameter.
/** Start publishing a stream */
engine.startPublishingStream("stream1");
Then, to listen for and handle various events that may happen after stream publishing starts, you can implement the corresponding event callback methods of the event handler as needed.
engine.setEventHandler(new IZegoEventHandler() {
/** Common event callbacks related to stream publishing. */
/** Callback for updates on stream publishing status. */
@Override
public void onPublisherStateUpdate(String streamID, ZegoPublisherState state, int errorCode, JSONObject extendedData){
/** Implement the callback handling logic as needed. */
}
});
To start playing remote audio or video stream, call the startPlayingStream
method with the corresponding Stream ID passed to the streamID
parameter and the view for rendering the video passed to the canvas
parameter.
You can use a SurfaceView
, TextureView
, or SurfaceTexture
to render the video.
/**
* Start playing a remote stream with the SDK's default view mode (AspectFill).
* The play_view below is a SurfaceView/TextureView/SurfaceTexture object on UI.
*/
engine.startPlayingStream("stream1", new ZegoCanvas(play_view));
To stop publishing local audio or video stream to remote users, call the stopPublishingStream
method.
/** Stop publishing a stream */
engine.stopPublishingStream();
If a local video preview is started, call the stopPreview
method to stop it as needed.
/** Stop local video preview */
engine.stopPreview();
To stop playing remote audio or video stream, call the stopPlayingStream
method with the corresponding stream ID passed to the streamID
parameter.
/** Stop playing a stream*/
engine.stopPlayingStream(streamID);
To log out of a room, call the logoutRoom
method with the corresponding room ID passed to the roomID
parameter.
/** Log out of a room */
engine.logoutRoom("room1");
Like and Follow is the biggest encouragement to me
Follow me to learn more technical knowledge
Thank you for reading :)
This is one of the live technical articles. Welcome to other articles:
Also published here.