6,458 reads

How to develop a WYSIWYG editor in Android

by Ankit kumarJanuary 28th, 2019

Too Long; Didn't Read

If you are a developer, you may have tried out Medium’s editor. It is a world-class service for bloggers and its user experience is an industry standard for blogging sites.

Company Mentioned

featured image - How to develop a WYSIWYG editor in Android

Develop a WYSIWYG editor for Android in one day.

If you are a developer, you may have tried out Medium’s editor. It is a world-class service for bloggers and its user experience is an industry standard for blogging sites.

Index —

What is WYSIWYG Editor?
Internal Architecture of the editor.
How to store the content?

This blog will help you understand how to think when you want to build an editor, what are the building blocks, and basic rules that you must pay attention to.

I’ve also given a link to my sample at the end of this blog for your reference in case you get stuck somewhere or you directly want to go through the code.

So let's begin,

What is a WYSIWYG Editor?

WYSIWYG is an acronym for “what you see is what you get”. A WYSIWYG editor is a system in which content can be edited in a form closely resembling its appearance when printed or displayed as a finished document.

How medium editor looks?

Medium Editor (Android)

The above is the screenshot of the editor, it is a clean and intuitive interface. All the editing tools are presented on the bottom toolbar. The user can type their content and format it accordingly.

At once it seems to be a Custom EditText which uses Spannables to format its content. But, Using spannable it would be difficult to manage the format of long text like 10k words. According to docs,

This is the interface for text to which markup objects can be attached and detached.

Attach the specified markup object to the range start...end of the text.

At the core, spannable perform string search operation from its start to end index and then apply the type of [Spans](https://developer.android.com/guide/topics/text/spans). On each character change in the edittext, it will have to update spans. This involves a lot of operation in a finite time, hence it will affect the UI rendering performance and will produce a terrible experience while the user tries to type too fast or delete any content too fast.

So to verify, let’s get into depth and try to observe the ‘bars behind the concrete’.

Internal Architecture of the Editor

X-ray of the editor

Layout bounds of editor

The given picture is medium app opened in developer mode turned on. The active cursor is inside a box. This box is still an editext. Now let's see another screenshot when we type something

On Typing something

When we press enter, cursor moves to the new line and another box is created. Hence we can conclude that it is not a single editext and Spans are not used.

If we go on pressing enter, new boxes are added. So here is the concept —

Think editor as a container and elements are added/removed to/from it based on user actions (enter or backpressed). Each element is independently stylable entity, may it be Bold, H2 or Blockquote, Image or List Item etc.

Now let's explore more on this,

Basics of the internal architecture

There is a Parent View which adds view in vertical order. In Android, it simply can be an extended version of LinearLayout in vertical orientation .

Initially, there is an edittext added to the stack and focussed. Until user presses enter, the content goes to the same edittext.

After enter is pressed a new edittext is inserted into the stack. The parent view keeps track of indexes of each view added to the stack and also the view in focus. Have a look at the skeleton —

Skeleton of our editor

I think the internal skeleton is clear. Now, let’s look at the formatting part.

Basic building blocks of Editor

In our basic editor, we will support —

Heading styles like H1, H2…H5
Bold, Italics
Blockquotes
Image
Ordered/Unordered Lists
Horizontal dividers

We can sort the list and categorize them into groups like.

Group 1: Heading styles, Bold, Italics, Blockquotes, Lists

Group 2: Image

Group 3: Horizontal dividers

Here, we’ve grouped the elements based on the properties required to render them. In other words, Text, Image (multimedia), and Horizontal divider.

The actions we can perform on the Group 1 —

We can use Edittext to take input.
Apply heading styles using fonts.
Show blockquote by changing the background view of edittext.

Hence we can create a custom edittext. This view will store all the configurations applied. We will read all of them while we collect editor data (covered later) and store on the device or send it to the server.

For Group 1: We can have a custom view with all the parameter. This block (view) will be added or removed from the editor.

The actions we can perform on the Group 2—

We use ImageView to show the image.
Image captions can be taken via Edittext.

This can be a Custom view composed of an ImageView and an Edittext(for the caption on image). It will be responsible for uploading an image to the server and storing back the URL itself. Also, it needs to store the text entered into the EditText .

For Group 2: We can have a Custom View with ImageView and Edittext.

The actions we can perform on the Group 3 —

This is the simplest part, it will be a simple view which will draw a line.

For Group 3: We can have a view with line drawable.

In order to make a stable editor, we need to make its reactions genuine. So there are some basic rules which need to be followed.

Rules of Editor

Initially, there will be an editext with the cursor on it.
If a user is typing a paragraph and presses enter, we need to add another row(edittext).
If a user applies a style, it will be applicable to the whole content of the block.
If a user puts the cursor in the middle of text and presses enter, take the content after cursor — > Insert new block — > copy content into it.
If User tries to insert an image, take the index of currently focussed view and insert a new Image block below it. Also, insert a EditText block below image ( so that user can tap below image and continue writing).
If a user deletes the whole content of a block, take him to the previous edit block.

These are some basic rules out of a host of rules which can be applied in a similar fashion.

How to store the content (text, image…)?

Till now we have covered the structure part of the editor. Now let’s get to content storage. We cannot store the plain text and URLs directly because there is a lot of styling involved. This part will cover how the text should be stored on the server and the device as well for its proper rendering.

Data Storage Structure

We will have a helper class which will traverse through the Parent Container and collects the following information from each group. The groups that we defined earlier.

From Group 1: type of entity (text), text, and styles.
From Group 2: type of entity (image), image URL and caption text
From Group 3: type of entity.

This is one of the structures of storing blogs —

A basic structure of editor data

After we collect data in this format, we can send it to the server or store it as a draft.

Rendering the data

Now if a user gets back to the draft version of the blog, we need to read the JSON structure and then prepare the view according to it. Here is how it is done —

Loop through the array of entities in our JSON data.
Instantiate the Entity of a particular type (Text, Image or HR) and configure them with the given data.
Add the entity to the parent view.
Bring focus to the first child.

We have now successfully restored the structure of draft from raw data.

A sample repository for reference —

I have implemented a version of editor which serves as a WYSIWYG Markdown editor. It is currently being used in the 1Ramp App (a social media application).

You can view the source code on GitHub.

Feel free to ask your doubts in the comments.