Disclaimer: This story is focused on how I identified and removed the sponsored posts, not on why I removed them. It’s an occasion to learn more about the DOM.
Today I open Facebook, and I notice something, my adblocker isn’t working.As a developer, I decided to investigate, and first thing first I inspect the structure of these sponsored posts, to see if there’s a way to identify them so I can remove them with a script.
The structure looks pretty simple, we have an element with role “article”, that contains a div with a class starting with “feed_subtitle”, and inside this last div, something like a bazillion of spans with random words.Dude, seriously, WTF?
They are using a trick to display the word “Sponsored”: some of the spans are visible, some aren’t.
And to make things simpler… sometimes the parent is visible, but the child isn’t, and vice-versa.
Time to start constructing a script to get rid of this useless stuff.I left the subtitle div selected in the chrome inspector, and run in the console “$0.textContent”:
Well, this was kind of expected, we need a function to find what is the real text if we want to remove these ads.
To do this, we need a recursive function; the function will obtain a list of the child nodes of the element and remove the ones that are hidden.
The DOM is a big tree of nodes, the elements that compose the page, are usually of type “Element” and “Text”, is important to notice that each different node type has a different set of child nodes allowed. The document node (the root) for example can have as a child a DocumentType node while the other nodes can’t.
The elements that compose the post, are all under a node of type Element, this means we can find only these types of nodes: Element, Text, ProcessingInstruction, and Comment. [Specification].
We don’t have any interest in Comments and ProcessingInstructions, because they are not rendered by the browser. For this reason, we filter them out.
Going back to remove from the list the hidden nodes. Only nodes of type Element can have a style, and for this reason are the only one that can be hidden (together with their children). The other node types don’t have a style and we cannot use “getComputedStyle” on them. This means we need to check the style only on the Element nodes:
https://gist.github.com/maury91/fa02938f3ad4c9153de4c35ad4d831ac
Then, with these visible nodes, we collect the nodes that are visible inside them.
https://gist.github.com/maury91/13f1f7fbf61d02f9e51e93a10bbff78a
But we have a problem if we get the nodes of the elements recursively. The ones at the end (the leaves) will not have any nodes.
We need to stop when we reach an element that contains only text and return what we are interested in, the text content.
https://gist.github.com/maury91/836cdeb0e1699d046dc643d9129e3529
Perfect! Now we have everything for our recursive function, the recursion cycle and the stop condition, let’s merge all the pieces.
https://gist.github.com/maury91/76401e97b8309958ec2fef28f11a1a5f
Time to try the function on the selected element:
Perfect! It works!
Now, we need to have a function that just says “yes” or “no” when we ask if that is the subtitle of a sponsored post.And just for security, that it works even when the subtitle is missing.
https://gist.github.com/maury91/7fee276352fd7b18441323327c3eda5b
Now that we have a way to know when is sponsored, we just need to obtain all the sponsored posts on the page.
Let’s start with the easy bits, let’s get all the posts, and keep only the ones that are sponsored
https://gist.github.com/maury91/55432f39268010c2f24981420006830e
Now we need a function to know if the post is sponsored, we already have a function that identifies if a subtitle is the one of a sponsored post, all we need is to pass the subtitle of the post to that function
https://gist.github.com/maury91/18f292082a276d8d612ce18eba503be3
All the pieces are coming together!
Now we need a function to remove these articles.
https://gist.github.com/maury91/60b89407de2e1d515f521c5987b33299
Let’s try it!
It works!
Now it’s missing only one piece, remove these posts before I can see them.
There are for sure a lot of smart ways to do that, but I’m very lazy, so I will simply use a function to observe all the new DOM elements added to the feed.To do this I can use MutationObserver, this useful feature can execute a callback every time an element is added or removed from the subtree of a specific element.
When we create a new MutationObserver, we pass to the constructor a callback that will be executed when a change happens on the element we are observing. The changes that we can be notified of are chosen on the observe method that we will see later.We will get notified of these changes with a mutationList, a mutationList is a list of MutationRecord that happen on that Element, there are 3 types of mutations we can be notified of: “attributes”, “characterData” and “childList”; Each of these mutations describes what is changed, for example, a MutationRecord will be of type “attributes” if one or more attributes of that Element has changed.
Creating a MutationObserver isn’t enough, we need also to make it observe an Element, after creating it, we can use the method “observe”, this method takes as arguments the Element to observe and what observe. We can observe changes on attributes of the Element, and/or changes on the hierarchy of the Element (new nodes under the Element added or removed).
When Facebook adds a new post, it is always an element with an id starting with “u_fetchstream”
https://gist.github.com/maury91/a549e72de82c4bfef41b537d93bf8e9c
I will break it down because it can seem a little bit complex:
Perfect, now we have a list of every post that is added to the page!
We just need to use the old functions to detect which ones of these posts are sponsored and remove them. Time to see the final result:
https://gist.github.com/maury91/b6b30e7158896a69def366663846bdaf
Now the only thing that is remaining is to put this code in action automatically every time I open facebook.
And to do so, I will use Greasemonkey (or Tampermonkey in Chrome)
You can find the user script here: https://gist.github.com/maury91/d054d9f38650d70b64ec583845231f20/raw/00af0da1d5f8de4934fb0874effda6dce9e11daf/removeFacebookAdsNew.user.js
Have fun getting rid of the new facebook ads! (I know it will not work for long)