paint-brush
Understanding Android to bits and bytes [Part 1]by@aditibhatnagar
321 reads
321 reads

Understanding Android to bits and bytes [Part 1]

by OffgridSeptember 23rd, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Android Studio is a tool that builds an Android app using Android Studio. The tool is used to build an app using different tools and compilers. Android Studio combines all of the files in one APK file. The APK is actually just a zipped file call unzip and it'll open up with the command on it. It is signed and aligned (and the two just rhymed) to learn more about signing jarsigner to sign out my previous blog post, check out previous blog posts here.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail

Coin Mentioned

Mention Thumbnail
featured image - Understanding Android to bits and bytes [Part 1]
Offgrid HackerNoon profile picture

Often times as engineers we end up spending way more time and focus in writing high level code for our application. Write code and click the Run button - something happens and the app gets installed on the device.

This post is about that "something".

So what happens when you click that run button on Android Studio?

One of the main things it does is it builds your app.

How is an android app built? Think about it, what all is there to be built? You have java code, resource files (xmls), Android Manifest file, third party libs, drawables etc., that includes different types of files: Java files, xml files, image files etc. How are these different types of files built and the packed in a single apk file?

The flow somewhat goes like this:

Ref: https://stuff.mit.edu/afs/sipb/project/android/docs/tools/building/index.html#detailed-build

There are different tools and compilers to help us with the same.

What happens to resources and XML files?

As you see above, XML resource files get converted to R.java using AAPT which is Android Asset Packaging Tool. R.java is used by the source code to refer to the resources. All the XML resources are compiled to a resource.arsc format.

One can use Androguard to decode this format

This resource.arsc file contains all the meta-information about the resources. Some of those are:

  1. the xml nodes(e.g. LinearLayout, RelativeLayout, etc),
  2. the attributes(e.g. android:layout_width),
  3. the resource id's.

The resource ids refer to the real resources in the apk-file. A resource ID is a 32 bit number of the form: PPTTNNNN. PP is the package the resource is for; TT is the type of the resource; NNNN is the name of the resource in that type. For applications resources, PP is always 0x7f.

The attributes are resolved to a value at runtime. The resolution process is smart about any re-direction (@dimen/... as opposed to 4dp or @color/... as opposed to "#FFaabbcc") and returns a usable value (a dimen value is resolved differently than a color value).

Source: StackOverflow

Other than the XMLs images (png/Jpg) etc remain uncompiled and are packed together with the final apk.

What happens to Java code?

Java classes get compiled to .class files using Java Compiler. As we know Java gets compiled to byte code. These .class files further get compiled to .dex file. That's also byte code.

Hey, if both are byte codes what is the difference?

This paper beautifully explains it, but to brief you here it is:

  1. Android applications are usually written in Java language and are executed in the Dalvik Virtual Machine (DVM), which is different from the classical Java Virtual Machine (JVM).
  2. The DVM is developed by Google and optimized for the characteristics of mobile operating systems (especially for the Android platform).
  3. The bytecode running in Dalvik is transferred from traditional JVM bytecode to the dex-format by translating Java .class files with the conversion tool dx. Contrary to the DVM, JVM is using pure Java class files.
  4. JVM bytecode is composed of one or more .class files (each of these contains one Java class). During run time, JVM will dynamically load the bytecode for each class from the corresponding .class file. While Dalvik bytecode is only composed of one .dex file, containing all the classes of the application.
  5. After the Java compiler has created JVM bytecode, the Dalvik dx compiler deletes all .class files and recompiles them to Dalvik bytecode. Afterwards dx merges them into one .dex file.

Also, the OpCodes for the two differ in length and definition. (Btw, there are Dalvik has 218 opcodes which are essentially different from the 200 opcodes in Java.) More on OpCodes later.

But I hope you got the gist of it; .dex is more optimized for Android.

What happens when Resources and Java Code have been compiled?

Android Package Builder combines all of it in one .apk file. APK file is actually just a zipped file. You can actually just call unzip command on it and it'll open up. Try it!

What happens when APK has been built?

It is signed and aligned (and the two just rhymed).

One can use jarsigner to sign the APK, curious to learn more about signing, check out my previous blog post.

And then it's zipaligned! Zipalign is an archive alignment tool that provides important optimization to Android application (APK) files. The purpose is to ensure that all uncompressed data starts with a particular alignment relative to the start of the file. Specifically, it causes all uncompressed data within the APK, such as images or raw files, to be aligned on 4-byte boundaries. This allows all portions to be accessed directly with mmap() even if they contain binary data with alignment restrictions. The benefit is a reduction in the amount of RAM consumed when running the application.

(Source: https://developer.android.com/studio/command-line/zipalign)

Isn't it cool? I'll dig more into ZipAligning in one of the upcoming posts.

So that explains what happens when the apk is built! But remember, this is only the part when all resources get compiled, and zipped, signed and aligned to produce a single APK file. What happens when you "install" the APK? How does it load and function? How does underlying hardware understand the bits and execute the instructions? There is a lot more to it, and we'll be talking about all of it!

So stay tuned and look forward to the next blog post! Like and share with people who might find this useful!

Interested in learning similar stuff? Join us @Unschool (free and open cohort for devs and hackers) .

I'd be glad to engage in conversations on similar topics, feel free to reach out to me on Twitter @4ditiBhatnagar

That's all for this post folks! Kudos, you made it till the end. I'll see you in the next post soon, till then keep hacking!

Relevant Interesting Links:

Dalvik OpCode Sheet

Comparison of Dalvik and Java Byte Code

Previously published at https://www.digitised.in/post/understanding-android-to-bits-and-bytes